Tag: API

  • RouteLLM Unlocking Cost Effective LLM Routing

    Introduction Costs associated with using closed-source large language models can add up in the use cases of complex tasks due to the nature of how tokens are priced for using APIs. RouteLLM is a open-sourced project that creates a method to determine based on the query a user sends which LLM to choose based on…

  • Batch Jobs in Azure OpenAI

    Introduction In the existing landscape of Generative AI, optimizing API submissions is crucial for both cost and performance. Whether you’re fine-tuning token usage or streamlining context-aware requests using Retrieval-Augmented Generation (RAG), finding the right tools can make a significant difference. One of the most promising solutions is the Azure OpenAI Batch API, designed specifically for…

  • Adversarial Simulation in Azure AI Studio

    Large Language Models present a powerful enabler for various use-cases for most enterprises but without some form of due diligence and testing can spew some unintended responses. Content safety is a preventative mechanism that is used for Azure AI Studio and can also be tested with the Prompt-flow SDK. In this blog post I’ve going…

  • Vertex AI Agents

    Google Cloud Platform’s Vertex AI offers a comprehensive suite of tools designed to simplify the process of building, deploying, and scaling machine learning models. One of the standout features of Vertex AI is its support for Agents, which are frameworks that enable seamless integration and automation within AI workflows. In this blog post, we’ll delve…

  • API Server VNET Integration

    Connectivity in AKS If you’re running AKS in production you’ll likely encounter the private link scope and integration of leverage private DNS zones for putting the API server behind private IP’s rather than accessible on port 6443 or you should be doing this. But what about other options? Perhaps you’re spinning up a dev/test cluster…

  • Retina by Microsoft OSS

    KubeCon 2024 in Europe has recently wrapped up this past week with some major announcements from various vendors one that stood out to me is the use of Retina. Microsoft released a open-source cloud-agnostic Kubernetes Network Observability platform this can provide a path to customizable telemetry. This telemetry has multiple options on where you’d like…

  • Exploring KEDA Scaling to Zero

    Continuing experimentation on CNCF projects I’ve stumbled across one that is near and dear to the Microsoft Azure space since KEDA was introduced with some contributors from Microsoft and still has maintainers that are current as of this repo’s README.md. To understand further on what exactly KEDA is we start at the top of what…

  • KubeArmor Explored

    KubeArmor is a cloud-native runtime security enforcement system that works with restricting behavior (this resides with execution, file access, and network operations) of pods, containers, and nodes (VM’s) at the system level. The way this tool works is by using Linux Security Modules which to no surprise are enamored in the Certified Kubernetes Security Specialist…

  • Azure ML on AKS with Trusted Access

    Azure ML on AKS with Trusted Access

    Trusted Access which is in preview provides secure access to the Kubernetes API Server while granting services that are needed for operations without requiring a traditional (private-endpoint). This feature uses the system0assigned managed identity as a authentication mechanism as intermediary to access your AKS clusters. As always in any feature that is rolled out prior…

  • Capsule – Multi-tenant in Kubernetes

    Capsule – Multi-tenant in Kubernetes

    Introduction Multi-tenancy in Kubernetes refers to the ability to isolate and manage multiple user groups or ‘tenants’ within a single Kubernetes cluster. This approach is essential for organizations that want to maximize resource utilization while maintaining isolation and security between different user groups. Typically this can be achieved by either logical isolation mapping namespaces as…