Tag: Serveless

  • DeepSeek-V3 (MoE)

    DeepSeek-V3 (MoE)

    DeepSeek-V3 is an open-source large language model that boast a 671-billion parameter Mixture-of-Experts architecture with only 37 billion parameters activated per token. This specific model uses Multi-Head Latent Attention (MLA) for inference this compresses the attention keys and values in a low dimensional latent representation. Additionally this has also the strategy of Auxiliary-Loss-Free load balancing…

  • Hacking Kubernetes via ServiceAccountTokens

    Kubernetes has a large amount of advancements and inherent good security principles but these are dependent on configurations that are typically not well-known to end users. Predominantly the constructs of Service Accounts or (Non-human Identities) for the masses are populated in many services as they act as the go-between for service to authenticate and operate…

  • Evaluations in Azure Foundry

    Evaluations in the application of Generative AI serve as a backstop component to build trust and confidence in your AI-centric applications. Measuring the output and context as it is produced in your application can help you grasp in a verifiable method how your application will perform under certain conditions. Given the natural language usage of…

  • Image Policy Webhook

    Image Policy Webhook is a native Kubernetes admission plugin that enforces security policies by validating container images before they are deployed. This ensures that only trusted and compliant images run in your environment. This will take the image that is attempted to be applied compare against predefined policies, and if those policies allow the image…

  • PyRIT for LLM Security

    Microsoft launched PyRit (Python Risk Identification Tool) back in 2024 this serves as a open source framework to identify risk with Generative AI systems using the framework to test with multiple methods of attacks. Given the expansion of methods for Jailbreaking systems this allows for the dynamic adaption of attacks to quickly automate processes of…

  • Garak Red Teaming LLMs

    As Generative AI is playing a role in multiple organizations so is the popularity of tools for identifying risks and vulnerabilities. In this blog I’m exploring Garak a LLM vulnerability scanner developed by NVIDIA and is a OSS project to help strengthen LLM Security. When the term “Red Team” appears in the approach of simulation…

  • Azure AI Foundry

    Introduction This week at Microsoft Ignite, Azure AI Foundry was unveiled as the rebranded successor to “Azure AI Studio.” This marks a significant step toward unifying AI development tools under one cohesive platform. Azure AI Foundry provides a streamlined toolchain and an SDK designed for efficient consumption of AI models, supporting both OpenAI and Mistral…

  • Phi-3.5 Mixture of Experts

    Introduction Microsoft has open-sourced its Phi-3.5 Mixture of Experts model recently on the Azure AI Studio catalog provided as a (Model-as-a-Service) that you can run on Azure or you can also use Huggingface to utilize this model. The first question depending how much you’re following along with the constant upstream releases of models is the…

  • RouteLLM Unlocking Cost Effective LLM Routing

    Introduction Costs associated with using closed-source large language models can add up in the use cases of complex tasks due to the nature of how tokens are priced for using APIs. RouteLLM is a open-sourced project that creates a method to determine based on the query a user sends which LLM to choose based on…

  • Github Actions with Azure ML Jobs

    Introduction Its no secret that Github is a premier development platform for Source Control Management and does have a robust features that also allow for Continuous Integration/Continuous Deployment. These features are a part of the use of Github Actions, in this blog post I’m going to use the example code in a repository here this…