Tag: Compute

  • DeepSeek-V3 (MoE)

    DeepSeek-V3 (MoE)

    DeepSeek-V3 is an open-source large language model that boast a 671-billion parameter Mixture-of-Experts architecture with only 37 billion parameters activated per token. This specific model uses Multi-Head Latent Attention (MLA) for inference this compresses the attention keys and values in a low dimensional latent representation. Additionally this has also the strategy of Auxiliary-Loss-Free load balancing…

  • Hacking Kubernetes via ServiceAccountTokens

    Kubernetes has a large amount of advancements and inherent good security principles but these are dependent on configurations that are typically not well-known to end users. Predominantly the constructs of Service Accounts or (Non-human Identities) for the masses are populated in many services as they act as the go-between for service to authenticate and operate…

  • Evaluations in Azure Foundry

    Evaluations in the application of Generative AI serve as a backstop component to build trust and confidence in your AI-centric applications. Measuring the output and context as it is produced in your application can help you grasp in a verifiable method how your application will perform under certain conditions. Given the natural language usage of…

  • Image Policy Webhook

    Image Policy Webhook is a native Kubernetes admission plugin that enforces security policies by validating container images before they are deployed. This ensures that only trusted and compliant images run in your environment. This will take the image that is attempted to be applied compare against predefined policies, and if those policies allow the image…

  • Garak Red Teaming LLMs

    As Generative AI is playing a role in multiple organizations so is the popularity of tools for identifying risks and vulnerabilities. In this blog I’m exploring Garak a LLM vulnerability scanner developed by NVIDIA and is a OSS project to help strengthen LLM Security. When the term “Red Team” appears in the approach of simulation…

  • Bill of Materials CKS Refresher

    A Software Bill of Materials (SBOM) is like the ingredients list on your food package—it reveals what components, libraries, and dependencies go into building the final software. Just as checking food labels helps you understand nutritional content and potential allergens, an SBOM provides transparency into third-party components, helping identify vulnerabilities early in the software supply…

  • Azure AI Foundry

    Introduction This week at Microsoft Ignite, Azure AI Foundry was unveiled as the rebranded successor to “Azure AI Studio.” This marks a significant step toward unifying AI development tools under one cohesive platform. Azure AI Foundry provides a streamlined toolchain and an SDK designed for efficient consumption of AI models, supporting both OpenAI and Mistral…

  • RouteLLM Unlocking Cost Effective LLM Routing

    Introduction Costs associated with using closed-source large language models can add up in the use cases of complex tasks due to the nature of how tokens are priced for using APIs. RouteLLM is a open-sourced project that creates a method to determine based on the query a user sends which LLM to choose based on…

  • Github Actions with Azure ML Jobs

    Introduction Its no secret that Github is a premier development platform for Source Control Management and does have a robust features that also allow for Continuous Integration/Continuous Deployment. These features are a part of the use of Github Actions, in this blog post I’m going to use the example code in a repository here this…

  • Mutability of FIPS on AKS

    Introduction Your in compliance and tasked with identifying which microservice supported supports Federal Information Processing standards. Operations are dynamic and can change from supporting a business unit that might have this requirement, so what are you options if you have to revert and keep the cluster? Currently in Azure Kubernetes Service this has been capable…