Automated Red Teaming Agent in Azure Foundry

Your organization is likely navigating methods and uses of Generative AI whether this is innovation of a existing application that is internal to operations or a external web application the use of this technology should be thoroughly evaluated prior to release. You’ve likely encountered the term “Prompt Injection” however you’re also aware of automation that can assist with evaluating the changes to your LLM model notably how you configure “guard rails”. Azure Foundry recently has in public preview the use of a “Red Teaming Agent” this uses PyRiT (Python Risk Identification Tool) under the hood and allows automated “turns” to your LLM endpoint.

At this point most agents and workflows are to be deployed in some format such as SaaS applications or front-facing chat bots for various use cases. These interactions or input type of mediums can go through various scenarios that were typically limited to a natural language input with a set of outputs that were limited. Given the explosion of use in Generative AI this has powered various uses that have a substantial amount of inputs security has to monitor to ensure reputational damage or data exfiltration doesn’t occur.

OWASP Top 10 Risks & Mitigations for Generative AI

Image sourced from the official genai.owasp.org lists Prompt Injection being the highest in terms of top 10 risks, vulnerabilities and mitigations for 2025. While the inputs you’ll encounter are likely constrained to “What cars are listed between $35,000 to $45,000 with a mileage around 5,000 – 10,000” for a specific car listing can be benign we should take due care in investigating how our model responds to various inputs.

Getting Started

For using the capability of the Azure AI Red Teaming agent we need a few caveats for your Azure Foundry Workspace.

Regions (East US2, Sweden Central, France Central, Switzerland West) – Where your Azure Foundry is deployed are the supported regions
For the evaluation to be uploaded if you run into issues ensure Credential (Assigned to user is “Storage Blob Data Contributor” associated with the storage account as this will write the report to the Foundry connected Storage Account).
If you run into issues on connectivity ensure the private link associated with the Storage Account has access to your Azure Foundry could be a challenge (Beware)

Okay now the notes that I’ve worked with hopefully they assist we move into our authorization method using the Azure CLI.

az login --use-device-code

Once authenticated to our subscription we need a few packages I’m keeping this relatively vanilla will add more custom checks in the future.

%pip install azure-ai-evaluation[redteam]
%pip install python-dotenv
%pip install azure-ai-projects
%pip install azure-identity

This can be used in a requirements.txt or if you’re like me I prefer the use of Jupyter notebooks when I’m exploring a new portion of a SDK.

Now we’ll need to create a .environment file for our variables to connect to our resources.

#.env*
AZURE_SUBSCRIPTION_ID="<sub-id>"
AZURE_RESOURCE_GROUP_NAME="<azure-rg>" # Foundry Group
AZURE_PROJECT_NAME="<Azure_project_name>"
OPENAI_CHAT_ENDPOINT="<chat_endpoint>"
AZURE_OPENAI_ENDPOINT="<endpoint>"
AZURE_OPENAI_KEY="<key>"
AZURE_OPENAI_DEPLOYMENT_NAME="<deployment-name>"

Once you’ve ran this portion of our code you’ll get a information pop up stating Class RedTeam: This is an experimental class that we initialized by pointing to our project, using our credential and identifying our risk category.

Creating a Target

In the sense of what is a “Target” think of this as where we direct our attack at typically this would be a consumed API to our LLM. This will use the dotenv to pull our environment variables as indicated above with the .env file.

# Configuration for your Azure OpenAI Model
# ---> This will be gpt-4o or gpt-4o-mini, etc..
from dotenv import load_dotenv

# Load .env file
load_dotenv()

OPENAI_CHAT_ENDPOINT = os.environ.get("OPENAI_CHAT_ENDPOINT")

# configuration for Azure OpenAI
azure_openai_config = {
    "azure_endpoint": os.environ.get("AZURE_OPENAI_ENDPOINT"),
    "api_key": os.environ.get("AZURE_OPENAI_KEY"), # not required if authenticated with az login --use-device-code
    "deployment_name": os.environ.get("AZURE_OPENAI_DEPLOYMENT_NAME"),
    "model": "gpt-4o-mini",  # or "gpt-4o", "gpt-4o-mini", etc.
}

red_team_result = await red_team_agent.scan(scan_name="Red Team Sugar",
                                            target=azure_openai_config)

Running this scan I’ve ran into one issue the deployment name to the OpenAI endpoint didn’t match the parameter if this occurs and you need to access the updated deployment name run this command.

az cognitiveservices account deployment list --name <openai-resource> --resource-group <resource-group> -o table

Then restart your Jupyter notebook variables to ensure this reset the deployment name.

Running the scan

After you’ve compiled the previous code in your attack_strategies you define which levels of complexity you want along with the “category”. If you are concerned on the cost for larger payload attacks limit this to a small subset.

from azure.ai.evaluation.red_team import AttackStrategy


# Run the red team scan with multiple attack strategies
advanced_result = await red_team_agent.scan(
    target=azure_openai_config,
    scan_name="July-Scan",
    attack_strategies=[
        AttackStrategy.EASY,  # Group of easy complexity attacks
        AttackStrategy.MODERATE,  # Group of moderate complexity attacks
        AttackStrategy.CharacterSpace,  # Add character spaces
        AttackStrategy.ROT13,  # Use ROT13 encoding
        AttackStrategy.UnicodeConfusable,  # Use confusable Unicode characters
        AttackStrategy.CharSwap,  # Swap characters in prompts 
        AttackStrategy.Morse,  # Encode prompts in Morse code - ".---..---"
        AttackStrategy.Leetspeak,  # Use Leetspeak think of 1337 h4x0r
        AttackStrategy.Url,  # Use URLs in prompts "r0b%20r0b%20r0b%20r0b the $bA3b3$",
        AttackStrategy.Binary,  # Encode prompts in binary
        AttackStrategy.Compose([AttackStrategy.Base64, AttackStrategy.ROT13]),  # Use two strategies in one attack
    ],
    output_path="Scan.json", # if you want to define the .json file output path
)

After running this code you should get a status report and a link to the Azure Foundry Evaluation portal and once complete you can review the results.

A successful attack is illustrated above to get the model to explain a way to maximize monetary gain from a bank heist. This will also show other attacks such as morse code that broke through with a response from the model as shown below.

Summary

Evaluations assess the accuracy, security and overall performance of large language models deployed within an organization. To accelerate implementation, this native approach leverages Azure Foundry to rapidly identify and prioritize potential mitigations. In this case, the use of Azure AI Content Safety is native to the LLM endpoint as one layer, with additional content filters applied based on the results would limit this possibility. Given that jailbreaks often resemble a ‘whack-a-mole’ scenario, requiring multiple defensive strategies. Explore the Azure Foundry AI Red Team agent within your Azure Foundry and test its capabilities out you can bring your own data as well.