Introduction
Agents are the next iteration of taking traditional stateless interactions with LLM’s to a stateful interaction with the use of typically Assistants API or extending using a framework. Some popular ways to create a agent workflow are tools such as Promptflow, CrewAI, LangGraph, LangChain and others. For this blog post I’m going to demonstrate the use of LangGraph which is a part of the family of Langchain creators that have simplified the use of creating agents. The way I try to frame the agents are a directed motion of what you want from an LLM response with equipping the specific tools, functions and files pertinent to its use.
Architecture
This is the defined graph that demonstrates our agents we define two in this example Researcher and Summary Agent. To get to this point we will have to provide some work on step by step so for starters we are going to need to access a LLM. To start with keeping cost relatively free I’ve used Groq you can opt for other open-source models and you can run Ollama locally too.
Code
First we start with the imports assuming your using the python jupyter notebook this will look like the following.
%pip install -U langgraph langchain-openai langchain-groq
%pip install python-dotenv
%pip install exa-py
The packages we are using is the Exa SDK as this represents our search tool that the research agent will have access to along with the summary agent. If you want to use another tool an alternative is Tavily API which provides a generous API.
Setting our environment variables should be simple since I’m local I’m using the python-dotenv library to read a key/value for this.
import os
from dotenv import load_dotenv
def check_openai_api_key():
# Load environment variables
load_dotenv()
# Get the API key from environment variables
openai_api_key = os.getenv("OPENAI_API_KEY")
exa_api_key = os.getenv("EXA_API_KEY")
groq_api_key = os.getenv("GROQ_API_KEY")
# Check if the API key is found
if openai_api_key or exa_api_key or groq_api_key:
print("OPENAI_API_KEY found.")
print("Exa API Key found.")
print("Groq API Key found.")
return openai_api_key, exa_api_key, groq_api_key
else:
print("Error: OPENAI_API_KEY not found, Exa API Key not found, Groq key not found......")
return None
# Example usage
api_key = check_openai_api_key()
if api_key:
# Proceed with using the API key
print("Proceeding with API key.")
else:
# Handle the error
print("Please set the OPENAI_API_KEY environment variable.")
This assumes you have a .env* file with these values set again INCLUDE THESE IN .gitignore and don’t commit secrets.
For the code below I’m using the example provided on the collaboration agent page here.
from langchain_core.messages import (
BaseMessage,
HumanMessage,
ToolMessage,
)
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langgraph.graph import END, StateGraph, START
def create_agent(llm, tools, system_message: str):
"""Create an agent."""
prompt = ChatPromptTemplate.from_messages(
[
(
"system",
"You are a helpful AI assistant, collaborating with other assistants."
" Use the provided tools to progress towards answering the question."
" If you are unable to fully answer, that's OK, another assistant with different tools "
" will help where you left off. Execute what you can to make progress."
" If you or any of the other assistants have the final answer or deliverable,"
" prefix your response with FINAL ANSWER so the team knows to stop."
" You have access to the following tools: {tool_names}.\n{system_message}",
),
MessagesPlaceholder(variable_name="messages"),
]
)
prompt = prompt.partial(system_message=system_message)
prompt = prompt.partial(tool_names=", ".join([tool.name for tool in tools]))
return prompt | llm.bind_tools(tools)
Then if we want to limit the access of the agent to specific domains you can annotate the use I’m searching for specific areas that I’m constantly using this narrows down the results to what the use-case is.
from exa_py import Exa
from langchain_groq import ChatGroq
import json
llm = ChatGroq(
model="mixtral-8x7b-32768",
temperature=0,
max_tokens=None,
timeout=None,
max_retries=2,
)
# For use of the tool we will use the Exa API
from typing import Literal
from langchain_core.tools import tool
exa = Exa(os.getenv("EXA_API_KEY"))
# Define the search function
@tool
def search(query: str) -> list:
"""
Perform a Exa SDK search against the given query
@param query: Search query
@return: List of search results
"""
try:
result = exa.search_and_contents(
query=query,
type="neural",
include_domains=["arxiv.org", "bing.com", "google.com"],
start_published_date="2023-12-31",
end_published_date="2024-9-12",
use_autoprompt=True,
num_results=5,
text=True,
)
output = []
# Assumes that SearchResponse object has a results attribute this is .results
for item in result.results:
output.append({
"title": item.title,
"link": item.url,
"snippet": item.text,
"score": item.score,
"publish": item.published_date,
})
return json.dumps(output)
except Exception as e:
print(f"Error during search: {e}")
return json.dumps({"error": str(e)})
tools = [search]
This code defines our tool which we give access via the Exa SDK this client will have specific time parameters to ensure the results are as up-to-date up to September 9, 2024.
Defining the State
This portion of the code defines the state for our LangGraph.
import operator
from typing import Annotated, Sequence
from typing_extensions import TypedDict
from langchain_groq import ChatGroq
# This defines the object that is passed between each node
# in the graph. We will create different nodes for each agent and tool
class AgentState(TypedDict):
messages: Annotated[Sequence[BaseMessage], operator.add]
sender: str
Then once that is defined we can now put our agent nodes together as shown below.
import functools
from langchain_core.messages import AIMessage
# Helper function to create a node for a given agent
def agent_node(state, agent, name):
result = agent.invoke(state)
# We convert the agent output into a format that is suitable to append to the global state
if isinstance(result, ToolMessage):
pass
else:
result = AIMessage(**result.dict(exclude={"type", "name"}), name=name)
return {
"messages": [result],
# Since we have a strict workflow, we can
# track the sender so we know who to pass to next.
"sender": name,
}
# Specifying which model consumed by the agents
llm = ChatGroq(model="mixtral-8x7b-32768")
# Research agent and node
research_agent = create_agent(
llm,
tools,
system_message="You should provide accurate data for the summary_agent to use.",
)
research_node = functools.partial(agent_node, agent=research_agent, name="Researcher")
# summary_agent and node
summary_agent = create_agent(
llm,
tools,
system_message="Any charts you display will be visible by the user.",
)
chart_node = functools.partial(agent_node, agent=summary_agent, name="summary_agent")
We define two areas the Summary Agent and Research Agent we define what specific model we are using via the Groq SDK as Mixtral-8x7b.
Next we put together our tools using the following code.
from langgraph.prebuilt import ToolNode
tools = [search]
tool_node = ToolNode(tools)
The next code is the Edge Logic how we want to decide on what to do based on results of the agents.
# Either agent can decide to end
from typing import Literal
def router(state) -> Literal["call_tool", "__end__", "continue"]:
# This is the router
messages = state["messages"]
last_message = messages[-1]
if last_message.tool_calls:
# The previous agent is invoking a tool
return "call_tool"
if "FINAL ANSWER" in last_message.content:
# Any agent decided the work is done
return "__end__"
return "continue"
Next our code Defines the StateGraph this is pivotal in the use of compiling it all together.
workflow = StateGraph(AgentState)
workflow.add_node("Researcher", research_node)
workflow.add_node("summary_agent", chart_node)
workflow.add_node("call_tool", tool_node)
workflow.add_conditional_edges(
"Researcher",
router,
{"continue": "summary_agent", "call_tool": "call_tool", "__end__": END},
)
workflow.add_conditional_edges(
"summary_agent",
router,
{"continue": "Researcher", "call_tool": "call_tool", "__end__": END},
)
workflow.add_conditional_edges(
"call_tool",
# Each agent node updates the 'sender' field
# the tool calling node does not, meaning
# this edge will route back to the original agent
# who invoked the tool
lambda x: x["sender"],
{
"Researcher": "Researcher",
"summary_agent": "summary_agent",
},
)
workflow.add_edge(START, "Researcher")
graph = workflow.compile()
Now to visualize it similar to how our code looked in the beginning on the architecture section.
from IPython.display import Image, display
display(Image(graph.get_graph().draw_mermaid_png()))
Calling our Agents
events = graph.stream(
{
"messages": [
HumanMessage(
content="Generate the latest research on Generative AI Security, and summarize a report on the latest research in the field."
)
],
},
# Maximum number of steps to take in the graph
{"recursion_limit": 15},
)
for s in events:
print(s)
print("----")
Output
While this image might be a little portion of the output, the recurison_limit is set to 15 to limit the tokens per minute on request since this is calling various research articles then passing to the LLM to summarize.
{‘Researcher’: {‘messages’: [AIMessage(content=’In the field of Generative AI Security, recent research has identified and addressed several potential threats and risks. One study focuses on the misuse of Large Language Models (LLMs) by cybercriminals, who exploit these models to generate and automate cyber attacks. The research demonstrates how LLMs can be manipulated to bypass ethical and privacy safeguards, effectively creating threats such as social engineering, malicious code, payload generation, and spyware. By testing these AI-generated attacks on live systems, the study assesses their effectiveness and the vulnerabilities they exploit, shedding light on the risks AI poses to critical infrastructure.
In response to the growing problem of unauthorized malicious image generation, another study proposes a new adversarial attack method that adds strong perturbation on the high-frequency areas of images. This method aims to make adversarial images more robust to adversarial purification, making it more difficult for attackers to generate malicious images.
A paper on secret collusion among generative AI agents comprehensively formalizes the problem of secret collusion and proposes a variety of mitigation measures. The study results in a model evaluation framework that systematically tests capabilities required for various forms of secret collusion, providing extensive empirical results across a range of contemporary LLMs. While the steganographic capabilities of current models remain limited, the research highlights the need for continuous monitoring of steganographic frontier model capabilities.\n\n
In the context of the Internet of Things (IoT), the integration of generative AI poses emerging security risks. Recent research discusses these risks and analyzes the potential for data breaches when using generative AI in IoT ecosystems. Strategic approaches for mitigating these risks include the development of robust security protocols, multi-layered security approaches, and the adoption of AI technological solutions.
A study on generated data with fake privacy reveals potential privacy risks associated with fine-tuning large language models on generated data. Fine-tuning with LLM-generated data may enhance privacy, but it can also pose additional privacy risks. The research focuses on two primary approaches to fine-tuning with generated data, highlighting the potential privacy risks in LLMs when fine-tuning with generated data.
In summary, the latest research in Generative AI Security emphasizes the importance of addressing potential risks and threats, including cyber attacks, unauthorized image generation, secret collusion among AI agents, IoT security risks, and privacy risks associated with fine-tuning large language models on generated data.
Summary
The use of agents are by far the best methods of achieving accuracy for the intended output given the inputs can become more relevant based on the frequency of what you are trying to achieve, such as “Latest news from a specific region between August – September”. This will likely have reliance on tools that work in tandem on the workflow that is then parsed, cleaned and sent to the LLM to have context additionally the fine-tuned can assist. From the use of CrewAI in my opinion LangGraph is a robust framework that is much more customizable to truncate responses without excessive token use (costly). Check out LangGraph and the examples presented to see if it meets your use case.