Groq + Exa.ai Powerful Searching across LLMs

Introduction

I’ve been exploring APIs that extend some search capabilities of existing LLM models for knowledge that isn’t known to the underlying model and ideally assist with relevant knowledge bases for some research I’m conducting. I’ve tried a handful of API’s such as Serper API that is very powerful and recently did a video using Bing API as function calling and its proven a relatively useful capability to add to your application and LLM use. Given the existing landscape of start ups to use AI in some shape or form of a product I’ve come across a company called Exa.ai this company portrays itself as the Google 2.0 (Search for AI) a very bold statement and in this post today I’m going to run a few sample turns with this API.

The setup

If you’d like to use this type of function call the documentation on Exa.ai is actually really good to plug-and-play for use of the SDK via Python, if you choose to use it in a agentic framework such as CrewAI it also has a how-to use. The following items I’m running to replicate this you can choose your own LLM if it supports function calls. I’ve hosted this repository here for source code.

  • Groq API (Llama-3B) Key
  • Exa.AI (API) Key

For starting off this code will be hosted on my github for your use if you’d like to resuse, I’ve declared a search function as follows.

First we’d have to gather our requirements which will encompass the search function this tells our LLM specifically that if the knowledge isn’t know to the model this will invoke a tool.

# Exa API
exa = Exa(api_key=os.environ.get("EXA_API_KEY"))
# Define the search function
def search(query: str) -> list:
    """
    Perform a Exa SDK search against the given query

    @param query: Search query
    @return: List of search results

    """
    try:
        result = exa.search_and_contents(
            query=query,
            type="neural",
            use_autoprompt=True,
            num_results=5,
            text=True,
        )

        output = []

        # Assumes that SearchResponse object has a results attribute this is .results
        for item in result.results:
            output.append({
                "title": item.title,
                "link": item.url,
                "snippet": item.text,
                "score": item.score,
            })

        return json.dumps(output)
    except Exception as e:
        console.print(f"Error during search: {e}", style="bold red")
        return json.dumps({"error": str(e)})

This defines our function and now we can use some familiar code to call Groq and specify our model of choice for most uses Llama-3 works well.

client = Groq(
    api_key=os.environ.get("GROQ_API_KEY"),
)

# Query the user for a search query
query = input("Enter a desired query: ")

# Perform a search
def run_conversation(query):
    # Initial user message that is passed to the API
    messages = [{"role": "user", "content": query}]

    # Define the funciton for the model
    tools = [
        {
            "type": "function",
            "function": {
                "name": "search",
                "description": "Search for any query that is not known or understood by the model.",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "query": {
                            "type": "string",
                            "description": "LLM Security encompasses a variety of approaches that should be considered such as OWASP Top 10 LLMs.",
                        },
                    },
                    "required": ["query"],
                },
            }
        }
    ]
    # First API call: Ask the model to use the function to search for the query
    chat_completion = client.chat.completions.create(
        model="llama3-8b-8192",
        messages=messages,
        tools=tools,
        tool_choice="auto",
        max_tokens=4096,
    )
    
    # Process the model response
    response_message = chat_completion.choices[0].message
    tool_calls = response_message.tool_calls
    if tool_calls:
        available_functions = {
            "search": search,
        }
        messages.append(response_message)
        for tool_call in tool_calls:
            function_name = tool_call.function.name
            function_to_call = available_functions["search"]
            function_args = json.loads(tool_call.function.arguments)
            function_response = function_to_call(
                query=function_args.get("query")
            )
            messages.append(
                {
                    "tool_call_id": tool_call.id,
                    "role": "tool",
                    "name": function_name,
                    "content": function_response,
                }
            )
        # Print the response
        console.print("Model's Response (LLM)", style="bold red")
        console.print(response_message, style="bold red")

        # Second API call: Pass the response to the model
        final_response = client.chat.completions.create(
            model="llama3-8b-8192",
            messages=messages,
            max_tokens=4096,
        )

         # Print the final response
        final = final_response.choices[0].message.content

         # Distinguish the print with a color to understand the function call
        console.print("Final Response:", final, style="bold blue")

        # Validate the function is called 
        console.print("Raw Final Response:", final_response, style="bold yellow")

        # Return the final response is needed otherwise you can comment this out
        # return final
print(run_conversation(query))

I’ve tested this out from this point now we can actually start querying the function tool with our request. Since I’m aware that any type of LLM model on the market has a end date on information it knows of I’m going to ask something about current research.

Using the rich library from Python I’ve created some context that the end user can decipher on which response represent our tool. The blue specifically is the mixture of our function call that retrieves using Exa.ai for the research notice this references research papers what we are looking for, this is also given a tool id.

Now if we scroll down the response will also have a representation known as Raw Final Response.

Tailoring our search to specific domains

If we navigate back to our search function we can also use various parameters to specify included_domains, excluded_domains if we know we are performing research we’d like to target research sites that house most of this information.

We can capture this by modifying our exa.search_and_contents with the include_domains, start_published_date, end_published_date. Perhaps you want your search to be relevant to specific time frame this is where this comes into play, additionally I want to ensure my results are focused on academic research.

Our response is enhanced with more accurate research this time to specific domains that we’ve chosen to point our search function at and this feeds into our LLM to summarize the research back to us now this raw response is to show you what is happening behind the scenes to represent the response from Groq.

Summary


In conclusion, integrating the Groq API with Exa.ai to power search within a Large Language Model (LLM) significantly enhances search results. By utilizing function calling through Exa.ai, relevant topics are surfaced in a concise and customizable format. This approach has proven invaluable in my projects, even surpassing the capabilities of the Bing API. The straightforward addition of a function call to your LLM not only improves query handling but also boosts response accuracy. Check out Exa.AI if you’re looking for a Search functionality in the future blog post I’m going to show how to use this with agent workflows to perform search task and write reports.