How to Build AI Agents Using Plan-and-Execute Loops

An AI agent, also known as an autonomous agent or intelligent agent, is a generative AI program capable of interacting with its environment and accomplishing predetermined tasks autonomously. These artificial intelligence agents use the reasoning capabilities of large language models (LLMs) to autonomously develop plans for achieving a given task. The AI agents are further empowered to access real-world tools (e.g., a web browser) to execute their plans.

So how are these agents different from adjacent technologies like AI copilots? Though some have agentic AI built into them, copilots aren’t intended to function autonomously. Instead, they augment employees’ abilities through direct real-time interaction. Think of a customer service copilot retrieving information and suggesting responses to help a contact center agent resolve a complex query, or a field service copilot helping a technician repair an engine.

Back to building AI agents, let’s say we want to task one with ordering a pizza. Our instruction to the agent may look like:

You are a helpful autonomous assistant. Your task is to provide a comprehensive plan to execute the given task. Think step by step in a lot of detail. Assume you have access to a web browser. You may also ask for additional tools or access you need.

‍

The AI agent will respond with a comprehensive plan to execute on the task (truncated for conciseness):


I can certainly assist you in purchasing a pizza from Domino's. Below is a step-by-step plan to ensure the order is placed successfully:

### Step 1: Gather Necessary Information

Before proceeding with the order, I'll need some details from you:

1. **Delivery Address:** The full address where the pizza should be delivered.

2. **Contact Information:** A phone number that the delivery driver can reach if needed.

…

### Step 2: Access Domino’s Website

1. Open a web browser.

2. Navigate to the Domino's website [www.dominos.com](http://www.dominos.com).

### Step 3: Set Up the Order
…

‍

By making sure the AI agent has the right tools, we empower it to complete the task on its own. In this tutorial, we’ll detail how to build your own autonomous plan-and-execute agent using Python, OpenAI, and Beautiful Soup.

Set Up the AI Agent’s Tools and Memory

We’re going to build a plan-and-execute agent that’s capable of accepting instructions and navigating within the context of a web browser using links on the page. To accomplish this, we need to equip our agent with some capabilities:

Tools: These are functions that the LLM can choose from and has access to. In this project, we’ll give it the ability to open a Chrome browser, navigate to a given page, and a function to parse out links on the page.
Memory: This is a data structure that allows the LLM to remember things that have been done when planning future tasks. This includes bread crumbs for itself as well as what tools have been called. Memory can be kept short-term in the agent or long-term to track the overall goal progress.

With these core components in place, we can now set up our plan-and-execute loop.

Create the Plan-and-Execute Loop

Agents work by analyzing their goals, creating steps that will reasonably accomplish that goal, and executing them using tools.

Diagram of a plan-and-execute loop for building AI agents

‍

The heart of this process is the plan-and-execute loop. Our example is built around the following plan-and-execute loop:

1. First, we provide the agent with an instruction through the terminal. In this case, “Get all the links from Hacker News.”

2. Next, the agent creates a plan to accomplish the task considering its instructions and the tools that it has at its disposal. Given the tools we outlined, it responds with the following steps:

open_chrome
navigate_to_hackernews
get_all_links

3. Last, with a completed plan, the agent moves on to the execute phase, calling the functions in order. When a function is called, the memory updates with the tool that was called and the related metadata, including the task, parameters, and result of the call.

Enhance the Agent’s Memory, Prompts, and Tools

To give our tools greater functionality, we'll import several libraries. We chose these tools to showcase different ways agents can interact with a system:

subprocess allows us to open system applications.
requests and BeautifulSoup allow us to get links from a URL.
OpenAI allows us to make LLM calls.

Setup:

from dotenv import load_dotenv
load_dotenv()

import os
import json
import subprocess

import requests
from bs4 import BeautifulSoup

from openai import OpenAI

api_key = os.getenv("OPENAI_API_KEY")
client = OpenAI()

Memory

For our AI agent’s plans to be effective, we need to know what the agent has done and still needs to do. This step is often abstracted in frameworks, making it important to call out. Without updating the agent’s “state,” the agent won't know what to call next. Think of this as similar to conversation history in ChatGPT.

Let’s create an `Agent` class and initialize it with variables that we can use for completing our task. We will need to know what tasks have been given and the LLM’s responses to those tasks. We can save those as `memory_tasks` and `memory_responses`. We will also want to store our planned actions along with any urls and links we might come across.

class Agent:
    def __init__(self):

        ## Memory
        self.memory_tasks = []
        self.memory_responses = []

        self.planned_actions = []
        self.url = ""
        self.links = []

Plan-and-execute prompts

In order to make and execute a plan, we need to know what tools we have and a memory of the steps already taken.

If we want to navigate to a website and get links we should call out some possible tooling around opening Chrome, navigating to a website, and getting links. We can be clear in structuring a role, task, instructions, tools available, and the expected output format. With this, we are asking the LLM to plan the steps needed to accomplish our task with the given tools.

self.planning_system_prompt = """
% Role:
You are an AI assistant helping a user plan a task. You have access to % Tools to help you with the order of tasks.

% Task:
Check the user's query and provide a plan for the order of tools to use for the task if appropriate. Do not execute only out line the tasks in order to accomplish the user's query.

% Instructions:
Create a plan for the order of tools to use for the task based on the user's query.
Choose the approapriate tool or tools to use for the task based on the user's query.
Tools that require content will have a variable_name and variable_value.
A tool variable value can be another tool or a variable from a previous tool or the variable that is set in memory.
If a variable is set in memory, use the variable value from memory.
If the tool doesn't need variables, put the tool name in the variable_name and leave the variable_value empty.
Memories will be stored for future use. If the output of a tool is needed, use the variable from memory.

% Tools:
open_chrome - Open Google Chrome
navigate_to_hackernews - Navigate to Hacker News
get_https_links - Get all HTTPS links from a page requires URL

% Output:
Plan of how to accomplish the user's query and what tools to use in order
Each step should be one a single line with the tool to use and the variables if needed
"""

The update_system_prompt was added to show specifically the addition of “memory” being added during the execution of this workflow. Here the only major differences between the planning and execution prompts are in the addition of our memory and the response format of the prompt outputs.

When completing a task, the LLM will look at the previous memory of tasks completed and choose a tool to use for the given task. This will return a tool name along with any variables needed to complete the task in JSON format.

def update_system_prompt(self):
self.system_prompt = f"""
% Role:
You are an AI assistant helping a user with a task. You have access to % Tools to help you with the task.

% Task:
Check the user's query and provide a tool or tools to use for the task if appropriate. Always check the memory for previous questions and answers to choose the appropriate tool.

% Memory:
You have a memory of the user's questions and your answers.
Previous Questions: {self.memory_tasks}
Previous Answers: {self.memory_responses}

% Instructions:
Choose the appropriate tool to use for the task based on the user's query.
Tools that require content will have a variable_name and variable_value.
A tool variable value can be another tool or a variable from a previous tool.
Check the memory for previous questions and answers to choose the appropriate variable if it has been set.
If the tool doesn't need variables, put the tool name in the variable_name and leave the variable_value empty.

% Tools:
open_chrome - Open Google Chrome
navigate_to_hackernews - Navigate to Hacker News
get_https_links - Get all HTTPS links from a page requires URL

% Output:
json only

{{
"tool": "tool" or ""
"variables": [
{{
"variable_name": "variable_name" or ""
"variable_value": "variable_value" or ""
}}
]
}}
"""

The openai_call function will allow us to call OpenAI and request the different format_responses to showcase the differences between planning and execution. The JSON format is important here because we are using the responses of what tool to use to actually run that function.

def openai_call(self, query: str, system_prompt: str, json_format: bool = False):

        if json_format == True:
            format_response = { "type": "json_object" }
        else:
            format_response = { "type": "text" }

        completion = client.chat.completions.create(
            model="gpt-4o",
            temperature=0,
            messages=[
                {"role": "system", "content": system_prompt},
                {
                    "role": "user",
                    "content": f'User query: {query}'
                }
            ],
            response_format=format_response
        )
        llm_response = completion.choices[0].message.content
        print(llm_response)
        return llm_response

Tools

Tools are functions that an agent has access to. Tools vary widely depending on the tasks the agent aims to accomplish.

In the code below, there are three specific tools that showcase how we can open a browser, navigate to a website, and get links. The way that these tools will be called is shown below when we put it all together. For now, think of this as a function an LLM can choose to call or not based on the current task of the plan.

The def open_chrome() tool can use Python’s subprocess to open a Chrome instance that will allow for navigation.

    def open_chrome(self):
        try:
            subprocess.run(["open", "-a", "Google Chrome"], check=True)
            print("Google Chrome opened successfully.")
        except subprocess.CalledProcessError as e:
            print(f"Failed to open Google Chrome: {e}")

‍

The def navigate_to_hackernews() tool is hard-coded for this demo to showcase the idea and should be expanded to allow any URL.

    def navigate_to_hackernews(self, query):
        self.url = "https://news.ycombinator.com/"
        try:
            subprocess.run(["open", "-a", "Google Chrome", self.url], check=True)
            print(f"Opened '{query}' opened in Google Chrome.")
        except subprocess.CalledProcessError as e:
            print(f"Failed to open search in Google Chrome: {e}")

‍

The def get_https_links() tool uses the Beautiful Soup library. It will grab the URL of the page we are on and get all of the links.

    def get_https_links(self, url):
        try:
            response = requests.get(url)
            response.raise_for_status()
            
            soup = BeautifulSoup(response.text, 'html.parser')
            links = soup.find_all('a')
            
            for link in links:
                link_text = link.get_text(strip=True)
                link_url = link.get('href')
                if link_url and link_url.startswith('https'):
                    print(f"Link Name: {link_text if link_text else 'No text'}")
                    print(f"Link URL: {link_url}")
                    print('---')
        except requests.exceptions.RequestException as e:
            print(f"Error fetching the URL: {e}")

Bring All Your Work Together

Here, we create a function to plan and then execute that same plan.

Diagram showing the user query, planning system prompt, and execution task list of an AI agent

‍

To do this, we need to:

Send the planning system prompt to the LLM to create a plan and save those tasks as a list.
For every task in that list:
1. Keep track of the task given to the agent.
2. Pull in the task prompt with the history.
3. Call the LLM to choose a tool based on the given task and return as JSON.
4. Update our memory with the LLM response.
5. Match the tool called by the LLM to one of the tools available.
6. Run the function called.
7. Start the next task.

Here’s what this looks like in our code below:

## Plan and Execute
    def run(self, prompt: str):
        ## Planning
        planning = self.openai_call(prompt, self.planning_system_prompt)
        self.planned_actions = planning.split("\n")

        ## Task Execution
        for task in self.planned_actions:
            print(f"\n\nExecuting task: {task}")
            self.memory_tasks.append(task)
            self.update_system_prompt()
            task_call = self.openai_call(task, self.system_prompt, True)
            self.memory_responses.append(task_call)

            try:
                task_call_json = json.loads(task_call)
            except json.JSONDecodeError as e:
                print(f"Error decoding task call: {e}")
                continue

            tool = task_call_json.get("tool")
            variables = task_call_json.get("variables", [])

            if tool == "open_chrome":
                self.open_chrome()
                self.memory_responses.append("open_chrome=completed.")

            elif tool == "get_https_links":
                self.links = self.get_https_links(self.url)
                self.memory_responses.append(f"get_https_links=completed. variable urls = {self.links}")

            elif tool == "navigate_to_hackernews" :
                query = variables[0].get("variable_value", "")
                self.navigate_to_hackernews(query)
                self.memory_responses.append(f"navigate_to_hackernews=completed. variable query = {query}")



if __name__ == "__main__":
    agent = Agent()

    while True:
        prompt = input("Please enter a prompt (or type 'exit' to quit): ")
        if prompt.lower() == 'exit':
            break
        agent.run(prompt)

‍

With our agent in place, we can now run a request and see how the agent responds.

“Get all the links from Hacker News.”

The plan the agent generated:

1. open_chrome
2. navigate_to_hackernews
3. get_https_links with variable_name: URL and variable_value: (URL of Hacker News)

‍

The outputs of each of the tasks being run:

Executing task: 1. open_chrome
{
    "tool": "open_chrome",
    "variables": [
        {
            "variable_name": "",
            "variable_value": ""
        }
    ]
}
Google Chrome opened successfully.

Executing task: 2. navigate_to_hackernews
{
    "tool": "navigate_to_hackernews",
    "variables": [
        {
            "variable_name": "",
            "variable_value": ""
        }
    ]
}
Opened '' opened in Google Chrome.

Executing task: 3. get_https_links with variable_name: URL and variable_value: (URL of Hacker News)
{
    "tool": "get_https_links",
    "variables": [
        {
            "variable_name": "URL",
            "variable_value": "(URL of Hacker News)"
        }
    ]
}
Link Name: No text
Link URL: https://news.ycombinator.com
---
Link Name: Computing with Time: Microarchitectural Weird Machines
Link URL: https://cacm.acm.org/research-highlights/computing-with-time-microarchitectural-weird-machines/
---
Link Name: SQLiteStudio: Create, edit, browse SQLite databases
Link URL: https://sqlitestudio.pl/
---
Link Name: The Nearest Neighbor Attack

...

Pitfalls When Building Plan-and-Execute Agents

If we start every agent call by making a plan, we don’t always have the flexibility to handle problems that arise. How we handle failures and what tools the agent has to mitigate these failures is very important. In our plan-and-execute approach, it isn’t easy to update our plan unless everything is in a try-catch and we update our plan on the failure.

Advanced Techniques and Concepts for Building AI Agents

With our understanding of plan-and-execute agents, we can build on what we know to create agents with more advanced capabilities. These include utilizing agentic frameworks and creating specific architectures.

While we covered the basics of building a plan-and-executive agent in this blog, you can push its abilities even further by incorporating advanced techniques such as:

ReAct (Reason and Act): This merges the planning and execution prompts into one, allowing the prompt to think one step at a time.
ADaPT (As-Needed Decomposition and Planning for complex Tasks): Perhaps best thought of as an extension of ReAct, ADaPT plans step by step, but also allows the agent to recursively break down problems when they arise or when steps are too large.
Reflexion: This allows an agent to know how well it did or if it needs to try again after a task has been completed.

As for advanced concepts that you could incorporate into existing agents or alongside other techniques, you could explore options such as:

Recursion: Agents call their own functions to keep tasks going until the problem is solved. Think of this as while True and be careful about infinite loops. This holds a lot of power in autonomy.
Multiple agents: These are perfect if you require agents to talk to one another or orchestrate a more complex task. Not every agent needs access to all tools or systems.
Data standardization: Standards such as Anthropic’s Model Context Protocol (MCP) are emerging to leverage tool and data service integrations with AI. Efforts around establishing best practices for how AI agents interact with data services is becoming increasingly important.
Frameworks: Agentic frameworks allow for the creation of agents, but often have their own paradigms that are built off of the ideas presented above. Autogen, LangGraph, and LlamaIndex are a few examples of agentic frameworks.

With advanced techniques and concepts, you can build more robust agents capable of things like comparing different scenarios when making decisions, or helping customer service chatbots handle more complex queries.

Start Building AI Agents Tailored to Your Business Needs

In this post, we looked at the anatomy of an AI agent through a practical plan-and-execute example, showcasing how agents use planning, execution, tools, and memory to accomplish tasks. Our basic implementation highlighted key considerations like tool selection and pitfalls, while also introducing more advanced concepts such as ReAct and multi-agent systems.

Mastering these concepts will prepare you to build agents capable of handling dynamic and generative tasks, including for high-value use cases like:

Process automation where agents enhance workflows by navigating internal systems, aggregating data from multiple sources, and generating reports.
Product development where AI-powered features can adapt to user requests and interact with various APIs and services.

WillowTree can accelerate your efforts by helping you implement the right frameworks and standards to develop and deploy AI agents. Learn more about our Data & AI consulting services.

Table of Contents

Building Your First AI Agent with Plan-and-Execute Loops