How to Build an AI Agent with ChatGPT (2026 Guide)
Step-by-step build guide for a working AI agent with ChatGPT. Custom GPTs, the Assistants API, Actions, function calling, and when to graduate to a real agent framework.
Short answer. To build an AI agent with ChatGPT in 2026 you have three real options. The fastest is a Custom GPT (no code, 15 minutes, free with ChatGPT Plus). The most capable is the OpenAI Assistants API with Actions and function calling (a few hundred lines of code, $5-50 in tokens to test). The most production-ready is graduating to a dedicated agent framework (LangGraph, CrewAI, or Anthropic’s Agents SDK) once you outgrow a single chat session.
This guide walks all three paths. We do not assume you have built an agent before. We do assume you can install a Python package and read a JSON snippet. By the end of this page you will have a working agent that uses tools, holds state, and completes a multi-step task.
What “AI agent” actually means in this context
Before you build, get the definition straight. An AI agent is a system that takes a goal, breaks it into steps, calls external tools, observes the results, and iterates until the goal is met. The key word is iterate. A regular ChatGPT conversation answers your question once and stops. An agent loops: think, act, observe, think again, act again.
That loop is what makes an agent different from a chatbot. Chatbots talk. Agents do.
The simplest agent has three pieces:
- A model (the brain - in this guide, ChatGPT or GPT-4 class)
- A set of tools (functions the agent can call to read or change the world)
- A loop (the harness that keeps calling the model until the goal is done)
Everything you build below is a variation on those three pieces.
Path 1: Custom GPT (no-code, 15 minutes)
If you have a ChatGPT Plus, Team, or Enterprise subscription, you can create a Custom GPT without writing any code. This is the right starting point for 90% of first-time builders.
What a Custom GPT actually is
A Custom GPT is a configured version of ChatGPT with three things baked in: a system prompt (its personality and rules), a knowledge base (PDFs, docs, or files you upload), and Actions (HTTP calls it can make to your own APIs). It runs inside chatgpt.com. You can share it with anyone who has a ChatGPT account.
Step-by-step
-
Go to chatgpt.com, click your profile, then My GPTs, then Create a GPT.
-
In the Create tab, describe what your agent should do in plain English. The builder asks you follow-up questions to refine it. After 4-5 turns you have a usable draft.
-
Switch to the Configure tab. This is where the real configuration lives:
- Name and description: keep these short and benefit-focused.
- Instructions: the system prompt. Be specific. Tell it what to do, what NOT to do, what tone to use, and how to handle edge cases. 200-600 words is typical.
- Conversation starters: 3-4 buttons your users see when they open the GPT.
- Knowledge: upload up to 20 files (PDF, docx, txt, csv). The GPT can search them with retrieval.
- Capabilities: toggle web browsing, DALL-E, code interpreter, canvas as needed.
- Actions: this is the agent layer. We cover it in step 4.
-
Add Actions to make the GPT into an agent. Click Create new action. You paste an OpenAPI 3.0 schema describing your API. The Custom GPT can now call your endpoints. Examples:
- A “support assistant” that calls your CRM to look up the customer’s order status
- A “lead qualifier” that calls your enrichment API to score inbound leads
- An “audit bot” that calls your DataForSEO endpoint to pull live SERP data
-
Test in the preview pane on the right side of the builder. Trigger the Action with a test prompt. Watch the request and response. Iterate until it behaves correctly.
-
Publish to “Only me”, “Anyone with a link”, or the GPT Store.
When Custom GPT is the right choice
- The agent’s job is contained and predictable
- You do not need to embed it on your own website
- You are fine with users needing a ChatGPT account to use it
- Your latency tolerance is conversational, not sub-second
When you outgrow Custom GPT
- You need to embed the agent on your own site or app
- You need fine control over the system prompt at runtime per user
- You need rate limits, billing, or auth tied to your own backend
- You need long-running tasks (Custom GPTs reset state per conversation)
- You need a model other than GPT-4 class (Claude, Gemini, open-source)
That is when you move to path 2.
Path 2: OpenAI Assistants API with Actions and function calling
The Assistants API is OpenAI’s official way to build a stateful agent in your own code. It handles the agent loop, thread state, and tool execution for you. You control the system prompt, the tools, and where the conversation lives.
What you need
- An OpenAI API key with billing enabled
- Python 3.10+ (or Node.js, or any language with HTTP)
- ~30 minutes for the first working version
Minimal working agent in Python
This is the smallest useful agent. It has one tool (a calculator) and one job (answer math questions correctly). It demonstrates every concept you need.
from openai import OpenAI
import json
client = OpenAI()
def calculator(expression: str) -> str:
"""Safely evaluate a math expression."""
try:
# Production: use a real parser, not eval. This is for demo only.
return str(eval(expression, {"__builtins__": {}}, {}))
except Exception as e:
return f"Error: {e}"
assistant = client.beta.assistants.create(
name="Math Agent",
instructions="You are a careful math assistant. When the user asks a calculation, always use the calculator tool. Never guess.",
model="gpt-4o",
tools=[{
"type": "function",
"function": {
"name": "calculator",
"description": "Evaluate a math expression and return the result.",
"parameters": {
"type": "object",
"properties": {
"expression": {"type": "string", "description": "A math expression like '2+2' or '15*0.18'"}
},
"required": ["expression"]
}
}
}]
)
thread = client.beta.threads.create()
client.beta.threads.messages.create(thread_id=thread.id, role="user", content="What is 1247 * 38?")
run = client.beta.threads.runs.create_and_poll(thread_id=thread.id, assistant_id=assistant.id)
while run.status == "requires_action":
tool_outputs = []
for call in run.required_action.submit_tool_outputs.tool_calls:
if call.function.name == "calculator":
args = json.loads(call.function.arguments)
result = calculator(args["expression"])
tool_outputs.append({"tool_call_id": call.id, "output": result})
run = client.beta.threads.runs.submit_tool_outputs_and_poll(
thread_id=thread.id, run_id=run.id, tool_outputs=tool_outputs
)
messages = client.beta.threads.messages.list(thread_id=thread.id)
print(messages.data[0].content[0].text.value)
Save this as agent.py, set OPENAI_API_KEY, run python agent.py. You now have a working agent.
What is happening
- You define an Assistant with a system prompt (
instructions) and a list of tools (justcalculatorhere). - You create a Thread (a persistent conversation) and post the user message.
- You start a Run. The model decides whether to answer directly or call a tool.
- If the model wants to call a tool, the run pauses with status
requires_action. You execute the tool yourself (runcalculator("1247*38")), then submit the output. The model continues. - When the model is done, the run completes and the answer is in the latest message.
That loop is the agent. Everything else is just adding more tools, smarter prompts, and better error handling.
Adding real tools
Replace calculator with anything: an HTTP request to your CRM, a database query, a Slack post, a file write. As long as you can write a function that takes JSON and returns a string, the agent can use it.
A useful pattern: keep a tools/ directory in your project, one file per tool, each exporting a function and an OpenAPI-style spec. Wire them all into the Assistant at startup. This is how production agents stay maintainable as the tool count grows.
Cost expectations
A simple agent run costs $0.001-$0.05 depending on context size and tool count. A complex multi-step research agent that processes documents can run $0.50-$5 per task. Budget $20-100 for the development phase, then watch usage as you scale.
Path 3: Graduating to a real agent framework
The Assistants API is fine for single-purpose agents. Once you need multiple agents collaborating, complex memory, branching plans, or evaluation harnesses, you graduate to a framework.
The serious options as of 2026:
- LangGraph (LangChain’s agent framework). Most flexible, most mature ecosystem, steepest learning curve. State machine based. Ideal for production agent systems that need observable, debuggable graphs.
- CrewAI. Multi-agent collaboration as the headline feature. You define “crews” of agents with roles and goals. Friendly Python API. Less flexible than LangGraph, faster to prototype.
- Anthropic Agents SDK. Newer, tightly integrated with Claude. Excellent if you have already chosen Claude as your model. Strong on tool use and subagent spawning.
- OpenAI Swarm (experimental). Lightweight multi-agent handoff library from OpenAI. Useful for simple routing, less so for complex orchestration.
Pick the framework that matches your model choice. If you are GPT-first, LangGraph or CrewAI. If you are Claude-first, the Anthropic SDK. If you are mixing models, LangGraph.
When you migrate from path 2 to path 3, you will rewrite the loop, but the tools and prompts port directly. Plan for ~2-3 days of rework if you have a working Assistants API agent.
How to choose between the three paths
| Need | Custom GPT | Assistants API | Framework |
|---|---|---|---|
| Build today, no code | Yes | No | No |
| Embed on your own site | No | Yes | Yes |
| Multi-step, multi-tool task | Limited | Yes | Yes |
| Multiple agents collaborating | No | Limited | Yes |
| Switch model providers | No | No | Yes |
| Production observability | Limited | Limited | Yes |
| Cost to test | $0 | $5-50 | $50-200 |
Rule of thumb: start at path 1 to learn, move to path 2 the moment you need control, move to path 3 the moment you need scale.
Common mistakes building your first agent
Mistake 1: too many tools too soon. Five tools is plenty for a first agent. With 20 tools, the model gets confused, calls the wrong one, and your debugging gets brutal. Add tools only when you have a working version without them.
Mistake 2: vague system prompts. “You are a helpful assistant” produces a helpful chatbot, not a useful agent. Specify the goal, the constraints, the output format, and the failure modes. 300 words of crisp instructions beats 50 words of platitudes.
Mistake 3: no error handling on tool calls. When a tool fails (timeout, 500, malformed JSON), the agent has to know. Always return a structured error string, never let the tool throw silently.
Mistake 4: testing only the happy path. Run the agent against weird inputs, contradictory user messages, and tools that return empty results. Most agents fall apart in the first 20 minutes of unfriendly testing.
Mistake 5: skipping cost monitoring. A misbehaving agent in a loop can rack up $50 of API bills in an hour. Set hard token limits per run and log cost per task from day one.
What to build first
If you are blocked on what to build, three first-agent ideas that reliably work:
- A research agent that takes a topic, searches the web (via a search tool), reads 5 articles, and writes a synthesis. Tests web fetching, summarisation, and structured output.
- A meeting-prep agent that takes a calendar event, looks up the attendees on LinkedIn, pulls recent news, and writes a 1-page brief. Tests enrichment, multi-source aggregation, and formatting.
- A support-triage agent that reads an incoming customer email, classifies the issue, looks up their account, and drafts a reply. Tests classification, internal-API calls, and policy-aware writing.
Each of these can be built in a day on path 2 and gives you a real artefact to show.
FAQ
Can I build an AI agent with ChatGPT for free?
Yes, with limits. Custom GPTs are included with ChatGPT Plus ($20/month). The Assistants API has no free tier but gives $5 of credits to new accounts. For a fully free path, you can use the open-source models on Hugging Face plus a framework like CrewAI, but you sacrifice quality and latency.
Do I need to know how to code to build an agent with ChatGPT?
For Custom GPTs, no. The builder is conversational. For the Assistants API, basic Python or JavaScript is required. You do not need to be an experienced engineer; reading and adapting a 50-line script is enough.
What is the difference between a Custom GPT and an OpenAI Assistant?
A Custom GPT runs inside chatgpt.com and is configured through the ChatGPT UI. An Assistant runs through the API in your own code. Custom GPTs are faster to build and demo. Assistants are more flexible, embeddable, and production-ready. The Assistants API supports more tools, longer conversations, and finer control.
How long does it take to build an AI agent with ChatGPT?
A Custom GPT: 15-30 minutes for a useful first version, 1-3 hours to polish. An Assistants API agent: 2-6 hours for a single-tool agent, 1-3 days for a multi-tool production version. A framework-based multi-agent system: 1-2 weeks for the first usable build.
Can a ChatGPT agent take real actions like sending emails or making purchases?
Yes, through Actions (Custom GPT) or function calling (Assistants API). You define the tool, the agent decides when to call it, and your code performs the action. For sensitive actions like payments or sending emails, always require explicit user confirmation in your tool’s logic, never let the agent execute silently.
Is OpenAI’s Assistants API the same as agents from Anthropic or Google?
The concepts are the same: model plus tools plus a loop. The implementations differ. Anthropic’s Claude has Computer Use and an Agents SDK with strong subagent support. Google’s Vertex AI Agents target enterprise Google Cloud workloads. OpenAI’s Assistants API is the most popular due to GPT-4’s adoption, but functionally similar to the others.
How do I keep my ChatGPT agent secure?
Three rules. Never put secrets in the system prompt or knowledge files - they leak. Always validate tool inputs server-side - the agent will sometimes pass weird arguments. Always require auth on the APIs your agent calls - never assume the agent is the only caller.
What’s the future of AI agents built on ChatGPT?
OpenAI is moving toward more autonomous, longer-running agents (the “Operator” line of products). Custom GPTs and the Assistants API are converging with these new capabilities. Expect agents that can run for hours, manage their own memory, and coordinate with other agents to become standard within 12-18 months. Building a small agent today is the best preparation for that shift.
Related guides
- Agent SEO: how to make your tools and APIs discoverable to agents
- Claude Code vs Cursor: which AI coding tool actually wins
Published by Online Optimisers. Want this audited or built for your business? We run agent builds for ambitious local and B2B teams. Book a 30-minute call.
Want this audited on your own site?
We run agent-SEO + AI ranking audits for ambitious local and B2B brands. Real data, no fluff, fixed scope.
Book an audit call