The n8n AI Agent Node: Complete Guide
What the AI Agent Node Actually Is
Most n8n nodes do one thing: take input, transform it, pass it on. The AI Agent node is different. It doesn't follow a fixed path. It reasons. You give it a goal, a set of tools, and a model, and it figures out the steps on its own. It can call tools, observe results, decide what to do next, and loop until the job is done or it hits a limit.
Under the hood, the AI Agent node is built on LangChain and implements what's called a ReAct loop: Reason, Act, Observe, Repeat. The LLM acts as the brain. It reads the incoming message, looks at the tools available, decides which one to call (if any), gets the result back, and decides whether the task is complete or whether it needs to do more. This loop can run multiple times per single user message.
That's the core distinction from a basic LLM node. A basic LLM node is a one-shot call: prompt in, text out. The AI Agent node is a loop: it can take 10 actions before it gives you a final answer.
When to Use an Agent vs. a Simple LLM Call
This is the most important question to answer before you start building. Agents are powerful, but they're also slower, more expensive, and harder to debug than a direct LLM call. Don't reach for the AI Agent node by default.
| Scenario | Use Agent? | Why |
|---|---|---|
| Summarize a document | No | Single LLM call is faster and cheaper |
| Classify an email into categories | No | Fixed output, no tool use needed |
| Extract structured fields from text | No | LLM Chain + Structured Output Parser handles this |
| Answer a question that needs a live web search | Yes | Agent decides when and how to call the search tool |
| Multi-turn chat that remembers context | Yes | Agent + Memory maintains conversation state |
| Task requiring multiple API calls in unknown order | Yes | Agent reasons about which tool to call next |
| Fixed sequential pipeline (always same steps) | No | Standard workflow nodes are more predictable |
| Research task pulling from multiple sources | Yes | Agent can query, synthesize, and verify iteratively |
| Customer support bot with CRM lookups | Yes | Agent decides when to look up data vs. answer directly |
| Translate text to another language | No | One-shot LLM call, no reasoning loop needed |
The rule of thumb: if the AI needs to decide what to do next based on intermediate results, use an agent. If the steps are predictable and fixed, use a standard workflow with LLM nodes where needed. If you're still getting familiar with how n8n workflows are structured, the first workflow guide is a good starting point before diving into agents.
Your First Agent: Chat Trigger + AI Agent + Model
Before going deep on every option, let's build the simplest possible agent so you have a working mental model. Three nodes. Five minutes.
Add a Chat Trigger node
This is your entry point. The Chat Trigger creates a built-in chat interface you can use directly in n8n for testing. In production, it can be embedded in a website or connected to a messaging platform. No configuration needed to get started, just add it to the canvas.
Add the AI Agent node
Click the + connector on the Chat Trigger and search for "AI Agent". Add it. You'll see the node has connection points along the bottom for sub-nodes: Chat Model, Memory, Tool, and Output Parser. These are not regular node connections, they're sub-node slots that extend the agent's capabilities.
Connect a Chat Model
Click the + under "Chat Model" on the AI Agent node. Search for your provider, OpenAI, Anthropic, Google Gemini, Groq, Mistral, or others. Select it, add your API credentials, and choose a model. For OpenAI, gpt-4o-mini is a solid starting point. For Anthropic, claude-3-5-haiku is fast and affordable.
The Chat Model is the only required sub-node. Everything else, memory, tools, output parsers, is optional.
Test it
Click the Chat button at the bottom of the canvas. A chat window opens. Type anything. The agent will respond using the connected model. You now have a working AI agent, no tools yet, just a direct LLM conversation. From here, every section below adds capability.
Import This Workflow
Here's the exact workflow for the minimal agent above rendered live. Click any node to inspect it, or use the copy button to grab the JSON and import it directly into n8n via Ctrl+Shift+V. Just swap in your OpenAI credentials and you're running.
Node Parameters: The Prompt Section
Open the AI Agent node and you'll see two main parameters before you get to the Options section.
This controls where the agent gets its user message from. Two choices:
- Take from previous node automatically: The agent looks for a field called
chatInputin the incoming data. The Chat Trigger node outputs exactly this field, so if you're using Chat Trigger, leave this on the default and it just works. - Define below: You manually write the prompt, either as static text or as an n8n expression. Use this when the agent is triggered by something other than a chat (a webhook, a schedule, a form submission) and you want to construct the prompt dynamically from the incoming data.
When you turn this on, a new sub-node slot appears on the agent: Output Parser. This tells the agent it must return its response in a specific structured format rather than free-form text. When enabled, you connect one of three output parsers (covered in detail in the Output Parsers section below). Leave this off unless you specifically need structured JSON output from the agent.
Node Options: Every Setting Explained
Click Add Option at the bottom of the AI Agent node to reveal these settings. None are required, they're all refinements.
This is the most important option in the entire node. The system message is a set of instructions sent to the LLM before any user message. It defines the agent's role, personality, constraints, what tools to use and when, and how to format responses. Think of it as the agent's job description and rulebook combined.
You can use n8n expressions here to make it dynamic. For example, inject the current user's name or account type from the incoming data to personalize the agent's behavior per request.
The maximum number of times the agent's reasoning loop can run before it's forced to stop and return whatever it has. Each iteration may involve one LLM call plus one tool call. If your agent is doing complex multi-step research, you might need to raise this. If you're worried about runaway costs, lower it. For simple agents with one or two tools, 5 is usually plenty.
When turned on, the agent's output includes every step it took, every tool call, every observation, every reasoning step, not just the final answer. This is extremely useful for debugging. In production, you'd typically leave this off and only enable it when something isn't working as expected. The intermediate steps appear in the node's output data and in the execution logs.
When enabled, any binary image data in the incoming workflow data is automatically forwarded to the agent as image-type messages. This is what enables multimodal workflows. For example, a user uploads an image via a form, and the agent can see and analyze it. Requires a model that supports vision (GPT-4o, Claude 3, Gemini 1.5, etc.).
When enabled, the agent sends its response back token by token as it generates it, rather than waiting until the full response is ready. This makes the chat feel much more responsive for long answers. For streaming to work, your trigger must support it. The Chat Trigger and Webhook node (with Response Mode set to Streaming) both do. If you're using a different trigger, disable this.
Add custom key-value pairs that get attached to tracing events for this agent. This is specifically for use with LangSmith or similar LLM observability tools. If you're not using a tracing platform, ignore this option entirely.
Writing a Good System Prompt
The system message is where most of the agent's behavior is defined. A vague system prompt produces unpredictable results. A well-structured one makes the agent consistent and reliable. Here's a template that works well:
A few things that consistently improve agent behavior:
- Tell the agent when to use each tool, not just that the tools exist. "Use the calendar tool when the user asks about scheduling or availability" is better than just listing the tool.
- Set explicit constraints. "Never share order details without first verifying the customer's email" prevents the agent from doing things you don't want.
- Define what "done" looks like. Without this, agents sometimes keep calling tools unnecessarily.
- Use n8n expressions to inject dynamic context:
Today's date is {{ $now.toFormat('yyyy-MM-dd') }}.This is especially useful for date-aware agents.
Supported Chat Models
The AI Agent node requires a Chat Model sub-node. The model must support function/tool calling, not all models do. Here are the officially supported providers and what to know about each. For a deeper comparison of cost, speed, and quality across these providers, see the Choosing the Right AI Model guide.
| Provider | Node Name | Best For | Notes |
|---|---|---|---|
| OpenAI | OpenAI Chat Model | General purpose, most reliable tool calling | GPT-4o for quality, GPT-4o-mini for cost. Most widely tested with n8n. |
| Anthropic | Anthropic Chat Model | Long context, nuanced reasoning, safety | Claude 3.5 Sonnet/Haiku. Excellent for complex instructions and large documents. |
| Google Gemini | Google Gemini Chat Model | Multimodal, Google ecosystem | Gemini 1.5 Pro/Flash. Strong vision capabilities. Free tier available. |
| Google Vertex AI | Google Vertex Chat Model | Enterprise Google Cloud deployments | Same models as Gemini but via Vertex AI API with enterprise billing. |
| Azure OpenAI | Azure OpenAI Chat Model | Enterprise compliance, data residency | OpenAI models hosted on Azure. Required for some regulated industries. |
| Groq | Groq Chat Model | Speed, extremely fast inference | Llama, Mixtral, Gemma models. Best for latency-sensitive workflows. |
| Mistral | Mistral Cloud Chat Model | European hosting, open-weight models | Mistral Large/Small. Good alternative to OpenAI with EU data residency. |
| Ollama | Ollama Chat Model | Local/self-hosted, privacy, no API costs | Runs models on your own hardware. Requires Ollama server. Tool calling support varies by model. |
| AWS Bedrock | AWS Bedrock Chat Model | AWS ecosystem, enterprise | Access to Claude, Llama, Titan, and others via AWS. |
| DeepSeek | DeepSeek Chat Model | Cost-effective, strong reasoning | Very competitive pricing. DeepSeek-R1 has strong reasoning capabilities. |
Each Chat Model sub-node has its own settings. The most important ones are:
- Model: Which specific model to use (e.g.,
gpt-4o,claude-3-5-sonnet-20241022) - Temperature: Controls randomness. 0 = deterministic, 1 = creative. For agents doing factual tasks or tool use, keep this at 0 or 0.1. For creative tasks, 0.7 to 1.0.
- Max Tokens: Maximum length of the model's response. Set this to avoid unexpectedly long (and expensive) outputs.
- Timeout: How long to wait for a response before failing. Important for production workflows.
Tools: What the Agent Can Do
Tools are what separate an AI Agent from a chatbot. Each tool is a capability the agent can invoke during its reasoning loop. The agent reads the tool's name and description, decides whether it needs it, calls it with the right parameters, and uses the result to continue reasoning.
You connect tools by clicking the + under "Tool" on the AI Agent node. You can connect as many tools as you want, but more tools means more decisions for the agent, which can lead to confusion and higher costs. Start with the minimum set needed.
Built-in Tool Nodes
n8n ships with several ready-to-use tool nodes that require no custom configuration:
Math Operations
Performs arithmetic. Use when the agent needs to compute exact numbers rather than estimate. No configuration needed.
General Knowledge
Searches Wikipedia for factual information. Good for general knowledge lookups without needing a paid search API.
Computational Knowledge
Scientific queries, unit conversions, data analysis. Requires a Wolfram Alpha API key.
Google Search
Real-time web search via Google. Use when the agent needs current information. Requires a SerpAPI key.
Execute JavaScript
Runs custom JavaScript or Python. Use for data transformations or logic that no other tool handles.
Call Any API
The most flexible tool. Configure it to call any REST API. Define the description carefully so the agent knows when to use it.
App Integration Tools
The Tools Agent supports over 100 native app integrations as tools. These include Gmail, Google Sheets, Google Calendar, Slack, Notion, Airtable, HubSpot, Jira, GitHub, Postgres, MySQL, MongoDB, Stripe, Shopify, Telegram, Discord, and many more. Each one exposes specific operations (read, write, search, create) that the agent can call.
To add one, click + under Tool, search for the app name, and configure it with your credentials and the specific operation you want to expose. The agent will use the tool's name and description to decide when to call it.
The Workflow Tool: Your Most Powerful Option
The Call n8n Workflow tool lets your agent call another n8n workflow as a tool. This is the foundation of multi-agent architectures. You create specialized sub-workflows, one for CRM lookups, one for sending emails, one for database queries, and your main agent calls them as needed. Each sub-workflow can have its own credentials, error handling, and logic, completely isolated from the main agent.
The $fromAI() Function: Dynamic Tool Parameters
When configuring a tool node (like Google Sheets or Gmail), you normally hardcode the parameters. But with $fromAI(), you let the agent fill in the parameters dynamically based on context.
The $fromAI(key, description, type) function takes three arguments: a key name (what the AI uses to identify the value), an optional description (gives the AI context about what to look for), and an optional type hint (string, number, boolean, json). The AI model fills in the value from its context, from the conversation, from previous tool results, or by asking the user if it can't find it.
Human-in-the-Loop for Tools
For sensitive tool calls, sending emails, deleting records, making payments, you can require human approval before the agent executes the tool. In the Tools Panel, there's a "Human review" section where you configure an approval channel (Chat, Slack, Telegram, email). When the agent wants to use a gated tool, the workflow pauses and sends an approval request. The reviewer approves or denies, and the workflow continues accordingly.
Memory: Giving the Agent Context
Without memory, every message to the agent starts a completely blank conversation. The agent has no idea what was said before. Memory fixes this by storing conversation history and making it available on each turn.
You connect a memory node by clicking the + under "Memory" on the AI Agent node. Here are all the memory types available:
| Memory Type | Persistence | Best For | Requires |
|---|---|---|---|
| Simple Memory | Session only (lost on restart) | Testing, demos, single-session interactions | Nothing, built into n8n |
| Window Buffer Memory | Session only | Conversations where you want a fixed context window | Nothing, built into n8n |
| Postgres Chat Memory | Persistent | Production chatbots, multi-session conversations | Postgres database |
| Redis Chat Memory | Persistent | High-performance production, fast reads | Redis server |
| Motorhead Memory | Persistent | Managed memory service with summarization | Motorhead server |
| Xata Memory | Persistent | Serverless Postgres-compatible storage | Xata account |
| Zep Memory | Persistent | Semantic memory with knowledge graph | Zep server |
| Vector Store Memory | Persistent | Semantic recall over long histories, RAG-style memory | Embeddings node + Vector Store node |
Start Here
Stores the full conversation history in n8n's in-memory store. Zero setup. Perfect for testing. Lost when the workflow restarts or n8n restarts. Configurable: set how many previous messages to keep (default is the last 5 interactions).
Cost Control
Like Simple Memory but with a hard cap on how many messages are kept. When the window fills, the oldest messages are dropped. Use this to prevent context from growing indefinitely and driving up token costs.
Production Choice
Stores conversation history in an external database. Survives restarts. Supports multiple concurrent users via Session ID. Use Postgres if you already have it. Use Redis if you need sub-millisecond read speeds.
Semantic Recall
On each turn, retrieves the most semantically relevant past messages, not just the most recent ones. Requires an Embeddings sub-node and a Vector Store sub-node. See the Vector Stores and RAG guide for setup.
All persistent memory types use a Session ID to separate conversations. This is critical. If you don't set a unique Session ID per user or conversation, all users will share the same memory. A common pattern:
Chat Memory Manager
For advanced memory management beyond what the standard memory nodes offer, n8n provides the Chat Memory Manager node. Use it when you need to: inject custom messages into the agent's memory (to give it context it didn't learn from the conversation), check and reduce memory size programmatically, or manage memory in workflows where you can't attach a memory sub-node directly to the agent.
Output Parsers: Getting Structured Data
By default, the AI Agent returns free-form text. That's fine for chat. But if the agent's output needs to feed into downstream nodes, a database insert, an API call, a conditional branch, you need structured data. Output parsers enforce a specific format on the agent's response.
To use an output parser, first enable Require Specific Output Format in the node parameters. This reveals the Output Parser sub-node slot. Then connect one of the three available parsers.
Structured Output Parser
The most common parser. Forces the LLM to return a JSON object matching a schema you define. Two ways to configure it:
Note: n8n's JSON Schema implementation doesn't support $ref for referencing other schemas. Keep schemas self-contained.
Auto-fixing Output Parser
A wrapper around the Structured Output Parser that adds resilience. When the LLM's output doesn't match the schema, instead of failing, this parser sends the malformed output back to the LLM with instructions to fix it. It requires a second Chat Model connection (the "fixing" model) and a Structured Output Parser connection (the schema to fix toward).
Use this in production systems where occasional parsing failures are unacceptable. Avoid it in cost-sensitive or time-critical workflows, each fix attempt adds an extra LLM call.
Item List Output Parser
The simplest parser. Forces the LLM to return a plain list of items, strings, keywords, tags, categories. Use this when you need an array of simple values rather than a complex object. No schema definition needed; just connect it and the agent will return a JSON array.
| Parser | Output Format | Best For | Reliability |
|---|---|---|---|
| Structured Output Parser | JSON object matching your schema | Structured data for downstream nodes | Good with capable models |
| Auto-fixing Output Parser | JSON object (with retry on failure) | Production systems needing reliability | Best, retries on failure |
| Item List Output Parser | JSON array of strings | Tags, keywords, simple lists | Very good, simple format |
How the Agent Loop Works
Understanding the loop helps you debug unexpected behavior and write better system prompts. Here's exactly what happens when a message hits the AI Agent node:
Each pass through steps 2–4 counts as one iteration toward the Max Iterations limit. A simple question with no tool use takes one iteration. A research task that searches the web three times and synthesizes results takes four or more iterations.
Agent Architecture Patterns
As your use cases grow, the architecture you choose matters. Here are the four patterns that cover most real-world scenarios. If you want to go further and have Claude build these architectures for you automatically, check out the Automate n8n with Claude guide.
One agent, multiple tools
The simplest setup. One AI Agent node with all tools connected. Good for prototypes and straightforward tasks with fewer than 5–6 tools. Gets unwieldy fast as complexity grows.
Classify → Route → Specialist
A classifier (simple LLM call or Switch node) categorizes the incoming request and routes it to a specialized agent. Each specialist has focused tools and a tight system prompt. Best when you have clear, distinct request categories.
Master agent → Sub-agents
A master agent has Workflow tools that call specialized sub-workflows, each containing their own AI Agent. The master decides which specialist to invoke. Scales well for enterprise use cases with many domains.
Agent A → Agent B → Agent C
Agents process in stages, each handling one phase of a pipeline. Research → Analysis → Report is a classic example. Each agent receives the previous agent's output as its input.
A Real Example: Support Agent with CRM Lookup
Here's a complete, production-ready agent setup you can replicate. A customer support agent that can look up orders, check product availability, and escalate tickets.
System Message for this agent:
The HTTP Request tool for order lookup:
Debugging Agent Workflows
When an agent behaves unexpectedly, the execution logs are your best friend. Go to Executions, click the relevant execution, click the AI Agent node, and expand the output. You'll see every step the agent took: what it received, what it decided, which tools it called, what they returned, and what it concluded.
Common failure patterns and fixes:
| Symptom | Likely Cause | Fix |
|---|---|---|
| Agent ignores tools entirely | Tool descriptions are vague or the system prompt doesn't mention when to use them | Rewrite tool descriptions to be specific about when and how to use each tool |
| Agent calls the wrong tool | Tool descriptions overlap or are ambiguous | Add clearer differentiation; reduce the number of tools |
| Agent loops without stopping | Tool returns unclear results; no completion criteria in prompt | Add explicit "you're done when..." instructions to the system prompt; lower Max Iterations |
| Agent makes up tool results | Tool is returning errors or empty responses that the agent doesn't recognize | Add error handling to tool nodes; add instructions for handling empty results in the system prompt |
| Agent forgets previous messages | No memory node connected, or Session ID is wrong | Add a memory node; verify Session ID is consistent per user |
| Structured output fails | Model doesn't reliably follow the schema | Use Auto-fixing Output Parser; simplify the schema; use a more capable model |
Keeping Costs Under Control
Agents can get expensive fast if you're not careful. Each iteration involves at least one LLM call, and each LLM call sends the full context (system prompt + memory + conversation history + tool results) to the model. Here's how to keep costs reasonable. For a full breakdown of which providers offer the best cost-to-quality ratio, see the AI Inference Providers guide.
- Right-size your model. Use the smallest model that handles your use case. GPT-4o-mini or Claude Haiku for most tasks. Reserve GPT-4o or Claude Sonnet for tasks that genuinely need the extra capability.
- Cap memory. Use Window Buffer Memory with a small window (5 to 10 messages) instead of unlimited Simple Memory. Every message in memory costs tokens on every turn.
- Lower Max Iterations. If your agent rarely needs more than 3 tool calls, set Max Iterations to 5. This prevents runaway loops from burning tokens.
- Keep system prompts tight. Every token in the system prompt is sent on every turn. A 2,000-token system prompt on a 100-message conversation is 200,000 tokens just for the prompt.
- Cache frequent queries. If the same questions come up repeatedly, consider caching responses in a database and checking the cache before hitting the agent.
The Full Picture
The AI Agent node is n8n's most capable and most complex node. Here's a quick reference of everything it connects to:
| Sub-node Type | Required? | Options |
|---|---|---|
| Chat Model | Required | OpenAI, Anthropic, Gemini, Groq, Mistral, Azure OpenAI, Ollama, AWS Bedrock, DeepSeek, Google Vertex AI |
| Memory | Optional | Simple Memory, Window Buffer, Postgres, Redis, Motorhead, Xata, Zep, Vector Store Memory |
| Tool | Optional* | Calculator, Wikipedia, Wolfram Alpha, SerpAPI, Code, HTTP Request, Call n8n Workflow, 100+ app integrations |
| Output Parser | Optional | Structured Output Parser, Auto-fixing Output Parser, Item List Output Parser |
* Technically optional, but an agent without tools is just an expensive chatbot.
The key settings to know:
- System Message: Defines the agent's behavior. The most impactful thing you can configure.
- Max Iterations: Controls how many reasoning loops the agent can run. Default 10.
- Return Intermediate Steps: Shows every tool call and reasoning step in the output. Use for debugging.
- Enable Streaming: Sends responses token by token for a more responsive chat experience.
- Require Specific Output Format: Enables the Output Parser sub-node slot for structured JSON output.
- Automatically Passthrough Binary Images: Enables multimodal input (images) for vision-capable models.