O
Octo
O
Octo
CoursesPricingDashboardPrivacyTerms

© 2026 Octo

Generative AI Explained
1What Is Generative AI?2What Is Deep Learning?3What Are AI Agents?4What Is Agentic AI?5AI Safety & Regulation
Module 3

What Are AI Agents?

AI agents don't just answer questions — they take action. Here's what they are, how they work, and why everyone from startups to Fortune 500 companies is building them.

You asked it to book a trip. It booked the trip.

You open your laptop on a Friday evening and type: "Plan a long weekend in Lisbon for two. We like street food, live music, and walkable neighborhoods. Budget is $2,000 including flights."

Thirty seconds later, you're not looking at a list of suggestions. You're looking at a confirmation. The AI found flights within budget, reserved a boutique hotel in Alfama, built a day-by-day itinerary with restaurant reservations, and sent the whole package to your partner's email for approval — all without you clicking a single link.

That's not a chatbot. A chatbot would have given you ten paragraphs of advice and wished you a wonderful trip. This is an AI agent — software that doesn't just think, it acts.

And this shift from "AI that talks" to "AI that does" is the biggest change in technology since the smartphone.

Agents vs. chatbots vs. traditional AI

Let's get the definitions straight, because these three things get conflated constantly:

✗ Without AI

  • ✗Responds to your prompt
  • ✗Generates text, images, or code
  • ✗Stops when it gives you an answer
  • ✗Has no memory between sessions (unless added)
  • ✗Knows things but can't DO things

✓ With AI

  • ✓Breaks your goal into sub-tasks
  • ✓Calls tools, APIs, and external services
  • ✓Keeps working until the goal is achieved
  • ✓Maintains state across multiple steps
  • ✓Knows things AND takes action on your behalf

Traditional AI (pre-LLM) classified data — spam filters, recommendation engines, fraud detection. Chatbots like ChatGPT generate content in response to prompts. AI agents take a goal and autonomously execute multi-step plans to achieve it. They perceive their environment, decide what to do, take action, observe the result, and adjust.

Think of it this way: a chatbot is a brilliant advisor sitting in a chair. An AI agent is that same brain strapped to hands, eyes, and a phone — it can actually do the things it recommends.

There Are No Dumb Questions

"Isn't an AI agent just a chatbot with extra steps?"

Technically, most agents today are built on top of LLMs (the same technology powering chatbots). But calling an agent "a chatbot with extra steps" is like calling a self-driving car "a cruise control with extra steps." The architecture is fundamentally different: agents have a planning loop, access to tools, persistent memory, and the ability to take real-world actions. Those "extra steps" change everything.

"Do agents actually understand what they're doing?"

No more than an LLM does. They don't understand your Lisbon trip — they execute a well-structured sequence of API calls based on pattern-matched reasoning. But they don't need to understand in the human sense. A thermostat doesn't understand temperature, but it keeps your house at 72 degrees.

The agent loop: perceive, think, act, learn

Every AI agent — from a customer service bot to a coding assistant — runs the same fundamental loop:

The AI agent loop

Perceive (read input, check state)
Think (reason, plan next step)
Act (call tool, send message)
Learn (observe result, update)
Press enter or space to select a node. You can then use the arrow keys to move the node around. Press delete to remove it and escape to cancel.
Press enter or space to select an edge. You can then press delete to remove it or escape to cancel.

Here's what happens at each stage:

Perceive — The agent takes in information: your instructions, data from APIs, the results of its previous actions, error messages. It reads the current state of the world.

Think — The LLM "brain" reasons about what to do next. Should it search the web? Call an API? Ask the user a clarifying question? It forms a plan.

Act — The agent executes: it calls a tool (search engine, database, email API, code interpreter), sends a message, writes a file, or takes any action it has access to.

Learn — The agent observes the result. Did the API return an error? Did the search find what was needed? It incorporates this feedback and loops back to "perceive" with updated context.

This loop repeats — sometimes dozens of times — until the agent either achieves the goal or determines it can't. The key difference from a chatbot: a chatbot runs this loop once (prompt in, response out). An agent runs it as many times as needed.

🔑Tools are what give agents their superpowers
An LLM alone can only generate text. But when you give it **tools** — functions it can call — suddenly it can search the web, query databases, send emails, write and run code, book flights, update spreadsheets, and interact with any system that has an API. The LLM is the brain. Tools are the hands. An agent is both working together.

Tools and function calling: how agents interact with the world

When you ask ChatGPT "What's the weather in Tokyo?", the LLM doesn't know the weather. Instead, the system recognizes this requires real-time data, calls a weather API, gets the result, and weaves it into a natural-language response. That's function calling — and it's the mechanism that makes agents possible.

Here's how it works under the hood:

StepWhat happensExample
1. User requestYou state a goal"Cancel my subscription to StreamMax"
2. PlanningAgent decides which tools to useNeeds: account lookup, subscription API, email sender
3. Tool callAgent invokes a function with parameterslookup_account(email="[email protected]")
4. ResultTool returns data{ subscription: "StreamMax Pro", status: "active" }
5. Next actionAgent uses result to make next callcancel_subscription(id: "sub_12345")
6. ConfirmationAgent reports back to user"Done. Your StreamMax subscription is cancelled. You'll have access until March 31."

The agent didn't write a persuasive paragraph about cancellation policies. It cancelled your subscription. That's the difference.

Types of AI agents

Not all agents are created equal. They range from simple to terrifyingly capable:

TypeWhat it doesExampleAutonomy level
ConversationalHandles dialogue with access to toolsCustomer support agent that checks order statusLow — human stays in control
Task-basedCompletes specific workflows end-to-end"Schedule a meeting with the team next week"Medium — executes within guardrails
ResearchGathers, synthesizes, and reports on information"Analyze our competitors' pricing strategies"Medium — reads extensively, reports back
CodingWrites, tests, debugs, and deploys codeGitHub Copilot, Cursor, DevinMedium-High — makes implementation decisions
AutonomousPursues open-ended goals with minimal oversightAI scientist designing and running experimentsHigh — plans and acts independently
🔑AI agent adoption is growing fast
A majority of enterprise companies are exploring AI agent deployments (multiple industry surveys, 2024–2025). Interest has grown explosively — searches for "AI agent" have surged since 2023 (per Google Trends). The agent market is projected to reach tens of billions of dollars by decade's end (various analyst estimates).

Real-world agents you can use today

This isn't science fiction. These agents exist right now:

Coding agents like GitHub Copilot, Cursor, and Claude Code don't just suggest the next line. They read your entire codebase, understand context across files, write multi-file features, run tests, fix bugs, and iterate until the code works. Developers using coding agents report significant productivity gains — some report 2–3x faster on specific tasks, though controlled studies typically show 30–55% improvement on defined work (GitHub/Microsoft Research, 2022–2023; verify against the most recent published studies as this area is evolving rapidly).

Customer service agents in production deployments handle anywhere from 60–80% of routine support tickets without human intervention (varies significantly by deployment and industry). They look up accounts, process refunds, track packages, reset passwords, and escalate to humans only when needed. Klarna claimed in a February 2024 press release — a self-reported figure that was not independently verified — that its AI handled the work of 700 full-time support staff, a claim the company later complicated when it resumed hiring for customer service roles in September 2024, illustrating why vendor claims require independent scrutiny.

Research agents like Perplexity and Deep Research read dozens of sources, cross-reference claims, synthesize findings, and produce cited reports. A research task that took an analyst a full day now takes 10 minutes.

Personal assistants manage email, schedule meetings, draft follow-ups, and organize your day. They connect to your calendar, email, and task manager — acting on your behalf rather than just suggesting actions.

Multi-agent systems are where things get wild. Instead of one agent doing everything, you orchestrate teams of specialized agents. One agent researches, another writes, a third edits, and a fourth fact-checks — just like a human team, but operating in seconds.

Multi-agent systems: teams of AI

The most powerful agent setups don't use a single agent — they use many agents collaborating. Think of it like a company org chart, but every employee is an AI:

RoleWhat this agent doesWhy it's separate
PlannerBreaks the goal into sub-tasks and assigns themNeeds high-level reasoning, not specialized tools
ResearcherSearches the web, reads documents, gathers dataOptimized for information retrieval
WriterDrafts content based on research findingsOptimized for generation and tone
CriticReviews output for errors, bias, and qualityChecks work without the bias of having created it
ExecutorTakes final actions (sends emails, updates databases)Has permissions and tool access the others don't

Why not have one super-agent do everything? For the same reason companies don't have one employee do everything: specialization produces better results. A "researcher" agent can be given read-only access to sensitive databases without the risk of it accidentally modifying data. A "critic" agent evaluates writing more objectively when it didn't write it.

Multi-agent systems are already being used for software development (one agent writes code, another writes tests, a third reviews), content creation (research, draft, edit, fact-check), and business operations (monitor metrics, identify anomalies, draft reports, alert stakeholders).

There Are No Dumb Questions

"If agents can do all this, why do I still need to do anything?"

Because agents are tools, not replacements. They're exceptional at structured tasks with clear success criteria — "book a flight under $500," "fix this failing test," "summarize these 50 documents." They struggle with ambiguity, creative judgment, ethical nuance, and anything requiring genuine understanding of human context. You set the goals, define the constraints, and make the judgment calls. The agent handles the execution.

"What's stopping an agent from going rogue and doing something I didn't ask for?"

Good agent design includes guardrails: confirmation steps before irreversible actions, scoped permissions (an agent can read your calendar but not delete meetings), spending limits, and human-in-the-loop checkpoints. The best agents ask before acting on anything consequential. The worst ones... well, that's why guardrails matter.

⚡

Agent or Chatbot?

25 XP
chatbotagent
You ask an AI "What are the best Italian restaurants near me?" and it gives you a list with descriptions.
You tell an AI "Book a table for two at the highest-rated Italian restaurant near me at 7pm tonight" and it makes the reservation.
You paste code into an AI and ask "What's wrong with this?" and it explains the bug.
You tell an AI "Fix the bug in my authentication module, run the tests, and push the fix to GitHub." It does all three.
You ask "What should our Q2 marketing strategy be?" and it writes a 2,000-word analysis.
You say "Execute our Q2 marketing strategy: schedule the social posts, set up the email campaigns, and brief the design team." It does it.

2. You tell an AI "Book a table for two at the highest-rated Italian restaurant near me at 7pm tonight" and it makes the reservation. →

0/6 answered

The anatomy of an agent: what's actually inside

When engineers build an AI agent, it typically has these components:

LLM core — The language model (GPT-4, Claude, Gemini, o1/o3, or other frontier models) that handles reasoning, planning, and natural language understanding. This is the "brain."

System prompt — Instructions that define the agent's role, personality, boundaries, and goals. "You are a customer support agent for Acme Corp. Never promise refunds over $500 without human approval."

Tool definitions — A list of functions the agent can call, with descriptions of what each one does and what parameters it needs. The LLM reads these descriptions and decides which tools to use.

Memory / context — Short-term (conversation history) and long-term (stored facts, user preferences, past interactions). Without memory, every interaction starts from zero.

Orchestration layer — The code that manages the perceive-think-act-learn loop: sending prompts to the LLM, parsing tool calls, executing them, feeding results back, and handling errors.

The surprising part: the LLM itself doesn't "know" it's an agent. It's just doing next-token prediction — the same thing it does in a chatbot. The agent behavior emerges from the orchestration layer around the LLM, which feeds it context, parses its responses for tool calls, executes those calls, and loops the results back in. The magic isn't in the model — it's in the loop.

Why 2025-2026 is the "year of agents"

AI agents aren't new as a concept — researchers have studied them for decades. So why is now the moment they're exploding? Three forces converged:

1. LLMs got good enough to reason. Early language models couldn't plan multi-step actions reliably. Models like GPT-4, Claude, Gemini, and newer reasoning-focused models like o1/o3 can break down complex goals, handle unexpected errors, and adjust plans on the fly. The "brain" finally got smart enough to drive the "body."

2. Tool ecosystems matured. There are now standard protocols (like function calling and the Model Context Protocol) that let agents interact with thousands of external services — from Slack to Salesforce to your company's internal database. The plumbing is in place.

3. The economics work. Running an agent that makes 20 API calls to complete a task costs pennies. Paying a human to do the same work costs dollars or hundreds of dollars. At enterprise scale, this math is irresistible.

⚠️The capability gap is closing fast
In early 2024, agents could handle about 3-5 step tasks reliably. By late 2025, frontier agents handle 20-50 step workflows with error recovery (directional; specific benchmarks evolve rapidly as models improve). The gap between "what agents can do" and "what humans do at work" is narrowing faster than most people realize. This doesn't mean mass replacement — it means mass augmentation. The professionals who learn to orchestrate agents will have an enormous advantage.

Risks and limitations

Agents aren't magic, and they come with real risks you need to understand:

Compounding errors

When a chatbot makes a mistake, you get a wrong answer. When an agent makes a mistake, it might act on that wrong answer — and then build subsequent actions on top of it. One bad assumption early in a 20-step plan can cascade into a mess. This is why human checkpoints matter.

Security and permissions

An agent with access to your email, calendar, and bank account is powerful — and dangerous. If compromised, an attacker could act as you. Agent security is an unsolved problem that the industry is racing to address.

Over-reliance and deskilling

If agents handle all your routine work, you might lose the skills to do it yourself. A developer who never debugs manually may struggle when the agent fails. Balance automation with maintaining core competencies.

Accountability gaps

When an agent makes a consequential mistake — sends the wrong email, cancels the wrong subscription, deploys buggy code — who's responsible? The user who gave the instruction? The company that built the agent? The answer is still evolving legally and ethically.

Hallucination in action

Chatbot hallucinations produce wrong text. Agent hallucinations produce wrong actions. An agent that confidently "finds" a nonexistent API endpoint and builds a workflow around it wastes time and creates broken systems.

⚡

Design an Agent's Guardrails

50 XP
You're building an AI agent that helps HR teams onboard new employees. The agent can: create email accounts, assign software licenses, schedule orientation meetings, send welcome emails, and add employees to Slack channels. Design the guardrails for this agent: 1. Which actions should the agent do **automatically** without asking? Why? 2. Which actions should require **human approval** before executing? Why? 3. What's one action the agent should **never** be allowed to do, even if asked? 4. What happens if the agent encounters an error halfway through onboarding? *Think about reversibility: actions that are easy to undo (like sending a Slack message) need fewer guardrails than actions that are hard to undo (like creating accounts with access to sensitive systems).*

What this means for your career

The rise of agents changes the value equation at work. Here's the shift:

Skill that loses valueSkill that gains value
Manually executing repetitive workflowsDesigning and orchestrating agent workflows
Gathering information from multiple sourcesEvaluating and synthesizing agent-gathered information
Writing boilerplate code, emails, reportsDefining clear goals, constraints, and success criteria
Being the person who "knows how the systems work"Being the person who decides what should happen and why

The career advice is clear: move up the abstraction ladder. The people who thrive will be the ones who can clearly articulate goals, break complex problems into agent-executable tasks, evaluate outputs critically, and handle the judgment calls that agents can't make.

You don't need to become an AI engineer. But you do need to become someone who can work with agents — the same way you didn't need to become a programmer to use spreadsheets, but you did need to learn what spreadsheets could do.

New roles are already emerging: Agent Designer (defines agent goals, constraints, and tool access), Prompt Engineer (crafts the system prompts that shape agent behavior), AI Operations (monitors agent performance, catches failures, tunes workflows). Even if your title doesn't change, the content of your work will shift toward higher-level orchestration.

The bottom line: the question isn't "will agents affect my job?" It's "am I learning to direct agents, or am I waiting to be replaced by someone who does?"

Back to the Lisbon trip

You typed one sentence — "Plan a long weekend in Lisbon for two" — and the agent booked flights, reserved a hotel, built an itinerary, and emailed your partner. No clicking through travel sites, no comparing prices in twelve tabs, no copy-pasting confirmation numbers. The agent did what a chatbot could only describe: it took your goal and executed a multi-step plan to achieve it. That shift — from AI that talks to AI that does — is why agents are the biggest change in technology since the smartphone.

Key takeaways

  • AI agents take action, not just generate text. They perceive, think, act, and learn in a continuous loop until a goal is achieved.
  • Tools and function calling are what separate agents from chatbots — they give the LLM "hands" to interact with the real world.
  • Types range from simple conversational agents to fully autonomous systems that plan and execute complex workflows independently.
  • 2025-2026 is the breakout moment because LLMs are smart enough to plan, tool ecosystems are mature, and the economics make sense.
  • Real agents exist today — coding assistants, customer service bots, research tools, and multi-agent systems are already transforming work.
  • Risks are real: compounding errors, security exposure, deskilling, and accountability gaps require thoughtful guardrails and human oversight.
  • Career implication: the value is shifting from doing tasks to orchestrating agents that do tasks. Learn to define goals, set constraints, and evaluate results.

?

Knowledge Check

1.What is the fundamental difference between a chatbot and an AI agent?

2.What is 'function calling' in the context of AI agents?

3.Why is 'compounding errors' a bigger risk for AI agents than for chatbots?

4.A company wants to deploy an AI agent that processes customer refunds. Which guardrail approach is most appropriate?

Previous

What Is Deep Learning?

Next

What Is Agentic AI?