Level 4: Agentic AI - When AI Runs Without You

May 28

Published April 28, 2026 · The Human Question · By Rob Gonzales, CPA, PhD

There is a moment, after you have spent enough time with workflows, when you start asking a different question.

You stop asking, “how do I prompt this better?” You stop asking, “what context should I give it?” You stop even asking, “what should the steps be?”

You start asking, “why am I the one running this every morning?”

That is the question that takes you to agents.

What an agent actually is

An agent is an AI that takes actions, not just answers questions. It can use tools. It can read files. It can write files. It can browse the web, send messages, run code, query a database, schedule a task. It can chain those actions together to accomplish something with an outcome attached.

The chat model writes you a draft. The agent sends the email.

A chat model produces text. An agent produces effects.

The year that made agents real

For most of 2024, “agents” were a research demo. Impressive in a controlled environment, brittle in the wild. By the end of 2025 that had changed. Three things converged.

First, models got reliable enough at long-horizon tasks that they could complete work over many minutes or hours without losing the thread. The error rate per step dropped low enough that twenty steps in a row started to work most of the time.

Second, the tool layer matured. The Model Context Protocol, computer-use capabilities, browser-driving APIs, scheduled-task systems, the ability for a model to call structured tools without falling over. The plumbing was finally there.

Third, the products. Coding agents that ship pull requests. Browser agents that fill out forms. Desktop agents like Cowork that organize files. Daily-briefing agents that read your inbox and write you a memo before you wake up. Agents stopped being a demo and started being a category.

My morning briefing, as an example

I run a small agent every weekday morning. It is not a person. It is a scheduled task. Here is what it does, while I am still in bed.

1. Reads a curated list of AI news sources, podcasts, and a few academic feeds.

2. Filters for the items that match the topics I care about — accounting AI, audit, education, workflow design.

3. Drafts a summary in my voice, the tone I have already taught it, with the structure I always use.

4. Renders the summary as a one-page PDF.

5. On Fridays, generates a video script ready for an avatar tool.

6. Sends me the result by 7 AM.

I did not write a prompt this morning. I did not even open the laptop. The work was done by the time my coffee finished brewing. That is what agentic AI looks like in practice.

I have built and run a few of these in my own work. The morning briefing is one. There are others. Each one took an afternoon to set up and paid back the time within the first week. The pattern that holds them together is the same: the agent runs, the human checks, the work compounds.

Where agents work today

Research and synthesis

Long, multi-source research tasks. Pull twenty papers on a topic, synthesize the trend, surface the disagreements. The agent runs for ten or twenty minutes. You read the report.

Coding

Coding agents now ship real changes. They open a branch, write the code, run the tests, file the pull request. The human reviews. This is the area where the productivity gains are largest and hardest to argue with.

Operations

Inbox triage, calendar wrangling, expense categorization, weekly reporting. The boring layer of knowledge work that nobody wants to do. Agents do not mind.

Content production at scale

Daily briefings. Status updates. Client recaps. Anything that has a stable structure and a fresh data input. Build the agent once, run it forever.

Where agents still struggle

Agents are not magic. They fail in predictable ways.

1. Ambiguous goals. If you cannot tell the agent exactly what success looks like, the agent cannot tell either.

2. Compounding errors. A 95 percent accuracy per step is fine for one step. For twenty steps in a row, it lands at 36 percent.

3. Trust and authority. Anything that touches money, sends external messages, or makes legal commitments still needs a human at the gate. Always.

4. Context limits. Even with bigger windows, agents lose the thread on truly long tasks. Designing the workflow to have natural breakpoints matters.

The hierarchy, complete

Step back and look at the four levels together.

1. Prompt. How clearly you ask.

2. Context. What the AI knows when you ask.

3. Workflow. How the work is broken into steps and gates.

4. Agent. When the work runs without you.

Each level depends on the one below it. You cannot build a reliable agent on top of a broken workflow. You cannot build a real workflow without a real context. You cannot build context without first writing a prompt that respects the medium.

Most people start at the top, fail, and quit. The work is in the bottom three levels. The visible payoff lives in the fourth.

Three rules to take with you

1. Earn the agent. Do not deploy an agent for work you cannot do by hand. The agent inherits your process, including its flaws.

2. Keep the human at the gate. For anything that has consequences in the world, the human approves before the action is taken. No exceptions.

3. Start small, ship one. One agent, one job, one schedule. Get it working. Then build the next.

The real lesson

If prompting is communication, and context is knowledge, and workflow is process, then agentic AI is what happens when those three layers compound.

The point of all of this is not to replace anyone. The point is to take the parts of your work that you would never miss and let them run themselves, so the parts you actually care about can have all of you.

Even the most autonomous agent runs in a system where the human is still the one who sets the rules and reviews the output. The freedom comes from the structure, not from giving up control.

Coming next: Level 5 — Workflow Augmentation: When AI Stops Being a Tool and Becomes a System

thehumanquestion.org

Roberto Gonzales