The Future of AI Agents
Last updated
Last updated
Every AI agent consists of two core conceptual components: A) planning a course of action and B) executing that plan (taking action). Planning a course of action is an iterative process that may incorporate the user's direct intent or sublayers of intent as the agent navigates a complex task with many embedded steps.
Currently, AI agents are falling apart at step A -- planning. Agents start spinning out of control, losing track of their original intent, or getting sidetracked with steps that aren't helpful to the original goal. Some believe that we may need new agent architecture to fix this. However, most AI researchers believe the primary issue is that the models simply are too small: the context windows are short & the models are not powerful enough.
As soon as GPT-5 or GPT-6 comes out, many believe that the current architecture for agent planning will start to work. The agents will have enough context length to think coherently over long sets of steps, and the models will be powerful enough to make a good plan of action.
Once models are bigger and planning is improved, executing that plan becomes the bottleneck. Agents executing actions requires accessing websites (browsing), authentication (e.g. getting access to credentialed databases), and importantly, payments.
Most agent actions online will involve coordinating with software/tools from other companies, using data owned by other companies, and utilizing models or agents owned by other companies. These interactions are all exchanges of value.
Currently, there is architecture for other types of agent actions like web browsing (e.g., Browserbase) and authentication (e.g., Anon), but there is no effective way for AI agents to make payments. As models improve, agent payments will become a primary bottleneck to AI agents functioning effectively. Without payment capabilities, AI agents will be unable to access the resources needed to complete tasks.