What is Sandboxing? Why Do AI Agents Need Sandboxes?

As AI agent development continues to mature, the term sandboxing has returned to discussions in the community. While sandboxing is not a new concept in the software industry, it existed long before the AI agent era. However, due to the nature of AI agents, this technology has become an essential component when developing AI agent-related products.

In this article, we will discuss what sandboxing is, why we need it, and how to build a basic AI agent sandbox.

What is Sandboxing?

The term sandbox was borrowed from real life into the software world. Dating back to the 19th century, Germans would build sandboxes in their gardens for children to play in. The greatest benefit of a sandbox for parents is that it ensures sand stays contained within the sandbox. No matter how enthusiastically children play, parents don't need to worry about sand spreading everywhere.

Furthermore, if children get the sandbox too dirty while playing, parents simply need to replace the sand with new, clean sand. Because the sandbox is limited in scope, the time and cost to change the sand are minimal.

When this concept is applied to software development, sandboxes in the software world also provide an isolated environment. This ensures that if something goes wrong within the sandbox (such as executing malicious code), the impact is contained within the sandbox boundaries. There's no need to worry about it affecting the wider environment. Additionally, if something does go wrong in a sandbox, you simply need to shut it down and create a new one.

In fact, the term sandbox extends beyond software development. In technology innovation policy, you often hear the term "regulatory sandbox." A regulatory sandbox is an environment where startups trying to innovate can operate within certain boundaries without being subject to current laws and regulations.

Many technological innovations require breaking existing frameworks. In other words, they often necessarily conflict with current regulations or exist in areas where no regulations apply. For example, when Uber first expanded globally, the technology itself was actually the easiest part to achieve. Subsequent expansion involved conflicts with various governments and regulations in different countries.

If regulations were too rigid, innovation could be slowed down. But if regulations were completely relaxed without proper oversight, problems could have far-reaching negative consequences. Through regulatory sandboxes, tech startups can experiment freely. If something goes wrong during these experiments, the impact is contained within the sandbox boundaries, and the entire system isn't affected.

Why Do AI Agents Need Sandboxes?

Sandboxing isn't a new concept. It became a focal point of discussion again with the rise of AI agents. This is because AI agents are more useful than simple chatbots in that they can help execute tasks. However, this same capability also makes them more dangerous than chatbots.

In the article Information Security Issues to Consider When Developing AI Products, we discussed the "lethal trifecta" problem with AI agents. If three conditions are met—allowing AI agents to access private data, allowing AI agents to contact untrusted data, and allowing AI agents external communication capabilities—security issues can arise.

To allow AI agents to execute tasks, permissions are typically granted (such as the ability to read and modify files). This satisfies the condition of "allowing AI agents to access private data." After all, when using AI agents, most developers wouldn't want to manually confirm the safety of each operation the agent performs, because that would be tedious and time-consuming. The point of using AI agents is to eliminate manual operations. If you have to confirm each action, you lose the benefit of using AI agents.

However, if the agent operates completely autonomously, doing whatever it wants to execute, and it executes some malicious code, it could contaminate the environment in which the AI agent runs. This could allow attackers to steal valuable information or maliciously delete files, causing irreversible harm.

In this context, sandboxes—which limit the scope of the environment—can contain the potential harm caused when AI agents are subjected to malicious attacks.

What Boundaries Should Be Drawn for an AI Agent's Sandbox?

After understanding how sandboxes can limit the potential harm an AI agent might cause, readers likely wonder "Specifically, what should be restricted? How should boundaries be drawn?"

To address this question, there are currently two common types of boundaries:

Filesystem isolation: Ensures that AI agents can only access or modify specific file directories, preventing agents from modifying sensitive system files when subjected to prompt injection attacks.
Network isolation: Ensures that AI agents can only connect to approved servers, preventing attacks from leaking sensitive information and preventing agents from downloading malicious code that could affect the entire system.

These two types of isolation address different domains—one is local (file isolation) and the other is remote (network isolation). Both types of boundaries must exist simultaneously; neither can be missing. Even with strong network isolation, if filesystem isolation is lacking, a compromised agent could still escape the sandbox and gain network access. Conversely, with good filesystem isolation but poor network isolation, a compromised agent could leak sensitive files (such as SSH keys).

If you're interested in the sandboxing topic, or interested in how Anthropic's open-source sandbox-runtime is implemented, these topics are covered in detail in the E+ Growth Program.

For readers interested in deeper understanding of this topic and other frontend/backend development, software engineering, and AI engineering topics, we invite you to join the E+ Growth Program (link).

What is Sandboxing?

Why Do AI Agents Need Sandboxes?

What Boundaries Should Be Drawn for an AI Agent's Sandbox?

Read More