Building a Coding Agent in Swift, Part 1: The Agent Loop

A language model can reason about code — it can plan how to fix a bug, suggest a refactoring, or design a feature. But it can’t touch the real world. It can’t read files, run tests, or check whether its suggestion actually compiles. Without some kind of bridge, every interaction is a dead end: the model suggests something, we copy-paste it into a terminal, paste the result back, the model adjusts, and we do it all over again. We are the loop.

The entire point of a coding agent is to close that loop automatically. Give the model a way to execute commands, feed the results back, and let it keep going until it’s done. That’s what we’ll build in this guide — and it turns out the core mechanism is surprisingly small.

The complete source code for this stage is available at the 01-agent-loop tag on GitHub. Code blocks below show key excerpts.

The problem: we are the middleware

Let’s say we ask the model to create a file. Without an agent loop, the interaction looks like this: we send a prompt, the model responds with a shell command, we manually run the command, then paste the output back so the model can verify it worked. Every single tool use requires a human round-trip. For a task that involves ten commands, that’s ten manual copy-paste cycles.

What we want instead is a loop that does this automatically:

+--------+      +-------+      +---------+
|  User  | ---> |  LLM  | ---> |  Tool   |
| prompt |      |       |      | execute |
+--------+      +---+---+      +----+----+
                    ^                |
                    |   tool_result  |
                    +----------------+
                    (loop until stop_reason != tool_use)

The user sends one prompt. The model calls tools as many times as it needs — reading files, running commands, checking results — and only stops when it’s satisfied. One exit condition controls the entire flow.

Two loops, two jobs

Our agent actually has two loops, each with a distinct purpose. The outer loop is the REPL — it reads user input, hands it to the agent, and waits for the next prompt. The inner loop is the agent loop — it calls the API, executes tools, and keeps going until the model decides it’s done.

The REPL is the user-facing shell:

// Sources/cli/SwiftClaudeCode.swift
while true {
  print("\(ANSIColor.cyan)\(ANSIColor.bold)>\(ANSIColor.reset) ", terminator: "")
  guard let input = readLine(strippingNewline: true) else {
    break
  }

  let trimmed = input.trimmingCharacters(in: .whitespacesAndNewlines)
  if trimmed.isEmpty { continue }
  if ["exit", "quit", "q"].contains(trimmed.lowercased()) { break }

  do {
    _ = try await agent.run(query: trimmed)
  } catch {
    print("\(ANSIColor.red)Error: \(error)\(ANSIColor.reset)")
  }

  print()
}

This loop lives forever. Each iteration reads one line of input, calls agent.run(query:), and prints the result. The agent handles everything in between — however many API calls and tool executions that takes. When the agent returns, the REPL is back to waiting for the next prompt.

The critical detail: the messages array lives on the Agent instance, not inside run(). This means conversation history persists across REPL turns. The second prompt the user types has full context of everything the agent did for the first one. During development, we briefly moved messages to a local variable for “cleanliness” — and immediately broke multi-turn conversations. The REPL calls run() per input; if messages don’t survive between calls, the agent has amnesia.

The agent loop: one exit condition

The inner loop is the actual agent. Let’s walk through the mechanism before seeing the full implementation.

First, the user’s query becomes a message:

messages.append(.user(query))

Next, we send the full conversation — plus our tool definitions — to the API:

let request = APIRequest(
  model: model,
  maxTokens: 4096,
  system: systemPrompt,
  messages: messages,
  tools: [Self.bashToolDefinition]
)
let response = try await apiClient.createMessage(request: request)
messages.append(Message(role: .assistant, content: response.content))

Now comes the single branching point. We check stopReason — if the model didn’t ask to use a tool, we’re done:

guard response.stopReason == .toolUse else {
  return response.content.textContent
}

Otherwise, we execute each tool call, collect the results, and append them as a user message. Then we loop back to the API call:

var results: [ContentBlock] = []
for case .toolUse(let id, let name, let input) in response.content {
  let toolResult = await executeTool(name: name, input: input)
  switch toolResult {
  case .success(let output):
    results.append(.toolResult(toolUseId: id, content: output, isError: false))
  case .failure(let error):
    results.append(.toolResult(toolUseId: id, content: "\(error)", isError: true))
  }
}
messages.append(Message(role: .user, content: results))

Assembled into one method, this is the complete agent loop:

// Sources/Core/Agent.swift
public func run(query: String) async throws -> String {
  messages.append(.user(query))

  while true {
    let request = APIRequest(
      model: model,
      maxTokens: 4096,
      system: systemPrompt,
      messages: messages,
      tools: [Self.bashToolDefinition]
    )

    let response = try await apiClient.createMessage(request: request)
    messages.append(Message(role: .assistant, content: response.content))

    for case .text(let text) in response.content {
      print("\(ANSIColor.cyan)\(text)\(ANSIColor.reset)")
    }

    guard response.stopReason == .toolUse else {
      return response.content.textContent
    }

    var results: [ContentBlock] = []
    for case .toolUse(let id, let name, let input) in response.content {
      printToolCall(name: name, input: input)
      let toolResult = await executeTool(name: name, input: input)

      switch toolResult {
      case .success(let output):
        print("\(ANSIColor.dim)\(String(output.prefix(200)))\(ANSIColor.reset)")
        results.append(.toolResult(toolUseId: id, content: output, isError: false))
      case .failure(let error):
        let message = "\(error)"
        print("\(ANSIColor.red)\(message)\(ANSIColor.reset)")
        results.append(.toolResult(toolUseId: id, content: message, isError: true))
      }
    }

    messages.append(Message(role: .user, content: results))
  }
}

With that in place, we have a fully functional coding agent — and the entire mechanism fits in a single method. The branching point is one guard on stopReason. Everything else in this series layers on top of this loop — without changing it. Tools are the variable; the loop is the invariant.

Bash is all you need

We only give the model one tool: bash. That might seem limiting, but think about what bash can do — read files, write files, search codebases, run compilers, execute tests, install packages, manage git. A shell command is a universal interface to the operating system. The model decides what commands to run; we just execute them and report back.

In Swift, executing a shell command means wrapping Foundation’s Process:

// Sources/Core/ShellExecutor.swift
let process = Process()
let stdoutPipe = Pipe()
let stderrPipe = Pipe()

process.executableURL = URL(fileURLWithPath: "/bin/bash")
process.arguments = ["-c", command]
process.standardOutput = stdoutPipe
process.standardError = stderrPipe
process.currentDirectoryURL = URL(fileURLWithPath: cwd)

try process.run()

// Read pipe data BEFORE waitUntilExit() to avoid deadlock
let stdoutData = stdoutPipe.fileHandleForReading.readDataToEndOfFile()
let stderrData = stderrPipe.fileHandleForReading.readDataToEndOfFile()
process.waitUntilExit()

One thing we discovered during research that saved us from a nasty bug: pipe data must be read before calling waitUntilExit(). Foundation’s Pipe uses kernel buffers that are typically around 64 KB. If a command produces more output than that, the child process blocks on write() because the buffer is full, while the parent blocks on waitUntilExit() waiting for the child to exit. Neither side makes progress — a classic deadlock that would have been silent and hard to diagnose.

Message accumulation: the growing conversation

One pattern worth understanding is how the messages array grows during a single run() call. Let’s say the user asks “create a file called greeting.txt that says Hello World.” Here’s what messages looks like at each step:

[user("create a file...")] — we append the query
[user, assistant(tool_use: bash "echo ...")] — the model responds with a command
[user, assistant, user(tool_result: "")] — we execute it, append the result
[user, assistant, user, assistant(tool_use: bash "cat greeting.txt")] — the model verifies
[user, assistant, user, assistant, user(tool_result: "Hello World")] — we run cat
[user, assistant, user, assistant, user, assistant("Done! I created...")] — model is satisfied, stopReason is end_turn

Each API call sends the entire array. The model sees the full history of what it’s done and what happened — which is how it knows to verify the file exists after creating it, and how it knows to stop once everything looks correct. This accumulation is what gives the agent memory within a single task.

The cost is obvious: this array grows without bound. For now that’s fine, but eventually we’ll hit the context window ceiling. We’ll solve that in a later guide when we build context compaction.

Building the types

Since there’s no first-party Anthropic SDK for Swift, we also need to build the supporting types that make this loop work. The API client is a thin wrapper around AsyncHTTPClient — encode a Codable request as JSON, send it with the right headers, decode the Codable response. The interesting type decision is how we model the API’s polymorphic content blocks. Each block can be text, a tool use request, or a tool result, and Swift enums with associated values are a natural fit:

// Sources/Core/API/APIModels.swift
public enum ContentBlock: Sendable, Equatable {
  case text(String)
  case toolUse(id: String, name: String, input: JSONValue)
  case toolResult(toolUseId: String, content: String, isError: Bool)
}

Tool inputs are arbitrary JSON, so we model JSON itself as a recursive enum (JSONValue) with cases for every JSON type. These supporting types are verbose to set up — about 200 lines of Codable conformances and API models — but they’re plumbing we write once and never change. The agent loop above is the part that matters.

Taking it for a spin

Here’s the agent in action — a single prompt triggers multiple tool calls, with the loop driving the entire interaction:

swift-claude-code demo

If we build and run our agent now, we can try the kind of multi-step tasks that show the loop in action:

swift build && swift run claude

Try asking it to create a file called greeting.txt that says "Hello, World!" and watch the agent call bash, verify the result, and respond. Then try list all Swift files in this directory or what is the current git branch? — single-tool-call tasks that return immediately. For something more interesting, try create a directory called test_output and write 3 files in it — watch how the model calls bash multiple times, once to create the directory, then once for each file, checking results along the way. We typed one prompt; the agent ran four or five commands. That’s the loop doing its job.

What we’ve built and where we’re going

We now have a working coding agent — one loop, one tool, and an accumulating message history. The model decides what commands to run, our loop executes them and feeds results back, and a single stopReason check controls when to stop. This is the kernel that drives everything else in the series. Over the next seven guides, we’ll add more tools, task tracking, subagents, context compaction, and parallel execution — but this while true loop won’t change. We’ll only add entries to the tool list and injection points around it.

In the next guide, we’ll give our agent more than just bash — we’ll add read_file, write_file, and edit_file tools, and build a dictionary-based dispatch system that scales to any number of tools without touching the loop. Thanks for reading!