jgandrews.com


Colossal Cave Adventure... In Go

Tl;DR

Colossal Cave

Adventure or Colossal Cave Adventure is a text adventure game, originally written in Fortran for the DEC PDP-10 in 1976 by Will Crowther. Of course, like all best things in tech, it started life on on of Digital Equipment’s PDP line of computers.

PDP-10

Adventure is a text adventure where the player explores the titular colossal cave. Solving puzzles, encountering dwarves, pirates and a dragon. Collecting treasure along the way to escaping the cave. It is played on a terminal and is the mother of all adventure games. Indeed the name ‘adventure game’ comes from Adventure itself.

Adventure could be found on many a mainframe or mini-computer all through the mid 70’s and 80’s and has since been ported to almost every platform. Including a recent graphical 3D version created by Roberta and Ken Williams, icon’s of point and click adventure genre. There is even a lovely port to the Pico-8 virtual console

Thankfully, versions of the original Fortran source code have been recovered and can be found in many place on the internet. A key source is the archive of various ports and source material can be found on Rick Adams’ website. From which convenient GitHub repositories have been created, two prominent repositories are this one and this one.

If you are dedicated, you can spin up a PDP-10 using the amazing SimH machine simulator and run the original 1976 version of the game ot experience it as a was in all its original glory.

If you are even more crazy about this sort of stuff, you can do what I did and build your own PDP-10 replica console from a kit, in order to get the most authentic SimH and PDP-10 experience!

PDP-10 Kit PDP-10 Kit Finished

As mentioned on Rick Adams site, the Fortran code base does not make for a good source for easy porting as it relies on quirks and intricacies of the PDP-10 to work. Thankfully, Eric S. Raymond, open Source advocate, author of ‘The Cathedral and the Bazaar’, editor of the Jargon File and now days a somewhat controversial character, ported the original Adventure to C and released it as ‘open-adventure’.

The C Version can easily be compiled and run on any modern machine. Making it an easy option if you want to experience Colossal Cave as close as possible to the original.

Given this version is a direct port from the Fortran, it can is probably the best version to use as the basis for subsequent ports that want to remain faithful to the original but don’t want to deal with the headache of translating Fortran. So this is exactly what I did.

My History with Adventure

Colossal Cave has a special place in my history with both computers and computer games. I think it is probably what hooked me on both. Back in the 1980’s my father was a mainframe operator at the University of Canterbury in Christchurch New Zealand. Back in the day when students submitted programs as massive stacks of punched cards (contributing my fascination with those as well I assume) and the operators would run them in while also tending to the needs of the mainframe.

As a kid I visited dad at work on occasion and it was like visiting the future. The core of the computer centre was separated into two area’s. There was the area where the operators worked, where all the consoles and punch card readers were. The other areas was the dinosaur pen where the computers lived. Separated by a windowed wall to isolate the operators from the noise of the machines.

I could peer into the computer room and look at the blinking lights on the machines. I have a vivid memory of the machines blinking away with spinning tape drives, massive disk drives that took multi platter disk packs and a trolley with an oscilloscope on it to dive deeper into faults indicated by the blinking lights. It looked like and was the height of technology.

On one such visit, Dad sat me down at a console and loaded up ‘Adventure’ and left me to it. I had never seen such a thing! I was presented with a prompt and I could type instructions to the computer and it would describe a world. Many things struck me about this. How was this world constructed? How was the computer able to interpret what I was instructing it to do? How do you get out of this maze of little twisty passages, all alike?. Yes it was all text, but in my imagination the ‘cheerful little bird’, the ‘fierce snake’ and the ’threatening little dwarf’ were all vividly rendered.

I was fascinated by both the gameplay and the mechanics of the world building.

Dad would often have to be on call for work which facilitated bringing home a teletype machine with an acoustic coupler attached. I wish I could find a picture, but I couldn’t. It looked a lot like the teletype shown sitting in front of the PDP-10 in the picture above.

It had no screen, just linefeed paper for output. The acoustic coupler was used to strap the phone handset to it to allow the terminal to talk to the computer via a 300 baud modem connection. The picture below illustrates what this was like. The teletype my dad used had one integrated into the side of the terminal.

Accoustic Coupler

On some lucky evenings I was allowed to play Adventure at home. This was, I suppose, my first experience with online gaming! It also contributed to my later obsession with BBS’s and early online forums like FidoNet (more on this to come)

So the Colossal Cave Adventure has a special place in my nostalgic heart. In 2025 I wanted to revisit it. I wanted to take a deeper look and understand it. So I decided to do what I did with Elite, another seminal game in my gaming history, and rewrite it in Go

goAdventure

All the code and various pre-compiled binaries for goAdventure can be found on my GitHub

Making the Dungeon

The first simple challenge of this port was to update the code that generates the dungeon. The original game used a text database which used a lot of ‘inscrutable numeric literals’ to build the dungeon and establish the relationships between locations, characters and items in the game. Eric S. Raymond’s C port of Adventure, replaces this database with a yaml file that defines more coherent yaml structures. The yaml file is then used as input to a python script that generates the C code that describes the dungeon.

It was relatively straightforward to change this python script to generate Go rather than C. In the Go version, make_dungeon.py generates the dungeon package which is imported by the game.

The Game Skeleton

For my version of the game I separated the main game ’engine’ into the advent package. Here I setup the game struct, mirroring the C version. All game state is maintained within this monolithic game struct. Then all of the game functions are defined as methods on game. This allows me to implement Adventure with either a classic text only interface or using the Terminal User Interface (TUI).

Input is read from the terminal, tokenised into a structured command using a look-up tables for MOTION, OBJECT, ACTION or NUMERIC. The command is then actioned and game state is updated. This establishes the main game loop.

The simple 2 word only (get lamp, drop rod etc) structure of the command processor makes it easier to understand how the tokenisation and command logic works. I have never really looked at how this is done and Adventure was an excellent way to learn about an approach. The player input must be processed into something deterministic that the game can action. This is what the commandProcessing code does.

Since the entire game has relatively few locations and objects (by today’s standards), everything is easily handled procedurally with tight coupling of game logic and game state. There is explicit handling for every location and state.

The code at times takes some work to understand the flow, but it is on the whole quite simple.

My Minions Claude and Copilot

I implemented the main game loop, state management, command tokenising and command processing logic by hand as I had done with Elite. Once I had it all working, the task of translating all the distinct actions and locations became quite laborious quite quickly. I was looking forward to implementing the fun bits like the TUI. It was taking me hours to do just the basic translation work. So here is where AI saved the day. Happily I was able to use it without totally ruining the experience and achievement.

With Claude Opus 4.5 at my finger tips, I expect it would have been quite simple to just point Claude at the C code and say ‘Translate this game from C to Go’ and Claude would have done the whole thing for me. Probably better than I did. But then it wouldn’t be my version of the game. It would have been a lot of incoherent (to me) generated code that would have ticked the ‘make this thing into a Go app’ box, but not the ‘make my own version of the game and understand the original’ box. So I am glad I explicitly wrote the core game engine and logic myself.

Once I had setup the game code the way I wanted it, I could then use Claude code and Copilot CLI. I could switch between Copilot CLI and Claude Code quite easily, making the most of the tokens available to me! Claude Code is a more sophisticated tool than Copilot CLI, with its sub-agents, skills and plug-in’s but for the basic drudge work of translating the bulk C code, both tools were more than up to the job. This approach allowed me to use the AI assistants for long periods when needed to do the ‘donkey work’ of translating the rest of the action handling code from C to Go keeping to my game framework. This worked spectacularly well and I was able to accelerate completion of a basic working version of the game by days.

Another place the AI tools accelerated me was in fixing annoying corner case bugs as I play tested the game. The speed at which AI can isolate the cause of a bug hiding deep in the game logic is a real timesaver. Tracking down esoteric issues is hair pulling, frustrating and leads to giving up. It’s a real game changer to have AI tools as an unblocker here.

This allowed me to focus on executing on my basic vision for the TUI. Which I enjoy more than the other grind.

Terminal User Interface

My absolute favourite Go packages is the Terminal User Interface (TUI) framework Bubbletea from Charm. I love to find excuses to use it. So of course I had to use it for this version of Adventure.

Starting the game with no command line options, drops you into the TUI:

Adventure TUI

At first I built a main screen with lots of panes and windows for different aspects of the game, but I quickly realised that overdoing it detracted from the intended way to experience of the game. So I stripped it right back to just five key elements:

  • Description of the current location and items. A simple affordance to mitigate the need to constantly ‘LOOK’
TUI Location
  • Inventory and Recent Moves. The Inventory pane is a nice affordance I think. Letting the player avoid many tedious invocations of ‘INVENT’. I also realised that because the command output isn’t echoed to the game output in the TUI version, it easy for a player to lose track of what moves they have made recently, so the recent moves pane is important. Although the game doesn’t necessarily allow you to logically back track once you get lost in a maze or in the woods at the start of the game. So recent moves are necessarily always helpful for pathfinding!
TUI Invent
  • Main Game Output
TUI Ouput
  • Command Entry with Score and Turns output on the footer
TUI Command

This kept it minimal and practical. I then used AI agents to do the one thing I do find laborious and that is styling the UI elements. I set-up the style I wanted and got the agents to complete the styling.

I am still not 100% happy with the TUI, but its a good start and I can refine from here.

You can still play the game without the TUI by passing the -notui option on the command line

OpenTelemetry Tracing

There are a couple of unique features of this version of Adventure. The first of which is the novel addition of OpenTelemetry Tracing to the game.

Site Reliability Engineering with Observability principles such as Tracing, Error Budgets and Service Level Objectives has been part of my day job for the last five years. So I thought I ought to add telemetry to at least one of my own toy applications.

While this is far from a pure observability driven implementation, it is a lot of fun. If you start the game with the -trace option, as you traverse the locations in the game world, the game will emit Spans that can be collected by an OpenTelemetry Collector and sent to a telemetry store and tracing backend. These spans then log the players progress through the game in a unique way.

I played around with lots of ways to trace the game and most of them just didn’t represent very well. I settled on simply emitting spans for each location a player visits. This isn’t strictly what you want in a ‘proper’ trace where you’d have a waterfall of spans as a single trace, but as the game loop executes this just results in an endless waterfall which doesn’t work very well. I did play around with some more sophisticated trace output where traces were ended at key points and then new traces started, but it was quite messy. Like the TUI, I will probably revisit this and have another go at making more interesting traces. That said, I think the current implementation is pretty neat.

The easiest way to see this in action is to use the Jaeger All-in-one docker container. This spins up a collector and the OpenTelemetry compatible Jaeger distributed tracing platform. Simply run:

docker run -d --name jaeger -p 16686:16686 -p 4318:4318 jaegertracing/all-in-one:latest

Then browse to http:\\localhost:16686. The service is called goAdventure and rather than there being a few complete traces containing a number of spans, you’ll see a complete abuse of tracing, a large number of small traces with just a few spans.

Traces

Each trace plotting a turn or location within the game. Distributed tracing this is not! But it is fun.

Trace Detail

Large Language Models vs the Colossal Cave

The other feature I implemented was to get an Large Language Model (LLM) to play the game. If you have a reasonable machine with ollama install and a few local models available you can get them to play the game by starting the game with the -ai command line flag using the -model option to choose a model you’d like to use. For example:

./goAdventure -trace -ai -model qwen2.5:14b

Will have the qwen2.5 14 billion parameter model play the game. Claude suggested llama3.1:8b, mistral:7b and qwen2.5:14b models as good models for ‘instruction following’ and may be good players. I tried all of them as well as deepseek-r1:8b and 70b. I can run the 70b version on my mac (slowly) but it failed to play the game due to the having to set to long a timeout and to small of a context window for it to work.

I assumed the models would have Colossal Cave somewhere in there training data and while they might stumble a bit they would probably have the statistical weights correctly represented to solve the game. I was wrong.

All of the local models were rubbish at the game. Even with:

  • A detailed system prompt
  • A walkthrough of the start of the game to dial in the structure and instruction model
  • A basic reinforcement learning system
  • Some heavy hints
  • Exposure to more game state than a human player gets
  • Explicit instructions on how to issue commands

The LLM would get lost and start to thrash almost immediately.

Here is an example. qwen2.5:14b playing without the TUI:

LLM Struggles 1

With the walkthrough instructions, the LLM has made it to the cave and found the bird. It recognises the hint in its prompt about catching the bird, but then completely ignores the explicit instructions in its prompt to use only 2 word command phrases and ignores the game telling it the same!

Here is the LLM playing with the TUI enabled:

LLM Struggles 2

Despite the walkthrough and feedback from the game, the LLM gets lost on the way to the cave and just starts to do nonsense things.

I suppose this makes sense when you think about what is going on. There are no embeddings in models that make sense of even the simple path through the game.

LLM’s are of course not simple drop in replacement for game playing ‘AI’s’. I didn’t think they would be, but in this case I thought maybe they would know about Colossal Cave enough to be able to issue statistically accurate commands in sequence to get some way through the game. This is not the case.

It’s a good reminder that these LLM’s are not as adaptable or as good at reasoning as we trick ourselves into thinking they are or even close to general intelligences. It’s amazing to think that these models could comfortable write this game, to all appearances understanding how the code and the game mechanic works, but completely fail to play it!

I tried to tune it a lot with the help of Claude code, but nothing made much difference. I am sure there are clever things I (or Claude) could do to make it better. Here is what Claude said when I asked it why LLM’s were so bad at this:

Claude on LLM’s

I had a further conversation with Claude because I wanted to understand better why LLM’s were so bad when there are bound to be walkthroughs in the training data. The answer is great for a broader understanding of the limit’s of LLM’s. Claude said:

The core issue: knowing about vs. knowing how

Training data almost certainly contains walkthroughs, discussions, and transcripts of Colossal Cave. But there’s a crucial difference between having seen text about something and being able to execute it coherently. The LLM has absorbed patterns like “XYZZY is a magic word” or “you need the lamp in the dark areas” as isolated facts, not as an executable procedure tied to game state.

State tracking is the killer

Text adventures are fundamentally state machines. Playing well requires:

  • Tracking your current location
  • Remembering your inventory
  • Recalling what you’ve already done
  • Planning actions based on all of the above

LLMs don’t maintain state between outputs. Each response requires reconstructing all of this from the conversation history, which degrades quickly. By turn 30, the model is struggling to keep track of where it is, what it’s carrying, and what it was trying to accomplish.

Retrieval doesn’t work the way you’d hope

Even if the model has “memorized” a walkthrough, it can’t just play it back step by step. It would need to know where in the walkthrough it is and adapt if anything goes slightly differently from the memorized version. One unexpected room description or parser message throws off the mapping entirely.

No feedback loop

Humans learn to play by trial and error: try something, observe results, adjust. LLMs have no such mechanism at inference time. They’re generating “plausible next commands” based on pattern matching, not systematically learning what works in this game session.

Parser brittleness

The model might generate “pick up the brass lantern” when the game wants GET LAMP. Small mismatches cascade into failures, and the model often can’t diagnose why its commands aren’t working.

Conclusion

This project was great fun. Getting to know the code of another seminal game in my gaming history was really great. Using AI coding tools to accelerate and unblock progress made it even more enjoyable and provided time and assistance with adding in some fun features.

The LLM plays the game feature was a revelation and super interesting.

All the code for the game and binaries are available on my GitHub