What if a Worm Could Make AI Agents Smarter?

AI coding agents are genuinely impressive right now. You give Claude or GPT a task, it reads your code, writes a fix, runs the tests. Awesome. But here's the thing that's been bugging me — they're behaviourally static. Same temperature, same approach, same level of caution whether they just broke the build or nailed it first try. It's like having a developer who never adjusts their approach based on what just happened.

That felt like a problem worth poking at. And the answer I landed on came from a pretty unexpected place — a roundworm with 302 neurons.

The static agent problem

If you've used AI coding agents for any real work, you've probably noticed this. They don't adapt mid-task. The agent doesn't get more cautious after breaking something. It doesn't get more exploratory when it's stuck. It just keeps doing the same thing with the same settings, tick after tick.

I've spent years managing engineering teams — at Vend, at Xero, at Cotiss — and the best engineers constantly adjust their approach. They slow down when things are fragile. They get creative when they're blocked. They know when to stop. Current AI agents don't do any of that.

So the question became: what if you could build a control layer that sits outside the AI and adjusts its behaviour in real time based on how things are going?

Enter c302

c302 is a research project I've been building. The name comes from the c302 model in the OpenWorm project — an open-source computational model of the Caenorhabditis elegans roundworm's nervous system.

C. elegans is a fascinating little creature. It's got exactly 302 neurons and roughly 7,000 synaptic connections, and it's the only organism where we've mapped the entire nervous system. Every neuron, every connection. That's it — that's the whole brain. And with those 302 neurons, it navigates its environment, finds food, avoids danger, and adapts its behaviour based on what's working and what isn't.

My project uses those neural dynamics as a behavioural controller for an LLM coding agent. The worm doesn't write code — it doesn't see code, prompts, or any LLM output at all. It sees five simple signals about what just happened (did tests pass? did the build break? how big was the change?) and emits seven parameters that constrain how the AI behaves on the next step — things like mode (diagnose, search, edit, test), temperature, token budget, and how aggressive the edits should be.

It's a closed loop. The AI acts, the outcomes get observed, a reward signal feeds back into the controller, and the controller's internal state shifts. Then the next set of behavioural parameters reflects that changed state.

Why a worm?

This sounds ridiculous, I know. But there's a serious idea underneath it.

C. elegans has had roughly a billion years of evolution to solve what's fundamentally a control problem — how do you steer behaviour adaptively with minimal hardware? Its nervous system doesn't have the luxury of scale. It can't throw more neurons at the problem. Instead, it's evolved incredibly efficient circuits for things like balancing exploration and exploitation, persisting on a strategy when it's working, and backing off when it's not.

Those are exactly the problems a coding agent faces. When should it keep editing? When should it stop and run tests? When should it try a completely different approach? When should it just stop?

The bet is that the control architecture biology evolved for navigating a physical environment might generalise to navigating a software engineering task. Not because worms and code have anything in common — but because the control problem is structurally similar.

How this differs from other C. elegans + AI work

There's actually a growing body of research connecting C. elegans neuroscience to artificial intelligence, and I want to be upfront about where my work sits relative to it — because the approach is quite different.

Most existing work uses the connectome to build AI. The Elegans-AI project used the connectome's topology to design neural network architectures for image classification. BAAIWorm, published in Nature Computational Science in 2024, built an incredible closed-loop simulation of the worm's brain, body, and physical environment — the worm navigating through a 3D fluid simulation. A recent Frontiers paper on translating C. elegans circuits into AI reviews the whole movement, and their framing captures it well: extracting functional principles from the connectome to design better neural network architectures.

All of that work asks: can the worm's brain make a better AI model?

My project asks something different: can the worm's brain control an existing AI model?

The LLM stays completely untouched. Claude Sonnet does all the reasoning, all the code generation, all the problem-solving. The connectome-derived controller sits outside it and modulates how it behaves — not what it knows. It's a control layer, not a model architecture.

BAAIWorm is probably the closest conceptual parallel. They've got the same closed-loop structure — brain modulates behaviour, environment provides feedback. But their environment is physics (fluid dynamics, muscle actuation, chemotaxis). Mine is a git repository with failing tests.

A disclaimer about who I am

I should be really clear about something: I'm not a neuroscientist. I'm not a research scientist of any kind. I'm a software engineer who's spent the last fifteen years building products and leading engineering teams in the NZ tech scene. My education is from a polytechnic in Christchurch — practical, not academic.

This is hobbyist research. Citizen science, if you want a fancier term for it. I had a question that genuinely fascinated me, I had access to open-source tools like the OpenWorm simulation models, and I had evenings and weekends. The FlyWire connectome project showed that amateur contributors can make real contributions to neuroscience — they credited volunteer researchers as authors on peer-reviewed papers. That gave me confidence that you don't need a PhD to ask interesting questions.

I'm going to share my findings honestly — including every limitation, every thing I got wrong, and every place where the data doesn't support the conclusion I wanted. I've run 167 experiments across three difficulty levels. The experimental pipeline is solid. But the sample sizes are small, and most individual comparisons aren't statistically significant at p<0.05. I'll be upfront about all of that.

What's coming in this series

This post is the first in a series where I'll walk through the entire c302 research project — what I built, what I found, and what it might mean.

The series will cover the experimental results across three difficulty levels — from a trivial single-function task where every controller succeeds, through multi-file coordination where differences start appearing, to regression recovery where the adaptive controller's error detection actually matters. I'll share the actual data, the traces, the cost breakdowns, and the honest limitations.

After that, the really interesting bit — Phase 2, where the hand-tuned synthetic controller gets replaced with actual connectome-derived neural dynamics from the c302 simulation. That's where the neuroscience question gets answered: does biological provenance actually add something that hand-tuning can't replicate?

The tools are all open source. The question — whether a billion years of evolved neural architecture can improve how we control AI systems — is genuinely one of the most interesting things I've worked on.