Introducing seeds

Almost all of my software work happens through AI coding agents now — the kind that read a codebase, write code, and run commands on my behalf while I steer. These agents get dramatically more useful when I give them tools built for them to operate, not for me, a lesson I learned from beads, a git-backed issue tracker the programmer and blogger Steve Yegge built for his AI agents to use.

This post is about a small CLI tool I built called seeds. It is named in the same gardening direction as beads, and it exists because of what beads showed me was possible.

I built seeds because plan files were failing me. The deliberation that went into a plan was being silently overwritten every time the AI rewrote the plan file. Beads, meanwhile, was working — after several months of heavy use I had watched my AI agents pick up the beads CLI and wield it as a native extension of themselves, as a bridge between me and the work they were doing. That broke open what I thought tools could be. So I went looking for something that did the same trick for the deliberation behind decisions, not the decisions themselves. I couldn’t find one. So I built seeds.

The relationship with beads

Steve Yegge built beads — in his words, “an issue tracker. A special one,” and “a tool that AI has built for itself.” The problem he was solving: coding agents have, in his framing, “no memory between sessions — sessions that only last about ten minutes.” After installing beads, my agents started saying things like “I notice all your tests are broken … and I’ve filed issue 397 to get them working again” — flagging unrelated work without losing the current task, and trusting that the issue would still be there when somebody came back to it. The agent doesn’t forget. The work survives until I’m ready for it.

Seeds catches the work that comes earlier. Plans, deliberations, questions, half-baked ideas, things I haven’t decided are work yet, and may decide are not work at all. A seed is allowed to be deferred indefinitely. A seed is allowed to be abandoned (with a reason). A seed is allowed to spawn six children that disagree with each other and can be resolved at a later time. Beads, designed around an execution lifecycle, doesn’t appear to me to have a comfortable place for any of that.

Yegge has been explicit about beads’ scope. From Beads Best Practices: “Everyone is focused on making planning tools, and Beads is an execution tool.” The one product he names as a “planning tool” is OpenSpec. On the project’s GitHub discussions he put the same point a different way: “I love the idea of planning tools, but I think they belong in a completely separate layer from Beads.” And in Beads Blows Up he draws an even sharper line: “Finished issues and future issues don’t really belong in Beads — it’s best to keep them in a separate store.”

The way I think about this now: there are really three layers. Planning tools like OpenSpec, GitHub Spec Kit, and the various task-master flavors produce a spec or a PRD or a task list as their output. Execution tools like beads track what is being built right now. Deliberation tools capture the reasoning that decides whether something becomes plannable in the first place — and that’s the layer seeds is trying to be. Yegge’s complaint about “everyone making planning tools” is about the first bucket, not the third. I read it as agreement, not opposition.

Deliberation has always mattered, but now it’s cheap to track

One thing I have learned across two decades of maintaining long-lived software is that the decisions behind a piece of code are often just as important as the code itself. Why a feature behaves the way it does, why some other feature was considered and dismissed, what data we did and didn’t trust at the time — that information shapes every future decision we’ll make about the same project. I have lost count of the number of times I have sat in a meeting to discuss a feature, only to dimly remember that we considered and rejected it years ago for reasons I can no longer recall, and the meeting becomes an exercise in re-deriving conclusions we already had.

Historically, I never bothered to capture the majority of what went into those decisions. I was too busy participating in the discussion to take minutes. I was too busy implementing to record the thinking that went into the implementation. As a human, documentation is expensive. As a human, retrieval from a large pile of freeform text is also expensive. For a solo developer, the math never quite worked out.

Both costs have plummeted with LLMs. Agents will happily transcribe meetings, summarize decisions, and write copious documentation alongside the code. They will also happily go off and read every document, transcript, and issue in a project to pull together a coherent picture from all of it. Capture is now cheap. Retrieval is now cheap. The value proposition for keeping a real deliberation log is completely different than it was just a couple of years ago.

Why plan files don’t carry it

The oft-recommended plan-file-based workflow turned out to be a slow leak. For me the workflow looked like this: AI and I converge on a plan via a conversation. The AI writes the plan into a file. I give the AI feedback. The AI rewrites the file based on that feedback. The new file is cleaner and more confident, but silent about the alternatives we had considered, the ideas I rejected, the questions I asked along the way. The plan that lands on disk is the destination. The journey is in the chat scrollback, which is being summarized, compacted, and silently discarded after a default of 30 days.

Ultimately, I noticed I had started either avoiding feedback on, or maintaining ever-incrementing versions of, plan documents I cared about, because the feedback would overwrite the very history I wanted to preserve. Important information was slipping through my fingers, and I was distraught.

Looking for an existing tool

For me the question became: how do I capture and organize all these decisions? That was the wrong question, so I got the wrong answer: Architecture Decision Records. I read up on ADRs, played with the small set of CLI tools that exist for them, and got partway through building one of my own. ADRs are a great pattern. They were not the pattern I needed.

ADRs capture the decision after it has been made. They summarize, neatly and concisely, what was decided, along with a little of what was considered. To me that’s only one step removed from a plan file. The best software I have worked on came out of hour-long conversations that weren’t reviews of decisions but deep dives into the problem domain — the messy, almost philosophical, in-the-weeds part where the interesting solutions actually live. ADRs, as I have come to understand them, are not intended to capture that.

Instead I started asking a different question: how do people capture and organize all the discussion that goes into a decision? That is where I came across deliberation tools, decision software, IBIS, dialogue mapping. It is a rich topic and was exactly what I had been looking for. It is also a bit of a ghost town. There is a literature stretching back to the 1960s, lots of frameworks, but almost nothing in active use. The tools that exist are too formal to reach for in the moment, and the moment is where the actual thinking is happening. More importantly, the tools were written for humans — I needed a tool written for agents by agents.

I figured I’d put together a prototype to see how it went.

seeds, briefly

I called the tool seeds. It is, deliberately, a rhyming, gardening cousin of beads. It’s also about the metaphor: ideas are like seeds — they take time to germinate, they need to be nurtured and cared for, they can eventually sprout and blossom into something amazing.

A taste of what it looks like:

seeds jot "We should write native PostgreSQL-flavored SQL and then transpile it for other RDBMSes"
# Created seed-a1b2: We should write native PostgreSQL-flavored SQL and then transpile it for other RDBMSes

seeds ask "are there any good libraries for transpiling SQL" --seed seed-a1b2
seeds answer q-c3d4 "There are transpilation libraries but only the following are written in the language used by this project: ..."

seeds resolve seed-a1b2

There are more verbs (explore, defer, abandon, link, tree, prime), a SQLite store underneath, and a JSONL export so the deliberation graph stays git-trackable. The README has the rest. The point is that every seed has a body the agent can fill with its investigations and rationale, and that every state transition — resolving, deferring, abandoning — takes a reason. Both habits leave a trail of why.

Now an awkward admission. I have never personally invoked the seeds CLI. Not once. I have probably read less than one percent of the content in my own seeds databases. To be fair, I have the same approach with beads — after several months of heavy use, I do not believe I have ever run bd ready from a terminal. Both tools, for me, are magical black boxes mediated entirely through a session with an AI agent. The tools work well enough that I haven’t had to pop the hood and muck around.

This shaped the design of seeds more than anything else. The bar I set early on was: if seeds required me to use it, it was not going to get used. It had to feel like a natural extension of the AI’s working memory. Flag-based, atomic, no interactive prompts; every command does one thing and takes simple arguments. The interface is for the agent. The agent is the interface for me.

The workflow rhythm

For the last five months, this is the workflow I’ve fallen into with seeds. I use this workflow multiple times a day across about a dozen different projects. By the time I’m done with the workflow, more often than not I end up with a set of beads that my agents use to one-shot an entirely new feature for a given project. Every time it feels like magic.

Step zero is the brain dump. I hop into a project, fire up dictation, and talk through every stray thought I have about what I might want — the gist, the weird ideas, the directions I’m leaning, the parts that worry me. I don’t try to be organized.

Step one is “turn this into seeds.” The agent reads the dump and creates a batch of seeds — some ideas, some concerns, some open questions, some early decisions. Messy and partial and exactly what I wanted.

Step two is the agent interviewing me. It asks for clarification on things I said and only half-explained. It asks about things I didn’t say but should have considered. It flags places where my own thoughts contradict each other. I answer what I can, and ask questions when I don’t know the answer.

Step three is the magic part. The agent figures out which questions need to be answered before others can be opened up — which seeds gate the rest. It does this completely without my intervention or guidance. The agent builds parent-child relationships, tags things, links related seeds, and surfaces which open questions are foundational. Now I have a database of ideas that are ready for curation and further exploration.

Step four is selecting a few seeds to nurture. Sometimes I know what I want to start implementing first. Sometimes I ask my agent what it thinks a good candidate feature might be. We zero in on a handful of seeds to flesh out and refine. One of the things I love best about seeds is it helps the agent and me stay focused. We home in on seeds we want to address now, defer the ones that are related but not immediately relevant. Those seeds are still there and often the agent will take them into consideration as we deliberate, helping keep the new feature on track but also in alignment with other potential future work. But I’ve never been so focused in a planning session as I am with seeds.

Step five is cycling on nurturing until we’re done. In considering our selected seeds, sometimes the agent has questions so I answer them, and the agent updates the relevant seeds with my answers. Sometimes I have questions so the agent investigates by reading the docs, looking at the code, running a query against the data, searching the web, etc., and then answers me and updates the seed with its findings.

This is one of the aspects of seeds that gives me the most peace of mind. The conversation isn’t just in chat that evaporates at the next compaction; the thoughts and findings land in the seeds and stay there. Months later, when some future agent revisits a seed, it has the same information that was on the table when the decision was made. Not a summary. The actual findings.

Eventually the agent and I run out of things to consider. We have made virtually all the decisions we needed to make for this session, and we got there together, organically. So often agents are chomping at the bit to just get implementing long before I’m ready, but even with seeds my agent will eventually say: “hey, want me to make some beads so we can get to writing code?” But in my experience, after following all of the workflow above, my answer is often yes. We have spent enough time thinking. The agent has not prematurely jumped to implementation, and I have not lingered in planning past usefulness.

Step six is the handoff. “Make some beads out of these seeds.” The actionable seeds become bd issues with their seed-body context attached, and implementation begins.

Step seven is the loop, lighter than it used to be. More often than not, implementation goes off without a hitch — no revisiting, no weird surprises, not much going back to the drawing board. Certainly nothing like the churn I used to live in with plan files. When the drawing board is needed, I revise existing seeds or supersede them with new ones. Resolved seeds occasionally get reopened because reality disagreed. The deliberation feeds the next round of implementation.

The surprises

I built seeds because I wanted to capture deliberation. I got that. I also got four things I didn’t predict, and together they’re why I cannot go back to plan files.

Focus. Seeds aren’t a giant document inviting me to add another sentence. They’re discrete pieces of ideas, decisions, and questions. When the seeds I care about are resolved, I stop. Back when I was using plan files, an AI in a coding context was unreasonably satisfied with insufficient planning — pressuring me to jump into implementation when there was real thinking still to be done. Conversely, once I had nudged the AI into a planning context, it would happily stay there: plan, nitpick, over-plan, overthink, as far as I was willing to engage. I was already suffering scope creep in my plan files, before a line of implementation got written. With seeds, I no longer end up with plans ten times the size they should be.

Peace of mind. The deliberation isn’t being lost. The level of detail and permanence I want is the level I’m getting. I don’t feel the slow leak anymore. This is the surprise I appreciate most when I sit down to plan something: there’s no anxiety humming in the background that the thinking I’m doing right now is going to evaporate by next week.

Better implementations. Implementation goes off without a hitch more often than it used to — fewer surprises, less revisiting, less churn. I cannot disentangle this from the AI getting better at implementation in general — that is a real and ongoing trend — but the shape of the improvement is consistent with what the deliberation log is doing. The agent walks into implementation with the rationale attached to each beads issue, and it doesn’t have to re-derive it.

Tighter iteration. The cycle between planning and implementing has shortened. I’m no longer fighting an AI that wants to either start coding right now or plan forever. One pattern that has quietly become my default: when I’m planning a feature, I often have a pie-in-the-sky version in mind — what the feature could become three years out. Rather than fight that, I capture the pie-in-the-sky vision as one seed, with all its nuance and complexity intact. Then I capture the actually-needed-now version as a separate seed and implement that. The pie-in-the-sky seed gets deferred. It sits in the database with the thinking already underway; if the bigger version is ever called for, the deliberation has already started. The complexity gets parked, not lost. I get to be ambitious during planning without paying for it during implementation.

I don’t know how much of this is the tool, how much is me getting better at working with AI, and how much is AI itself getting better over time. Teasing that apart is intractable. What I can tell you is that the four together feel real, and feel different from what I had before.

Where this has shown up in real work

A few shapes seeds has taken in real projects, anonymized.

ETL design. I do healthcare-data ETL for a living. ETL is a parade of small, fiddly decisions: which incoming columns get used, which transformations apply to each, which rows get dropped or merged or backfilled, and why. The “why” is what a downstream consumer needs in order to trust the resulting data. I have watched groups try to capture that “why” in documentation, and there is never any guarantee that the documentation tracks the implementation, and the day-of decisions made by the people doing the work almost never bubble back up. In my last two ETL design sessions I used seeds to capture every characterization query, every decision, every compromise, with a level of fidelity I have never been able to maintain before. When my downstream consumers had questions about the data, seeds was there to tell me not only what had happened, but why the ETL was done that way.

Greenfield projects. I have not started a greenfield project in four months that didn’t begin with seeds. So far I have used the brain-dump → triage → resolve → handoff workflow to design two web apps and three smaller projects in support of those web apps. The most useful run was a complicated data-collection tool that will eventually ingest from forty-two distinct sources — seeds helped me prioritize which sources to explore first and figure out how to catalog and prepare each one for harvesting.

Decade-old codebases. I have recently been on the receiving end of another round of feature requests for one of my long-running projects — some genuinely useful, several that need real discussion before they go anywhere. I have captured each request as a seed, along with whatever discussion has been had about it, so the topics can be picked up later for further refinement. This is not yet the longitudinal archive of deliberation I ultimately want for every project, but it has demonstrated that seeds can slot into an existing project with very minimal friction.

Things that aren’t software. I was recently invited to join a tabletop role-playing game and I used seeds to brainstorm a character — names, attributes, personality quirks, a homebrewed skill tree. Brainstorming a character throws off a lot of possibilities, most of which get cast aside or refined. Seeds was a fine fit. I suspect there is a fair amount of universality to the planning-and-deciding shape that goes well outside software.

What seeds doesn’t do

Some honest limits, in case anyone is about to talk themselves into this.

Seeds doesn’t enforce completeness. It is a forest, not a checklist. There is no workflow that says “you have twenty source columns and only seventeen of them are resolved, finish the other three.” But it isn’t blind, either. If each column has its own seed, the agent can see that three remain unresolved and bring it up. Completeness is in how you use the tool, not enforced by a workflow.

Capture in the moment is still hard. Even with the CLI installed and a seeds prime command teaching the agent the deliberation context, the agent doesn’t always reach for seeds when a decision is implicit in the conversation. It is pretty good. It needs reminding. The capture-in-the-moment problem isn’t solved. I’m considering agent hooks and backfill commands but those are still just a couple of seeds I’m letting germinate.

Seeds quietly assumes a very small team. I built it for a solo developer and their agent coworker — at most a team of one or two humans. There’s no multi-user model, no concurrent-editor story, no permissions, nothing for coordinating a crowd around the same deliberation graph. The JSONL export is git-trackable, so in principle a couple of people could share a seeds database the way they’d share any other file in a repo, but I’ve never tried it and I designed nothing for it. If you’re picturing seeds as shared deliberation infrastructure for a large team, that isn’t the tool I built.

Seeds may not work the same for everyone. My agents are increasingly making notes, memories, and agent files for themselves to better tailor their interaction with me. Because I’ve been using seeds for many months, I can only imagine the amount of customization my agents have built up around working with seeds. One of my greatest concerns about releasing and promoting seeds is whether it will behave the same way for others as it does for me.

I cannot quantify any of this. Much of the AI world runs on vibes, and seeds is no exception. The vibes I get from seeds are a sense of security that the ideas and plans and designs I’m capturing are being kept at the level of detail and permanence I want them at, and a sense of wonder that an agent can wield the tool as masterfully as it does. None of that is science. For me, seeds passes the vibe check. I don’t know how to make it pass yours.

Where this goes from here

Software development has always been one branch of a much larger discipline: taking a hard problem, breaking it into smaller ones, finding solutions, and stitching the solutions back into a whole. Agents are very good at the implementation step now. The homing-in-on-the-solution step is what’s left for the rest of us. Tools that help with that step feel more critical to me, not less, in a world where implementation is cheap.

The deliberation space is a rich, criminally under-addressed area. Now that capture and retrieval are cheap, there is room — and reason — for more tools here. I would be surprised if seeds is the last word, or even the right word; I built mine because I needed it. If you can use mine, use it. If it inspires you to build something better, even better. I would like to see more emphasis put on this space. For my part, I’ve recently begun encoding my deliberation rhythm into a couple of agent skills, so the agent captures decisions and feedback as they happen instead of waiting for me to ask.

The bottom line

I built seeds for myself. The way I use it may not be the way you would use it. I have very modest plans for it going forward — it is genuinely a minimum viable product that an agent vibe-coded for me to start with — but at this point I cannot imagine planning or designing anything of any sophistication without it. I do not use plan files anymore. The cycle is seeds, then beads, then implementation, then back to seeds. Over and over and over.

Honestly: I don’t feel safe planning if I’m not planning in seeds.

If you give it a try, just ask your AI agent to set it up and run it for you. I’d be interested to see how well it works for you and your agents.

Try it

github.com/outcomesinsights/seeds. MIT licensed. Beta. Issues and pull requests welcome.