From 1s and 0s to AI: A Long Walk

I burned my first program into a UV-erasable EPROM sometime in the mid-1980s. The erase cycle took about 20 minutes under a UV lamp. The write cycle took maybe three minutes. If I had a typo, I started over. Not a "recompile and re-run" kind of start over. A "put the chip back under the lamp, wait 20 minutes, then try again" kind of start over.

That is where this walk starts. It ends at sub-agents that spawn sub-agents on the fly, pulling from context windows that would have sounded like science fiction from the EPROM bench. The distance between those two points is not a linear progression. It is a series of thresholds, each one crossed when nobody was entirely sure the crossing was possible.

The init stack era

The early part of my career was low-level embedded work. You were close to the metal. You knew where every byte lived. The initialization sequence — setting up memory, configuring registers, getting the chip to a known state before anything useful happened — was yours to write and yours to debug when it went wrong.

This was not abstract programming. A mistake in the init stack meant the system came up in an undefined state, and undefined states have a way of being very creative about how they fail. Undefined does not mean "probably fine." It means the chip is doing something, you just do not know what, and finding out requires a logic analyzer and patience.

There was craftsmanship in that work. The constraint of working close to hardware teaches you that every system state is a choice, and that the choices you make before the application runs determine what the application can do. That lesson translates. Whether you are writing init code for a microcontroller or designing the startup sequence for a distributed system, the first seconds of a system's life matter more than most engineers think.

Y2K — the great COBOL panic

Fast forward to the late 1990s. The Year 2000 problem. For those who were not there: the computing world had a large amount of production code — much of it written in COBOL, running on mainframes, handling payroll and banking and logistics — that stored years as two digits. 99 for 1999. The question was what happened when the year rolled to 00.

The answer, for properly fixed systems, was nothing. But getting to "properly fixed" required finding every instance of year logic in decades of accumulated code, much of it written by people who were no longer around and documented by people who had assumed the code would never run in the year 2000.

The panic was real. The work was real. And the outcome was that most critical systems were fine, because a lot of engineers spent a lot of unglamorous hours finding the needle in the haystack. It was not a triumph of technology. It was a triumph of methodical, painstaking remediation work. The lesson I took was that technical debt is not theoretical. It has a due date.

The GPU compute moment

For most of the early computing era, graphics processors were for graphics. They were specialized hardware for rendering pixels fast, and the general computing world did not pay much attention to them for anything else.

Then, gradually and then suddenly, researchers started noticing that the massively parallel arithmetic that makes a GPU good at rendering is also good at linear algebra. And linear algebra is the thing neural networks run on. CUDA landed in 2007 and gave programmers a real interface to GPU compute. The research community started moving in. The hardware kept getting better.

The moment that was visible to me as a threshold — not in research papers but in what you could actually run on hardware you could buy — came around 2012-2013. Before that moment, training a useful neural network on a non-trivial dataset required either specialized hardware, university cluster time, or weeks of patience. After that moment, the economics changed. GPU compute became accessible. The cost to experiment dropped. The experiments accelerated. Everything that came after follows from that threshold.

Voice to text crosses the line

For a long time, voice recognition was a punchline. You spoke clearly, at a measured pace, into a decent microphone, and the software transcribed about 80% of what you said, with the remaining 20% sometimes better and sometimes worse than random.

At some point in the early 2020s, that changed. The models crossed a threshold where voice-to-text was not just "pretty good" but genuinely better than a fast typist for most practical purposes. You could dictate naturally, with real speech patterns, at conversational pace, and the output was clean.

I noticed this not from a benchmark but from a behavior change in myself. I started dictating things I would previously have typed. That is the real threshold test: when a technology changes how you naturally work, not because you are forcing yourself to use it but because it is actually easier.

LLM chat and image generation as casual tools

The release of large language model chat interfaces to the public in late 2022 had the same quality: behavior-changing. Not because the underlying research was new — it was not, much of the foundational work had been published for years — but because the interface was finally good enough that non-specialists could use it without friction.

Within months, asking a language model to draft a document, explain a concept, or review code went from "unusual thing that researchers do" to "thing engineers do without thinking about it." Similarly, generating images from text descriptions went from research demonstration to production tool in what felt like a year.

The interesting part of this phase was not the technology. It was watching people figure out the interface. The model can do a lot. The hard part is knowing how to ask. That is a skill — prompt engineering is a real skill, even if the name sounds odd — and watching that skill develop in real time across the whole engineering profession was something I had not seen before. Every new era requires learning what the new tool is actually for.

Cut-and-paste coding becomes AI-augmented coding

The progression from "AI helps me write code" to "AI writes substantial chunks of code that I direct and review" happened faster than I expected. Three years ago, a code-generation tool was impressive when it autocompleted a function. Now, you can describe a system in prose, and an AI assistant will produce working implementation across multiple files that often needs only minor correction.

That is not a replacement for understanding what you are building. The engineers who use these tools well are the ones who can evaluate the output, catch the wrong architectural choice, and direct the model toward the actual goal. The engineers who use them poorly are the ones who accept output without understanding it, and discover the problems downstream.

The craft has not gone away. It has moved up the stack. Knowing what to build, how to evaluate whether it was built correctly, and how to course-correct — those are the skills that matter more now, not less.

Sub-agents spawning sub-agents

The most recent threshold I have watched cross in real time is multi-agent systems. Not a single model responding to a user. A network of agents, each with specialized roles, orchestrated by a planning layer, breaking problems down and routing sub-problems to the right specialist.

I run this kind of system for engineering work now. A planning agent breaks a task into components. Sub-agents handle implementation, testing, research. The orchestration decides what runs in parallel and what waits. The output of one agent becomes the input of the next.

From the EPROM bench, the description of this would have sounded indistinguishable from science fiction. The concept of software that directs other software, adaptively, based on the nature of the work, would have required a long explanation before anyone believed you were talking about something real.

And yet here it is. Running on hardware you can lease by the hour. With interfaces good enough that a small engineering firm can build and operate it.

The arc

Each of these thresholds looked impossible from the era before it. The engineers before the GPU compute moment did not expect neural networks to become practical at that speed. The engineers before voice transcription crossed its line did not expect to be dictating to their computers. The engineers before LLM chat interfaces did not expect prose-to-code at production quality this decade.

Every impossible thing, from where I am standing, has happened inside one career. That is not a cause for complacency. It is a cause for staying attentive. The next threshold is coming. It looked impossible from the one before it, too.

— AK-mee Engineering