
The Speculative Decoding Pattern
Pattern Defined Precise Definition: Speculative Decoding is an optimization pattern where a...

Author profile
Systems architect & technical product leader with roots in bare-metal engineering. I design modern local-first, data-sovereign AI platforms in Go/Python and scale elite core infrastructure teams.
Browse the latest writing surfaced through DevArt.

Pattern Defined Precise Definition: Speculative Decoding is an optimization pattern where a...

In previous steps, we gave our system Eyes (Local Vision) and a Shield (The Redactor). But a list of...

Part 3 of 5 in the series: When Your AI Pipeline Grows Up In the previous post, we worked through...

For years, we have treated LLMs as a rented brain. We have poured our debugging sessions, research...

In the last post, we gave our forensic system "Eyes" using local Multimodal Vision. We successfully...

In the previous post, we identified three categories of pressure that expose architectural weaknesses...

From Stateless Prompts to Persistent Intelligence Where this fits: This article bridges two...

Pattern Defined Precise Definition: Inference Patterns are repeatable architectural...

We’ve built a system that is Reliable, Affordable, and Governed. But until now, our Forensic Team has...

Why your tech stack doesn’t matter when building multi-agent systems. Learn the three architectural principles that determine reliability: state, context, and control.

There’s a familiar arc in AI development. A team builds a model, wires up a pipeline, and ships it....

At the beginning of this series, the problem seemed simple. There were a lot of rocks in the...

Autonomous agents are a liability for high-stakes enterprise decisions. Learn how to build 'The Guardian'—a stateful Human-in-the-Loop (HITL) checkpoint that requires a human to sign off on high-severity findings before they are committed to the record.

Over the past several weeks, I’ve been spending a lot of time thinking about systems. Some of that...

By now, the Backyard Quarry system has grown beyond its original intent. We started with a pile of...

Stop overpaying for AI reasoning. Learn how to build 'The Accountant'—a semantic router that uses local SLMs to classify task complexity and slash inference costs by up to 80%.

So far, the Backyard Quarry system has worked well. We have: a schema a capture process ...

Stop "vibe-checking" your AI agents. Learn how to build a production-grade 'Judge Agent' that audits your team against a Golden Dataset, providing quantitative reliability scores and an automated path to optimization.

OpenAI’s 5.4-Cyber release just changed the rules of DevSecOps. If your pipeline isn't autonomous by the end of 2026, it's a liability. Here's why.

In Part 5 of the Backyard Quarry series, the project finally starts to connect to a bigger idea: digital twins.

By this point, the Backyard Quarry has a schema, a capture process, and a growing collection of...

The 5-Minute "Hello World" Comparison We’ve spent the last month talking about the End of Glue Code...

How do you scale AI agents without losing control? Transition from 'it works on my machine' to enterprise-grade governance. Explore the AI Mesh architecture using Oracle 26ai, RLS, and immutable audit trails.

This post looks at what it actually takes to turn physical objects into usable data.

This is a submission for the 2026 WeCoded Challenge: Echoes of Experience In the world of rare and...

Large Models Build Systems. Small Models Run Them. For most developers, modern AI systems...

In the first post of this series we set the stage for the Backyard Quarry project. Once you decide...

Another round of tech layoffs rolled through the industry recently, and I was one of the people...

Stop building monolithic agents. Learn how to architect a specialized 'Forensic Team' using a Python Supervisor and a TypeScript MCP Server. A masterclass in multi-agent orchestration and functional specialization.

Stop hard-coding AI integrations. Learn how the Model Context Protocol (MCP) replaces brittle M x N glue code with a scalable, protocol-driven architecture.