Staff AI Systems Engineer

I design and build
production AI systems.

End-to-end AI platforms with control loops across every layer.
Built for reliability, observability, and real-world execution.

intent→orchestration→execution→verification→supervision

Read the full control systems thesis →

Sr Software Architect @ VectorVest

Built AI-assisted engineering systems → ~3x throughput to production

View Systems Read Writing

GitHub LinkedIn npm Email

Most AI systems fail in production.

Not because the model is wrong — but because the system has no control loop.

I build AI platforms that:

—orchestrate end-to-end workflows (retrieval → inference → tools)
—define a source of truth (evals, golden datasets)
—trace execution across every layer (OpenTelemetry, replay)
—enforce verification at execution time (tests, outputs, tool validation)
—expose supervision interfaces for human control

This is LLMOps as a system, not just model integration.

Selected Systems

Production AI systems I've designed and built.

AI PR Generation System

Problem: Manual PR authoring was the primary bottleneck across the engineering org.
System: User story → code → tests/lint → review gate.
Control: Execution-time evals + feedback loop into prompts and system tuning.
Outcome: ~3x increase in throughput to production.

LLMOpsRAGMCPEvals

Distributed AI Ingestion Pipeline

Problem: Large-document ingestion was unreliable — silent failures corrupted retrieval quality.
System: Queue-based chunking + distributed workers + DLQ.
Control: Data validation + retrieval evals.
Outcome: Stable ingestion for large-scale knowledge systems.

AI PlatformRAGDistributed SystemsEvals

Agent Execution + Supervision System

Problem: Agent workflows were opaque — no tracing, no intervention, no post-hoc debugging.
System: Real-time supervision interface for AI agent execution (OpenCode).
Control: Execution tracing + human-in-the-loop checkpoints.
Outcome: Debuggable, controllable agent workflows.

AI Agent SupervisionObservabilityTypeScript

View all systems→

AI systems in production behave like control systems.

intent → orchestration → execution → verification → supervision

Failures come from:

—misaligned intent (did we deliver value?)
—weak retrieval / context quality
—lack of observability across system layers
—missing or delayed feedback loops

My focus is making these systems reliable, measurable, and controllable.

Focus Areas

AI Platform Engineering / LLMOpsRetrieval systems + evaluation (RAG, grounding)Agent orchestration + workflowsObservability + tracing (OpenTelemetry)Execution-time verification + eval systems

Writing

I write about building reliable AI systems in production.

May 11, 2026

Control Systems for Intelligent Software #6

Why Coherence Doesn't Scale with Capability

There's a quiet assumption underneath most AI discussions: if capability keeps improving, coherence will eventually follow. Gödel's incompleteness theorems suggest otherwise. The supervision layer isn't a temporary workaround — it's part of the design.

April 27, 2026

Control Systems for Intelligent Software #5

Why AI Needs Control Surfaces, Not Just Chat

Once you can see what the system is doing, the next problem is interacting with it. Chat interfaces for AI agents are like flying a drone through a text terminal. The industry needs purpose-built control surfaces.

April 20, 2026

Control Systems for Intelligent Software #4

Observability for AI Agents

AI systems fail in ways that look like success. You can't find these failures in a chat log. You need traces. AI agent systems need the same observability infrastructure that distributed systems built over the past decade.

April 13, 2026

Control Systems for Intelligent Software #3

AI Agents Are Control Systems

We're building AI systems like they're chatbots. They're not. They're control systems. The architecture that robotics solved decades ago — machine, telemetry, interface, human — is the same architecture AI agents need.

April 6, 2026

Control Systems for Intelligent Software #2

Why the Smart Model Reviewer Pattern Is Backwards

Most AI pipelines have generation and verification backwards. Smart models should generate. Cheap models should verify. The industry is putting its best capability in the wrong place — and it's creating a quality ceiling, not just a cost problem.

March 17, 2026

Control Systems for Intelligent Software #1

Bash Is All You Need — Until It Isn't

The industry phrase 'bash is all you need' is technically correct. But it hides the real problems: context management, execution reliability, alignment with outcomes, and human supervision.

If you're building AI platforms, agent systems, or production LLM features where correctness, observability, and control matter —

let's talk.

Email me Connect on LinkedIn

I design and build production AI systems.