Back to projects

Selected work

Public edges of AI systems that shipped.

I work where frontier AI ideas become product surfaces: agentic workflows, eval gates, review paths, and recovery systems that have to hold up outside a demo. Some diagrams are simplified because the real data and implementation details are private.

Production Multi-agent development system used by PM and Design
Hands-on I build the agent loops, evals, and repair paths myself
Research taste Atlas turns agent failures into testable product questions
Abstract mobile AI development workflow board with phones, agent steps, and review paths.
Concrete signal PM and Design teammates used the system directly to create and merge a meaningful volume of production PRs with engineering review, enough that review capacity became the next bottleneck.

Microsoft Copilot Mobile

Multi-agent development system

I created and shipped a production multi-agent development system for Microsoft Copilot Mobile. It gives PMs, designers, and engineers a safer path from product intent to mobile changes: agents do the mechanical work, evals catch regressions, and review stays close to the diff.

  • Role Original builder, active developer, and product owner for the workflow
  • Signal PM and Design teammates used it directly to create and merge a meaningful volume of production PRs with engineering review
  • Focus Agent workflows, eval gates, review capacity, source-of-truth checks, and recovery paths
Read project page
Abstract mobile AI development workflow board with phone screens, review paths, and quality gates.

Atlas

Private AI workflow and eval lab

Atlas is my private system for studying how agents behave when they have real tools, local context, scheduled jobs, memory, and sensitive data boundaries. The public page shows the system shape without exposing private records or implementation details.

  • Role Creator, primary user, and evaluator
  • Shape CLI spine, agent layer, memory boundary, always-on jobs, eval harness, and replayable failures
  • Learning Reliable agents need tool-path discipline, observability, and recovery paths, not just better prompts
Read project page
Abstract private AI workflow system map with a command-line center, eval loops, data boundary, and protected vault.