I build and lead production-grade AI systems that perform reliably under real-world constraints. Focused on execution, system design, governance, and measurable outcomes.
I've built and scaled production systems across enterprise, growth-stage, and regulated environments including fintech and healthcare.
Most recently, I founded Beesla and led the design of an AI-native platform integrating agent workflows, retrieval pipelines, and production guardrails from day one.
I operate at the intersection of product decisions and engineering execution, translating business priorities into systems that ship cleanly and hold up under pressure.
Every fifteen minutes, Mender wakes up, reads the last hour of traces from another agent via the Arize Phoenix MCP server, clusters failures, hypothesizes a root cause, generates a focused eval set, drafts a prompt patch, re-runs the same evals against the patched version, and — if it measurably improves — posts a structured incident card to Slack with one-click human approval. He also reads his own traces every cycle and tunes himself: how many evals to generate, what confidence threshold to use, when to ask for help.
End-to-end verified: detected a real regression in a target agent ("ambiguous source currency silently defaulted to USD"), generated 10 focused eval cases, lifted pass rate from 4/10 to 10/10 on the patched version — a +60% lift, ready for one-click Slack approval.
Stack: Google ADK (Python) · Gemini 3 on Vertex AI · Arize Phoenix + Phoenix MCP · Slack Block Kit · Cloud Run · Cloud Scheduler · Firestore.
Designing and deploying autonomous agent architectures with orchestration, memory, and reliability patterns.
Context engineering, structured outputs, evaluation frameworks, and retrieval-augmented generation at scale.
Scalable system design, microservices, API-first platforms, and operational reliability at enterprise scale.
AWS and Azure architecture, infrastructure-as-code, CI/CD pipelines, and cloud-native operational patterns.
Team building, delivery discipline, cross-functional alignment, and shipping culture in fast-paced environments.
Production readiness frameworks, guardrails, risk assessment, and compliance in regulated domains.
Production over prototype.
Reliability over novelty.
Clear architecture over complexity.
Measurable outcomes over experimentation theater.