Build and ship the agent orchestration layer. Optimize model selection, latency and cost across thousands of agents.
LocationRemote (EU)
TypeFull-time
Compensation$110k, $160k + equity
About the role
Our agent orchestration layer routes millions of conversations a month across LLMs, voice models and custom skills. You'll own its performance, reliability and cost.
We optimize for sub-300ms response times in voice and sub-2s in chat. We pick models per request based on quality vs cost tradeoffs. We A/B test prompts in production. You'll do all of this and more.
This is a hands-on engineering role with significant research components. You'll write code, read papers, run experiments, and ship to production.
What you'll do
Own the model serving and orchestration layer
Optimize for latency, cost and quality across providers (OpenAI, Anthropic, in-house)
Build evaluation harnesses and A/B testing infrastructure
Research and ship improvements to RAG, tool-use and multi-step reasoning
Mentor other engineers on ML best practices
What we're looking for
5+ years ML engineering, 2+ years production LLM systems
Deep Python; experience with vLLM, Triton or similar inference servers
Strong systems thinking, you've debugged production latency or cost issues
Comfortable reading research papers and prototyping their ideas
Bonus: distributed systems experience
Perks & benefits
Remote-firstEquity packageHealth insurance$2k learning budgetCompute budget for experiments
Apply for Senior ML Engineer
Tell us about yourself. We read every application and reply within 1 business week.