Push the limits of voice quality, latency and multilingual accuracy. Publish, prototype, productionize.
LocationRemote
TypeFull-time
Compensation$120k, $180k + equity
About the role
Our voice AI is good. We want it to be the best in the world for live customer conversations. That's the bar.
You'll work on streaming TTS, low-latency ASR, turn detection, multi-language switching, and emotional prosody. You'll publish externally, prototype freely, and ship to production.
We have GPU compute, audio data and the patience for real research. We just want to ship better voice agents.
What you'll do
Improve voice latency, naturalness and multilingual support
Run experiments and ship winners to production
Publish papers, blog posts and open-source contributions
Partner with ML engineers to productionize research
Set technical direction for the voice stack
What we're looking for
PhD or strong research track record in TTS/ASR/audio ML
Hands-on Python, PyTorch, audio processing
Have shipped or contributed to a production voice system
Strong publication record
Bonus: experience with diffusion models or RLHF for audio
Perks & benefits
Remote-firstSignificant equityHealth insuranceConference budgetCompute budgetPublication support
Apply for Voice AI Researcher
Tell us about yourself. We read every application and reply within 1 business week.