Shared-device conversations, live mode, and multilingual voice output that holds up in motion.
The challenge
Translation apps often break the rhythm of an actual conversation. Yap United needed to handle live voice, turn-taking, and multilingual community behavior without making people fight the interface.
How we built it
I built a dual-mode speech system: a turn-based flow for shared-device conversations and a Gemini Live pipeline for hands-free mode. Audio is recorded with Expo, translated with Gemini, voiced with ElevenLabs, and routed to the correct earbud side with reconnect and backoff handling when the live session drops.
What shipped
Yap United supports 15 languages end-to-end, lets each user keep a distinct voice identity, and extends beyond translation with community zones, moderation controls, and non-Latin script handling that keeps the conversation usable under real conditions.
Outcomes
- Built both turn-based translation and hands-free live mode for real conversations on a shared device.
- Supported 15 languages end-to-end across transcription, translation, and voice output.
- Added per-user voice assignment, location-based community chat, and social moderation controls around the core translation flow.