Staff AI Engineer

About Wati

Started as a WhatsApp team inbox in 2020, Wati has evolved into a full revenue orchestration system that goes beyond a single platform. We empower businesses that sell, support, and grow through conversations by observing customer intent in real-time, deciding the next best revenue action, and executing it seamlessly across marketing, sales, and support—all within WhatsApp and connected messaging channels.

Our Platform & AI Capabilities

Wati is designed for scalability and intelligence. Our AI-native platform simplifies complex customer communication operations through a unified inbox, a robust multi-channel messaging infrastructure, and no-code automation. At the heart of our solution is Astra, our intelligent AI layer, which helps you create AI Agents for all customer interactions and all your messaging platforms. By integrating AI agents into the ecosystem, we enable businesses of all sizes to deliver measurable ROI and build deeper customer relationships.

Our Backing & Partnerships

Trusted by over 16,000 customers across 190+ countries, Wati is proudly backed by world-class investors including Tiger Global, Sequoia Capital, DST Global, and Shopify. As a Premium-tier Partner of Meta and Google, we maintain the highest standards of platform excellence and integration.

About the Role

We’re hiring a Staff AI Engineer to own LLM orchestration, RAG, and agent infrastructure at 4B+ messages/year scale.

Our platform processes over 4 billion messages per year across 100+ countries. Your mission is to build the robust, scalable, and intelligent systems that turn conversation data into real-time, intelligent customer experiences.

In this role, you will lead the architecture, deployment, and optimization of our LLM-driven services — including multi-provider inference orchestration, RAG pipelines, multi-agent workflows, and voice AI. This is a senior IC role with significant technical influence across the AI stack.

We need a "builder" who can bridge the gap between complex AI capabilities and massive-scale production environments, ensuring our AI is fast, reliable, and cost-effective.

What You Will Own

Core LLM Infrastructure: Architect and lead our AI production stack, including multi-provider LLM gateway optimization, token budget management, and low-latency inference routing across OpenAI, Gemini, and other providers.
Agentic AI & RAG: Design and implement scalable RAG (Retrieval-Augmented Generation) systems, multi-step AI agent workflows, and tool-calling infrastructure (MCP), ensuring high accuracy and reliability in customer interactions.
Voice & Multimodal AI: Lead the evolution of our voice AI layer (WebRTC/realtime) and cross-channel agent coordination across text, voice, and connected messaging platforms.
AI Production Lifecycles: Own the "Engineering-to-AI" loop: building automated pipelines for data collection, cleaning, fine-tuning orchestration, and model versioning.
Performance & Cost Optimization: Continuously optimize API costs, token budgets, latency, and caching strategies to ensure our 4-billion-message scale remains sustainable and performant.
Evaluation & Benchmarking: Build the infrastructure for systematic AI quality assessment, identifying failure modes and ensuring model improvements are grounded in real-world production metrics.
Technical Roadmap: Drive technology decisions in close collaboration with engineering leadership, selecting frameworks and architectural patterns that will define our AI future.

What We Are Looking For

Systems Expert: 5+ years of professional experience in backend or infrastructure engineering. Mastery of at least one high-performance language (Go, Rust, or C++) and deep proficiency in Python.
AI Deployment Mastery: Proven track record of taking LLMs/NLP models from experiments to high-traffic production. You understand multi-provider orchestration, prompt engineering at scale, and model drift management.
Data Pipeline Experience: Strong experience building data pipelines for AI workloads, including document processing, embedding generation, and vector search.
Product-Minded Engineer: You don’t just build for the sake of tech; you understand how AI performance impacts customer outcomes and business value.
Autonomous Builder: You thrive in environments with high ambiguity and can design, code, and deploy complex systems independently.
Experience with vector databases (e.g., Qdrant, Milvus, Pinecone) and RAG architecture patterns.
Familiarity with agentic frameworks, tool-calling protocols (MCP, function calling), or multi-agent orchestration.
Experience with real-time voice/audio AI pipelines (WebRTC, LiveKit, or similar).
Infrastructure-as-Code experience with GCP/AWS, Docker, and Kubernetes.

You’ll own AI quality across a platform that serves 16,000+ businesses in 190+ countries. The data pipeline and production infrastructure are in place — your job is to push the frontier: better models, smarter agents, faster inference, and measurable business impact.

You’ll have direct access to the founding team and the autonomy to shape our AI roadmap. This is a rare IC opportunity to own AI end-to-end at production scale, with real data, real customer impact, and a direct line to product decisions.

Details

Location

Shenzhen, Guangdong Province, China Hong Kong

Experience

Min. 5 years