T
Talent@ Beta
Livekit

Software Engineer, Data

Livekit · Seed · Website

Role Details

Location
Remote, U.S
Salary (est. USD)
~$85K - $136K (est. USD)

Estimated based on role seniority, company stage (Seed), and industry benchmarks. Actual compensation may vary.

How is this calculated?
Seniority band Mid-level
Base range $100K – $160K
Stage adjustment Seed (−15%)
Adjusted range $85K – $136K

Based on Web3 & AI industry compensation data. Seniority is inferred from role title keywords. Company stage affects ranges: early-stage (−15%), late-stage/public (+10%).

Department
Engineering
Type
Full-time
Vertical
AI

Job Description

LiveKit is building the infrastructure layer for the voice-driven era of computing. Our platform gives developers everything they need to build, test, deploy, scale, and observe agents in production. Founded in 2021, LiveKit powers voice AI applications for OpenAI, xAI, Salesforce, Coursera, Spotify, and thousands of others, collectively facilitating billions of calls each year.

You'll thrive at LiveKit if you:

  • obsess with crafting code that is fast, reliable and practical for the problem

  • are known as the go-to person for tackling tough technical problems

  • work hard and can build and ship fast

  • can clearly explain complex technical concepts to others

  • are a fast learner, frequently picking up new languages and tools

The best way to impress us is with thoughtful Issues and/or PRs on our Github repos 😊

About This Role:

We're looking for a Software Engineer, Data to help design and operate the metering and analytics infrastructure that powers LiveKit's platform. Our systems process massive volumes of usage data generated by thousands of developers and their end-users across billions of sessions, spanning real-time analytics, long-term trend analysis, data transfer, data governance, and retention policies. The infrastructure is heavily built on Go, with data flowing through blob stores, SQL stores, and ClickHouse across dozens of global regions. This is a deeply socio-technical role - you'll work closely with engineers across every team to ensure correctness of metering and analytics for their domains, while collaborating on practical designs that hold up in production at scale. If you're energized by building resilient data infrastructure and have strong opinions about schema evolution and data quality, this is the role.

What You'll Do:

  • Design and evolve metering and analytics infrastructure spanning real-time analytics, long-term analysis, data transfer, governance, and retention policies

  • Collaborate with teams across the organization to ensure metering and analytics are correct and complete for their domains - Agents, Agent Insights, Cloud Dashboard, and customer-facing reporting

  • Monitor and manage datasets with varying cardinality - both internally-defined datasets we control and customer-produced datasets where cardinality is unbounded and efficient querying is essential

  • Ensure data reliability through delivery guarantees, dead letter queues, reconciliation, validation, alerting, and anomaly detection across our distributed service fleet

  • Design and enforce schema evolution strategies (e.g., schema registries, backward/forward compatibility contracts) to evolve infrastructure without breaking downstream consumers

  • Optimize ClickHouse and blob storage for query performance, cost efficiency, and reliability across global regions

  • Reduce operational toil through automation, self-service tooling, and runbooks

Who You Are:

  • You have strong experience designing and operating data pipelines and distributed systems in production across dozens of global regions

  • You're a Gopher who has worked extensively with Go and contributes comfortably to a distributed systems architecture

  • You have deep experience with columnar/analytical databases (ClickHouse, BigQuery, or similar) and blob storage for high-volume workloads

  • You think deeply about data correctness - delivery semantics, idempotency, schema compatibility, and the failure modes that cause silent data loss

  • You're a strong cross-team collaborator who translates domain requirements into practical infrastructure designs

  • You have previous experience working on data-intensive SaaS applications with web-based dashboards in the analytics (reporting, observability or finance) space

    • Nice to have: Experience with stream processing frameworks (Kafka, Pulsar), Kubernetes, OpenTelemetry, query federation engines (Trino, Presto, Dremio), protobuf/Avro schema registries, or usage-based billing/metering systems

Our Commitment to You:

  • An opportunity to build something truly impactful to the world

  • Contribute to open source alongside world-class engineers

  • Competitive salary and equity package

  • Health, dental, and vision benefits

  • Flexible vacation policy

LiveKit is an equal opportunity employer and does not discriminate on the basis of any characteristic protected by applicable law. If you require a reasonable accommodation during the application or interview process, please contact recruiting@livekit.io.

About Livekit

Open-source voice, video, and physical AI framework. Powers OpenAI ChatGPT Voice Mode.

View company profile

Similar roles at other companies

Backend Software Engineer — Data Platform & AI Data Products
Together Ai · Series A · San Francisco
Senior Software Engineer, Data & Eval Platform
Dyna Robotics · Series A · Redwood City, CA
Software Engineer II, Data & AI
Ripple · Private · San Francisco, CA, United States
Software Engineer, Data Infrastructure
Intrinsic · Corporate · Mountain View (US-MTV)
Staff Software Engineer - Distributed Data Systems
Databricks · Series I · San Francisco, California
Senior Software Engineer – Big Data Infrastructure (OLAP Engine)
Binance · Private · Hong Kong

You'll be redirected to the company's application page

Get roles like this daily

Join our Telegram channels for curated job alerts