Self-host · Apache 2.0 · GDPR · Air-gappable

Self-Host
Meeting
Transcription

Run the whole meeting bot + transcription pipeline inside your VPC. Apache 2.0, Docker + Kubernetes, full data residency, no audio leaves your infrastructure. Works with Google Meet, Microsoft Teams, and Zoom from a single API.

Book a demo View on GitHub

No credit card required

Starred by 2.6k+ developers on GitHub

vexa · live transcript

LIVE

Waiting for meeting participants…

Connected · Google Meet

🦀

Real-time Meeting Data for AI

Feed meetings into your AI. Live.

AI needs data, and the best data is real-time. Vexa streams transcripts, audio, and context from any meeting straight to your models and agents — as it happens.

Built for self-hosting

Vexa is the only Apache-2.0 meeting bot + transcription API that fully self-hosts. The bot pod, audio capture, Whisper transcription service, API gateway, and transcript store all run as containers in your environment — Docker Compose for dev, Helm chart for production Kubernetes. No SaaS dependency, no telemetry phone-home, no audio path that crosses your firewall. Regulated industries (legal, healthcare, financial services, government) run Vexa fully air-gapped on-premise; everyone else runs it in their own AWS / GCP / Azure / on-prem cluster with the same Helm chart.

Apache 2.0 — no license-key checks, no callbacks home, no per-seat metering
Docker Compose for dev (5-min quickstart) + Helm chart for production Kubernetes
Audio never leaves your VPC — Whisper runs on your GPU nodes, transcripts persist in your Postgres

5-minute self-host quickstart — Docker Compose

# Clone the repo
git clone https://github.com/Vexa-ai/vexa.git
cd vexa

# Configure (one env file)
cp .env.example .env
# Edit .env: WHISPER_MODEL_SIZE, JWT_SECRET, ADMIN_API_TOKEN, ...

# Bring up the whole stack
docker compose up -d

# Mint your first API key (admin endpoint)
curl -X POST http://localhost:8056/admin/users \
  -H "X-Admin-Token: $ADMIN_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"email": "you@yourco.com"}'

# Drop a bot into any meeting
curl -X POST http://localhost:8056/bots \
  -H "X-API-Key: $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "platform": "google_meet",
    "native_meeting_id": "abc-defg-hij",
    "transcribe_enabled": true,
    "transcription_tier": "realtime"
  }'

Set $API_BASE to https://api.cloud.vexa.ai (hosted) or http://localhost:8056 (self-host). Get $API_KEY from /get-started.

Self-host FAQ

What operators ask before they deploy Vexa in their own infrastructure.

Is the self-host path really feature-complete vs hosted Vexa?

Yes — there's no functionality split between hosted and self-hosted. Same bot, same Whisper service, same REST API, same MCP server. Hosted Vexa is just a managed deployment of the same Apache-2.0 code. Customers regularly start on hosted, then migrate to self-host once they cross volume / compliance thresholds.

What infrastructure do I actually need?

For dev/single-tenant: one VM with Docker (≥4 CPU, 8GB RAM, optional GPU for faster Whisper). For production: a Kubernetes cluster with at least one GPU node pool (L4, A10, A100, or RTX 4090 work well — the Helm chart documents tested configurations). Postgres for state, Redis for queues, Minio / S3-compatible store for recordings (optional).

Can I run Vexa fully air-gapped, no internet access at all?

Yes. Pre-pull all container images from your private registry, mirror the Whisper model weights to internal storage, configure outbound traffic to be denied. The bot pod needs network access to the meeting platform (meet.google.com, teams.microsoft.com, zoom.us) — but no Vexa-controlled telemetry endpoint exists.

How does the Helm chart handle scale?

Bot pods are stateless and horizontally scaled per concurrent meeting (one pod per active bot). The Whisper service auto-scales on GPU node pools — typical sizing is 1 GPU per 5-10 concurrent realtime transcriptions, depending on model tier. Postgres + Redis are sized per tenant. See the scaling doc for production sizing tables.

What's the difference vs Vexa Lite?

Vexa Lite is the same code packaged as a single all-in-one Docker container for the simplest deployments (1 user, low volume, single host). Vexa "full" is the Compose / Helm multi-service deployment for production. Both are Apache 2.0; pick based on volume + ops surface.

Common patterns

Who runs their own Vexa

Three buyer shapes we see repeatedly across self-hosted Vexa deployments.

01Pattern 1

Regulated industries (legal, healthcare, financial)

Scenario

You're subject to HIPAA / GDPR / FINRA. Sending meeting audio to a third-party SaaS is non-starter. You need transcription with full audit trail under your control.

What you get

Self-host Vexa in your VPC behind your firewall. Audio captured in bot pods, transcribed on your GPU nodes, persisted in your Postgres. Every transcript event is auditable. Pair with your existing SIEM for the audit trail.

02Pattern 2

EU data residency (Schrems II, GDPR)

Scenario

Your buyers require EU-only data processing. Hosted SaaS — even with EU regions — fails their procurement review due to sub-processor sprawl.

What you get

Self-host on your EU cloud (AWS Frankfurt, GCP Belgium, Hetzner, OVH). Zero sub-processors. Sign a DPA with yourself. Pass procurement on the first attempt.

03Pattern 3

High-volume / cost-optimized at scale

Scenario

You run thousands of concurrent transcription hours per month and the hosted per-bot-hour pricing crosses break-even with running your own GPU pool.

What you get

Self-host on a reserved GPU cluster (your own A10s, A100s, or 4090s). Marginal cost drops from $0.30-0.45/hr to compute COGS (~$0.05-0.15/hr at scale). Hosted Vexa remains an option for spillover.

Quickstart

Self-host Vexa in 5 minutes

docker compose up brings the full stack online: bot dispatcher, runtime API, Whisper, gateway, Postgres, Redis.

1
Clone the repo
git clone https://github.com/Vexa-ai/vexa. Apache 2.0 — fork, modify, redistribute, no asks.
2
Configure one env file
Copy .env.example to .env. Set WHISPER_MODEL_SIZE (base/small/medium/large/turbo), JWT_SECRET, ADMIN_API_TOKEN, optional Postgres / Redis URLs.
3
docker compose up
The whole stack — bot dispatcher, runtime API, Whisper service, gateway, Postgres, Redis — comes up in 2-3 minutes.
4
Mint your first API key
Hit the admin endpoint with your ADMIN_API_TOKEN to create a user + key. Multi-tenant patterns documented in self-hosted-management.
5
Production: Helm chart
For production Kubernetes, helm install vexa from the chart in deploy/helm — GPU node pools, ingress, Postgres operator, the works.

Get an API key Read the deployment guide

How it works under the hood

Self-Host architecture

The self-host stack: bot dispatcher service (allocates bot pods), runtime API (control plane), Whisper transcription service (runs on GPU nodes, configurable model size), gateway (REST + WebSocket + MCP entry point), Postgres (state), Redis (queues + cache). Everything is one Helm chart. No telemetry, no callbacks, no SaaS dependency — Apache 2.0 means you can audit every line, fork if needed, run it forever.

Pipeline

bot pod → audio capture → whisper-svc → transcript chunks → REST / WebSocket

Documentation

Go deeper on self-hosting

Deployment, scaling, security, admin API, and Whisper sizing — everything you need to run Vexa in production.

Self-host deployment guide

Docker Compose for dev, Helm for production, sizing tables, GPU pool patterns.

Vexa Lite (single-container)

All-in-one Docker container — fastest path to a running instance, low-volume use cases.

Scaling & architecture

Concurrent-bot limits, GPU sizing, Kubernetes node-pool layout.

Security & data handling

Threat model, encryption-at-rest, audio retention controls, audit-trail patterns.

Admin API

Self-hosted user / key management, tenant scoping, quota enforcement.

Transcription quality

Whisper model tiers, language support, hallucination filtering, when to pick which.

Full reference at docs.vexa.ai.

Keep going on Self-Host

Tutorials, deep-dives, and adjacent surfaces from across the Vexa substrate.

Blog

Self-host meeting transcription in 5 minutes

Docker Compose quickstart — the canonical first-deploy guide.

Blog

Privacy-first meeting transcription — why self-hosted matters

GDPR, HIPAA, FINRA — the compliance case for keeping audio in your VPC.

Blog

Open-source transcription API — complete guide

Architecture overview, services breakdown, what runs where.

Pricing

Pay only for
what you use

Self-host for free, or let us run it for you. Simple usage-based pricing—no per-seat tax.

Free

Apache 2.0 · Self-host forever

Best for teams who need full control

View on GitHub

Full platform — no limits

Your infrastructure, your data

Google Meet + Teams + Zoom

Docker Compose / K8s deploy

REST API + WebSockets + Dashboard

Community support

Enterprise

Let’s talk

Managed cloud, or run it in your own infrastructure

Best for teams running Vexa in production

Book a call

30-min intro — no commitment

Managed cloud or on-prem / OpenShift / K8s

Google Meet + Teams + Zoom

Dedicated support + SLA

Security & data-residency review

Volume pricing — no per-seat tax

REST API + WebSockets + Dashboard

Frequently asked questions

What's included in the $0.30/hr Bot Service?

The base rate covers bot infrastructure: audio capture, webhooks, and 12 months of audio storage across Google Meet, Microsoft Teams, and Zoom. Add real-time transcription for +$0.20/hr ($0.50/hr total).

Which plan is right for me?

Choose Individual ($12/mo) if you need one bot for personal use—it includes real-time transcription, storage, and the Dashboard. Choose Pay-as-you-go ($0.30/hr) if you need multiple simultaneous bots or want to pay only for what you use—ideal for teams and API integrations. Both plans include Google Meet and Microsoft Teams support. Self-host for free from GitHub if you need full data control.

How does the free credit work?

Every new account gets $5 in free bot credit—no credit card required. That covers ~16 hours of bot time at $0.30/hr. All features are available: audio capture, transcription, real-time data, and full API access.

Can I self-host Vexa?

Yes. Vexa is Apache 2.0 licensed and fully self-hostable. Deploy with Docker Compose, Kubernetes, or OpenShift on your own infrastructure with complete data sovereignty.

How does Vexa compare to Recall.ai?

Vexa is open source, self-hostable, and up to 40% cheaper. Bot: $0.30/hr vs Recall.ai ~$0.50/hr. Transcription add-on: $0.20/hr vs ~$0.15/hr. See our detailed comparison.

What's the Individual plan for?

The Individual plan ($12/mo) is for single users who need one concurrent bot with unlimited meetings. It includes real-time transcription, recording, storage, REST API, WebSockets, and the UI Dashboard—everything you need for personal meeting intelligence. It supports Google Meet and Microsoft Teams, with Zoom coming soon.

Ready when you are

Run your own Vexa.
Up in 5 minutes.

Apache 2.0 — no license-key, no callbacks home, no per-seat metering. Clone the repo, docker compose up, mint a key, drop your first bot.

Self-host quickstart Deployment docs GitHub

Apache 2.0 · Self-host or cloud · No credit card required

Self-HostMeetingTranscription

Feed meetings into your AI. Live.

Built for self-hosting

Self-host FAQ

Who runs their own Vexa

Regulated industries (legal, healthcare, financial)

EU data residency (Schrems II, GDPR)

High-volume / cost-optimized at scale

Self-host Vexa in 5 minutes

Clone the repo

Configure one env file

docker compose up

Mint your first API key

Production: Helm chart

Self-Host architecture

Go deeper on self-hosting

Self-host deployment guide

Vexa Lite (single-container)

Scaling & architecture

Security & data handling

Admin API

Transcription quality

Keep going on Self-Host

Self-host meeting transcription in 5 minutes

Privacy-first meeting transcription — why self-hosted matters

Open-source transcription API — complete guide

Pay only forwhat you use

Frequently asked questions

What's included in the $0.30/hr Bot Service?

Which plan is right for me?

How does the free credit work?

Can I self-host Vexa?

How does Vexa compare to Recall.ai?

What's the Individual plan for?

Run your own Vexa.Up in 5 minutes.

Self-Host
Meeting
Transcription

Pay only for
what you use

Run your own Vexa.
Up in 5 minutes.