§ 00 — LOADING STACK
◆
LangGraphLangGraph
Azure OpenAIAzure OpenAI
QdrantQdrant
Arize PhoenixArize Phoenix
LLangfuse
MMastra
Next.jsNext.js
SSupabase
LANGGRAPH ◆ AZURE ◆ QDRANT ◆ ARIZE ◆ LANGFUSE ◆ MASTRA ◆ NEXT.JS ◆ SUPABASE ◆ LANGGRAPH ◆ AZURE ◆ QDRANT ◆ ARIZE ◆ LANGFUSE ◆ MASTRA ◆ NEXT.JS ◆ SUPABASE
KKomal Vardhan.
HomeWorkAboutWritingResourcesContact
HomeWorkWritingResourcesAboutContact
Build like an engineer. Teach like a friend.

© 2026 Komal Vardhan Lolugu

Sitemap
  • Home
  • Work
  • About
  • Writing
  • Contact
  • Resources
Elsewhere
  • LinkedIn · 3.5K
  • Medium · Writing
  • Instagram
  • GitHub
  • Topmate
Newsletter

A field note every other Sunday. No hype, no AI spam. Unsubscribe anytime.

Designed & built by Komal. Made in India.
← All work
2025 · Developer toolsPyPI · v0.2.0Author & Maintainer

az-realtime-webrtc

Python companion to the npm package. Async WebSocket client, streaming iterators, function calling, and server middleware for Flask & FastAPI.

View live ↗GitHub →
Python ≥3.10Requirement — async-first from the ground up
2Framework integrations: Flask blueprint + FastAPI router
async forNative streaming — transcript, audio, and event iterators
PyPIpip install azure-realtime-webrtc
§ 01

The Problem

After shipping the TypeScript SDK, the Python ecosystem had the same gap. Python developers building voice agents or FastAPI backends had no clean way to connect to Azure OpenAI's Realtime API without weeks of low-level WebSocket work.

§ 02

The Solution

Built az-realtime-webrtc: an async-first Python SDK. RealtimeClient wraps the WebSocket with a clean context manager. Streaming is built in — async for chunk in session.transcript_stream(). Flask blueprint and FastAPI router handle the token server in under 10 lines each.

§ 02b

How it works

01
Token server

Flask blueprint or FastAPI router handles POST /api/realtime/token — keeps API key server-side.

02
Async client

async with client.connect() as session opens the WebSocket, completes session handshake, yields a RealtimeSession.

03
Streaming

async for chunk in session.transcript_stream() yields TranscriptChunk objects word by word.

04
Tool loop

register_tool() binds a handler to a ToolDefinition. SDK runs the handler and sends the result back automatically.

§ 03

What I Learnt

  • 01

    async with as the primary API pattern is the right choice for Python — it makes session lifecycle explicit and prevents the most common bug: not closing the WebSocket cleanly.

  • 02

    Mirroring the TypeScript SDK's naming conventions makes it immediately familiar to developers using both packages.

  • 03

    Flask's sync nature requires asyncio.run() internally for the token endpoint — surfacing this in docs proactively saved many support questions.

  • 04

    py.typed marker + dataclass-based types makes the SDK first-class for mypy and pyright.

§ 04

Technologies Used

Python (asyncio)Python (asyncio)

Async-first WebSocket client — async with, async for, async generators

websocketswebsockets

Underlying WebSocket transport layer

Flask (optional)Flask (optional)

Blueprint for token server — POST /api/realtime/token

FastAPI (optional)FastAPI (optional)

Async router for token server with Swagger UI auto-generated

azure-identity (optional)azure-identity (optional)

Microsoft Entra ID auth support via DefaultAzureCredential

Python (asyncio)Python (asyncio)
websocketswebsockets
Flask (optional)Flask (optional)
FastAPI (optional)FastAPI (optional)
azure-identity (optional)azure-identity (optional)
← All workWork together ↗
← Previousazure-realtime-webrtc (npm)