Live Transcription

Live transcription converts spoken words to text in real-time during calls. Use it for live monitoring, agent assistance, compliance recording, and post-call analysis.

How It Works

Caller speaks
     ↓
Audio stream captured
     ↓
Speech-to-text processing
     ↓
Text emitted via WebSocket/webhook
     ↓
Display in dashboard or your app

Enabling Transcription

Global Setting

Enable for all calls in your Crew:

{
  "transcription": {
    "enabled": true,
    "language": "en-US"
  }
}

Per-Call Setting

Enable for specific calls:

curl -X POST https://api.usecrew.ai/v1/calls/outbound \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "to": "+14155551234",
    "agent_id": "agent_sarah",
    "transcription": {
      "enabled": true
    }
  }'

Configuration Options

{
  "transcription": {
    "enabled": true,
    "language": "en-US",
    "punctuation": true,
    "profanity_filter": false,
    "speaker_labels": true,
    "interim_results": true,
    "word_timestamps": false
  }
}

Option	Description	Default
`enabled`	Enable transcription	`false`
`language`	Primary language	`en-US`
`punctuation`	Add punctuation	`true`
`profanity_filter`	Mask profanity	`false`
`speaker_labels`	Identify speakers	`true`
`interim_results`	Partial transcripts	`true`
`word_timestamps`	Per-word timing	`false`

Supported Languages

Language	Code
English (US)	`en-US`
English (UK)	`en-GB`
Spanish (US)	`es-US`
Spanish (Spain)	`es-ES`
French	`fr-FR`
German	`de-DE`
Portuguese (Brazil)	`pt-BR`
Japanese	`ja-JP`
Mandarin	`zh-CN`

Receiving Transcripts

WebSocket Stream

Connect to receive real-time transcripts:

const ws = new WebSocket('wss://api.usecrew.ai/v1/calls/{call_id}/transcript');

ws.onmessage = (event) => {
  const data = JSON.parse(event.data);
  console.log(`[${data.speaker}]: ${data.text}`);
};

Message Format

{
  "type": "transcript",
  "call_id": "call_abc123",
  "timestamp": "2024-01-15T10:30:15.123Z",
  "speaker": "caller",
  "text": "I'd like to schedule an appointment",
  "is_final": true,
  "confidence": 0.95
}

Webhook Delivery

Receive transcripts via webhook:

{
  "webhooks": {
    "transcription.segment": "https://yourapp.com/hooks/transcript"
  }
}

Webhook Payload

{
  "event": "transcription.segment",
  "call_id": "call_abc123",
  "segments": [
    {
      "speaker": "agent",
      "text": "Good afternoon, how can I help you today?",
      "start_time": 0.0,
      "end_time": 2.5
    },
    {
      "speaker": "caller",
      "text": "I'd like to schedule an appointment for next week.",
      "start_time": 3.0,
      "end_time": 5.8
    }
  ]
}

Dashboard View

View live transcripts in the Crew dashboard:

Navigate to Calls → Active Calls
Click on an active call
View the live transcript panel

Interim vs Final Results

Interim Results

Partial transcripts as words are spoken:

Speaker: "I'd like to sche..."
Speaker: "I'd like to schedule an app..."
Speaker: "I'd like to schedule an appointment" [FINAL]

Enable for real-time display, disable for cleaner webhook traffic.

Final Results

Complete, corrected transcripts after speaker pauses:

{
  "is_final": true,
  "text": "I'd like to schedule an appointment for next Thursday."
}

Speaker Identification

With speaker_labels enabled, transcripts identify who is speaking:

Speaker	Description
`agent`	The AI agent
`caller`	The external party
`human_agent`	A human who joined the call

{
  "segments": [
    { "speaker": "agent", "text": "How can I help you?" },
    { "speaker": "caller", "text": "I need to cancel my appointment." },
    { "speaker": "agent", "text": "I can help with that." }
  ]
}

Post-Call Transcripts

Access complete transcripts after calls end:

API

curl https://api.usecrew.ai/v1/calls/{call_id}/transcript \
  -H "Authorization: Bearer YOUR_API_KEY"

Response

{
  "call_id": "call_abc123",
  "duration": 127,
  "transcript": {
    "full_text": "Agent: Good afternoon...\nCaller: Hi, I'd like to...",
    "segments": [
      {
        "speaker": "agent",
        "text": "Good afternoon, thank you for calling Acme Medical.",
        "start_time": 0.0,
        "end_time": 3.2
      }
    ]
  }
}

Use Cases

Live Agent Assistance

Display transcripts to human supervisors for real-time coaching:

// Supervisor dashboard
socket.on('transcript', (data) => {
  if (data.call_id === monitoredCall) {
    displayTranscript(data);
    checkForCoachingOpportunities(data);
  }
});

Compliance Recording

Store transcripts for regulatory requirements:

{
  "transcription": {
    "enabled": true,
    "storage": {
      "enabled": true,
      "retention_days": 365,
      "format": "json"
    }
  }
}

Real-Time Analytics

Analyze conversations as they happen:

socket.on('transcript', (data) => {
  // Detect sentiment
  const sentiment = analyzeSentiment(data.text);
  
  // Alert on negative sentiment
  if (sentiment < -0.5) {
    alertSupervisor(data.call_id, 'Negative sentiment detected');
  }
});

Search and Discovery

Index transcripts for later search:

curl https://api.usecrew.ai/v1/calls/search \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "query": "appointment cancellation",
    "date_range": {
      "start": "2024-01-01",
      "end": "2024-01-31"
    }
  }'

Accuracy and Limitations

Accuracy Factors

Factor	Impact
Audio quality	Higher quality = better accuracy
Background noise	Reduces accuracy
Accents	May affect recognition
Technical jargon	May require custom vocabulary
Speaking speed	Very fast speech reduces accuracy

Improving Accuracy

Use high-quality telephony connections
Minimize background noise
Add custom vocabulary for industry terms
Choose the correct language model

Pricing

Transcription usage is billed per minute of processed audio:

Plan	Included Minutes	Additional Cost
Starter	100/month	$0.02/min
Professional	1,000/month	$0.015/min
Enterprise	Custom	Custom

Next Steps

Call Routing — Use transcripts to inform routing decisions
Webhooks — Process transcripts in your systems
Analytics — Analyze call patterns

Welcome

Core Concepts

Voice & Communication

Integrations

Security & Compliance

Enterprise

Live Transcription

Live Transcription

How It Works

Enabling Transcription

Global Setting

Per-Call Setting

Configuration Options

Supported Languages

Receiving Transcripts

WebSocket Stream

Message Format

Webhook Delivery

Webhook Payload

Dashboard View

Interim vs Final Results

Interim Results

Final Results

Speaker Identification

Post-Call Transcripts

API

Response

Use Cases

Live Agent Assistance

Compliance Recording

Real-Time Analytics

Search and Discovery

Accuracy and Limitations

Accuracy Factors

Improving Accuracy

Pricing

Next Steps

Welcome

Core Concepts

Voice & Communication

Integrations

Security & Compliance

Enterprise

​Live Transcription

​How It Works

​Enabling Transcription

​Global Setting

​Per-Call Setting

​Configuration Options

​Supported Languages

​Receiving Transcripts

​WebSocket Stream

​Message Format

​Webhook Delivery

​Webhook Payload

​Dashboard View

​Interim vs Final Results

​Interim Results

​Final Results

​Speaker Identification

​Post-Call Transcripts

​API

​Response

​Use Cases

​Live Agent Assistance

​Compliance Recording

​Real-Time Analytics

​Search and Discovery

​Accuracy and Limitations

​Accuracy Factors

​Improving Accuracy

​Pricing

​Next Steps

Live Transcription

How It Works

Enabling Transcription

Global Setting

Per-Call Setting

Configuration Options

Supported Languages

Receiving Transcripts

WebSocket Stream

Message Format

Webhook Delivery

Webhook Payload

Dashboard View

Interim vs Final Results

Interim Results

Final Results

Speaker Identification

Post-Call Transcripts

API

Response

Use Cases

Live Agent Assistance

Compliance Recording

Real-Time Analytics

Search and Discovery

Accuracy and Limitations

Accuracy Factors

Improving Accuracy

Pricing

Next Steps