Skip to main content

Live Transcription

Live transcription converts spoken words to text in real-time during calls. Use it for live monitoring, agent assistance, compliance recording, and post-call analysis.

How It Works

Caller speaks

Audio stream captured

Speech-to-text processing

Text emitted via WebSocket/webhook

Display in dashboard or your app

Enabling Transcription

Global Setting

Enable for all calls in your Crew:
{
  "transcription": {
    "enabled": true,
    "language": "en-US"
  }
}

Per-Call Setting

Enable for specific calls:
curl -X POST https://api.usecrew.ai/v1/calls/outbound \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "to": "+14155551234",
    "agent_id": "agent_sarah",
    "transcription": {
      "enabled": true
    }
  }'

Configuration Options

{
  "transcription": {
    "enabled": true,
    "language": "en-US",
    "punctuation": true,
    "profanity_filter": false,
    "speaker_labels": true,
    "interim_results": true,
    "word_timestamps": false
  }
}
OptionDescriptionDefault
enabledEnable transcriptionfalse
languagePrimary languageen-US
punctuationAdd punctuationtrue
profanity_filterMask profanityfalse
speaker_labelsIdentify speakerstrue
interim_resultsPartial transcriptstrue
word_timestampsPer-word timingfalse

Supported Languages

LanguageCode
English (US)en-US
English (UK)en-GB
Spanish (US)es-US
Spanish (Spain)es-ES
Frenchfr-FR
Germande-DE
Portuguese (Brazil)pt-BR
Japaneseja-JP
Mandarinzh-CN

Receiving Transcripts

WebSocket Stream

Connect to receive real-time transcripts:
const ws = new WebSocket('wss://api.usecrew.ai/v1/calls/{call_id}/transcript');

ws.onmessage = (event) => {
  const data = JSON.parse(event.data);
  console.log(`[${data.speaker}]: ${data.text}`);
};

Message Format

{
  "type": "transcript",
  "call_id": "call_abc123",
  "timestamp": "2024-01-15T10:30:15.123Z",
  "speaker": "caller",
  "text": "I'd like to schedule an appointment",
  "is_final": true,
  "confidence": 0.95
}

Webhook Delivery

Receive transcripts via webhook:
{
  "webhooks": {
    "transcription.segment": "https://yourapp.com/hooks/transcript"
  }
}

Webhook Payload

{
  "event": "transcription.segment",
  "call_id": "call_abc123",
  "segments": [
    {
      "speaker": "agent",
      "text": "Good afternoon, how can I help you today?",
      "start_time": 0.0,
      "end_time": 2.5
    },
    {
      "speaker": "caller",
      "text": "I'd like to schedule an appointment for next week.",
      "start_time": 3.0,
      "end_time": 5.8
    }
  ]
}

Dashboard View

View live transcripts in the Crew dashboard:
  1. Navigate to CallsActive Calls
  2. Click on an active call
  3. View the live transcript panel

Interim vs Final Results

Interim Results

Partial transcripts as words are spoken:
Speaker: "I'd like to sche..."
Speaker: "I'd like to schedule an app..."
Speaker: "I'd like to schedule an appointment" [FINAL]
Enable for real-time display, disable for cleaner webhook traffic.

Final Results

Complete, corrected transcripts after speaker pauses:
{
  "is_final": true,
  "text": "I'd like to schedule an appointment for next Thursday."
}

Speaker Identification

With speaker_labels enabled, transcripts identify who is speaking:
SpeakerDescription
agentThe AI agent
callerThe external party
human_agentA human who joined the call
{
  "segments": [
    { "speaker": "agent", "text": "How can I help you?" },
    { "speaker": "caller", "text": "I need to cancel my appointment." },
    { "speaker": "agent", "text": "I can help with that." }
  ]
}

Post-Call Transcripts

Access complete transcripts after calls end:

API

curl https://api.usecrew.ai/v1/calls/{call_id}/transcript \
  -H "Authorization: Bearer YOUR_API_KEY"

Response

{
  "call_id": "call_abc123",
  "duration": 127,
  "transcript": {
    "full_text": "Agent: Good afternoon...\nCaller: Hi, I'd like to...",
    "segments": [
      {
        "speaker": "agent",
        "text": "Good afternoon, thank you for calling Acme Medical.",
        "start_time": 0.0,
        "end_time": 3.2
      }
    ]
  }
}

Use Cases

Live Agent Assistance

Display transcripts to human supervisors for real-time coaching:
// Supervisor dashboard
socket.on('transcript', (data) => {
  if (data.call_id === monitoredCall) {
    displayTranscript(data);
    checkForCoachingOpportunities(data);
  }
});

Compliance Recording

Store transcripts for regulatory requirements:
{
  "transcription": {
    "enabled": true,
    "storage": {
      "enabled": true,
      "retention_days": 365,
      "format": "json"
    }
  }
}

Real-Time Analytics

Analyze conversations as they happen:
socket.on('transcript', (data) => {
  // Detect sentiment
  const sentiment = analyzeSentiment(data.text);
  
  // Alert on negative sentiment
  if (sentiment < -0.5) {
    alertSupervisor(data.call_id, 'Negative sentiment detected');
  }
});

Search and Discovery

Index transcripts for later search:
curl https://api.usecrew.ai/v1/calls/search \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "query": "appointment cancellation",
    "date_range": {
      "start": "2024-01-01",
      "end": "2024-01-31"
    }
  }'

Accuracy and Limitations

Accuracy Factors

FactorImpact
Audio qualityHigher quality = better accuracy
Background noiseReduces accuracy
AccentsMay affect recognition
Technical jargonMay require custom vocabulary
Speaking speedVery fast speech reduces accuracy

Improving Accuracy

  • Use high-quality telephony connections
  • Minimize background noise
  • Add custom vocabulary for industry terms
  • Choose the correct language model

Pricing

Transcription usage is billed per minute of processed audio:
PlanIncluded MinutesAdditional Cost
Starter100/month$0.02/min
Professional1,000/month$0.015/min
EnterpriseCustomCustom

Next Steps

  • Call Routing — Use transcripts to inform routing decisions
  • Webhooks — Process transcripts in your systems
  • Analytics — Analyze call patterns