Voice Transcription
Convert speech to text with NVIDIA Parakeet
Yolocode provides audio transcription powered by NVIDIA's Parakeet model.
Endpoints
POST /api/transcribe-stream— Streaming transcriptionPOST /api/transcribe— Non-streaming transcription
Supported formats
- WAV
- FLAC
- PCM (raw audio)
Usage
JSON (base64 audio)
curl -X POST https://api.yolocode.ai/api/transcribe-stream \
-H "Content-Type: application/json" \
-d '{
"audio": "<base64_encoded_audio>",
"languageCode": "en-US"
}'Multipart form data
curl -X POST https://api.yolocode.ai/api/transcribe-stream \
-F "audio=@recording.wav" \
-F "languageCode=en-US"Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
audio | string/file | required | Base64 audio (JSON) or file (multipart) |
languageCode | string | en-US | BCP-47 language code |
Notes
- Max request duration: 300 seconds
- Audio format is auto-detected from the data
- The streaming endpoint returns results as they're processed