multipart/form-data in the audio field. For best results, your audio should meet the quality and format guidelines described on this page. Poor audio quality — excessive background noise, very low bitrate, or very short clips — can reduce signal accuracy.
Recommended Formats
Dolva works with common audio formats. The following are recommended for best compatibility and analysis quality:| Format | Extension | Notes |
|---|---|---|
| WAV (PCM) | .wav | Best quality; uncompressed |
| MP3 | .mp3 | Widely supported; moderate compression |
| M4A / AAC | .m4a | Good quality; common on mobile devices |
| FLAC | .flac | Lossless compression; large file size |
| OGG Vorbis | .ogg | Open format; good compression |
Recording Guidelines
For accurate cognitive and emotion signal extraction, follow these recording best practices:- Minimize background noise — Record in a quiet environment. Loud background noise (traffic, music, crowd) reduces signal quality.
- Use a close microphone — Headset or phone microphone held close to the speaker produces better results than far-field recording.
- Avoid excessive clipping — Recording volume should be high enough to be clear but not so high that the audio distorts.
- Include full utterances — Clips of at least a few seconds that contain natural speech yield the most reliable signals.
File Size
Keep audio files to a reasonable size for upload performance. For most conversational recordings, a few minutes of audio is sufficient. Very long files (e.g., several hours) should be split into segments before uploading.Mono vs. Stereo
Dolva accepts both mono and stereo audio. If you have a stereo file (e.g., two speakers on separate channels), consider whether you want to analyze the full mix or extract individual channels for per-speaker analysis.Language and Accent
Dolva’s acoustic models are language-agnostic — they analyze audio signal properties rather than words, so they work across languages and accents without additional configuration.If you’re unsure whether your audio format is supported, test with a short clip first using the
/v1/analyze/cognitive or /v1/analyze/emotion endpoint. A 422 Unprocessable Entity response may indicate a format issue.