Skip to main content
When Dolva analyzes an audio file, it returns a JSON object containing the detected signals. This page explains how to interpret the response, what the signal values mean, and how to use them in your application.

Response Structure

Both analysis endpoints (/v1/analyze/cognitive and /v1/analyze/emotion) return an application/json body on success. The top-level structure contains a status indicator and a signals object with the extracted data:
Cognitive Response Example
{
  "status": "ok",
  "signals": {
    "cognitive_load": 0.72,
    "clarity": 0.85
  }
}
Emotion Response Example
{
  "status": "ok",
  "signals": {
    "valence": 0.6,
    "arousal": 0.4,
    "dominant_emotion": "calm"
  }
}
The exact fields in the signals object may vary as Dolva’s models evolve. Design your integration to handle additional or missing fields gracefully — use optional chaining or null checks rather than assuming all fields are always present.

Cognitive Signals

Cognitive signals describe the speaker’s mental state as reflected in their speech patterns.
FieldRangeInterpretation
cognitive_load0.01.0Higher values indicate greater cognitive effort or mental load
clarity0.01.0Higher values indicate clearer, more organized speech patterns
Reading the scores: values closer to 1.0 indicate stronger expression of that signal. A cognitive_load of 0.8 suggests high mental effort; a value of 0.2 suggests relaxed, low-effort processing.

Emotion Signals

Emotion signals describe the affective properties of the audio.
FieldRangeInterpretation
valence0.01.0Emotional positivity: 1.0 = very positive, 0.0 = very negative
arousal0.01.0Energy level: 1.0 = highly activated/energized, 0.0 = calm/subdued
dominant_emotionstringThe strongest detected emotion (e.g., "calm", "tense", "engaged")

Using Signals in Your Application

Here are practical patterns for working with the response data:
Python
import requests
import os

token = os.environ["DOLVA_API_TOKEN"]

with open("audio.wav", "rb") as f:
    resp = requests.post(
        "https://api.dolva.ai/v1/analyze/emotion",
        headers={"Authorization": f"Bearer {token}"},
        files={"audio": f}
    )

data = resp.json()
signals = data.get("signals", {})

valence = signals.get("valence", None)
if valence is not None and valence < 0.3:
    print("Low valence detected — consider flagging for review")

Tracking Changes Over Time

Dolva’s signals are most powerful when compared across multiple recordings of the same person or conversation context. A single data point gives you a snapshot; a series of recordings reveals trends — cognitive fatigue accumulating across a workday, or emotional tone shifting across a therapy program. Store the full response JSON alongside a timestamp and a subject identifier, then compute trends in your own data layer.
Normalize signals to your own baseline before drawing conclusions. Individual speakers differ in their natural acoustic profiles — what counts as “high” cognitive load varies person to person.