Single-word/sentence assessment usage

Single-word/single-file assessment for a single file (batch) mode. The developer sever/app sends a TCP call to the ELSA server with all information necessary for the assessment to take place. The Elsa server performs the process synchronously and sends a result back, in the the same call when it is ready. To perform an assessment on a word or sentence the developer need to provide:

word/sentence expected to be found in the audio
Audio to be analyzed
API token provided by ELSA

Header Parameters

Authorization: Elsa string required

multipart/form-data

Request Body

Audio data can be processed in 3 different ways:

audio_file: @<local_audio_file_path> → (note the ‘@’ for multipart upload of a binary file) allows to send a local audio file
audio_path: <remote_audio_path_url> → points to a URL path to an audio file available online. Make sure the audio file is accessible.
audio_data: <base64_data_string> → audio data sent as a string, encoded in base64

sentence string required

audio_file binary required

Possible values: Value must match regular expression @<local_audio_file_path>;<remote_audio_path_url> ;<base64_data_string>

Responses

application/json

Schema

Example (from schema)

Schema

api_version string

total_time number

utterance object[]

Array [

initial_silence boolean

sentence string

sentence_id integer

total_time number

has_speech boolean

attempt_type string

snr number

decision string

nativeness_score number

nativeness_score_partial number

ipa string

intonation_score number

fluency_score integer

speed_metrics object

words_per_minute number

syllables_per_minute number

phones_per_minute number

articulated_words_per_minute number

articulated_syllables_per_minute number

articulated_phones_per_minute number

pronunciation_rate_metrics object

correct_words_per_minute number

correct_syllables_per_minute number

correct_phones_per_minute number

correct_words_ratio number

correct_syllables_ratio number

correct_phones_ratio number

pausing_metrics object

pause_ratio number

pause_ratio_without_initial_sil number

pausing_score_percentage integer

advanced_metrics_confidence string

words object[]

Array [

start_index integer

end_index integer

start_time number

end_time number

trans_arpabet string

decoded boolean

text string

text_orig string

ipa string

nativeness_score integer

decision string

phonemes object[]

Array [

start_index integer

end_index integer

text string

trans string

trans_arpabet string

start_time number

end_time integernumber

decision string

nativeness_score integer

phoneme_error string

phoneme_error_arpabet string

]

word_stress object[]

Array [

start_index integer

end_index integer

decision string

stress_level_measured string

]

success boolean

{
  "api_version": "string",
  "total_time": 0,
  "utterance": [
    {
      "initial_silence": true,
      "sentence": "string",
      "sentence_id": 0,
      "total_time": 0,
      "has_speech": true,
      "attempt_type": "string",
      "snr": 0,
      "decision": "string",
      "nativeness_score": 0,
      "nativeness_score_partial": 0,
      "ipa": "string",
      "intonation_score": 0,
      "fluency_score": 0,
      "speed_metrics": {
        "words_per_minute": 0,
        "syllables_per_minute": 0,
        "phones_per_minute": 0,
        "articulated_words_per_minute": 0,
        "articulated_syllables_per_minute": 0,
        "articulated_phones_per_minute": 0
      },
      "pronunciation_rate_metrics": {
        "correct_words_per_minute": 0,
        "correct_syllables_per_minute": 0,
        "correct_phones_per_minute": 0,
        "correct_words_ratio": 0,
        "correct_syllables_ratio": 0,
        "correct_phones_ratio": 0
      },
      "pausing_metrics": {
        "pause_ratio": 0,
        "pause_ratio_without_initial_sil": 0,
        "pausing_score_percentage": 0
      },
      "advanced_metrics_confidence": "string",
      "words": [
        {
          "start_index": 0,
          "end_index": 0,
          "start_time": 0,
          "end_time": 0,
          "trans_arpabet": "string",
          "decoded": true,
          "text": "string",
          "text_orig": "string",
          "ipa": "string",
          "nativeness_score": 0,
          "decision": "string",
          "phonemes": [
            {
              "start_index": 0,
              "end_index": 0,
              "text": "string",
              "trans": "string",
              "trans_arpabet": "string",
              "start_time": 0,
              "decision": "string",
              "nativeness_score": 0,
              "phoneme_error": "string",
              "phoneme_error_arpabet": "string"
            }
          ],
          "word_stress": [
            {
              "start_index": 0,
              "end_index": 0,
              "decision": "string",
              "stress_level_measured": "string"
            }
          ]
        }
      ]
    }
  ],
  "success": true
}

Single-word/sentence assessment usage​

Single-word/sentence assessment usage