4 May 2023

Efficient Bulk Transcription from the Command Line

Introduction

In today’s data-driven world, businesses must leverage audio transcriptions to enhance their understanding of customer experience (CX) and optimize interactions. ElevateAI, powered by NICE’s award-winning AI, simplifies bulk audio transcription and delivers valuable CX insights.

In this blog, we’ll explore how to upload multiple audio files and perform bulk transcription from the command line, guiding you through each step of a sample Python script that efficiently automates the process using the ElevateAI API. Access the code from its GitHub repository for easy reference.

Within the GitHub repo, you’ll find a submodule – the ElevateAI Python SDK – which includes the ElevateAI.py file to interface with the ElevateAI API. So, what are the high-level steps involved in transcription? There are three: inform ElevateAI that you want to transcribe an audio file, upload the file, and download the transcriptions and CX insights once ElevateAI completes the task. The functions in ElevateAI.py – DeclareAudioInteraction, UploadInteraction, GetPunctuatedTranscript (or GetWordByWordTranscription), and GetAIResults – will handle the heavy lifting.

To break it down further, this sample will parse command-line arguments to get the list of audio files and the path to a configuration file.

  • Check if the configuration file and audio files exist and load the configuration data.
  • Upload each audio file and store the interaction status.
  • Display the interaction status of each file in a live-updating table.

Now, let’s dive into the code and examine the functions.

Step 1. Setup

Import necessary libraries and modules

Starting off, we need to import the ElevateAI Python SDK and rich, a library for creating tables in the terminal.

import sys
import os
import json
import time
import argparse
from ElevateAIPythonSDK import ElevateAI
from rich.live import Live
from rich.table import Table

Step 2. Check Status

Check and update the status of each uploaded file

Here we are are taking in a list of uploaded file identifiers and checking their status with the API.

def update_results(upload_results, config):
    updated_results = []
    for row in upload_results:
        response = ElevateAI.GetInteractionStatus(row[1], config["api_token"])
        response_json = response.json()
        new_row = (row[0], row[1], response_json["status"])
        updated_results.append(new_row)
    return updated_results

Step 3. Display Status

Create a table to display the interaction status of each audio file using the rich library

For this test, we only want to print out the name of the files uploaded, each of their unique identifiers, and current state.

def generate_table(results) -> Table:
    table = Table()
    table.add_column("Filename")
    table.add_column("Identifier")
    table.add_column("Status")

    for row in results:
        table.add_row(row[0], row[1], row[2])

    return table

Step 4. Process & Return

Processes command-line arguments and returns the list of audio files and the path to the configuration file

Files to be uploaded are passed as arguments. There is no limit to the number of files we can pass in to upload.

def process_args(args):
    parser = argparse.ArgumentParser(description='Upload audio files to ElevateAI.')
    parser.add_argument('-f', '--files', nargs='+', help='Audio files to upload')
    parser.add_argument('-c', '--config', default='config.json', help='Path to config.json file')
    arguments = parser.parse_args(args)

    if arguments.files is None:
        parser.print_help()
        sys.exit(0)

    return arguments.files, arguments.config

Step 5. Load Configuration

Check if the configuration file and audio files exist, and loads the configuration

Before trying to do any real work, let’s make sure we have a configuration file and the list of files passed in do exist.

def check_files(config_file, audio_files):
    if not os.path.isfile(config_file):
        print(f"Config file '{config_file}' not found. A config.json file is required.")
        sys.exit(1)

    if not all(os.path.isfile(file) for file in audio_files):
        print("One or more audio files do not exist. Please check the file paths and try again.")
        sys.exit(1)

    with open(config_file) as f:
        config = json.load(f)

    return config

 

Step 6. Upload Audio

Upload each audio file and stores the interaction status in a list.

While here we are looping over the list of files here and retrieving their statuses in a linear order, an improvement would be using concurrency and threads.

def upload_files(audio_files, config):
    upload_results = []
    for file in audio_files:
        response = upload_file(file, config)

        if response.status_code == 201:
            upload_results.append((file, response.json()["interactionIdentifier"], "Uploaded Successfully"))
        else:
            upload_results.append((file, response.json()["interactionIdentifier"], "Upload error"))

    return upload_results

 

Step 7. Upload to ElevateAI

Upload a single audio file and returns the response from the DeclareAudioInteraction() call.

Here is where we do majority of the work. Each file is announced to ElevateAI and uploaded.

def upload_file(file_path, config):
    token = config['api_token']
    language_tag = "en-us"
    version = "default"
    transcription_mode = "highAccuracy"

    file_name = os.path.basename(file_path)

    declare_response = ElevateAI.DeclareAudioInteraction(language_tag, version, None, token, transcription_mode, False)
    declare_json = declare_response.json()
    interaction_id = declare_json["interactionIdentifier"]

    ElevateAI.UploadInteraction(interaction_id, token, file_path, file_name)
    return declare_response

 

Step 8. Live Updates

Main function puts it all together.

We call all the appropriate functions above to check and load the configuration file, uploads the audio files, and displays the status in a table, updating the status every 15 seconds. Only one table is displayed and the statuses are refreshed till the user exits the program.

def main(args):
    try:
        audio_files, config_file = process_args(args)
        config = check_files(config_file, audio_files)
        upload_results = upload_files(audio_files, config)

        with Live(generate_table(upload_results), refresh_per_second=4) as live:
            while True:
                time.sleep(15)
                upload_results = update_results(upload_results, config)
                live.update(generate_table(upload_results))
    except KeyboardInterrupt:
        print("\nGoodbye!")
        sys.exit(0)


Get a copy of the sample bulk transcription from command line code from its GitHub repository.

This script provides a convenient way to upload multiple audio files to ElevateAI for bulk transcription by following these steps:

  • Parse command-line arguments to get the list of audio files and the path to the config file.
  • Check if the config file and audio files exist and load the configuration data.
  • Upload each audio file and store the interaction status.
  • Display the interaction status in a live-updating table.
Neeraj Verma

Neeraj has extensive experience in the enterprise software space, having joined speech technology pioneer Nexidia straight out of college and spent his career in technology and customer experience. He transitioned to NICE with their 2016 acquisition of Nexidia and currently serves as the Vice President of Artificial Intelligence (AI), leading ElevateAI by NICE.