In today’s data-driven world, businesses must leverage audio transcriptions to enhance their understanding of customer experience (CX) and optimize interactions. ElevateAI, powered by NICE’s award-winning Enlighten AI, simplifies bulk audio transcription and delivers valuable CX insights.
In this blog, we’ll explore how to upload multiple audio files and perform bulk transcription from the command line, guiding you through each step of a sample Python script that efficiently automates the process using the ElevateAI Cloud API. Access the code from its GitHub repository for easy reference.
Within the GitHub repository, you’ll find a submodule – the ElevateAI Python SDK – which includes the ElevateAI.py file to interface with the ElevateAI API. So, what are the high-level steps involved in transcription? There are three: inform ElevateAI that you want to transcribe an audio file, upload the file, and download the transcriptions and CX insights once ElevateAI completes the task. The functions in ElevateAI.py – DeclareAudioInteraction, UploadInteraction, GetPunctuatedTranscript (or GetWordByWordTranscription), and GetAIResults – will handle the heavy lifting.
To break it down further, this sample will parse command-line arguments to get the list of audio files and the path to a configuration file:
Now, let’s dive into the code and examine the functions.
Starting off, we need to import the ElevateAI Python SDK and rich, a library for creating tables in the terminal.
import sys import os import json import time import argparse from ElevateAIPythonSDK import ElevateAI from rich.live import Live from rich.table import Table
Here we are are taking in a list of uploaded file identifiers and checking their status with the API.
def update_results(upload_results, config): updated_results = [] for row in upload_results: response = ElevateAI.GetInteractionStatus(row[1], config["api_token"]) response_json = response.json() new_row = (row[0], row[1], response_json["status"]) updated_results.append(new_row) return updated_results
For this test, we only want to print out the name of the files uploaded, each of their unique identifiers, and current state.
def generate_table(results) -> Table: table = Table() table.add_column("Filename") table.add_column("Identifier") table.add_column("Status") for row in results: table.add_row(row[0], row[1], row[2]) return table
Files to be uploaded are passed as arguments. There is no limit to the number of files we can pass in to upload.
def process_args(args): parser = argparse.ArgumentParser(description='Upload audio files to ElevateAI.') parser.add_argument('-f', '--files', nargs='+', help='Audio files to upload') parser.add_argument('-c', '--config', default='config.json', help='Path to config.json file') arguments = parser.parse_args(args) if arguments.files is None: parser.print_help() sys.exit(0) return arguments.files, arguments.config
Before trying to do any real work, let’s make sure we have a configuration file and the list of files passed in do exist.
def check_files(config_file, audio_files): if not os.path.isfile(config_file): print(f"Config file '{config_file}' not found. A config.json file is required.") sys.exit(1) if not all(os.path.isfile(file) for file in audio_files): print("One or more audio files do not exist. Please check the file paths and try again.") sys.exit(1) with open(config_file) as f: config = json.load(f) return config
While here we are looping over the list of files here and retrieving their statuses in a linear order, an improvement would be using concurrency and threads.
def upload_files(audio_files, config): upload_results = [] for file in audio_files: response = upload_file(file, config) if response.status_code == 201: upload_results.append((file, response.json()["interactionIdentifier"], "Uploaded Successfully")) else: upload_results.append((file, response.json()["interactionIdentifier"], "Upload error")) return upload_results
Here is where we do majority of the work. Each file is announced to ElevateAI and uploaded.
def upload_file(file_path, config): token = config['api_token'] language_tag = "en-us" version = "default" transcription_mode = "highAccuracy" file_name = os.path.basename(file_path) declare_response = ElevateAI.DeclareAudioInteraction(language_tag, version, None, token, transcription_mode, False) declare_json = declare_response.json() interaction_id = declare_json["interactionIdentifier"] ElevateAI.UploadInteraction(interaction_id, token, file_path, file_name) return declare_response
We call all the appropriate functions above to check and load the configuration file, uploads the audio files, and displays the status in a table, updating the status every 15 seconds. Only one table is displayed and the statuses are refreshed till the user exits the program.
def main(args): try: audio_files, config_file = process_args(args) config = check_files(config_file, audio_files) upload_results = upload_files(audio_files, config) with Live(generate_table(upload_results), refresh_per_second=4) as live: while True: time.sleep(15) upload_results = update_results(upload_results, config) live.update(generate_table(upload_results)) except KeyboardInterrupt: print("\nGoodbye!") sys.exit(0)
Get a copy of the sample bulk transcription from command line code from its GitHub repository.
This script provides a convenient way to upload multiple audio files to ElevateAI for bulk transcription by following these steps:
Want more? Visit our Documentation Hub >> ElevateAI Documentation
Ready to Get Started? >> elevateai.com/getstarted