import asyncio
import aiohttp
import requests
import os
import sys
# This two lines allow asyncio to be used in Jupyter Notebooks
import nest_asyncio
apply()
nest_asyncio.
from pathlib import Path
from typing import List, Optional
from time import perf_counter
Introduction
Discovering Gladia
Gladia unveils a great “Speech-to-Text” service, powered by their Whisper-Zero ASR technology. You can begin this exploration free with a generous 10 hours of audio transcriptions per month.
Usage and Documentation
While Gladia’s documentation and API reference offers Python examples for both pre-recorded and live (streaming) scenarios, a crucial element remains unaddressed: integration with Python’s asyncio.
Asyncio with Gladia
This post delves into harnessing asyncio with the Gladia API, enabling applications to execute multiple tasks in parallel.
We’ll navigate into through the transcription process, which involves several I/O bound tasks:
- Uploading audio files
- Initiating the transcription job
- Awaiting completion of the transcription
Configuring your account
You can go to the Getting Started section in the documentation to configure your account and get an API key.
Resources
- Gladia Documentation
- Gladia samples: python
- Batch Processing OpenAI using asyncio and Instructor with Python
- In creating this guide, I leveraged Gemini and mostly ChatGPT, tools that helped me, a non-native English speaker, refine my ideas and express them clearly.
Asyncio with Gladia: A Step-by-Step Guide
Import Libraries
Get API Key
if "google.colab" in sys.modules:
# If running in Colab
from google.colab import userdata
= userdata.get('GLADIA_API_KEY')
x_gladia_key else:
from dotenv import load_dotenv
"creds/.env")
load_dotenv(= os.environ.get('GLADIA_API_KEY') x_gladia_key
Context manager to measure time
# https://stackoverflow.com/a/69156219
class catchtime:
def __enter__(self):
self.start = perf_counter()
return self
def __exit__(self, type, value, traceback):
self.seconds = round(perf_counter() - self.start, 2)
= divmod(self.seconds, 60)
m, s self.m, self.s = int(m), round(s, 1)
self.readout = f'Time: {self.seconds:.3f} seconds'
Python Async Functions
async def process_response(
response: aiohttp.client_reqrep.ClientResponse-> Optional[dict]:
) """Process aiohttp requests
"""
if response.status not in (200, 201):
print(f"- Request failed with status: {response.status}")
= await response.text()
json_response print(f"Json Response: {json_response}")
print('- End of work');
return None
else:
print("- Request successful")
return await response.json()
async def async_make_request(
session: aiohttp.client.ClientSession, str, headers: dict,
url: str = "GET",
method: = None,
data: aiohttp.formdata.FormData dict = None
json: -> Optional[dict]:
) """Send aiohttp requests
"""
if method == "POST":
async with session.post(
=headers, data=data, json=json
url, headersas response:
) return await process_response(response)
else:
async with session.get(url, headers=headers) as response:
return await process_response(response)
async def a_upload_file(
session: aiohttp.client.ClientSession,
file_path: Path-> dict:
) """Upload audio file to Gladia
"""
with catchtime() as t:
= str(file_path.with_suffix(''))
file_name = f"audio/{file_path.suffix[1:]}"
content_type
with open(file_path, "rb") as f:
= aiohttp.FormData()
data "audio", f, filename=file_name, content_type=content_type)
data.add_field(
print("- Uploading file to Gladia...")
= await async_make_request(
json_response "https://api.gladia.io/v2/upload/",
session, =headers, method="POST", data=data
headers
)
print(f'Upload Time: {t.seconds} seconds for `{file_path.name}`')
return json_response
async def a_create_transcription_job(
session: aiohttp.client.ClientSession,str,
audio_url: bool = False,
diarization: bool = False,
enable_code_switching: dict] = None,
custom_metadata: Optional[**kwargs
-> str:
) """Initiate the transcription job
"""
= {
json_data "audio_url": audio_url,
"diarization": diarization,
"enable_code_switching": enable_code_switching,
"custom_metadata": custom_metadata
}for key in kwargs.keys():
= kwargs[key]
data[key]
print("- Sending transcription request to Gladia API...")
with catchtime() as t:
= await async_make_request(
json_response "https://api.gladia.io/v2/transcription/",
session, =headers, method="POST", json=json_data
headers
)
print(f'Create Transcription Job: {t.seconds} seconds for `{audio_url}`')
return json_response
async def a_wait_until_job_done(
session: aiohttp.client.ClientSession,dict
transcription_job:
):"""Wait until the transcription job is done
"""
= transcription_job.get("result_url")
result_url id = transcription_job["id"]
while True:
= await async_make_request(
poll_response =result_url, headers=headers
session, url
)
if poll_response.get("status") == "done":
print(f"- Transcription done. - id: ...{id[-5:]}")
break
elif poll_response.get("status") == "error":
print(f"- Transcription failed. id: ...{id[-5:]}")
print(poll_response)
else:
print(f"Transcription status: {poll_response.get('status')} - id: ...{id[-5:]}")
await asyncio.sleep(4)
return poll_response
Headers and example files
= {
headers "accept": "application/json",
"x-gladia-key": x_gladia_key,
}
= Path("./data")
files_path = [f for f in files_path.iterdir()]
files_to_upload files_to_upload
[PosixPath('data/Introducción Master Class.webm'),
PosixPath('data/Introducing_ Better Offline.mp3'),
PosixPath('data/You need to classify documents before trying to extract data.webm')]
asyncio.gather
vs asyncio.as_completed
As we saw, the process to transcribe audios has the following steps: - Upload audio files - Initiate the transcription job - Awaiting completion of the transcription
asyncio.gather
This function orchestrate tasks by leveraging asuncio.gather()
:
async def async_function_orchestrator(func: 'function', tasks_param: list):
"""Gather results from an async function and a list of parameters
"""
async with aiohttp.ClientSession() as session:
= [
tasks for p in tasks_param
func(session, p)
]
= await asyncio.gather(*tasks)
results return results
Uploading files asynchronously
The first step in the transcription journey involves uploading audio files to Gladia. With asyncio we can simultaneously upload multiple files. With asyncio.gather()
we can initiate several upload tasks concurrently, allowing our script to move forward without having to wait for each file to finish uploading:
with catchtime() as t:
= asyncio.run(
upload_results =a_upload_file, tasks_param=files_to_upload)
async_function_orchestrator(func
)
print(f'Total Time: {t.seconds} seconds')
- Uploading file to Gladia...
- Uploading file to Gladia...
- Uploading file to Gladia...
- Request successful
Upload Time: 4.77 seconds for `Introducción Master Class.webm`
- Request successful
Upload Time: 6.14 seconds for `You need to classify documents before trying to extract data.webm`
- Request successful
Upload Time: 6.86 seconds for `Introducing_ Better Offline.mp3`
Total Time: 7.03 seconds
for file in upload_results:
print(f"audio_url: {file['audio_url']}")
print(f"filename: {file['audio_metadata']['filename']}")
print(f"id: {file['audio_metadata']['id']}")
print()
audio_url: https://api.gladia.io/file/5d7d3d23-2de3-4c78-93c8-9010c6d7b6a7
filename: data%2FIntroducci%C3%B3n%20Master%20Class
id: 5d7d3d23-2de3-4c78-93c8-9010c6d7b6a7
audio_url: https://api.gladia.io/file/dee3b9c4-90d4-4b15-8a94-fbd66f11d6e2
filename: data%2FIntroducing_%20Better%20Offline
id: dee3b9c4-90d4-4b15-8a94-fbd66f11d6e2
audio_url: https://api.gladia.io/file/1cf50050-a7c4-4a5b-b99c-f6face16e942
filename: data%2FYou%20need%20to%20classify%20documents%20before%20trying%20to%20extract%20data
id: 1cf50050-a7c4-4a5b-b99c-f6face16e942
Asynchronously requesting transcriptions
Once files are uploaded, the next step is to request transcriptions. Similar to the upload process, asyncio.gather()
enables us to send out transcription requests for all uploaded files in parallel. This ensures that we’re efficiently moving through or workload without unnecessary delays between requests:
= [result["audio_url"] for result in upload_results]
audio_urls
with catchtime() as t:
= asyncio.run(
transcription_job_results
async_function_orchestrator(a_create_transcription_job, audio_urls)
)
print(f'Total Time: {t.seconds} seconds')
- Sending transcription request to Gladia API...
- Sending transcription request to Gladia API...
- Sending transcription request to Gladia API...
- Request successful
Create Transcription Job: 1.2 seconds for `https://api.gladia.io/file/dee3b9c4-90d4-4b15-8a94-fbd66f11d6e2`
- Request successful
Create Transcription Job: 1.24 seconds for `https://api.gladia.io/file/1cf50050-a7c4-4a5b-b99c-f6face16e942`
- Request successful
Create Transcription Job: 1.25 seconds for `https://api.gladia.io/file/5d7d3d23-2de3-4c78-93c8-9010c6d7b6a7`
Total Time: 1.25 seconds
transcription_job_results
[{'id': '517ca2e0-7830-4803-a66c-4cb2cb259fd5',
'result_url': 'https://api.gladia.io/v2/transcription/517ca2e0-7830-4803-a66c-4cb2cb259fd5'},
{'id': '8a247329-c586-4685-914d-06e3e204f581',
'result_url': 'https://api.gladia.io/v2/transcription/8a247329-c586-4685-914d-06e3e204f581'},
{'id': '9c7d8255-a7c3-465e-a0f5-01569ba49f4c',
'result_url': 'https://api.gladia.io/v2/transcription/9c7d8255-a7c3-465e-a0f5-01569ba49f4c'}]
Wait for the transcriptions to be ready
Same as the uploading and transcription request process, we wait for transcriptions in parallel:
with catchtime() as t:
= asyncio.run(
transcription_results
async_function_orchestrator(a_wait_until_job_done, transcription_job_results)
)
print(f'Total Time: {t.seconds} seconds')
- Request successful
Transcription status: queued - id: ...59fd5
- Request successful
Transcription status: queued - id: ...49f4c
- Request successful
Transcription status: queued - id: ...4f581
- Request successful
Transcription status: queued - id: ...59fd5
- Request successful
Transcription status: processing - id: ...49f4c
- Request successful
Transcription status: queued - id: ...4f581
- Request successful
Transcription status: processing - id: ...59fd5
- Request successful
Transcription status: processing - id: ...49f4c
- Request successful
Transcription status: processing - id: ...4f581
- Request successful
Transcription status: processing - id: ...59fd5
- Request successful
Transcription status: processing - id: ...4f581
- Request successful
Transcription status: processing - id: ...49f4c
- Request successful
Transcription status: processing - id: ...59fd5
- Request successful
Transcription status: processing - id: ...4f581
- Request successful
- Transcription done. - id: ...49f4c
- Request successful
- Transcription done. - id: ...59fd5
- Request successful
Transcription status: processing - id: ...4f581
- Request successful
- Transcription done. - id: ...4f581
Total Time: 28.89 seconds
for transcription in transcription_results:
print(transcription["id"])
print(transcription["file"]["filename"])
print(transcription["result"]["transcription"]["languages"])
print(transcription["result"]["transcription"]["full_transcript"][:250])
print("...")
print(transcription["result"]["transcription"]["full_transcript"][-250:])
print()
517ca2e0-7830-4803-a66c-4cb2cb259fd5
data%2FIntroducci%C3%B3n%20Master%20Class
['es']
Música ¿Necesitas tutorías en tus tareas escolares? ¿Asesorías en proyectos académicos y empresariales? Aquí está la solución. Ingresa desde tu PC a www.masterclass.com.ec o descarga la aplicación desde tu móvil masterclass-ec. Después, selecciona la
...
tutorías recibidas, recibe una gratis. Recuerda que nuestra plataforma es inclusiva. Si necesitas que la tutoría vaya acompañada de un intérprete de lengua de señas ecuatoriana, escoge la opción Intérprete. Masterclass. El conocimiento a tu alcance.
8a247329-c586-4685-914d-06e3e204f581
data%2FIntroducing_%20Better%20Offline
['en']
Hi, I'm Ed Zitron, host of the Better Offline podcast on the Cool Zone Media Network. I've been both a tech writer and a tech executive for the last 15 years, and I've seen this industry grow from a bunch of dorks building things in their garage into
...
no bullshit, just a crystal clear window into a world that quietly finds new and innovative ways to make billionaires rich. Listen to Better Offline on the iHeartRadio app, Apple Podcasts, or wherever else you get your podcasts. Thanks for listening.
9c7d8255-a7c3-465e-a0f5-01569ba49f4c
data%2FYou%20need%20to%20classify%20documents%20before%20trying%20to%20extract%20data
['en']
Today I've been talking to a bunch of people on doing document extraction. And in particular, I think a lot of people who are coming into this world with that much machine learning experience kind of think that AGI is here and they think that Jupyter
...
y valuable. You might have to be in a world where you pay humans to do this relabeling. Because we have before, if you're wrong in your pre-work, it's very easy to not lose all that effort. And you can just rebuild a lot of these indices very easily.
asyncio.as_completed
Finally, instead of waiting for each step to finish, we can adopt a different strategy by processing the files as they are uploaded. Then, using asyncio.as_completed()
allow us to process the end result as each transcription process ends.
async def a_upload_and_process(
session: aiohttp.client.ClientSession,
file_path: Path-> dict:
) """Upload and process the file
"""
# Upload the file
= await a_upload_file(session, file_path)
uploaded
= uploaded["audio_url"]
audio_url
# Start the transcription
= await a_create_transcription_job(session, audio_url)
transcription_job_result
# Wait for the transcription to complete
= await a_wait_until_job_done(session, transcription_job_result)
transcription_result
return transcription_result
async def async_tasks_orchestrator(files_to_upload: List[Path]) -> None:
"""Process transcriptions as they complete
"""
async with aiohttp.ClientSession() as session:
= [
transcription_tasks file) for file in files_to_upload
a_upload_and_process(session,
]
for transcription_task in asyncio.as_completed(transcription_tasks):
= await transcription_task
transcription
process_transcription(transcription)#yield transcription
def process_transcription(transcription: dict) -> None:
print(f"<<<<<Transcription with id: {transcription['id']} Done>>>>>")
print(transcription["file"]["filename"])
print(transcription["result"]["transcription"]["languages"])
print(transcription["result"]["transcription"]["full_transcript"][:250])
print("...")
print(transcription["result"]["transcription"]["full_transcript"][-250:])
print("<<<<</Transcription Done>>>>>")
with catchtime() as t:
= asyncio.run(
transcription_job_results
async_tasks_orchestrator(files_to_upload)
)
print(f'Total Time: {t.seconds} seconds')
- Uploading file to Gladia...
- Uploading file to Gladia...
- Uploading file to Gladia...
- Request successful
Upload Time: 3.56 seconds for `Introducción Master Class.webm`
- Sending transcription request to Gladia API...
- Request successful
Create Transcription Job: 0.35 seconds for `https://api.gladia.io/file/799bac2b-8217-49a1-a67c-c53966fb9b60`
- Request successful
Transcription status: queued - id: ...1fba0
- Request successful
Upload Time: 4.28 seconds for `Introducing_ Better Offline.mp3`
- Sending transcription request to Gladia API...
- Request successful
Upload Time: 4.4 seconds for `You need to classify documents before trying to extract data.webm`
- Sending transcription request to Gladia API...
- Request successful
Create Transcription Job: 0.47 seconds for `https://api.gladia.io/file/dd94ecad-ec21-40e7-97fd-ecd063afd686`
- Request successful
Create Transcription Job: 0.43 seconds for `https://api.gladia.io/file/47ae9c16-665c-46f8-8379-0ae38f81eb01`
- Request successful
Transcription status: queued - id: ...c17fb
- Request successful
Transcription status: queued - id: ...89830
- Request successful
Transcription status: processing - id: ...1fba0
- Request successful
Transcription status: queued - id: ...c17fb
- Request successful
Transcription status: queued - id: ...89830
- Request successful
Transcription status: processing - id: ...1fba0
- Request successful
Transcription status: queued - id: ...c17fb
- Request successful
Transcription status: queued - id: ...89830
- Request successful
- Transcription done. - id: ...1fba0
<<<<<Transcription with id: 145b8844-c360-406f-8bb9-2de82661fba0 Done>>>>>
data%2FIntroducci%C3%B3n%20Master%20Class
['es']
Música ¿Necesitas tutorías en tus tareas escolares? ¿Asesorías en proyectos académicos y empresariales? Aquí está la solución. Ingresa desde tu PC a www.masterclass.com.ec o descarga la aplicación desde tu móvil masterclass-ec. Después, selecciona la
...
tutorías recibidas, recibe una gratis. Recuerda que nuestra plataforma es inclusiva. Si necesitas que la tutoría vaya acompañada de un intérprete de lengua de señas ecuatoriana, escoge la opción Intérprete. Masterclass. El conocimiento a tu alcance.
<<<<</Transcription Done>>>>>
- Request successful
Transcription status: queued - id: ...c17fb
- Request successful
Transcription status: processing - id: ...89830
- Request successful
Transcription status: queued - id: ...89830
- Request successful
Transcription status: processing - id: ...c17fb
- Request successful
Transcription status: processing - id: ...89830
- Request successful
Transcription status: processing - id: ...c17fb
- Request successful
Transcription status: processing - id: ...89830
- Request successful
Transcription status: processing - id: ...c17fb
- Request successful
Transcription status: processing - id: ...89830
- Request successful
Transcription status: processing - id: ...c17fb
- Request successful
Transcription status: processing - id: ...89830
- Request successful
- Transcription done. - id: ...c17fb
<<<<<Transcription with id: 8f880055-b56f-4297-8390-5d7668ec17fb Done>>>>>
data%2FIntroducing_%20Better%20Offline
['en']
Hi, I'm Ed Zitron, host of the Better Offline podcast on the Cool Zone Media Network. I've been both a tech writer and a tech executive for the last 15 years, and I've seen this industry grow from a bunch of dorks building things in their garage into
...
no bullshit, just a crystal clear window into a world that quietly finds new and innovative ways to make billionaires rich. Listen to Better Offline on the iHeartRadio app, Apple Podcasts, or wherever else you get your podcasts. Thanks for listening.
<<<<</Transcription Done>>>>>
- Request successful
- Transcription done. - id: ...89830
<<<<<Transcription with id: 9b13aff2-aa31-4d90-8e9d-dfab9ff89830 Done>>>>>
data%2FYou%20need%20to%20classify%20documents%20before%20trying%20to%20extract%20data
['en']
Today I've been talking to a bunch of people on doing document extraction. And in particular, I think a lot of people who are coming into this world with that much machine learning experience kind of think that AGI is here and they think that Jupyter
...
y valuable. You might have to be in a world where you pay humans to do this relabeling. Because we have before, if you're wrong in your pre-work, it's very easy to not lose all that effort. And you can just rebuild a lot of these indices very easily.
<<<<</Transcription Done>>>>>
Total Time: 44.65 seconds
Processing times will depend on Gladia response time. In this example we cannot directly compare asyncio.gather()
with asyncio.as_completed()
without taking into account the time it takes to Gladia to complete each transcription.
Conclusion
Integrating Gladia transcription service with Python’s asyncio presents a powerful approach to managing audio data processing tasks efficiently. By utilizing asyncio.gather()
for parallel uploads, requests and waits; or using asyncio.as_completed()
and inmediate processing of each uploaded file, we significantly enhance the speed and responsiveness of the process.