Logo

dev-resources.site

for different kinds of informations.

OpenAI Whisper Deployment on AWS as Asynchronous Endpoint

Published at
5/31/2023
Categories
aws
openai
whisper
ai
Author
makawtharani
Categories
4 categories in total
aws
open
openai
open
whisper
open
ai
open
Author
12 person written this
makawtharani
open
OpenAI Whisper Deployment on AWS as Asynchronous Endpoint

This project facilitates the deployment of the OpenAI Whisper ASR model on SageMaker, and handling of inference results using SageMaker, S3, SNS, DynamoDB, and Lambda.

Description

The cdk script sets up an infrastructure in AWS to automatically deploy a Whisper ASR model using Amazon SageMaker asynchronous endpoint. It also sets up resources to store, manage and process model inference results.

Resources

  1. S3 Bucket: An S3 bucket is created to store the model and inference results. The bucket is automatically deleted when the stack is deleted.
  2. IAM Role: An IAM Role is set up with appropriate permissions for SageMaker to access the S3 bucket, CloudWatch Logs, and to pull images from the ECR repository.
  3. SageMaker Model: The script defines a SageMaker model, including a container definition property and sets up an endpoint configuration for asynchronous inference.
  4. SNS Topics: Two SNS topics are created for success and error notifications.
  5. DynamoDB Table: A DynamoDB table is created to store job results.
  6. Lambda Function: A Lambda function is created to process job results and store them in the DynamoDB table. The function uses the SNS topics as event sources.

Output

At the end, the script provides output which includes the names of the created S3 bucket and the deployed SageMaker endpoint.

Functionality

When a job result is produced by the SageMaker model, it is sent to the appropriate SNS topic depending on whether the job was successful or not.
The SNS topic triggers the Lambda function, which processes the result and stores it in the DynamoDB table.
The following diagram shows the architecture and workflow of the solution.
Architecture

Asynchronous inference

Amazon SageMaker Asynchronous Inference is a new capability in SageMaker that queues incoming requests and processes them asynchronously.
This option is ideal for requests with large payload sizes (up to 1GB), long processing times (up to one hour), and near real-time latency requirements.
Asynchronous Inference enables you to save on costs by autoscaling the instance count to zero when there are no requests to process, so you only pay when your endpoint is processing requests.

How it works

Creating an asynchronous inference endpoint is similar to creating real-time inference endpoints. You can use your existing SageMaker models and only need to specify the AsyncInferenceConfig object while creating your endpoint configuration with the EndpointConfig field in the CreateEndpointConfig API. The following diagram shows the architecture and workflow of Asynchronous Inference.
Architecture
To invoke the endpoint, you need to place the request payload in Amazon S3 and provide a pointer to this payload as a part of the InvokeEndpointAsync request.
Upon invocation, SageMaker queues the request for processing and returns an identifier and output location as a response.
Upon processing, SageMaker places the result in the Amazon S3 location. We receive success or error notifications with Amazon SNS.

Additional Material

The ECR Image for the Whisper Model

It is Flask-based Python server that uses the OpenAI Whisper ASR system to transcribe audio files. This server listens for HTTP POST requests containing information about an audio file, downloads the file from an Amazon S3 bucket, transcribes it, and then uploads the transcription back to the S3 bucket.
For more info please check app>asr_server.py and app>Dockerfile

Script to invoke the endpoint

The script invokes an Amazon SageMaker endpoint for asynchronous inference. The asynchronous inference is performed on the input audio file that resides in an Amazon S3 bucket, and the transcription is returned.
Before running the script, you need to have AWS CLI set up on your local machine or server and your AWS credentials (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and optionally AWS_SESSION_TOKEN) properly configured.
for more info please check whisper-invoke>whisper.py

Github Repo

whisper Article's
30 articles in total
Favicon
How Machines Hear and Understand Us
Favicon
Wisper, ffmpeg을 활용한 비디오 자막 자동 생성
Favicon
Most affordable Whisper API
Favicon
AI and Emotional Dependency: A Growing Concern
Favicon
Distance de Levenshtein : Le Guide Ultime pour Mesurer la Similarité Textuelle
Favicon
Creating a Free AI voice-to-text transcription Program using Whisper
Favicon
Do Pet Translator Apps Work? Unveiling the Science Behind Dog Translator Apps & More!
Favicon
Build A Transcription App with Strapi, ChatGPT, & Whisper: Part 1
Favicon
免費開源的語音辨識功能:Cloudflare Workers AI + Whisper
Favicon
fishaudio/fish-speech-1.2-torrent
Favicon
Pitch-Tonic
Favicon
Making My Own Karaoke Videos with AI
Favicon
When smart algorithms beat artificial intelligence -brute force
Favicon
Deploying whisperX on AWS SageMaker as Asynchronous Endpoint
Favicon
Generate subtitles with OpenAI Whisper and Eyevinn OSC
Favicon
Deploying OpenAI's Whisper Large V3 Model on SageMaker Using Hugging Face Libraries
Favicon
免費開源的語音辨識功能:Google Colab + Faster Whisper
Favicon
免費開源的語音辨識功能:Google Colab + Whisper large v3
Favicon
OpenAI Whisper new model Large V3 just released and amazing
Favicon
Write a video translation and voiceover tool in Python
Favicon
Optimise OpenAI Whisper API: Audio Format, Sampling Rate and Quality
Favicon
Using Nodejs Buffers to transcribe an Audio file using OpenAI's Whisper service
Favicon
OpenAI Playground: Unlocking the Potential of AI Models
Favicon
How to get text from any YT video | Free transcribe program 🖹
Favicon
How to use Whisper AI (using Google Colab)
Favicon
OpenAI Whisper Deployment on AWS as Asynchronous Endpoint
Favicon
How I converted a podcast into a knowledge base using Orama search and OpenAI whisper and Astro
Favicon
Achieving 90% Cost-Effective Transcription and Translation with Optimised OpenAI Whisper on Q Blocks
Favicon
Translate Speech Into Japanese (open source web app)
Favicon
Build a Telegram voice chatbot using ChatGPT API and Whisper

Featured ones: