Logo

dev-resources.site

for different kinds of informations.

Using Nodejs Buffers to transcribe an Audio file using OpenAI's Whisper service

Published at
8/1/2023
Categories
openai
whisper
node
Author
hectorleiva
Categories
3 categories in total
openai
open
whisper
open
node
open
Author
11 person written this
hectorleiva
open
Using Nodejs Buffers to transcribe an Audio file using OpenAI's Whisper service

I was having a difficult time attempting to do the following:

  • Retrieve an Audio file from AWS S3
  • Have the Audio file in memory be sent to Open AI's Whisper transcription service to get the transcribed text and save the result.

I've seen some examples where users were attempting to use a ReadStream instead and attempt to send that, but that didn't work for me.

So after a Sunday's worth of effort to figure out how to send this data, I did the following (using this specific version of OpenAI's Node.js API: 4.0.0-beta7)

s3.service.ts

function async returnBufferFromS3(params: { key: string }) {
      const data = await this.awsS3.send(new GetObjectCommand({
        Bucket: 'some-bucket'
        , Key: params.key
      }));

      const stream = data.Body as Readable;

      return new Promise<Buffer>((resolve, reject) => {
        const chunks: Buffer[] = [];
        stream.on('data', (chunk) => chunks.push(chunk));
        stream.on('error', (error) => reject(error));
        stream.on('end', () => resolve(Buffer.concat(chunks)));
      });
}
Enter fullscreen mode Exit fullscreen mode

openAITranscription.ts

import { returnBufferFromS3 } from "s3.service";
import { isUploadable } from "openai/uploads";
...

const openAI = new OpenAI({
  apiKey: 'some-api-key',
  organization: 'some-org-id'
});

function async returnTranscriptionsBasedOnKey(key: string) {
      const buffer = await returnBufferFromS3({ key: 'some-key; });
      const tmp = new Blob([buffer], { type: `test.mp4` });
      const fileLikeBlob = return Object.assign(blob, {
          name: 'test.mp4'
          , lastModified: new Date().getTime()
     });

      if (!isUploadable(fileLikeBlob)) {
        throw new Error(`Unable to upload file to OpenAI due to the following not being uploadable: ${fileLikeBlob}`);
      }

      const openAIResponse = await openAI.audio.transcriptions.create({
        file: fileLikeBlob,
        model: 'whisper-1'
      }

      return openAIResponse;
}
Enter fullscreen mode Exit fullscreen mode

This seems like an incredible hacky way of completing this task (Buffer -> Blob -> "FileLikeBlob").

I'm not proud of it, but I didn't enjoy the time I spent trying to make this work either.

If anyone in the comments has better recommendations of how to accomplish this same method; I welcome any critique to make it better and to make this information more readily available to more users.

whisper Article's
30 articles in total
Favicon
How Machines Hear and Understand Us
Favicon
Wisper, ffmpeg을 활용한 비디오 자막 자동 생성
Favicon
Most affordable Whisper API
Favicon
AI and Emotional Dependency: A Growing Concern
Favicon
Distance de Levenshtein : Le Guide Ultime pour Mesurer la Similarité Textuelle
Favicon
Creating a Free AI voice-to-text transcription Program using Whisper
Favicon
Do Pet Translator Apps Work? Unveiling the Science Behind Dog Translator Apps & More!
Favicon
Build A Transcription App with Strapi, ChatGPT, & Whisper: Part 1
Favicon
免費開源的語音辨識功能:Cloudflare Workers AI + Whisper
Favicon
fishaudio/fish-speech-1.2-torrent
Favicon
Pitch-Tonic
Favicon
Making My Own Karaoke Videos with AI
Favicon
When smart algorithms beat artificial intelligence -brute force
Favicon
Deploying whisperX on AWS SageMaker as Asynchronous Endpoint
Favicon
Generate subtitles with OpenAI Whisper and Eyevinn OSC
Favicon
Deploying OpenAI's Whisper Large V3 Model on SageMaker Using Hugging Face Libraries
Favicon
免費開源的語音辨識功能:Google Colab + Faster Whisper
Favicon
免費開源的語音辨識功能:Google Colab + Whisper large v3
Favicon
OpenAI Whisper new model Large V3 just released and amazing
Favicon
Write a video translation and voiceover tool in Python
Favicon
Optimise OpenAI Whisper API: Audio Format, Sampling Rate and Quality
Favicon
Using Nodejs Buffers to transcribe an Audio file using OpenAI's Whisper service
Favicon
OpenAI Playground: Unlocking the Potential of AI Models
Favicon
How to get text from any YT video | Free transcribe program 🖹
Favicon
How to use Whisper AI (using Google Colab)
Favicon
OpenAI Whisper Deployment on AWS as Asynchronous Endpoint
Favicon
How I converted a podcast into a knowledge base using Orama search and OpenAI whisper and Astro
Favicon
Achieving 90% Cost-Effective Transcription and Translation with Optimised OpenAI Whisper on Q Blocks
Favicon
Translate Speech Into Japanese (open source web app)
Favicon
Build a Telegram voice chatbot using ChatGPT API and Whisper

Featured ones: