Logo

dev-resources.site

for different kinds of informations.

Boost your app accessibility with AI Speech Recognition (Deepgram)!

Published at
4/10/2022
Categories
hackwithdg
react
showdev
javascript
Author
Paweł Ciosek
Boost your app accessibility with AI Speech Recognition (Deepgram)!

Overview of My Submission

The goal is to provide extra way to input value. It could be really helpful for people with disabilities, anyone that have problems with typing on a keyboard. You can fill any input using pointer and voice! Cool!

Submission Category:

Accessibility

Link to Code on GitHub

GitHub logo pavelee / react-deepgram-example

DEV hackathon project, usage of Deepgram AI Speech Recognition, boost your app accessibility

Boost your react app accessibility with AI Speech Recognition (Deepgram)!

What's that?

It's example of integration with Deepgram using react.

gif

Deepgram?

Deepgram is external service to transcript speech from audio! (using AI, crazy stuff!)

Read more here: https://deepgram.com

Purpose

Purpose is to use speech transcription to improve an react app accessibility. We provide extra way to input value!

  • Help to provide input for people with disabilities
  • Speed up a form filling
  • Share expierience accross any device
    • any device supporting modern browser
    • react-native (mobile, TV, dekstop) as well!

Why?

Project was made as submission to DEV hackathlon, read more here

Post here: post

How it's working?

Project is built from two parts deepgram-proxy and deepgram-react

deepgram-proxy

We need some backend to upload audio file…

Additional Resources / Info

I am using react as a frontend app.

It's important to remember that react frontend app is not necessary. You can integrate proxy with any other type of app. It's just REST API!

The process is pretty simple.

  • Display Deepgram component
  • Ask permission to microphone (browser)
  • Record your voice
  • Sending audio to proxy
  • Proxy ask Deepgram for transcription
  • Proxy responses with transcription, error or warring about no transcription (eg. user need to repeat louder)
  • User receives transcription and apply to input

process gif

Proxy from technical aspect.

It's node.js + express.js to handle API requests and communicate with Deepgram (using Deepgram SDK)

// endpoint to upload and transcript
app.post("/audiotranscript", upload.single("file"), async (req, res) => {
    let filepath = req.file.path
    let language = req.body.language;
    let transcript = await deepgramTranscript(deepgramApiKey, filepath, language);
    res.send({ transcript: transcript });
});

Fronted from technical aspect.

It's simple handler you pass proxy url and setter for your value.

I prepared two visualizations of usage. I am using antd design as component library.

Using wrapper that creates popover to any React component.

<DeepgramHandlerPopover
    setValue={setNotepadValue}
    proxyUploadUrl={proxyUploadUrl}
>
    <Input.TextArea
        rows={20}
        value={value}
        onChange={(ev) => {
            setValue(ev.target.value);
        }}
    />
</DeepgramHandlerPopover>

popover

Using modal (small devices friendly)

<DeepgramHandlerModalButton
    setValue={setNotepadValue}
    proxyUploadUrl={proxyUploadUrl}
    buttonProps={{
        type: "primary",
    }}
/>

modal

If you have other idea you can easy just wrap handler:

<MyCoolComponent>
    <DeepgramHandler
        setValue={setValue}
        proxyUploadUrl={proxyUploadUrl}
    />
</MyCoolComponent>

Final thoughts

I really appreciate what Deepgram has created. It is really well working, even with my unclear English. Amazing how tech industry improving our lives! <3

Featured ones: