dev-resources.site
for different kinds of informations.
HarmonyOS Next Audio Collection in Audio and Video Practice
Background
In the process of application development, there are audio collection requirements in many scenarios, such as the voice sending function in chat features, real-time speech-to-text conversion function, real-time voice calls, and real-time video calls. On the Android and iOS platforms, the system provides two forms:
- Real-time audio stream collection
- Audio file recording
The system also provides different forms of APIs. For example, on Android:
- AudioRecorder Java interface
- MediaRecorder Java interface
- OpenSLES C++ interface
- AAudio C++ interface
During the process of HarmonyOS adaptation, there is also a need for audio collection. In this article, we will implement the audio collection function step by step.
Introduction to Audio Recording Interfaces
HarmonyOS provides two types of audio collection interfaces, namely TS and C++:
- AudioCapture
- OHAudio
The APIs of these two languages will be introduced respectively.
AudioCapture
Using AudioCapturer to record audio involves creating an AudioCapturer instance, configuring audio collection parameters, starting and stopping the collection, and releasing resources. The following state diagram provided by the official clearly marks the methods and state transitions:
createAudioCapture
Creating a capture mainly involves parameter configuration:
import { audio } from '@kit.AudioKit';
let audioStreamInfo: audio.AudioStreamInfo = {
samplingRate: audio.AudioSamplingRate.SAMPLE_RATE_48000, // Sampling rate
channels: audio.AudioChannel.CHANNEL_2, // Number of channels
sampleFormat: audio.AudioSampleFormat.SAMPLE_FORMAT_S16LE, // Sampling format
encodingType: audio.AudioEncodingType.ENCODING_TYPE_RAW // Encoding format
};
let audioCapturerInfo: audio.AudioCapturerInfo = {
source: audio.SourceType.SOURCE_TYPE_MIC,
capturerFlags: 0
};
let audioCapturerOptions: audio.AudioCapturerOptions = {
streamInfo: audioStreamInfo,
capturerInfo: audioCapturerInfo
};
audio.createAudioCapturer(audioCapturerOptions, (err, data) => {
if (err) {
} else {
let audioCapturer = data;
}
});
The parameters consist of two main parts:
- AudioStreamInfo: Audio format configuration information
- samplingRate: Sampling rate
- channels: Number of channels
- sampleFormat: Sampling format
- encodingType: Audio encoding type. Currently, only the ENCODING_TYPE_RAW configuration for PCM is supported.
- AudioCapturerInfo: Collection configuration information
- source: Audio source type, including:
- SOURCE_TYPE_INVALID: Invalid audio source
- SOURCE_TYPE_MIC: Microphone audio source
- SOURCE_TYPE_VOICE_RECOGNITION: Voice recognition source
- SOURCE_TYPE_PLAYBACK_CAPTURE: Audio source for recording the playback audio stream (internal recording)
- SOURCE_TYPE_VOICE_COMMUNICATION: Audio source for voice call scenarios
- SOURCE_TYPE_VOICE_MESSAGE: Audio source for short voice messages
- capturerFlags: Audio capturer flag. 0 represents the audio capturer.
- source: Audio source type, including:
on('readData')
The on('readData') method is used to subscribe to and monitor the callback for reading audio data:
let readDataCallback = (buffer: ArrayBuffer) => {
// Process the audio stream
};
audioCapturer.on('readData', readDataCallback);
start
The start method is used to start recording:
import { BusinessError } from '@kit.BasicServicesKit';
audioCapturer.start((err: BusinessError) => {
if (err) {
} else {
}
});
stop
The stop method is used to stop recording:
import { BusinessError } from '@kit.BasicServicesKit';
audioCapturer.stop((err: BusinessError) => {
if (err) {
} else {
}
});
release
The release method destroys the instance and releases resources:
import { BusinessError } from '@kit.BasicServicesKit';
audioCapturer.release((err: BusinessError) => {
if (err) {
} else {
}
});
OHAudio
OHAudio is a set of C APIs introduced by the system in API version 10. This API is designed to be unified and supports both normal audio paths and low-latency paths. It only supports the PCM format and is suitable for scenarios where audio input functions are implemented at the Native layer. Many audio encoding libraries are implemented in C/C++. After migrating to the HarmonyOS platform, using the OHAudio C++ interface on the collection side can reduce the consumption of data transfer between the TS layer and the C++ layer and improve efficiency.
OHAudio depends on the libohaudio.so dynamic library. By introducing the <native_audiostreambuilder.h>
and <native_audiocapturer.h>
header files, you can use the APIs related to audio recording.
Creating the Constructor
OH_AudioStreamBuilder* builder;
OH_AudioStreamBuilder_Create(&builder, AUDIOSTREAM_TYPE_CAPTURER);
Configuring Audio Stream Parameters
You can refer to the following example:
// Set the audio sampling rate
OH_AudioStreamBuilder_SetSamplingRate(builder, 48000);
// Set the number of audio channels
OH_AudioStreamBuilder_SetChannelCount(builder, 2);
// Set the audio sampling format
OH_AudioStreamBuilder_SetSampleFormat(builder, AUDIOSTREAM_SAMPLE_S16LE);
// Set the encoding type of the audio stream
OH_AudioStreamBuilder_SetEncodingType(builder, AUDIOSTREAM_ENCODING_TYPE_RAW);
// Set the working scenario of the input audio stream
OH_AudioStreamBuilder_SetCapturerInfo(builder, AUDIOSTREAM_SOURCE_TYPE_MIC);
The roles of these parameters are similar to those of AudioCapture.
Setting the Audio Callback Functions
// Custom write data function
int32_t MyOnReadData(
OH_AudioCapturer* capturer,
void* userData,
void* buffer,
int32_t length)
{
// Take out the recording data with the length from the buffer
return 0;
}
// Custom audio stream event function
int32_t MyOnStreamEvent(
OH_AudioCapturer* capturer,
void* userData,
OH_AudioStream_Event event)
{
// Update the player state and interface according to the audio stream event information represented by event
return 0;
}
// Custom audio interruption event function
int32_t MyOnInterruptEvent(
OH_AudioCapturer* capturer,
void* userData,
OH_AudioInterrupt_ForceType type,
OH_AudioInterrupt_Hint hint)
{
// Update the recorder state and interface according to the audio interruption information represented by type and hint
return 0;
}
// Custom exception callback function
int32_t MyOnError(
OH_AudioCapturer* capturer,
void* userData,
OH_AudioStream_Result error)
{
// Make corresponding processing according to the audio exception information represented by error
return 0;
}
OH_AudioCapturer_Callbacks callbacks;
// Configure callback functions
callbacks.OH_AudioCapturer_OnReadData = MyOnReadData;
callbacks.OH_AudioCapturer_OnStreamEvent = MyOnStreamEvent;
callbacks.OH_AudioCapturer_OnInterruptEvent = MyOnInterruptEvent;
callbacks.OH_AudioCapturer_OnError = MyOnError;
// Set the callback for the audio input stream
OH_AudioStreamBuilder_SetCapturerCallback(builder, callbacks, nullptr);
Configure the callback functions through the OH_AudioStreamBuilder_SetCapturerCallback function.
Constructing the Recording Audio Stream
OH_AudioCapturer* audioCapturer;
OH_AudioStreamBuilder_GenerateCapturer(builder, &audioCapturer);
Using the Audio Stream
- OH_AudioStream_Result OH_AudioCapturer_Start(OH_AudioCapturer* capturer): Start recording.
- OH_AudioStream_Result OH_AudioCapturer_Pause(OH_AudioCapturer* capturer): Pause recording.
- OH_AudioStream_Result OH_AudioCapturer_Stop(OH_AudioCapturer* capturer): Stop recording.
- OH_AudioStream_Result OH_AudioCapturer_Flush(OH_AudioCapturer* capturer): Release cached data.
- OH_AudioStream_Result OH_AudioCapturer_Release(OH_AudioCapturer* capturer): Release the recording instance.
Releasing the Constructor
OH_AudioStreamBuilder_Destroy(builder);
Audio Recording Best Practices
Let's take recording MP3 as an example to implement the full process practice of audio collection.
Permission Application
Audio collection requires dynamic permission application. Declare the permissions in module.json5:
"requestPermissions": [
{
"name": "ohos.permission.MICROPHONE",
"reason": "$string:reason",
"usedScene": {
"abilities": [
"FormAbility"
],
"when": "inuse"
}
}
],
Apply for permissions dynamically:
function reqPermissionsFromUser(permissions: Array<Permissions>, context: common.UIAbilityContext): void {
let atManager: abilityAccessCtrl.AtManager = abilityAccessCtrl.createAtManager();
// requestPermissionsFromUser will determine the authorization status of permissions to decide whether to pop up a window
atManager.requestPermissionsFromUser(context, permissions).then((data) => {
let grantStatus: Array<number> = data.authResults;
let length: number = grantStatus.length;
for (let i = 0; i < length; i++) {
if (grantStatus[i] === 0) {
// User has authorized, and you can continue to access the target operation
} else {
// User has refused authorization. Prompt the user that authorization is required to access the functions on the current page and guide the user to open the corresponding permissions in the system settings.
return;
}
} // Authorization is successful
}).catch((err: BusinessError) => {
console.error(`Failed to request permissions from user. Code is ${err.code}, message is ${err.message}`);
})
}
Call the permission application method in aboutToAppera and start recording after the authorization is successful:
const context: common.UIAbilityContext = getContext(this) as common.UIAbilityContext;
reqPermissionsFromUser(permissions, context);
Configuring the C++ Project
After creating a C++ module, configure the dependence on the ohaudio dynamic library:
cmake_minimum_required(VERSION 3.5.0)
project(audiorecorderdemo)
set(NATIVERENDER_ROOT_PATH ${CMAKE_CURRENT_SOURCE_DIR})
if(DEFINED PACKAGE_FIND_FILE)
include(${PACKAGE_FIND_FILE})
endif()
include_directories(${NATIVERENDER_ROOT_PATH}
${NATIVERENDER_ROOT_PATH}/include)
add_library(capture SHARED napi_init.cpp)
target_link_libraries(capture PUBLIC libace_napi.z.so)
target_link_libraries(capture PUBLIC libohaudio.so)
Configure the napi method:
static napi_value start(napi_env env, napi_callback_info info)
{
return nullptr;
}
static napi_value stop(napi_env env, napi_callback_info info)
{
return nullptr;
}
EXTERN_C_START
static napi_value Init(napi_env env, napi_value exports)
{
napi_property_descriptor desc[] = {
{ "start", nullptr, start, nullptr, nullptr, nullptr, napi_default, nullptr },
{ "stop", nullptr, stop, nullptr, nullptr, nullptr, napi_default, nullptr }
};
napi_define_properties(env, exports, sizeof(desc) / sizeof(desc[0]), desc);
return exports;
}
Implementing the Start of Recording
// Custom write data function
int32_t MyOnReadData(
OH_AudioCapturer* capturer,
void* userData,
void* buffer,
int32_t length)
{
//TODO Take out the recording data with the length from the buffer
return 0;
}
// Custom audio stream event function
int32_t MyOnStreamEvent(
OH_AudioCapturer* capturer,
void* userData,
OH_AudioStream_Event event)
{
//TODO Update the player state and interface according to the audio stream event information represented by event
return 0;
}
// Custom audio interruption event function
int32_t MyOnInterruptEvent(
OH_AudioCapturer* capturer,
void* userData,
OH_AudioInterrupt_ForceType type,
OH_AudioInterrupt_Hint hint)
{
//TODO Update the recorder state and interface according to the audio interruption information represented by type and hint
return 0;
}
// Custom exception callback function
int32_t MyOnError(
OH_AudioCapturer* capturer,
void* userData,
OH_AudioStream_Result error)
{
//TODO Make corresponding processing according to the audio exception information represented by error
return 0;
}
static napi_value start(napi_env env, napi_callback_info info)
{
OH_AudioStreamBuilder* builder;
OH_AudioStreamBuilder_Create(&builder, AUDIOSTREAM_TYPE_CAPTURER);
// Set the audio sampling rate
OH_AudioStreamBuilder_SetSamplingRate(builder, 48000);
// Set the audioε£°ι
OH_AudioStreamBuilder_SetChannelCount(builder, 2);
// Set the audio sampling format
OH_AudioStreamBuilder_SetSampleFormat(builder, AUDIOSTREAM_SAMPLE_S16LE);
// Set the encoding type of the audio stream
OH_AudioStreamBuilder_SetEncodingType(builder, AUDIOSTREAM_ENCODING_TYPE_RAW);
// Set the working scenario of the input audio stream
OH_AudioStreamBuilder_SetCapturerInfo(builder, AUDIOSTREAM_SOURCE_TYPE_MIC);
OH_AudioCapturer_Callbacks callbacks;
// Configure callback functions
callbacks.OH_AudioCapturer_OnReadData = MyOnReadData;
callbacks.OH_AudioCapturer_OnStreamEvent = MyOnStreamEvent;
callbacks.OH_AudioCapturer_OnInterruptEvent = MyOnInterruptEvent;
callbacks.OH_AudioCapturer_OnError = MyOnError;
// Set the callback for the audio input stream
OH_AudioStreamBuilder_SetCapturerCallback(builder, callbacks, nullptr);
OH_AudioCapturer* audioCapturer;
OH_AudioStreamBuilder_GenerateCapturer(builder, &audioCapturer);
return nullptr;
}
Best Practice 1:
To avoid unexpected behaviors, when setting audio callback functions, please ensure that each callback in OH_AudioCapturer_Callbacks is initialized with a custom callback method or a null pointer. For example:
OH_AudioCapturer_Callbacks callbacks;
// Configure callback functions. If you need to listen, assign values.
callbacks.OH_AudioCapturer_OnReadData = MyOnReadData;
callbacks.OH_AudioCapturer_OnInterruptEvent = MyOnInterruptEvent;
// (Required) If you don't need to listen, initialize with a null pointer.
callbacks.OH_AudioCapturer_OnStreamEvent = nullptr;
callbacks.OH_AudioCapturer_OnError = nullptr;
Best Practice 2:
For devices that support the low-latency mode, in scenarios with high latency requirements (such as voice calls), you can use the low-latency mode to create an audio recording constructor to obtain a higher-quality audio experience:
OH_AudioStream_LatencyMode latencyMode = AUDIOSTREAM_LATENCY_MODE_FAST;
OH_AudioStreamBuilder_SetLatencyMode(builder, latencyMode);
Audio File Processing
In the audio callback, we can process the audio data. It can be handed over to ASR or directly written to a file. In the next article, we will implement the practice of encoding it into MP3 and writing it to a file.
Stopping Playback and Destroying the Instance
OH_AudioCapturer_Stop(builder, &audioCapturer);
OH_AudioStreamBuilder_Destroy(builder);
Summary
This article introduced two audio collection methods provided by HarmonyOS: AudioCapture at the TS layer and OHAudio at the C++ layer, and implemented the real-time audio collection function using the OHAudio interface.
Featured ones: