How To Increase Listen Time In Google Speech Api?
Solution 1:
Speech recognition is a pretty good library, but I too have had to fight with recording lengths. Here's how I've managed around the problem:
Saving Audio to Disk
with sr.AudioFile('path/to/audiofile.wav') as source:
audio = r.record(source)
Pros: Recording to an audio file and then sending longer chunks to google has given me more consistent recording lengths, compared to streaming.
Cons: Depending on the size of the audio file, this could present the disadvantage of lengthening the response time to a couple seconds, which might be unusable in your case.
Minimizing Noise Floor
You're likely already very aware that a better signal-noise ratio will give better STT accuracy - but i've also found it critical for the good chunk sizes with the speech recognition library.
Double check that your noise floor is easily distinguishable from your source. Recording the audio also help you troubleshoot this. Sometimes the audio can cutoff prematurely using the speech recognition library because it doesn't clearly detect you are speaking.
If improving the quality or proximity of your microphone isn't possible, there is a tool included in the library which calibrates audio levels for optimal signal-noise distinction.
To activate this feature, instead of the line:
audio=r.listen(source)
Try using:
audio=r.adjust_for_ambient_noise(source)
Be aware that this feature adds a small amount of latency in some cases. In others, it will continue listening indefinitely if you feed it noisy audio.
Combining it All
with sr.AudioFile('path/to/audiofile.wav') as source:
audio = r.adjust_for_ambient_noise(source)
Here's a great guide for this library - The Ultimate Guide To Speech Recognition With Python
Post a Comment for "How To Increase Listen Time In Google Speech Api?"