So I have the api working as in I can send audio files and get text back but what I am looking for is a robust way to have streaming functionality. For example, if there is a small duration of silence it should stop recording and send the audio to api etc.
Is there any such library in python?
Just stumbled upon this speedy one: https://github.com/sanchit-gandhi/whisper-jax
And this one for word precision time marks: https://github.com/m-bain/whisperX