Going online on SC Voice, sadhu!! Iāll make sure all of Bhanteās work thus far is up on opus and mp3 this weekend, they will be different files with different timings. Are you going off the json, vtt or srt pulled live or storing on the SC voice server?
Will change the URLs if it makes it easier! Just matches the naming of the FLACs.
I can tell opus files are reducing sampling rate in any case. 20Hz is what itās going down to Iāve just checked (Opus (audio format) - Wikipedia). Will set the encoder to 8Khz and see if thereās any discernible difference for chanting, good to check thanks! If reducing sampling rate can lead to less CPU needed when playing out is good too.
The only reason I say 48KHz is because most things use 48KHz, 44100Hz is really a standard from CD days. As I said, doesnāt matter so much, but since itās basically considered legacy and there will be other things to plug these files into in the future, 48Khz is more common. I chose 44100Hz originally because some of the processing tools wanted those.
Amazing to hear! Anything that can be tweaked will be great to hear about.
Iām not sure how tapping a button while you talk requires āa gapā?
Really? Is that so these days? Well, I guess I shouldnāt be too surprised: storage is so cheap!
CDs and DATs and even a reel-to-reel machine My studio experience was long ago and is now, apparently, obsolete, so I guess Iāll just go back to lurking and learning from yāall.
I had a reel-to-reel in college for listening to hi-fi music. Now I listen to MP3s. Maybe I should choose a middle way. Maybeā¦Opus.
Thank you. Deleting those zeroes will be a big help.
The zeroes are complicated because Voice users do not type zeroes and one never knows exactly how many zeroes to use. Itās not always one. Sometimes itās two. Sometimes itās none. Etc.
For now, Iāll adopt the crudest implementation, which is to simply provide an HTML link to your digitalocean .opus files. The full audio files can be heard directly in the browser, so I think all Iād need to provide is a Pali link and an English link for each uploaded sutta. Aminah is thinking about changing the Voice UI a bit to make the Other Resources more prominently available. It will therefore take us a few release iterations before we settle on how to fetch and present Bhanteās full audio recordings.
In a later release, Iāll be downloading your segmented audio files into dedicated audio caches on the Voice server. This will provide a smooth user listening experience without jarring pauses for retrieval. This latter task is much more work and has dependencies on VSM technology that I havenāt implemented yet. The segmented human audio will be quite interesting to work on and experience, since weāll be able to combine, for example, Bhanteās Pali audio segments with Vickiās German voice for the German translation of a sutta while showing the text for both at the same time, segment by segment!
I guess explosions and bullets and jets would need the higher frequencies!
But it never occurred to me that one would lower the pitch of a recorded human audio! Wow! That certainly would require higher frequency recording. I had the oddest mental flash on the suttas chanted by Bhante slowed down to Mongolian throat singer pitches.
Although we do lower the pitch of Aditi by 10% to make it more accessible to users with high frequency hearing loss, we would probably not need to do that with human voices since we already have adequate low frequency voice coverage of all supported suttas.
Because words in a sentence normally run on one to the other. We donāt speak like this, wespeaklikethis. Cutting, in real time, a runon of words, reliably and accurately, is practically impossible. Itās even more difficult if it is to be done without disturbing concentration or altering the natural phrasing. Anyway, there is no need: thatās the whole reason for using software to do it in post-processing. The software works fine, it is just being tweaked.
Now that we are able to create Voice Sound Modules (VSMs) for AI voices to provision sound caches in production, itās time for me to swing back to human voices and pick up the work on Aeneas segmentation of Bhanteās recordings. Let me know if there are any changes to the workflow or plan.
Iāll be working initially with the Pali recordings rather than the English recordings because Iām guessing the Pali recordings will be stable given that the Mahasangiti text isnāt changing. This will allow more time for Bhante Sujato to re-record any English translations should the need arise. It will also provide an alternative to Aditi for users listening to Pali.
No worries, all looking good. Iāve been slow on recording the last couple of weeks due to completing the paragraph task. Iāll be back at it very soon.
While itās true to say the Pali text will be more stable, I do not expect the English text to change drastically, especially the human-recorded version. I will not be re-recording this, it is a one-time thing. So I am making sure I am happy with the text before recording. Of course there will be small changes and adjustments as time goes on, and I will always be open to fixing mistakes. But the result of that will be a slight drift between the written text and the recorded version. Again, i fully anticipate that this will only happen very slowly and in a very small number of cases.
@MichaelH I just noticed that the segment URLs do not have author or reader fields. This is fine for now, but would be a problem if we support other authors or readers. Currently sujato/sujato are implicit.
With the segmentation of Bhanteās audio, users will be able to listen to the above combination of voices as well as others. Many thanks to MichaelH and Bhante Sujato for making this possible!
@MichaelH, would you join us on this thread when you have a moment? We have some questions on the extent and perhaps location of segmented audio uploads. We will soon release Bhanteās currently recorded segmented Pali audio in Voice and need to check back with youā¦
Since Aeneas produces subtitles, Iāve produced some code that outputs videos and subtitles for any sutta in bilara-data, and adds it to a Youtube Playlist with name of Nikaya chapter - here are some tests for perusal!:
If you think this drone noise is okay, please let me know. I can also easily take away the big SuttaCentral header or sc-voice logo? The visuals are just slightly random images from a pixabay.org search for āforest.ā Perhaps other pics might be better.
Once I have a machine set up that can do rendering over the next fortnight, I can render all of the recordings out straight into a youtube playlist for each chapter - so it would be good to get some feedback over next few weeks, as rendering takes a while. Iād create a whole new channel for Early Buddhist text recordings - if SC should have its own Youtube channel too then Iām happy to set one up for SC instead too