![]() And back in September of 2022, OpenAI released the whole Whisper code into open source. According to OpenAI, they use 680,000 hours to initially train this app to do transcription, which is part of this whole AI movement for sure. It turns out that OpenAI own the rights to Whisper software, and it was produced by a fellow named Georgie Gurgenov. So I was looking for some other solution that I could do so that I could start dropping those audio files into something and get good transcripts out of it. I have 148 of them under my belt.Īnd to spend transcription time, which Ophonics gives you two free hours a month, to put all my old podcasts into their app, would have used up a lot of my time. Then I needed to go through and start doing my older podcasts. But Whisper with Auphonic did a great job and it got me a long way. I really need that good, solid transcript to start this whole AI process. Trying to clean up a 3,000-word transcription that’s not very good isn’t going to help me very much. That did a great job, and it did a little bit better than the apps I was trying to use. This allows you to exchange that data with other types of websites and applications too. This means that when you activate that, you get to have a subtitle file, you get to have a transcript, you get to have what is called a VTT file, it gives you an HTML file if you want to put it on your website, your transcript file, or a JSON file. They came out with integration with Whisper ASR (Automatic Speech Recognition) to do audio transcripts. Then back in September of 2022, Auphonic, which is the app that I use, (and Allison uses) to help process the audio file, smooths it out when it comes to noise level, it comes to loudness levels, it makes the whole podcast better added a new feature. When I saw the transcription, the words were quite different than what I actually said. ![]() The first thing I need is a good transcription.Īt first, I tried other services and started reducing my confidence that I was any good at podcasting at all. I really want to do that myself.īut instead, I was using AI to get good show notes and get some good hot topics for social media.īut I failed to mention, I think, the very first step in all of this is I’m using Grammarly Go to help me write the show notes, which are summaries of my podcast. Again, I’m not using it to write my content. User data is all anonymous.I told you recently how I was using AI to help me make podcasts show notes better. We use Google Analytics to understand how the site is being used in order to improve your user experience. This information is collected by major web servers by default. ![]() This information includes information such as your computer’s Internet Protocol (“IP”) address, browser user-agent and the time and date of your visit. We want to inform you that whenever you use this service, we collect information that your browser sends to us. This section is used to inform website visitors regarding policies with the collection, use, and disclosure of Personal Information if anyone decided to use this service. To save generated audio, right click on audio player and press "Save audio as.".It should be done nearly instantly, as the interface tries to generate audio at x16777215 real-time. Wait for generated audio appear in audio player. ![]() All voices have lower and upper pitch and speed limits. Note that BonziBUDDY voice is actually an "Adult Male #2" with a specific pitch and speed. Microsoft Sam TTS Generator is an online interface for part of Microsoft Speech API 4.0 which was released in 1998.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |