How to transcribe audio to text for FREE - Riverside’s new AI transcription tool

Joey /// VP Land
21 Mar 202307:04

TLDRRiverside has launched a new AI-powered transcription tool that offers free transcription services for audio and video files in over 100 languages. Built on OpenAI's Whisper technology, the tool is accessible at riverside.fm/transcription and does not require an account. Users can simply upload their files, which are then transcribed in real-time, with the option to copy the text, download a transcript, or an SRT file for captioning. While the tool does not identify speakers or allow text editing, it provides a raw text output and timecode markers. The accuracy of transcriptions is generally high, with minor punctuation and proper noun errors. However, it lacks the ability to preview audio for accuracy testing or to upload custom vocabularies. Despite these limitations, the tool is a valuable free resource for transcription needs.

Takeaways

  • 🆓 Riverside has launched a free AI transcription tool that can transcribe audio or video files in over 100 languages.
  • 🌐 The tool is built on Whisper, an OpenAI technology, and is accessible without the need for an account at riverside.fm/transcription.
  • ⏳ The transcription process is real-time, and the tool processed a 50-minute interview in approximately 1-2 minutes.
  • 📄 The output can be copied to clipboard with timecode markers but lacks speaker labels.
  • 🔄 Users can download the transcript in two formats: a text file and an SRT file for video captioning.
  • 🚫 The free service does not allow for audio preview or text editing within the tool.
  • 📈 The tool's accuracy was tested and found to be around 88%, with minor punctuation and proper noun errors.
  • 📌 Proper names and specific terms may not be accurately transcribed, indicating a lack of custom vocabulary support.
  • 📁 The tool does not currently integrate with the Riverside app, but an announcement hints at future integration.
  • 🎉 The transcription tool is highly rated for its accuracy and utility, especially considering it's free to use.
  • 📝 For higher accuracy, it's recommended to proofread the transcript and use additional tools for speaker identification if needed.
  • 🌟 The tool is a great starting point for podcasters and video podcasters looking to transcribe their content for various platforms.

Q & A

  • What is the new AI transcription tool released by Riverside?

    -Riverside has released a new AI transcription tool that is completely free and can transcribe audio or video files in over 100 different languages.

  • What is the underlying technology used by Riverside's transcription tool?

    -The tool is built on top of Whisper, which is another tool from OpenAI, the same company that developed ChatGPT, GPT-3, and GPT-4.

  • How does the transcription tool work?

    -The transcription tool has a simple interface where users can drag and drop their audio or video files to start the transcription process. It processes the file in real time and generates a text with timestamps.

  • What are the limitations of file size or duration for the transcription tool?

    -The script does not mention any specific file size or duration limits, but it is implied that there might be some limits, as the example file was a little under 5 gigabytes and about a 50-minute interview.

  • How long did it take to transcribe an hour-long interview using the tool?

    -The transcription of an hour-long interview was completed in approximately 1 to 2 minutes.

  • What are the options for using the transcribed text?

    -The transcribed text can be copied to the clipboard with timecode markers, downloaded as a text file, or exported as an SRT file for video captioning.

  • Does the transcription tool identify speakers in the audio or video?

    -No, the transcription tool does not identify speakers or provide speaker labels. For speaker identification, the transcript would need to be loaded into another tool like Descript.

  • What are the downsides of using the free transcription service?

    -The downsides include the inability to preview the audio for accuracy testing, the inability to edit the text directly within the tool, and the lack of integration with the Riverside app for podcast episodes.

  • How accurate is the transcription compared to a proofread transcript?

    -The overall similarity between the AI-transcribed text and a proofread transcript was 88%, with most differences being punctuation and a few proper nouns.

  • Can you use the transcription tool without signing up for an account?

    -Yes, the transcription tool does not require an account or login, and it is completely separate from the Riverside app.

  • What is the process for obtaining a more accurate transcription with speaker identification?

    -To obtain a more accurate transcription with speaker identification, the user would need to load the transcript into a tool like Descript, which can handle speaker identification.

  • What is the recommended next step for those interested in podcasting and video podcasting?

    -The recommended next step is to check out a course on how to use Riverside for podcasting and video podcasting, as YouTube is growing in podcast popularity.

Outlines

00:00

🚀 Introduction to Riverside's Free AI Transcription Tool

Riverside has launched a free AI-powered transcription tool that can transcribe audio and video files in over 100 languages. Built on OpenAI's Whisper technology, the same as ChatGPT and GPT models, this tool offers a simple interface without needing an account. Users can upload files up to 5GB for transcription. The tool processes files in real-time, displaying text and timestamps as it works. However, it does not identify speakers or provide speaker labels. The transcribed text can be copied, downloaded in text file or SRT format for captioning, but cannot be edited within the tool. There's no limit mentioned for file size or duration, but it's expected that there should be some. The tool is separate from the Riverside app, but an integration is anticipated with an upcoming app update.

05:02

📊 Accuracy Test and Comparison of Riverside's Transcription Tool

The video script includes a comparison of the Riverside transcription tool's output against a proofread transcript. Using an online comparison tool, the accuracy of the transcription was evaluated. The overall similarity was found to be 88%, with most differences being punctuation and a few proper nouns. The tool did not correctly transcribe 'GPT' as 'GTB-3' and mispronounced a name. It also lacked the ability to use a custom vocabulary library for specific proper words. Despite these minor inaccuracies, the tool performed well, correctly capitalizing certain words and providing a high-quality transcription for free. The video concludes by encouraging viewers to use the tool, subscribe for more content, and check out a course on podcasting with Riverside.

Mindmap

Keywords

💡AI transcription tool

An AI transcription tool refers to a software application that uses artificial intelligence to convert spoken language from audio or video files into written text. In the context of the video, Riverside has released a free AI transcription tool that can transcribe audio and video in over 100 languages, showcasing its utility for podcasters and content creators.

💡Whisper

Whisper is a technology developed by OpenAI that is used for transcribing and translating speech. It is mentioned in the video as the underlying technology that powers Riverside's new transcription tool, highlighting its role in enabling accurate transcription services.

💡ChatGPT, GPT-3, GPT-4

These terms refer to different iterations of language models developed by OpenAI. ChatGPT is a conversational AI model, while GPT-3 and GPT-4 are successive generations of their language model with increasing capabilities in understanding and generating human-like text. They are mentioned to establish OpenAI's expertise in AI language processing, which is relevant to the transcription tool's capabilities.

💡Transcription accuracy

Transcription accuracy refers to how well a transcription tool can convert spoken words into written text without errors. The video discusses the accuracy of Riverside's tool by comparing it to a proofread transcript, noting that it achieved an 88% similarity, which is considered highly accurate for a free service.

💡Timecode markers

Timecode markers are timestamps that indicate specific points in a video or audio file. They are important for locating sections of the media. The video script mentions that the transcription includes timecodes, which is useful for referencing parts of the interview or podcast.

💡Speaker labels

Speaker labels are identifiers used to denote which speaker is talking at any given time in a transcript. The video points out that Riverside's tool does not automatically identify speakers, which means that for transcripts requiring speaker identification, additional software like Descript would be necessary.

💡Descript

Descript is a transcription and editing tool that offers features such as speaker identification and transcription editing. In the context of the video, it is suggested for users who need more advanced transcription features, including identifying speakers and editing the text for accuracy.

💡SRT file

An SRT file is a SubRip subtitle file format used to create captions for videos. The video script explains that users can download an SRT file from Riverside's tool to add captions to their videos, which can then be uploaded to various platforms like YouTube or Vimeo.

💡Free online comparison tool

A free online comparison tool is a service that allows users to compare two pieces of text to identify differences. In the video, the presenter uses such a tool to compare the accuracy of the AI-generated transcript with a manually proofread version, demonstrating the tool's effectiveness.

💡Custom vocab library

A custom vocab library is a collection of specific words and terms that a transcription tool can use to improve its accuracy for particular users or industries. The video notes that Riverside's tool does not allow users to upload their own custom vocab library, which could be a limitation for handling industry-specific jargon or proper nouns.

💡Video podcasting

Video podcasting refers to the creation and distribution of podcast content in video format, which is hosted on platforms like YouTube. The video encourages viewers to explore video podcasting, suggesting that it is a growing medium due to the increasing popularity of YouTube for podcast consumption.

Highlights

Riverside has released a free AI transcription tool that can transcribe audio or video in over 100 languages.

The tool is built on Whisper technology from OpenAI, the creators of ChatGPT, GPT-3, and GPT-4.

Users can upload their files to riverside.fm/transcription without needing an account.

The tool supports large files, with the example being a 5GB podcast episode.

Transcription is done in real-time, with the interface showing text and timestamps as it processes.

The transcription process for a 50-minute interview took approximately 1-2 minutes.

The output can be copied to clipboard with timecode markers but without speaker labels.

For speaker identification, the transcript needs to be loaded into another tool like Descript.

The transcript can be downloaded in text file format or as an SRT file for video captioning.

The tool offers a significant discount for new Riverside users with the code JOEY30.

The transcription accuracy is impressive, with an 88% similarity to a proofread transcript.

Most inaccuracies are minor, such as punctuation and capitalization differences.

The tool does not support uploading a custom vocab library for specific proper nouns.

Despite being free, the tool provides one of the most accurate transcriptions available.

The transcription can serve as a great starting point before proofreading for accuracy.

Riverside's transcription tool is a standalone service, separate from the Riverside app.

An upcoming Riverside app update is expected to integrate the transcription tool.

The tool does not currently allow for audio preview or text editing within the platform.

For video podcasters, Riverside offers a course on how to use their platform effectively.