Adobe recently added a potential game-changer to the Premiere Pro subscriber arsenal of creative tools.
Previously available only at the enterprise level, the July 2021 release (version 15.4) saw the addition of Speech to Text, Automated Transcription and Caption Styling options to the Creative Cloud platform. Powered by Adobe Sensei, these tools are scary-good in terms of accuracy and destined to save a ton of time and third party expense for aspiring content creators.
Manually transcribing voice-to-text can be the most tedious and time-consuming part of any project and easy to neglect, if accomplished at all. Automated transcription puts an end to the tedium and serves as a powerful ally in helping others to locate answers more precisely and discover your work at the same time.
Let’s take a quick run through the process, starting with a few words on the difference between a transcript and a caption and how they’re beneficial in regard to search engine optimization (SEO).
Transcript vs. caption
A transcript is a written record of any narrative content present in an audio, video or film production. A caption is the graphic overlay of a transcript synchronized to move with the pace of discourse. Combine the two and we have what is commonly referred to as a subtitle.
SEO loves subtitles
As the years progress, I’ve grown to rely on subtitles and enjoy using them whenever possible. A welcome solution in busy and volume-sensitive environments, subtitles provide yet an even more useful service to filmmakers and online content creators.
In a world where content is indeed king, a reliable transcription can now work in tandem with SEO to apply a quantifiable search metric to the intangible wavelength we call audio.
Think of a transcript as a continuous string of descriptive tags and keyword phrases, all relevant to your specific video content and all searchable from the transcript text directly in more than a dozen languages.
That’s some serious firepower and now accessible within a few short steps.
Transcribing a sequence
Step 1: Navigate to Window > Text to access the speech-to-text interface and select Transcribe sequence to create a new transcript.
Step 2: Select Audio on track and choose from the pull-down menu the audio track containing the narrative to be transcribed. In this instance, our target audio is track one.
Step 3: Select a target Language and Transcribe to start the process. Selected tracks are transferred to Adobe Sensei to be analyzed and processed, thus an internet connection will be required.
Process time will vary depending on length and the number of audio tracks. In this instance, processing of our 6-7 minute audio file completed in about a minute and a half on a standard Lenovo Ideapad laptop with 16gb of memory and no CUDA hardware acceleration.
Step 4: When the audio is analyzed, the transcribed text is displayed according to timecode and ready for proof. Here we can update the speaker name(s) by navigating to Unknown > Edit speakers …
If multiple voices are detected, each voice is automatically synced and positioned according to timecode as well.
Step 5: Our transcript is now ready for proof and correction of small errors, with the software adapting as we go. Simply click on any word in the transcript and the video timeline advances in tandem, providing a vastly improved editing experience.
Step 6: With our transcript proofed and errors corrected, we’re now ready for export. Let’s have a quick look at the available options.
- Re-transcribe sequence … enables users to save existing transcripts in a variety of different languages.
- Export transcript … creates a .prtranscript file that can be accessed via the Import transcript … option.
- Display pauses as [ …] enables users to display longer gaps between spoken words to be displayed as ellipses.
- Export to text file … enables users to export a .txt file for client sharing and subtitle uploads for existing YouTube projects.
- Disable auto-scrolling holds the current position in the edit panel while scrubbing through the timeline.
Step 7: With an accurate transcript in-hand, we have the option to re-transcribe and further export in additional languages. To do so, simply repeat the export process for each language and enjoy the added SEO potential.
To add subtitles (captions) prior to export, select Create captions to generate automatically from the current transcript, synchronized and positioned according to timecode.
On selecting, the following dialogue options enable users to stylize captions and how they will appear on-screen.
When everything looks good, select Create. Premiere will automatically generate a Subtitles track and position each word on the timeline sequence according to its timecode.
Step 8: Utilizing the same handy advance-and-correct tools discussed previously, our captions are ready for final proof and correction.
Step 9: And finally, users can opt to Export to SRT file … (or, SubRip Subtitle) or a standard text format common to most media players and video capture software.
It’s important to note that the YouTube subtitle interface does not currently support SRT upload, but this is likely to change in the future. It does handle text transcripts nicely, in my experience, when retrofitting existing projects.
Step 10: All that remains is to start reloading those existing projects into Premiere, repeat the steps above and reap the rewards of offering viewers a quality viewing experience while enhancing your SEO performance at the same time.
The best part is you don’t even have to render. Simply open a project in Premiere, verify audio and video are in sync and start transcribing.
And there we have it, looks like we can officially say goodbye to the monotonous play-pause-type days of old and reclaim a little creative sanity.
Whether baking-in captions on the front end or generating transcripts to add after the fact, Adobe Speech to Text is a time-savings beast and a welcome addition to the arsenal of creative assets.
Have you tried these new features? If so, what did you think? Share your thoughts in the comments and join us over at the Photofocus Community to keep the conversation going.