Captioning Your Videos - Massachusetts Cultural Council

If you’ve made a video but haven’t added captions, have you really said enough?

You’ve created a video. It could be a work of art itself, or a documentation of another work (a dance or interactive installation, for instance), or a reading of a literary work, or an explanatory video, or any number of the ways creative people utilize moving images. If your video includes sound of any kind, we believe it’s essential to add captions to make the experience more accessible.

Why
The most important thing you could get out of this article wouldn’t be how to caption videos (though we hope to help on that front), but that you should. Access is about equity (in June, our agency’s governing Council unanimously reaffirmed our commitment to advance diversity, equity, and inclusion). What’s more, accessibility is integral to the work of an artist. Artists create work to reach an audience, and accessibility efforts – like captions – increase that reach. Captions open up your work to more audiences, including those who are hard-of-hearing, deaf, or in any way restricted in hearing or processing language.

We understand that artists are exacting about the way their work appears in the world. We encourage artists to make accessibility, including captions in videos, a priority, and then be exacting about the way they do it.

Technology is always changing, and by the time you read this, some of the software or other tools we mention may have evolved. But the importance of increasing the accessibility (and reach) of your work remains constant.

How
If you have the resources, there are services that will caption your videos for you. Type “video captioning service” into a search engine and you’ll surely find some. But this article will focus on doing it yourself.

There are benefits to the DIY approach. First, when transcribing sound, it’s important to stay as close to the “heard” experience as possible. But there may be tricky decisions about how best to convert different moments into text, and wouldn’t you rather be the one to make those decisions? Second, transcribing and creating captions will force you to do a lot of close listening to your work. You may make discoveries you hadn’t noticed before – who knows, it may lead to positive changes in the work itself.

Transcription and Timing
Creating captions involves both transcribing language and sounds from video and aligning that transcription with the right timecodes. There are several tools you can use to do both at once. First, you can create timed captions in YouTube. You’d first upload your video into YouTube, and then select “Subtitles” in the YouTube Studio menu. Even if YouTube isn’t the final (or only) destination of your video, you can always create the captions there and export a captions file once you’re done (more on that later).

If you’d rather not transcribe from scratch, you can use YouTube’s automatically generated captions and edit them. Speech recognition tools used by major platforms are improving rapidly – and you may find that you don’t have to change much. Still, you’ll want to adjust mistakes, capitalizations, punctuation, awkward line breaks, and anything that might hinder comprehension or doesn’t match the intent of the moment.

Alternately, you can create captions using software called CADET (which stands for Caption and Description Editing Tool). CADET is a free, downloadable caption-authoring software created by the National Center for Accessible Media at WGBH (NCAM).

We’ll be blunt: no matter which tool you use, the process of transcribing can be time-intensive and taxing. You’ll play, pause, type; play, pause, type, over and over again, until you reach the end of the video. Then you can watch and see what adjustments you need to make – for instance, making sure text appears long enough to be read, or doesn’t linger too long as to be distracting.

Any tool or platform has its strengths or quirks. Because tools are always changing, we won’t dive into the how-to’s of using any one tool. But here are a few things to keep in mind while transcribing:

1. You may be tempted to correct grammar or other issues in the spoken text. Generally, it’s considered best practice to keep your transcription as close to the spoken text as possible, mistakes and all. If you absolutely have to make adjustments, try to keep them minor and unobtrusive.

2. Be sure to caption not just language, but other sounds. Caption sounds in brackets, lowercase, for example: [laughing] or [playful music].

3. The captions should be synchronous with the speech or noise. But you should also leave enough time to read each text block.

4. As with any new skill, there is going to be a learning curve. You’re likely to make mistakes. So be patient with yourself. And don’t be hesitant to use the “Help” feature of each tool – or even do an Internet search – if and when questions arise.

Moving Platforms
Let’s say you’ve worked hard and created captions for your video using CADET or YouTube. Your work may not be finished. If you want to show the video in another platform (such as Facebook or Vimeo), you’ll need to create a caption file compatible with that platform. Both CADET and YouTube allow you to export caption files in a range of formats, including SRT and WebVTT. We’ve found it useful to search the “Help” area of different platforms to find their recommended caption file format.

But what if you want to screen your video offline, directly from a file? Let’s say you are projecting video in a gallery or screening a video at an event, and you are uncertain about consistent Internet connectivity. In this case, it may be beneficial to “hardcode” the captions into your video file.

Think of it this way: tools like CADET and YouTube Studio help you create captions for your video by creating a separate “captions” file to augment the video file. Platforms like YouTube or Facebook draw from both files. Hardcoding those captions means you’re combining the original video file and the caption file into one, so any media player software can play the video with captions already embedded.

There are no doubt numerous ways to do this, but a tool we’ve used to hardcode captions is a free software program called Handbrake. To hardcode captions using Handbrake, you will have to save your captions file in a filetype Handbrake can process, such as SRT.

Once you open Handbrake, you click “Open Source” to open the video file. Then, select the “Subtitles” tab, click “Import Subtitle,” select the SRT file, click the “Burn in” option, and start encoding.

The video embedded above, about Yary Livan (Traditional Arts Fellow ’12), includes hardcoded captions. If you watch it, you’ll notice that we didn’t follow all of our own best practices. While the captions include all of the spoken language, there are no captions for the orchestral music by Scott Wheeler (Music Composition Fellow ’17, ’05), which is a vital component of the experience.

Which just reinforces the point that there’s a learning curve to this process. We’re still learning, too. To that end, please feel free to leave comments about your own best practices in captioning.

Related reading:
What Is Access Now? by Charles Baldwin, Program Officer, Mass Cultural Council UP Designation, Innovation and Learning Network
Read the U.S. General Service Administration’s guidelines on creating accessible video, audio, social media posts
Listen to a reading and interview with Sara Hendren, a writer, artist, design researcher, and scholar of disability studies

Image and media: “NUF CED” pin, produced by Michael T. McGreevey around the time of the 1903 World Series, from the Boston Public Library Flickr page; video by Mass Cultural Council about Yary Livan (Traditional Arts Fellow ’12).

Disclaimer: Any reference to specific service, software, or online platform is meant for informational purposes and not as an endorsement of any service, software, or platform.

Share this:

Like this:

Leave a Reply Cancel reply