Show and Translate YouTube Captions

Video captions are interesting. For example, if you subtitle a video in the same language as the video, you can help people with low literacy improve their reading skills. Or if you’re in a meeting, you could watch a video silently and read the captions.

The TED conference is also thinking about subtitles. I think they’ve translated several TED talks into 25 different languages. They also provide interactive transcripts — click on a sentence and the video will jump to the right spot. Cool stuff.

YouTube is not standing still either. They recently added an option to turn captions on for embedded videos. For example, take this recent video:

By adding “&cc_load_policy=1” in a couple places in the embed code, I turned captions on by default. If you click in the bottom right, you can toggle closed captioning on and off.

There’s also one more neat feature that you might not have seen. Did you see that Google Translate can now translate between 41 different languages? Well, you can auto-translate subtitles on videos as well. Click in the bottom right, then click the arrow by the “CC”. It looks like this:

Click to show CC options

Choose “Translate…” and then just select a language to translate the captions into. The Google Translate team just added seven new languages including Turkish, so let’s translate into Turkish:

Select Turkish

and in just a few seconds, you can watch my video and read the subtitles in Turkish!

Turkish subtitles

I’m sure the translation isn’t perfect, but it’s much better than the Turkish that I would write. ๐Ÿ™‚

Learn more about YouTube’s captions and subtitles in their help center. There is also a project to host caption files with a Creative Commons license.

44 Responses to Show and Translate YouTube Captions (Leave a comment)

  1. Amazing developement in youtube now a days ๐Ÿ™‚

    And I Hope translation would help much non english users and won’t have much mistakes in translation

    Cheers

  2. Great addition, and definitely useful for those who are at work who can’t turn on the volume (or listen on headphone) and learning about the video without actually listening to it. It also improves reading comprehension I guess.

    I must say that the quality of translation is good enough (I only tried Tagalog) knowing that this is an auto-translate function.

  3. Hi Matt,

    That’s excellent news!

    if you subtitle a video in the same language as the video, you can help people with low literacy improve their reading skills. Or if youโ€™re in a meeting, you could watch a video silently and read the captions.

    Don’t forget captions are also extremely important for accessibility (ie so people with hearing disabilities can still access the content in a meaningful way). As someone who runs the web team at a government department, I can tell you that if we don’t have captions, we can’t put a video up.

    Anyway, a really positive move by Google and YouTube.

  4. Too bad that machine translation between English and a lot of Asia languages (Chinese for instance) are still pretty much not working. It would be nice to have english subtitles translated to Chinese / Japanese for a change.

  5. Matt, how can I search for just videos containing, say original Chinese captions?

  6. Matt,

    Thats really interesting. Some days back when i was scribbling through videos on youtube, i found this innovation of adding text captions.
    I hope it can help us by adding version of languages to a video that we’ve uploaded.
    Hang up, Matt a question just came into my mind is about Duplication Issue.
    What if a company launches its single video with country/language based video with captions. This would be a single website with language based captions i.e. may be in French, English, Portuguese etc. Would that be a spam ? or Google will hang it up as duplicate content ?
    I hope the discussion will be fruitful.

    Waiting for your reply Matt.
    Regards

  7. Mistake in first post ….. Pardon me please.

    What if a company launches its single video with country/language based video with captions. The company would be generating versions of a single video with different languages. The original thing is video which is common in all but caption languages are for example in French, English, Portuguese etc. Would that be a spam ? Will Google hang it up as duplicate content ?

    I hope the discussion will be fruitful.

    Waiting for your reply Matt.
    Regards

  8. Aqeel, you can actually add multiple language tracks to a single video on YouTube, so there’s no need to duplicate the video itself, and everyone can use the same link. Check out this video which has subtitles in several languages:
    http://www.youtube.com/watch?v=QRS8MkLhQmM

    I wouldn’t worry about translations (subtitles or otherwise) being viewed as duplicate content in any negative/spammy sense if it fits the needs of your users.

  9. Thanks, Matt, for this tip…I haven’t seen this yet, but have a variety of projects this could be used on.

    I’ll introduce the elephant in the room as I’m sure everyone is wondering…will the various translations be indexed by Google? For example, will these videos show up in universal search results based on if the search term the user is using appears in the translations?

    As for the duplicate content questions above, I’d assume Google has taken measures to avoid that (especially if this is on Youtube) considering the recent adoption of the rel=”canonical” tag for websites. I’ll let Matt answer specifically, but I’m personally not concerned especially if the “duplicate content” (the translated text) is on the same video and not “multiple URLs”.

    Thanks again,
    Shane

  10. Matt,

    I can see the search benefits of youtubers captioning their content. Ultimately search engines will be able to index all audio and video text automatically. Do you think it will be pertinent to assign different values to the different kinds of content i.e. less value for video/audio/translated content than the old fashioned traditional text.

    I can imagine in the not too distant future that people like the escapist http://www.escapistmagazine.com/videos/view/zero-punctuation could make a thousand pages of content a minute with their motor mouths. Surely not as valuable as http://www.wired.com/wired/archive/14.11/sixwords.html ?

    In other words my question is: at some point in the future will there be a need to implement structures that assign values to content and different media types in order to eliminate abuse of automatic translators/speech recognition etc…

  11. I have a solar site where I do lots of video interviews and I hate transcribing them. Once I have the video transcribed this tool would be a great addition.

    Does anyone know of a good speech to text transcription tool for audio from videos that works well?

  12. Very nice feature Matt!

    If there is one thing Google does very well is to focus on serving different languages and embracing non-English speakers into the web. Hats off to G!

    You’ve mentioned several times that the risk of duplicate content in translated material is minimum. It’ll be apreciated if you can share your thoughts on these questions:
    1. Will this influence Google’s crawling capabilities for videos in languages other than English?
    2. Will the video appear for English search results as well as Spanish with the captioning on? Or turning captioning has no effect in multilingual SERP.
    3. How would that affect the ranking of a video regardless of its main language?
    4. Will it be possible to manually add the CC for better translation purposes instead of depending on Google’s translation tools?

    Since we are on the subject of automated translation tools, there are many translation plugins that create tons of poorly translated content. However, those pages are still indexed and show up for some queries. In fact, some people use these tools with the only purpose of creating more content; thefore, giving their sites and extra punch. How is Google dealing with those automated pages? Do you consider them seudo spam since the quality is so poor that is almost difficult to understand for a native speaker?

    Thanks again,

    Augusto Ellacuriaga

  13. I don’t know of one off-hand, Dave, but I agree that would be wicked cool.

  14. Good question, Philipp. This url will search for the word “china” in videos with closed captions:

    http://www.youtube.com/results?closed_captions=1&search_query=china

    But I don’t know of a way to restrict the search by closed caption language right now.

  15. I wish I can get Om to read this post. Just a few weeks ago he did write an article about Google not coming up with anything innovative anymore. I mean, this is a feat. Can you imagine turning the worlds biggest repository of videos into a more accessible medium for those who may have difficulty hearing. Not to mention the translated captions are just the most awesome idea ever.

    Oh wait, do I hear karaoke? Hahaha. This new technology will now allow me to sing with some of the music videos on Youtube without having to pull out a page from a lyric site.

    Matt, will you give those people who worked on this a pat on the back for me. This simple thing is really helpful and amazing. Send my sincere thanks.

  16. Matt, this is just cool. I’ve been criticizing Google’s expansion into browser market and other aspects on your blog but this post and this development and just me say “this is where I want to see Google heading”. Google should be easing people’s lives not getting on top other market’s toes like Firefox, Android, Cellphone market, Internet Service Provider etc. Staying behind smart guys and making them successful is humble but trying to take everything under your control and under Google hands is just making me and people like me suspicious about Google’s future.

    Some nice words about Turkish translation; people are now able to read english content. This is absolutely great. I used to translate your posts / messages and explain it to non-english speakers but now I am just sending the links to the translation page. As you said, it is not perfect but %60-70 understandable and the rest people will fill in the blanks.

    Out of subject, I was really hoping to meet you next week in Austin but I had to change plans. I know, I will regret but hopefully will get the chance to meet you in Vegas.

  17. Stanford’s eCorner has also been doing this for a while. For example, see a talk from Marissa Mayer:

    http://ecorner.stanford.edu/authorMaterialInfo.html?mid=1532

    you can choose the language via Subtitles drop-down, as well as browse the video via transcripts (click on Transcript link next to Description).

    More features, slicker interface for captions/subtitles, and features similar to what TED is planning to do are coming soon.

  18. Dave, regarding your question about transcribing audio in video…the new Premiere CS4 does that. Try grabbing the demo from Adobe and seeing how it works for you. They showed it at Adobe Max back in October and it’s pretty sweet. It basically does it automatically and allows you to edit the transcription (since it’s not perfect). They even showed how to make the text searchable, so you could type in a word and it would allow you to jump to various places in the video where that word (or phrase) is spoken. LOTS of possibilities with that.

    -Shane

  19. Very interesting Matt! ๐Ÿ™‚

    Three days ago i have recently learnt about how to watch the specific part of the video in youtube. Now another one, translate options for the video captions. Quite interesting. Looking forward more like this in the upcoming posts.

    Thank you ๐Ÿ™‚

  20. I would think Dragon’s voice recognition software could be used in a way for transcription… will need to be edited for sure, but it is better then having to type the whole thing…

  21. Dude, i spent the last 20 minutes watching this canonical link thing … YOU’RE AWESOME!!!

    Thanks a lot.

  22. I was tinkering with Youtube subtitles too the other day – and came up with a proof of concept ay for generating subtitles from twitter streams…

    … the use case being – generating tweets as subtitles to overlay conference presentations, where there’s a twitter backchannel:

    http://ouseful.wordpress.com/2009/03/08/twitter-powered-subtitles-for-conference-audiovideos-on-youtube/

    Ideally the route needs automating, but I leave that as an exercise for the lazyweb, or at least until such a time as I have a couple of spare hours free…

  23. Matt
    You gave an example of a url that “will search for the word โ€œchinaโ€ in videos with closed captions” [ http://www.youtube.com/results?closed_captions=1&search_query=china ] so I’m thinking – if live tweets from an event can be associated with a video of an event (maybe because the video is posted with a link to a (now out of date!) upcoming record for that event in order to anchor it in time) then being able to search the tweets as captions/subtitles provides a crib for deeplink searching into the video? (But then, I guess the Goog is looking at audio indexing anyway?)

  24. this is excellent. I think this is going to help a lot of people such as travelers, churches and really any business. we just need to get all these to start adding captions.

  25. BTW, what kind of video recorder do you use? I’ve been thinking of creating a few commercials

  26. Hi Matt,
    Video subtitling is extremely useful for non-native speakers.

    I have wrote about this a few months ago on my blog
    http://vizualbod.com/articles/online-video-subtitling-is-not-just-for-deaf/

    The YouTube implementation is very close to my post ๐Ÿ™‚
    Nice to see, that we as developers think so similarly about the best practices.

  27. Hi Matt, your blog is always a great source of information. This is an awesome breakthrough- how do the clickable captions figure under an SEO standpoint?

  28. Hi Matt, its really a useful tool.But i have a question for you , when we can listen your videos in Turkish? (not subtitle)
    i mean transcription and doubling with computer voice.im 23 now, can i see? ๐Ÿ™‚

  29. Is there also embed code for translations?
    I’ve some videos with english track and wish embed code for some translations.
    Thanks in advance.

  30. I have to second Gand’s request. I want to know if there is a parameter I can pass in the embed code that will display whichever closed captions track I want for the video. This is an important requirement for me.

  31. @ Ilya
    If your video already have subtitles track of your language you can use “&hl=(your language)”
    I mean: my videos have English and Italian subtitles with “&hl=en” I get english subtitles, with “&hl=it” I get Italian subtitles. But if I use “&hl=fr” I don’t get any translation to French.

    @Matt
    I’ve a question, having 2 or more subtitles tracks, is there a way to fetch the browser language without using “&hl=” tag? Thanks.

  32. It is a great idea, but the right-to-left languages, like Hebrew, aren’t written in RTL, so numbers and English words break the sentence structure.

  33. Thanks for all the instructions! Would it be too much if I ask you where in the You Tube embed did you insert the code you mention so that the captions appear automatically? Thanks again
    Carina

  34. I think this is going to help a lot of people such as travelers, churches and really any business. we just need to get all these to start adding captions.

  35. Matt I found a site that does do machine speech to text, it is called subply.com and I have done a few videos with them. It is free but their site is a bit buggy and doesn’t seem to be working right now.

  36. Dear Matt,

    I have a question regarding subtitles on YouTube videos. Let’s say, I have a video with English, German and French subtitles. Is there any possibility to define a parameter in the URL or embedding code, which will immedately select the right subtitle language?

    Many users are not familiar with the possibility where and how to change the language. If such a parameter is available, it would be much easier for me to present my multilanguage videos.

    Thanks you in advance,
    Thomas

  37. PS: I have seen now the post with the “&hl=xx” parameter, but this doesnt works …

    For example
    http://www.youtube.com/watch?v=Cm9onOGTgeM&feature=player_embedded&hl=es
    will not show the spanish subtitles.

    Thomas

  38. How about HTML5 and H.264 videos as there are no more subtitles?
    Is there a way to show subtitles with H.264 codec and HTML5?
    Thanks in advance.

  39. Dear Sir,

    Google is offering auto-captioning services. It is pretty good but as you know, it is not quite perfect.

    I suggest that instead of over-relying on machines to do the job, we should really use the strength of ‘volunteer captioners’ to do the job.

    Just like in Google translate, you have the option to ‘Submit a better (human) translation’, there should be this option to ‘Submit a better caption(done by humans)’.

    You can even go one step further. Make the contributions of people who volunteer captions open and available on the web.

    Then let users or people who watch the video and captions vote for the most accurate captions.

    The most highly rated captions would ‘rise to the top’, as is happening to the most highly-rated comments in Youtube videos now.

    The way people speak and the accents of people around the world are just amazingly different, sometimes it is best to have a Mexican caption what another Mexican has just said in English.

  40. Matt, it is all great in theory to translate with Google Translate, but I think many of us still find Google translate not accurate in most of the cases… Any chance, you might enlighten us if it is gonna change in future?

  41. Hi Matt,

    I have got the same question as Fatih ARAMA. Is Google going to improve its Translate? I mean it is sometimes very useless. I sometimes regret losing time to attempt to translate an article into another language. I hope you guys will figure something out.

    Thanks

    Best

  42. Thanks for clearing this up Matt. I’ve been hunting for info on Youtube captions and how they relate to non-English markets, so this has been a great aid.

  43. If you want to explicitly select a CC language, you can use cc_lang_pref url parameter.
    Not sure why this is not better documented, but there is this:
    http://www.google.com/support/youtube/bin/answer.py?answer=140174

    -Jonathan

  44. @Jonathan … thanks man! You’re a lifesaver!

css.php