easyjasub

easyjasub is a tool to add furigana and in-line translation to Japanese subtitles, for language learning. It takes Japanese and an other language (say English) text subtitles and combines them in picture-basted subtitles.

It is designed to work as a command-line utility or as a Java library to be integrated in other applications, or maybe a custom UI.

easyjasub is still in alpha development stage, you may find defects and its usage may be difficult, do not hesitate to ask for help opening a new ticket

Visit the project blog to get support and discuss about the easyjasub tool usage and development.

The result of the automatic parsing of Japanese is not always accurate and you may see incorrect words in subtitles, this is unfortunately unavoidable.

Links

Release packages and support pages are available on SourceForge.net:

Source code is available on GitHub:

If you are a developer and want to use easyjasub as part of your application, it is deployed on oss.sonatype.org, where you can also find all releases and development snapshots

Setup

This application requires third party software to execute, please make sure you install:

Not everything in the above list is mandatory but please try to install all in their respective default installation directory, you may face issues with the tool otherwise.

Subtitles are rendered as HTML pages, below you can see an example text that you can use to verify that you have Cinecaption and GT200001 fonts available

まほー せかい
そこ 魔法 世界

The rendersnake Java library is used to write HTML files, and Kurikosu library is used for Japanese text conversion

Additional software may be required, you may need to edit or produce text based subtitles, and visualize your video with external subtitles file.

Basic usage

  1. As input, you need to provide Japanese and translated subtitles in text format, both of them must be in sync with your video. Supported formats for input text subtitles are: EBU's STL, .SCC, .ASS/.SSA, .SRT, TTML (subtitleConverter library is used to read the text subtitles). Make sure subtitle file names have an extension matching one of supported ones above and copy them into a new empty directory.
  2. The timing of generated subtitles will be the same timing of the Japanese subtitles, the secondary subtitles may be repeated into multiple lines if their timings does not match perfectly (this is ok since normally you split or join sentences when you translate). The synchronization of input text subtitles is very important to properly associate them, be sure that both subtitles are in sync with the video, if they are not, you need to edit subtitles (try Aegisub). easyjasub has been tested using only UTF-8 encoding, you may face issues if the subtitle files are not in UTF-8, in this case edit subtitles and save them using UTF-8 encoding.
  3. Download the easyjasub release package (choose .tar by default, or use .zip if you have a Windows OS), unpack it in the same directory
  4. Run a terminal or a command line (on Windows, execute "cmd") and run:
    java -jar easyjasub-cmd.jar -ja subtitle.ja.srt -tr subtitle.en.srt
    Where subtitle.ja.srt is the filename for Japanese subtitles and subtitle.en.srt is the filename for translated subtitles On Windows you can use the executable file:
    easyjasub.exe -ja subtitle.ja.srt -tr subtitle.en.srt

A new folder will be produced to store output and intermediate files, and if the process is successfull a command to run BDSup2Sub and convert subtitles is suggested

Subtitles look like this: Sample easyjasub subtitle picture

If something fails or you want to do an other try with different options, be sure to delete all files that easyjasub produced. The tool does not overwrite files already existing but uses them.

wkhtmltoimage is searched in its default installation folder, if it is not found you can use option -wk to specify the full path to the executable.

java -jar easyjasub-cmd.jar -ja subtitle.ja.srt -tr subtitle.en.srt -wk /usr/local/bin/wkhtmltoimage
easyjasub.exe -ja subtitle.ja.srt -tr subtitle.en.srt -wk D:\wkhtmltopdf\bin\wkhtmltoimage.exe

You can customize the elements displayed in subtitles, for example you can show romaji, hide the translation, and even show just the Hiragana rewriting hiding Kanji; to do so use the following options:

Here is a sample without translation and Furigana but with Romaji Sample easyjasub subtitle picture

And here a sample without Kanji Sample easyjasub subtitle picture

Using JMdict

easyjasub can use a JMdict (http://www.edrdg.org) file to add a rough in-line translation of Japanese words, you can enable dictionary usage with the following options:

You need to download locally on your system a JMdict file. Only English is supported for the dictionary translation.

Using MeCab

Starting from version 0.3, easyjasub uses Lucene Kuromoji Analyzer for parsing Japanese text, but if you want you use MeCab tool, you can download it from MeCab on Google Code download page. Be sure you select UTF-8 usage.

MeCab is searched in its default installation folder, if not found you can use option -mc to specify the full path to the executable.

java -jar easyjasub-cmd.jar -ja subtitle.ja.srt -tr subtitle.en.srt -mc /opt/mecab/mecab
easyjasub.exe -ja subtitle.ja.srt -tr subtitle.en.srt -mc D:\MeCab\bin\mecab.exe

Using online converter

Alternatively to MeCab, you can use the online Kanji Converter available at nihongo.j-talk.com. You need to manually use the Kanji Converter website:

With the online converter you get an automatic translation for some of the words Sample easyjasub subtitle picture

Kanji Converter internally uses ChaSen, the accuracy of Japanese parsing may be different than the one you obtain using MeCab or Lucene.

Advanced usage

easyjasub has many options to tune subtitles generation, run it with -h option to view them. Many options, if left unspecified, gets a default value depending on the other options you have set.

Set width and height of subtitles with --width and --height option to best fit your screen, or to adapt to the video:

easyjasub <opts> --width 1024 --height 768

Sample subtitle generation on a subset of the subtitle lines with --select-lines, for example to process lines until line 10:

easyjasub <opts> --select-lines -10

Or you can select an interval

easyjasub <opts> --select-lines 5-20

Nearly all options accepts the special tag "disabled", with that you can suppress usage of that feature; for example to not produce the text file with transcribed subtitles:

easyjasub <opts> --output-text disabled

Or you can disable the usage of wkhtmltoimage even if it is found in your system:

easyjasub <opts> --wkhtmltoimage disabled

You can fine-tune the matching of subtitle lines reducing or increasing the difference in milliseconds you accept between Japanese and translated timestamps:

easyjasub <opts> --match-diff 500 --approx-diff 400

... the matching algorithm is currently not that sophisticated...

easyjasub performs the following actions:

  1. Reads Japanese subtitles file
  2. Converts Japanese subtitles in plain text format and write it in a text .txt file
  3. Read translated subtitles file
  4. Parses the Japanese text
  5. Create the output folder to store files
  6. Writes a CSS file to style subtitles
  7. Writes one HTML file for each subtitle line
  8. Renders HTML files in PNG pictures
  9. Writes the BDN XML file

easyjasub never overwrites an existing file, so make sure you manually delete them if you need to run the tool again. You can exploit this functionality to manually change some of the files: run the tool, edit the files you want to customize and delete all others, run the tool again. This is useful for example to edit the CSS file and change the style of the text of generated subtitles.

Contributing

This software is in early alpha stage, if you find problems or you need help do not hesitate to ask for help opening a new ticket. If some particular subtitle line is inaccurate attach the .html file produced.

If you find the tool interesting and would like to extend it go to GitHub easyjasub project page and join!

Project Web Hosted by SourceForge.net

©Copyright 1999-2009 - Geeknet, Inc., All Rights Reserved

About - Legal - Help