Checking your translation for errors using Translation Toolkit (TTK) and/or Virtaal

Updated 6 years ago

    Using the Translate Toolkit

    There are tools that you can use to check for hard-to-spot errors, such as an extra space at the end of a line, a typo in an acronym, a comma instead of a fullstop, etc. One such tool is the Translate Toolkit (TTK) by TranslateHouse (formerly Locamotion, formerly Translate.org.za).

    TTK works by checking your translation file for specific types of errors, and then exporting detected errors to a separate file. For example, if we run a check for incorrect trailing spaces, TTK will extract all segments that contain potentially incorrect trailing spaces and save those segments in a separate file.

    (TTK is actually intended for translators who are able to upload corrected versions of their files, but we can use TTK as well, even though we have to make edits manually in Transifex. TTK is designed to work with PO files, which is why one would normally convert one's TS file to PO first, but TTK can work directly on TS files, which is fine for us, since we're just interested in seeing the errors.)

    TTK is also built into a Windows program called Virtaal, which can also open TS files, but it's more productive to just use TTK directly. The method described below requires Windows (via a command window or by using a .BAT batch file), but one can also use TTK on Linux and Mac.

    Step 1: Download the latest user version of TTK

    https://sourceforge.net/projects/translate/files/Translate%20Toolkit/1…

    Step 2: Install TTK

    TTK will attempt to install in the Program Files folder, but that is not ideal, because you're going to be wanting to work on translatable files inside the folder where TTK is installed, and Windows doesn't like it when you fiddle with files in the Program Files folder. So, install TTK in an accessible location such as D:\TTK.

    Step 3: Download the two BAT files and put then in the same folder as TTK

    https://musescore.org/sites/musescore.org/files/2019-01/pofilterbats_0…

    Step 4: Get the latest translations from Transifex as a TS file

    a) Go to https://www.transifex.com/musescore/musescore/content/
    b) Click the file you want to download (e.g. QT - MuseScore)
    c) Click your language
    d) Click "Download for use"

    or, since there are currently only three files, use these links, customised for your language:
    https://www.transifex.com/musescore/musescore/musescore/LANGUAGECODE/do…
    https://www.transifex.com/musescore/musescore/tours/LANGUAGECODE/downlo…
    https://www.transifex.com/musescore/musescore/instruments/LANGUAGECODE/…

    Step 5: Rename the downloaded file

    Rename to "musecore.ts" and copy it to the folder where TTK and the BAT files are stored.

    Step 6: Run TTK

    ... by double-clicking either "all filters for TS.bat" or "all filters for PO.bat". The "for PO" version will take a little longer because it will first convert the TS file to a PO file.

    The .BAT file will run a number of checks, and if there are any flagged segments, they will be stored in separate files. For example, all segments with acronym errors will be exported to a file named "musescore [acronyms].ts" (or ".po").

    Step 7: Check the exported files

    ... and fix the errors (if any) in Transifex. Both TS and PO files can be opened in a text editor, or you can use view the files in a viewer offline or online, e.g. https://localise.biz/free/poeditor (handles both PO and TS files).

    Notes

    1. Don't fix the errors by editing the exported files. Think of the exported files as "error reports". You need to fix the errors via Transifex online, in the same way as you would usually edit translations in Transifex online.

    2. The fact that TTK flags something as a potential error doesn't mean it really is an error. For example, if you translated "OK" as anything else, TTK will flag it as an acronym error.

    3. If you're curious about the meanings of the checks, read here:
      http://docs.translatehouse.org/projects/translate-toolkit/en/latest/com…

    4. You can run multiple checks in a single command, which creates a single export file for multiple types of issues, but I find that that is not as useful as it may sound.

    5. It is likely that the very latest version of TTK contains extra checks that may be useful to us, but the versions after version 1.9.0 require advanced computer skills to make them work. Version 1.9.0 is the most recent easily installable version.

    6. If you do use PO files, remember that PO files encode certain characters that TS files don't (e.g. line breaks in PO files are shown as "\n"). In fact, it's best to just open the files in a suitable viewer, not in a text editor, unless you're sufficiently familiar with the underlying formats.

    Using Virtaal

    TTK is also built into a Windows program called Virtaal, which can also open TS files (simply drag and drop to TS file onto Virtaal).

    Several features in Virtaal work different from other tools (they experimented a bit...). Filtering in Virtaal is called "Navigation", and when a navigation mode is selected, pressing ENTER jumps to the next segment that matches the selected filter. In other words, when a filter is active, you can still see all segments (the idea behind this was to allow users to see context).

    virtaal view1.png

    In the image above, the "Quality Checks" filter is enabled, and we have chosen the "Simple capitalization" check. Virtaal also shows how many segments match the check, and the name of the check is also mentioned in the active segment (in this case, the segment got flagged by both "Acronyms" and "Simple capitalization" checks).

    virtaal view2.png

    In the image above, the "Search" filter is enabled. Unfortunately, you can't specify source or target text. I find this to be the most useful feature of Virtaal for us, for it allows one to search existing translations much faster than is possible in Transifex (although Transifex allows more options).

    upload
    Attachment Size
    pofilterbats.zip 1007 bytes