The Power of SDLXLIFF Toolkit and Perfect Match

No matter how powerful your computer is, there will be a huge translation file which can be nerve-racking during a busy day. I am talking about huge translation files which contain more than 100,000 words with only 1,000 new words to translate. Normally you would finish this amount of work in two hours or less but if the file is big, your computer will slow down and you will spend more time.

There is a tool which can help you slice these 1,000 new words into a separate file and it is called SDLXLIFF Toolkit. I can see that around 3500 people have downloaded this tool but I do not know if everyone uses its slicing feature. I thought it might be useful to talk about it.

We just need to use the slice button right after specifying the segments to be sliced and exported into a new .sdlxliff file. We can extract fuzzy segments, new segments, 100% matches or only unlocked strings. I will slice not translated strings in my file with below options.

The slicing time can increase depending on the file size but it is worth waiting. The resulting file will look like this.

Once you translate these segments you might of course update your TM with them and use pre-translate function to apply your translations to the original file. I assume you know how to do that. In this post, I will use Perfect Match function (only available in Pro and higher versions of Studio) which is a safer method. With perfect match, you can be sure that only the related segments are updated and the status of those segments are in your control. For example, I set the newly translated strings’ status to signed-off and I want to see it like this in the original file.

We just right click original file and choose Apply Perfect Match under batch tasks. We match it with the extracted and translated file and remember to choose “Use the original translation origin and status” option to make sure that the status of the related segments are the same.

The result will look like this.

I wish the Perfect Match feature were available in all versions of Studio. It would help freelance translators in cases like these. However, I am sure this guide will also help PJMs and translators working in Pro versions of Studio. Of course the freelancers can go with pre-translate function and review them afterwards to make sure everything is in its place. Feel free to reach out for questions and comments.

  • Nice article Sinan... thanks for sharing it. Nice usecase with the splitting of files and in man y ways I prefer it to the Split and Merge tool because of the problems you encounter there sometimes. I like the toolkit too... that's why we developed it :-)
  • The point is that SDLXLIFF Toolkit is lacking functionality for which is the Split and Merge tool mainly used - splitting to equally sized pieces, splitting by segment numbers, etc. Basically, it's purpose is completely different ;-).
  • I don't know if this is the point Evzen, but if the toolkit had the ability to split with similar options would it be preferable or do you still prefer the ability to merge the original split files back together again? If you don't then maybe it makes sense to add these options to the toolkit.
  • Hi Paul,
    The way we've been using Split & Merge (i.e. splitting files by segment numbers to have smaller files that allow the reviewer to start reviewing while the translator is still translating the other parts, and then merging the split files), the Split & Merge tool indeed offers a lot more than the Toolkit. It's not just about slicing based on filters in this case.
    For us, the Merge option is very important as well, as in some/most cases it is not a good idea to simply do a pre-translate on the entire file (e.g. what if segments were merged, what if some Repetitions required different translations, etc.).
    So at our company, we are eagerly awaiting the 2017 version of the Split & Merge tool as it was...
    Best regards,
  • Of course it's necessary to have the ability to reconstruct the original file from the individual pieces. Because the client of course wants the original file, not the pieces. And engineers/PMs don't want to waste time by some crazy harakiri with e.g. perfectmatching the original file from the bilingual pieces, or translating the original file from TM created from the bilingual pieces. We need transparent and fluent process.
    In fact, the Split and Merge functionality should have been the natural built-in feature in Studio ages ago. The current splitting option available during translation package creation (which allows splitting only at the whole-SDLXLIFF-file level, is half useless.
  • Evzen Polenka said:
    Of course it's necessary to have the ability to reconstruct the original file from the individual pieces. Because the client of course wants the original file, not the pieces.

    Well that's not exactly the process we are discussing here Evzen, of course nobody is going to send back pieces!  I merely asked whether having this option to split in the toolkit would be useful.  The toolkit method is different and pretty simple when you use Perfect match as described in the original post.

    I'm not discussing the problems with split and merge... try and stick to the point ;-)