File splitting - SDL XLIFF split/merge improvement/change request

Hi, all - I'm sorry if this is not the right place to put this, there doesn't seem to be a specific area to post Open Exchange app queries.


I at times need to split files out for large projects across many translators - as all PMs must do, but that is not my usual pass-time. There may be a simple solution to this that I am just not aware of, all lessons gratefully received. My present solution is manually and tediously working out the splits in Word, creating separate Word docs (which messes up the numbering, oh oh oh). Could provide the full file to all and tell them what their page numbers are, but I have found that cause issues in the past. Colour-coding is also an option - i.e. highlight Fred's word count in green, Jenny's in blue etc. Still tedious.


Meanwhile, I have investigated XLIFF split/merge, and it really doesn't do what I'm hoping for.


Scenario: 210K words (approx.) across 2 files, to be split across multiple translators, each of whom can do a different number of words. The 2 files need to be in the same project, there is cross-file repetition etc.

A couple of issues with this app:

1) the generated split-file names - really, really need the file names to be at least in part humanly recognisable! Conversations with translators are impossible - 'my file 001_79-37-D7-94-3E-49-F7-F7-6B-A6-C4-E4-69-40-9E-E7-DE-80-CB-C3' - yeah right. And imagine you are using this for multiple projects - you have to be able to know what file people are talking about! Suggest being able to give x chars of 'my' file name (at least 15), and you could incorporate into your automated file name. e.g. 001.myname_info_your-number-stuff. Sound possible? Or be able to update the file names and your rebuild register with updated names - open to so much human error, support nightmare, but option?

2) When splitting, would really need to be able to define varied word-count splits, totalling my file word count total (all are approx/rounded of course) e.g. 15000, 80000, 20000, 40000, 30000, 11000, 18000, 5000, balance. An alternative would be a 2-stage process whereby I could create a set of e.g. 5K splits then re-join, but obviously your split/merge process couldn't deal with that at present. I really don't want to send someone, e.g. 16 5K files (with incomprehensible names...) for their 80K word quota! And if I did - as a translator, I'm sure I'd merge them into one in Trados, which of course can be done, and what then happens when they send them back to me? Plus, in my scenario, if I want to create a project that contains all the translator's sdlxliffs, I can no longer tell which of the two source files the individual generated .sdlxliffs comes from. Am I missing something?

3) splitting by segment number is only any use if I have already calculated (in Word) where I want the splits to be, then created the project, then find the corresponding segment numbers in the project, note them... there is no gain.

I did try it with a team last year, and we got nowhere, I think we had a lot of issues with file names being incorrectly saved etc. Chaos ensued. But it still seems like a good idea.

Thoughts, suggestions and completely different solutions gratefully received. Thanks!