Detecting language please wait for.......
The title is a bit misleading, but I'm not able to express the weirdness in more meaningful way :-\.Either I'm missing some fundamental concept, or there is something weird happening in Studio Alignment (or in Studio in general)...
I need to align a MadCap Flare project containing ~1200 files in total (1100+ HTMLs and the rest MadCap internal files... all of which normally ARE localizable, but Studio built-in filetype ignores them... probably yet another unfinished work :( ).It's an online help, i.e. contains zillions of hyperlinks. The hyperlinks are localized, i.e. different in each language.
And I have found that TMs created by aligning such files are somehow 'losing' the hyperlinks in TM lookup window.
The problem can be easily simulated on trivial HTML files:
<html><head></head><body>Click <a href="http://www.example.com/en-us/"> to continue.</body></html>
Click <a href="http://www.example.com/en-us/"> to continue.
<html><head></head><body>Klicken Sie <a href="http://www.example.com/de-de/"> um fortzufahren.</body></html>
Klicken Sie <a href="http://www.example.com/de-de/"> um fortzufahren.
Aligning these two files using an empty TM with default settings results in this (no surprise here, all as expected):
Saving the alignment as SDLXLIFF and examining the SDLXLIFF content shows that hyperlink tags are present in both source and target (again, no suprise here... file was re-formatted and some tags folded for better visibility):The only potential "issue" here are the different IDs of the <g> element, see below...
Importing this SDLXLIFF in the empty TM results in this content of the TM... again showing the different IDs:
And trying to translate the original source file using that TM (and Alignment penalty set to 0, for the information completeness) shows this - the hyperlink tags are completely missing in the TM lookup pane and are NOT inserted in the translation!
Now this is a BIG issue here since the intention is to automatically pre-translate files using the TM automatically (using 0 Alignment penalty, thus e.g. expecting ALL source files originally used for alignment being fully translated from the TM).
What is wrong here?!I kind of understand the reason for different IDs - the content of the tags is not identical, so they get different IDs - but how am I supposed to approach the task then?
If I let extract the "href" attribute content for translation, I will get huuuuge amount of extra wordcount... and no one is going to pay for that extra "translation"!Acronyms auto-substitution does nto work for URLs either...
No one? I don't really expect any translators to comment on it, but I would expect something clever from some SDL person...
Even extracting "href" attribute doesn't help much... while this allows to localize the actual hyperlinks, the problem with translations of the segment text not being applied from TM (and losing the hyperlink tags after applying it manually) still persists! :(
Looks like a bug Evzen. I haven't seen this before so will report it to make sure it is known. I can get a 100% match if I edit the tags in an exported TMX, but I don't know why it's happening.
Oh and BTW, there is another very similar bug.
If source file contains the following element
<foo attrib="bar" />
and target file contains following element
(i.e. identical, just without the space before slash), the elements get again different IDs during alignment.And the consequence is again the same - TM match not being applied, tag missing after applying the translaiton.
Not mentioning the problems with segmentation I described in other thread...That's exactly why I get mad about SDL implementing fancy bells and whistles with ridiculous marketing names instead of fixing serious problems in elementary functionality of "industry standard" tool :(
Both are caused because the attributes are different. If you translated the files then the tags would be exactly the same in source and target and everything would be fine. But because you are aligning tags that don't match it causes this problem. So if you had this in source for example: <body>Click <a href="http://www.example.com/en-us/">here</a> to continue.</body> And this in target: <body>Klicken Sie <a href="http://www.example.com/en-us/">hier</a> um fortzufahren.</body> Then it'll give you the desired result. Same goes for this example: <foo attrib="bar" /> You don't even have a tricky workaround because if it worked and they were placeholders in the TM, then when you translated the source you would of course get the same attributes that were in the source and not the ones that were aligned from the target file. This is really not a simple thing to resolve as it touches on some of the fundamental benefits of using Studio which is the ability to use placeholder tags to improve leverage from the contents of your TM. Unfortunately in your case where you are aligning non-matching tags it's working against you. Having said all this I'm not even sure if this is a bug anymore. It's a real grey area for me because you are asking the software to behave in a completely different way to the way it's designed. I have left it with support/development but I'm afraid it won't get fixed NOW. I am interested in your thoughts on this Evzen, hopefully you'll see the problem.
Paul, my point is that it's 2017, so I'm surely not the first person on this planet encountering this problem... websites localization is around for quite a few decades and there are thousands of sites around using localized links... so there simply MUST be a way to achieve such trivial thing as align two HTMLs with localized links and get a TM content which can produce translated file identical to the file used to create the TM.That's a very simple requirement, just following simple common sense.
What I'm asking from a software claiming to be the industry leader is to allow me in year 2017 to do this. Or, to be precise, it's actually the clients and my managers who expect this to be possible... because hey, why would such simple thing not be possible, right?!And honestly, I'm sick and tired of trying to explain to all of them that such simple and trivial thing is not possible using a software they paid tens of thousands Euro for... and I'm even more sick and tired of trying to explain them that it's not my fault.
What I'm asking from the software is to be smart enough in alignment process and produce exactly the same TM content which I get by manually translating the file using the same file type settings, like this:(I have translated the file using one Studio project and updated the TM... and then created new project and used the previously created TM in it)Generated target file:
<html><head><META http-equiv="content-type" content="text/html; charset=utf-8"></head><body>Klicken Sie <a href="http://www.example.com/de-de/">hier</a> um fortzufahren.</body></html>
And TM content:
All perfectly working as expected, still following "the way it's designed".
What I'm expecting from a company claiming to be industry leader is to put this elementary functionality of its expensive software to proper shape in the first place and ONLY THEN start introducing fancy half-useless bells and whistles.And I truly believe that it's pretty sensible expectation.
And regarding the <foo attrib="bar" /> you're not correct - syntax with and without space before the slash does NOT mean two different tags.HTML definition contains specification for empty elements: https://www.w3.org/TR/REC-xml/#sec-starttags (scroll a few lines down to "Tags for Empty Elements" part) - the whitespace between attribute and "/>" is allowed, i.e. tag with or without space is identical tag.The same goes for XML specification: https://www.w3.org/TR/2006/REC-xml11-20060816/#NT-STag
Evzen, I'm not arguing with you here. I'm only questioning the problem in Studio because of the way Studio works and wanted to hear your thoughts on this because it all helps to raise the profile of this issue in development.
I tested it in WinAlign and get a 100% match (less alignment penalty here):
But look at the tags. These are incorrect since you end up with the placeholder problem I mentioned earlier and the target is translated with the contents of the source tag. This I think is the correct behavior based on how Studio works. You can of course resolve the content of the target file like this by changing the href attribute to translatable and then handle the files during translation:
But I'm sure you were aware of this already and probably use this option for your website translation. So perhaps WinAlign or LFAligner is the way to go for now until we do something about the way the Studio aligner handles these things.
Neither I, nor anyone else would disagree that the Alignment tools needs addressing in many ways. It's just not managed to get a high enough priority with other things going on. So to help this we really need to see more constructive input from our users on the importance of this feature for more than a few. If you look at the ideas for example. There are half a dozen or so ideas raised on this feature and the highest number of votes for any one is 8. And that's only for a feature in alignment... doesn't come close to addressing the more fundamental issue you and I both know are in this tool:
When we sit down and prioritise the things we have to do the Alignment feature always makes it on the list. In fact when the list is prepared it's often right at the top. But after going through the scoring we use to make sure we can address as many things as possible with the time/resources available it always drops down and doesn't get addressed. 8 votes from our users doesn't help!
I know we get rational and often heated posts on the alignment tool, but again 8 votes doesn't help. If it was the most voted for idea because people wanted an improved alignment editor then that would help us focus where you need it. But it doesn't seem to be. We'll continue to make sure the alignment editor is on the list but if it's really so important for our user base then it would be good to see the votes!
Paul Filkin said:I'm only questioning the problem in Studio because of the way Studio works and wanted to hear your thoughts on this because it all helps to raise the profile of this issue in development.
Paul Filkin said:So perhaps WinAlign or LFAligner is the way to go for now until we do something about the way the Studio aligner handles these things.
Paul Filkin said:If you look at the ideas for example. There are half a dozen or so ideas raised on this feature and the highest number of votes for any one is 8. And that's only for a feature in alignment...
Paul Filkin said:When we sit down and prioritise the things we have to do the Alignment feature always makes it on the list. In fact when the list is prepared it's often right at the top. But after going through the scoring we use to make sure we can address as many things as possible with the time/resources available it always drops down and doesn't get addressed. 8 votes from our users doesn't help!
Besides, basing the priority on the absolute number of user votes is, well, pretty silly... simply because the spectrum of users coming to the forum is very narrow, not mentioning further aspects like kind of users actually making suggestions (and thus further 'distorting' the kind of suggestions).For example, I would bet that the number of "how do I start translation" noobs and users really 'fully' using Studio features is like 9:1 (where the "9" basically represents "laic translators" and only the "1" represents "power users" like engineers)... so the type of ideas you get is very biased (or what's the proper English expression).Plus, the number of skilled people like engineers in the forums is VERY limited by the fact that they don't have their own SDL account, since they work for agencies, where the account is owned by some manager and they never get the credentials. So again, by this pretty stupid limitation you lose opportunity to get important feedback from MANY people using the Pro features and/or using Studio for more advanced tasks than some trivial Word translation.
Paul Filkin said:I do wish you'd stop arguing about everything. It's not helpful at all and I don't bother reading your posts properly as they're too long and too irritating.
Paul Filkin said:Yes, that one is 21 votes... not sure how I missed that. Or maybe it just received a bunch of votes. The point I was making is that even with 20 or 30 votes out of our entire userbase this is a drop in the ocean.
One cannot compare it to the entire userbase, of course... it should be rather compared to "active forum users" or something similar, where "active" would mean something like "coming to forum at least twice a week in the last 30-day window" (since one-time visitors asking single question, then reading the answers and never coming back are just distorting the figures). Or even look at which particular user either brought up the idea or voted/commented for it. That would IMO give way more relevant picture.
Paul Filkin said:If I was the sole voice behind what was worked on next I would have the alignment tool right at the top of my list.
IMO it's kind of catch 22... users would use the voting more if they actually see their voices being heard - i.e. a) something being actually implemented... AND the fact that it was implemented based on the ideas voting being also appropriately advertised!, b) being it implemented in sensible timeframe!
Plus, it seems to me that the entire thing is just heavily overvalued (or overestimated? you know what I mean...) - only a split of fragment of users go to some forum (especially if it's SO difficult to get around it as this one... sorry, but it IS true) and only a fragment of them are bothered to participate (especialyl if they see the catch 22 that nothing will be implemented anyway if there won't be a couple of hundreds (or thousands?) of votes).
IMO the voting is currently useful only for showing RELATIVE interest in various features/improvements... like e.g. that alignment rework is about as important (or even more if you count all the related/similar ideas) to users as having more comprehensive error messages (which is just being implemented, as opposed to alignment improvements) and that it's way more important to users than embedded content support in bilingual Excel (which has even been already implemented).
EDIT:BTW, I see also some flaws in the voting like e.g. "Choose file order for Analyze Files batch task" - the idea is marked as delivered, but that's clearly not true as the person marking it did not get the point of the idea at all (and even after re-explanation and re-phrasing the idea did not bother to come back and change the status)... Now, what is an average user coming to forum supposed to think and do?
Paul Filkin said:Maybe Evzen... but we can try it or just keep discussing it. I'd rather try it.