What is the difference between two TMX records of the same segment coming from different sources?

Hello everybody,

Due to a format change on the customer's side we have to deal with a loss of matching quality. This is clear but there is one thing I'm really wondering about. Please look at the following segments (TMX format):

<seg><ph x="1" type="255" /> allows you to solicit feedback from other employees...</seg>

<seg><ph x="1" type="325" /> allows you to solicit feedback from other employees...</seg>

The first one was stored with Framemaker MIF sources in Studio, the second one was stored with DITA sources in Studio. What is the difference between these copies of the same source segment from two different version of the original source files (MIF and DITA)?

When trying to leverage exting translation from the MIF version to the new DITA version I didn't get a 100% match because of the tag. Studio was able to identify the right translation in the TM but only inserted the text without the leading tag. For getting an idea about the differences I created a new and empty TM, sorted this segment from he old MIF source and the same segment from the DITA source. The type number is the only difference but one is perfect match the other is lower than 100%.

As far as I see the content of the tag is not stored in the TM. My question: What kind if information allows Studio to know what's coming from DITA and what's coming from MIF? Or the other way: How can I make a 100% match from the MIF segment?

Thanks a lot in advance.

Best
Franz

  • Please show the complete TU, i.e. both the source and target segment. I guess the numbers won't be the same in source and target in the bad TMX.

    This looks like the 2 years old problem I had: https://community.sdl.com/product-groups/translationproductivity/f/studio/13675/alignment-of-html-with-localized-urls-results-in-tm-segments-losing-the-hyperlink-tags-in-target-segment
    As can be seen on the screenshots, the problematic TUs have different tag IDs in the source and target, which effectively causes such TUs to not be 100% match.

  • Hi Evzen,

    Thanks a lot for your reply. I'll check the link to the old problem. It would be wonderful If I could grab a solution from this link. 

    As mentioned yesterday I created a TM containing only 2 TUs: Please find the complete TMX file here:

    <?xml version="1.0" encoding="utf-8"?>
    <tmx version="1.4">
      <header creationtool="SDL Language Platform" creationtoolversion="8.0" o-tmf="SDL TM8 Format" datatype="xml"
              segtype="sentence" adminlang="en-US" srclang="en-US" creationdate="20190904T101610Z" creationid="NILAKAZIE">
        <prop type="x-Recognizers">RecognizeAll</prop>
        <prop type="x-TMName">1031 Test (en-US 2 de-DE)</prop>
        <prop type="x-TokenizerFlags">DefaultFlags</prop>
        <prop type="x-WordCountFlags">DefaultFlags</prop>
      </header>
      <body>
        <tu creationdate="20190705T150414Z" creationid="NILAKAZIE" changedate="20190904T101728Z" changeid="NILAKAZIE" lastusagedate="20190904T101728Z">
          <prop type="x-Context">0, 0</prop>
          <prop type="x-Origin">TM</prop>
          <prop type="x-ConfirmationLevel">Translated</prop>
          <tuv xml:lang="en-US">
            <seg><ph x="1" type="7" /> allows you to solicit feedback from other employees...</seg>
          </tuv>
          <tuv xml:lang="de-DE">
            <seg><ph x="1" type="7" /> ermöglicht Ihnen, von anderen Mitarbeitenden Feedback...</seg>
          </tuv>
        </tu>
        <tu creationdate="20190705T150414Z" creationid="NILAKAZIE" changedate="20190904T102115Z" changeid="NILAKAZIE" lastusagedate="20190904T102115Z">
          <prop type="x-Context">0, 0</prop>
          <prop type="x-Origin">TM</prop>
          <prop type="x-ConfirmationLevel">Translated</prop>
          <tuv xml:lang="en-US">
            <seg><ph x="1" type="325" /> allows you to solicit feedback from other employees...</seg>
          </tuv>
          <tuv xml:lang="de-DE">
            <seg><ph x="1" type="325" /> ermöglicht Ihnen, von anderen Mitarbeitenden Feedback...</seg>
          </tuv>
        </tu>
      </body>
    </tmx>

    Only one of these TUs results in a 100% match when working on the same DITA file. This is really strange.

    Best
    Franz

  • Hmmm, this is different problem... it's caused by the different type of the placeholder.

    When it's not a 100% match, what is it then and what penalty it gets?
    I guess it's like 99% fuzzy due to a a formatting difference or something similar...

    So you just need to remove the corresponding penalty and you will get a 100% match then.

    signsandsymptomsoftranslation.com/.../

  • Hi Evzen,

    The issue you are describing in the old thread is well known to me, but it's unfortunately different to my current problem, but I still hope I find a way to slove it.

    Best
    Franz

  • Thanks again, but unfortunately there are no penalties defined. In the meantime I found that only tags representing variables such as a product name etc. are concerned.

  • You are making the help very difficult by not providing relevant info, screenshots, etc... :-\

    What do you see in the Translation Results pane? What score does the segment get if it's not 100%? What icons do you see next to the score, if any? What does the infotip say?

    Is the placeholder actually part of the segment in Studio (is the purple mark displayed at the start of the line) in the new DITA format, or could it be that it's actually taken outside of the segment?

    Are you absolutely sure about the penalties? Unless you explicitly changed the settings, the default would apply, which is 1% for missing/different formatting and also 1% for multiple translations.