QA for forbidden terms

Hi there,

I want QA to check if million is not translated as billion in any segment.

I added the term to my TB marking its correct equivalent as "Right" and the prohibited term as "Wrong" choosing from the picklist attributes I preset for this descriptive field.

In the Terminology Verifier I ticked the checkbox "Check for terms which may have been set as forbidden" and selected "Wrong" as forbidden value (and left "Right" unchecked).

Now, the QA runs and throws an error message for all the segments where million or billion is used, even if used correctly (i.e. with the term marked as "Right").

Any idea?
Thanks, Levente

  • Hi

    I use this feature for Multiterm / Studio and it works fine for me. Can you post a couple of screenshots where this issue happens in Studio? This way I can try to reproduce it and maybe help.

    Also a screenshot of implied entry in Multiterm so I can see your structure, something like:

     

    In Studio I see:

     

     

    Almudena

  • Hi ,

    I just moved your post into the MultiTerm forum but it just occurred to me that perhaps you have used the Language Cloud (Beta) for your termbase? If you have then I think this is probably why the verification doesn't work as I don't think this is supported with the Language Cloud Terminology solution.
  • Hi Almudena Ballester,

    Thank you for the quick reply.

    So, in the meanwhile I have tried all possible variations of ticking and unticking boxes and found out that apparently the order of the picklist items matters. Though I really don't see the logic behind this. 

    I have redesigned the picklist so that "forbidden" comes before "allowed", and now it works fine as you can see it on the pictures below.

    Can you confirm my assumption on the order of picklist items? I really don't want to inadvertently rely on some hidden setting and get into trouble when working on another computer.

    Thanks, Levente

  • Thank you, Paul, for moving the topic to where it belongs.
    Actually, I don't use the Language Cloud yet, so the issue has to do with normal desktop TB usage.
  • Hi again Levente:

    I don't think the order has something to do in this case. I have just the opposite and it works:

     

    I also realized, when configuring the Termbase, that there is no need to mark the "good" translations as preferred or allowed, it is just enough to mark forbidden ones - actually, all other translations in an entry are "allowed" by default. You can spare a lot of field-filling if you have only two categories. 

    But you have to be aware of this: if you mark a term as forbidden, Studio will not check the whole entry but just this field. So it can happen that Studio throws an error message for a term which is a mistake as a translation for one term, but it is correct as a translation for another different term.

    As an example, let's say that "burdeos" (ES) should be translated as "bordeaux" (PT) and should not be translated as "grená". In the same Termbase, "granate" should be translated as "grená" indeed. Studio will mark as mistakenly translated the word "grená" in target, no matter the source :(

    That's really a shortcoming in Studio Terminology Verifier, in my view. Terminology in translation works dually, terms correspond one to each other, term source matches term target. Absolute lists of forbidden words work only partially, maybe to avoid some specific words, rude ones, copyrighted or the like. But concerning terminology verifications, the pair of words must be checked.

    So... in your case, "billion"... always will be forbidden. 

     

    Regards,

     

    Almudena

     

     

  • Hi Almudena,
    So to put it simple:
    The Verifier will always throw an error message whenever it finds a forbidden term no matter the source segment...
    Splendid. So what is the whole point of storing this field information in a term-pair relation instead of just adding them to a black list of never-use words (if this option would be available)?
    Also, could you suggest any workaround to my problem? Million-billion mistakes are show-stoppers in any translation.
    Thanks,
    Levente
  • Hi Levente,

    I am just a Multiterm/Studio advanced user trying to help & get help in this community :) Just figuring out what the problem could be and giving feedback to developers in order to improve apps and plugins.

    Sometimes I know the answer to a question based on my experience, sometimes it is just an approach.

    Maybe Luis Lopes could give us a hand here.

    Regards,

    Almudena

  • Levente Péter Nagy said:
    The Verifier will always throw an error message whenever it finds a forbidden term no matter the source segment...

    Hi  and  

    I reproduced this problem this morning and sent a small package to support.  I'm not sure if this is already something on our list of things to fix but it does look like undesirable behaviour to me.  I'll update you when I have some feedback on this later, if Luis doesn't respond sooner.

  • Hi Levente,

    As a workaround to get those segments flagged, you could set up a regex QA check where you set billion as the source and its undesired (wrong) translation as the target and tell Studio to flag any segments where that combination is used.
  • Hi Nora,
    Your suggested workaround is actually a perfect solution for my problem! :)
    Thank you so much!
    Levente
  • Hello,

    I realise this post is more than 3 years old, but it seems to describe exactly the same problem as I'm having.

    1. If I set a term as "Forbidden" in my Termbase, it gets flagged up as forbidden in every target segment where it occurs, irrespective of whether or not the source segment contains that term. Is that the intended behaviour of the term verification settings?

    2. you mentioned a regex workaround to flag up accidental errors like "billion" instead of "million". Which settings can you enter this in? Would it be possible to give me more details?

    Many thanks,

    Hayley

  • Hello Hayley,

    To use Norah's solution follow these steps:

    Options > Verification > QA Checker 3.0 > Regular Expressions

    Check "Search regular expressions"
    Description: Give a friendly name to the RegEx pattern
    RegEx source: Add source segment term
    RegEx target: Add forbidden target segment term
    Condition: "Report if both target and source RegEx patterns match"
    Check "Ignore case"
    Action: "Add item"

    You can add as many patterns as you want.

    The above RegEx pattern tells QA Checker to trigger a warning when  "millió" appears in the source segment and "billion" in the target one. It does the job, although it will also (erroneously) flag segments where both "million" and "billion" appear (correctly), but this is quite rare and I can live with it.

    Also, try to experiment with the options in the Conditions dropdown list!

    Best,

    Levente

  • The regex pattern can be extended to prevent the problem when "million" and "billion" are both present, by adding a negative lookahead (posit the absence of the correct term).  

    (?!.*million)^.*billion

    But as you mention, this effort might not be "cost-effective".

  • Thank you very much . I've set mine up to flag any target segment that contains thousand, million, billion and Euros regardless of what's in the source. These words only occur occasionally in the kinds of text I translate, so that's sufficient for me.