SDL Trados Studio
SDL Trados GroupShare
SDL Trados Business Manager
SDL Trados Live
SDL Speech to Text
SDL Managed Translation - Enterprise
Translation Management Connectors
SDL LiveContent S1000D
SDL Contenta S1000D
SDL Tridion Docs
SDL Tridion Sites
SDL Content Assistant
SDL Machine Translation Cloud
SDL Machine Translation Connectors
SDL Machine Translation Edge
Tridion Docs Developers
SDL User Experience
Language Products - GCS Internal Community
SDL Community Internal Group
SDL Access Customer Portal
SDL Professional Services
SDL Training & Certification
Language Technology Partner Group
SDL Academic Partners
SDL Enterprise Technology Partners
ETUG (European Trados User Group) Public Information
Machine Translation User Group
Nordic SDL Tridion Docs User Group
SDL Tridion UK Meetup
SDL Tridion User Group New England
SDL Tridion West Coast User Group
SDL WorldServer User Group
Tridion Docs Europe & APAC User Group
Tridion User Group Benelux
Tridion User Group Ohio Valley
SDL MultiTerm Ideas
SDL Passolo Ideas
SDL Trados GroupShare Ideas
SDL Trados Studio Ideas
SDL Machine Translation Cloud Ideas
SDL Machine Translation Edge Ideas
SDL Language Cloud TMS Ideas
SDL Language Cloud Terminology Ideas
SDL Language Cloud Online Editor Ideas
SDL Managed Translation - Enterprise Ideas
SDL TMS Ideas
SDL WorldServer Ideas
SDL Tridion Docs Ideas
SDL Tridion Sites Ideas
SDL LiveContent S1000D Ideas
SDL Contenta S1000D
SDL XPP Ideas
Events & Webinars
To SDL Documentation
To SDL Support
What's New in SDL
Detecting language please wait for.......
Hello AllI want Trados Studio to split excel cell contents into segments based on embedded HTML codese.g:
Product FeaturesThrows are acrylic knittedProduct size : 130x170 cmProduct colour is beige. Washing RecommendationsWashable at 30 degrees.Do not bleach.Do not iron.--------
this is a samplesample-br.xlsxthanks
Maybe this is an 'overkill' solution, but I can see that your sample also seems to contain non-HTML columns. And while the Embedded Content solution provided by Paul does the trick to identify HTML tags…
The default rules will handle this and should give you an idea of how to improve it if you wish:
I find this regex-based approach rather amateurish, cumbersome and most importantly failing big time with just a little bit more complex HTML code, not mentioning anything more complicated (containing…
Maybe this is an 'overkill' solution, but I can see that your sample also seems to contain non-HTML columns. And while the Embedded Content solution provided by Paul does the trick to identify HTML tags, it does not recognise HTML character codes (if you should have any on your files).
So, just my two cents:
For such files, we use the XML options in Excel's Developer tab:
1. Use Notepad++ to create a simple file with the names of the Excel columns, which look like this:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <File xmlns:xsi="">www.w3.org/.../XMLSchema-instance"> <Element> <Column1>a</Column1> <Column2>a</Column2> </Element> <Element> <Column1>a</Column1> <Column2>a</Column2> </Element> </File>
2. Use the Source button in Developer > XML in Excel to add this XML map to the Excel file for translation. Then drag and drop each of the XML elements from the XML map onto the respective Excel column heading (which will create a table).
3. Click Export in Developer > XML to export your table to an XML file.
4. In Studio, create specific XML file type settings. As such you can configure which elements / columns need to be translated (maybe not all Excel columns need to be translated) and you can add document structure information to elements. You can then use this document structure information to have HTML content processed using Studio's embedded "Html Embedded Content 5 188.8.131.52" processor, which will recognise both HTML tags and HTML character codes. And non-HTML columns will not be processed as containing HTML.
5. After translation of the XML file, just open the Excel file and click Developer > XML > Import to import the translated XML file.
Bit of a long process, I know, but once you know how it works, we have found the results to make it worthwhile the effort.
Won't work either for Excel files with multiple languages, of course...
Wow... I did say the default aproach would help give the user the idea they needed to improve it for the content they have. So adding a few rules here and there for this specific content is trivial.
Evzen Polenka said:I find this regex-based approach rather amateurish, cumbersome and most importantly failing big time with just a little bit more complex HTML code, not mentioning anything more complicated (containing entities, comments, inline scripts, etc.)
Well... in this case the file is not more complex. I'm a firm believer in economy of accuracy and in this case your more professional approach is not needed. Interesting discussion though and for other files it is a sensible way to go given the lack of a better embedded content handler for Excel in Studio.
Paul said:in this case your more professional approach is not needed
This is questionable.I've seen such decisions based on a short sample (or seeing just a few lines of one file, not bothering to look thoroughly through the WHOLE file), making a hell of the people's lives because just a few pages down (or in other file) there was a messy complex HTML code.
So I prefer using robust solutions working reliably in all cases, rather than keeping solving endless issues with simple solutions.
So, as we all seem to agree, it all depends on the contents of the entire file.
And, in reply to aazzoma khateeb: if your file only contains BR and P tags, Paul's solution is definitely the best way to go as it will do the trick perfectly. If, however, the rest of the file also contains many other tags and tag pairs and HTML codes instead of characters, an Excel-to-XML solution may be the better option as you can then profit from Studio's integrated HTML processor.
In any event, no disrespect intended to anyone here from my side...
Thank Lieven LannooI will try your solution.Paul method resolved the Tags completely.Regards.
Hi aazzoma khateeb
What about this:
Would that work for you?
PS: Here is your sample file:
Dear @Daiel HugI really appreciate your solution How did you do the segmentation?but @Paul solution, resolve the contents without any TagThanks to all members who shared their experiences Best Regards
Hi aazzoma khateeb
I just used Okapi (https://okapiframework.org/wiki/index.php/Main_Page) out-of-the-box, so to say.
(no custom filter configurations)
This is the only custom setting I used (additional segmentation rule to segment at all the variations of <br/> tags that exist in your document:
That's all. So you end up with an xlf file, but Okapi will convert it back into the source format once you're done. So it's XSLX->XLF (Okapi) -> SDLXLIFF (Studio) -> XLF (Studio) -> XSLX (Okapi) I've never had that fail so far.
thanks a million,it is a new tool (Okapi)
I will try it.Regards