Problems reconverting XML files with embedded HTML code

 Hi everybody

After searching for a solution on my own and then in the community, and having not found what I need, I am asking if somebody could help.

I have an XML file with embedded HTML code. I managed to create a filetype, that was the easy part. The file in Studio looks great. There are some links with http... and some segments beginning with { that I can easely block in Studio:

The problem happens when I finalize the file. Some HTML entities are correctly converted, others are not. I tried different options in the filetype (both in the XML and in the embedded HTML configuration), but could not find a solution. The only thing I can do is to open the file in an Editor and find&replace the characters by hand (with a macro). I have also a bunch of tabs appearing from time to time...

In this screenshot I compared the file before and after (original file is on the right, the file on the left is created by Studio after finalization):

I already tried to use a converter in my editor, the problem is that it converts all entities. Some of them must remain the same.


This part created by Studio

<en><script>window.product_id = 'VC-WPI'; window.dataLayer = window.dataLayer || []</script></en>

should be

<en>&lt;script&gt;window.product_id = 'VC-WPI'; window.dataLayer = window.dataLayer || []&lt;/script&gt;</en>

If I convert the file in the editor I get

<en><script>window.product_id = 'VC-WPI'; window.dataLayer = window.dataLayer || []</script></en>

I sent the last one to my customer, but he will not accept it, it must be the yellow marked line.

I attached my filetype settings.

Could somebody help me?

Thank you very much.



Test setting XML embedded HTML.rar

10 Replies Latest Replies: 13 Feb 2019 3:23 PM by Daniel Hug < 1   2  >
  • Hi
    Sorry, I don't have unrar, so I can't open your settings file.
    Your parser settings must be wrong. Try /sitecore/phrase/en as parser setting. As you have to decide early on in the game, I'd consider using a file type with embedded content processor (HTML5).
    A screenshot of your parser settings and of the "embedded content" page of the file type might help to understand more.
  • In reply to Daniel Hug:

    Test setting XML embedded


    Hi Daniel, thank you very much. I uploaded a ZIP file (*.settings file is not allowed). You will find the settings for the XML file and the embedded HTML5 code.

    Thank you for your help



  • In reply to Daniel Hug:

    ...also, you extract a lot of non-translatable stuff. If you know the names of the fields that contain translatable text ("fieldid=...") you can extract just those. Daniel
  • In reply to Daniel Hug:

    Thank you for the file. To get at least basic functionality, delete the rules "//phrase" and "//sitecore". "//en" should be tag type "structure".
    You can tell the parser to only extract the stuff you want if you have a parser rule like
    //phrase[@fieldid="Meta Keywords"]/en
    In this case you will need a rule for each field.
    I want to warn you that the language of the output will still be <en>, which tells Sitecore that it's English text. You will have to replace all that with the code for the target language. The xml lang attribute value setting of your settings file will not have any effect here.
    Hope that helps.
  • In reply to Daniel Hug:

    Hi Daniel, thank you for the help. I will try and let you know.
< 1   2  >