XPATH to include tags that follow a certain character

Hi! I am currently trying to optimize our XML parser rules for WorldServer V11.3.5.4758. Since this is an XPATH question, I thought I would post it here as I'm having no joy figuring this out by myself.

Basically, I have the following situation. Embedded in our XML files are some HTML tags, which we have handled up until now using the parser rules for our XML file type in WorldServer. I would like to keep it this way, if possible. So we are seeing tags like <B>, <U>, etc. What I want to optimize our the <BR/> tags.

Basically, I want them to break but allow a possible merge. This is no problem and is easily handled by setting //BR to "Inline" and "Exclude". However, if the <BR/> tag comes immediately following a comma, I would like to have these units automatically merged together, as this is a strong indicator that the units belong together.

Example:

<field attribute="f1463782524627-art">
<value>
<U>
<B>Sicherer digitaler Eingang: </B>
<space/>
</U>
<BR/>
Typ B, Sink Beschaltung, einstellbarer SW-Eingangsfilter<BR/>
<U>
<B>Sicherer analoger Eingang: </B>
<space/>
</U>
<BR/>
Typ B, Messbereich 0 bis 10 V / 0 bis 32 V / 0 bis 20 mA<BR/>
<U>
<B>Digitaler Eingang (ohne Diagnose): </B>
<space/>
</U>
<BR/>
Digitale Eingänge, Sink/Source Beschaltung pro Kanal konfigurierbar, einstellbarer SW-Eingangsfilter,<BR/>fix oder ratiometrisch einstellbare Schaltschwelle, Drahbruch und Kurzschlusserkennung<BR/>
<U>
<B>Digitaler Eingang (mit Diagnose):</B>
<space/>
</U>
<BR/>Digitale Eingänge, einstellbarer SW-Eingangsfilter, Drahtbruch und Kurzschlusserkennung<BR/>
<U>
<B>Analoger Eingang:</B>
<space/>
</U>
<BR/>Analoge Eingänge, Messbereich 0 bis 10 V / 0 bis 32 V / 0 bis 20 mA / 4 bis 20 mA / 1 bis 50 kΩ / Temperatureingänge, einstellbarer Analogfilter, <BR/>einstellbare Rampenbegrenzung, einstellbare Schwellenwerte, integrierter Eingangsschutz</value>
</field>

Sorry, I know it's not well-indented and probably not well-formed, but I hope it gets my point across (I've included all of this to show the full context). I would like this unit to be automatically merged:

<BR/>
Digitale Eingänge, Sink/Source Beschaltung pro Kanal konfigurierbar, einstellbarer SW-Eingangsfilter,<BR/>fix oder ratiometrisch einstellbare Schaltschwelle, Drahbruch und Kurzschlusserkennung<BR/>

so that it looks like this:

Digitale Eingänge, Sink/Source Beschaltung pro Kanal konfigurierbar, einstellbarer SW-Eingangsfilter,<BR-TAG>fix oder ratiometrisch einstellbare Schaltschwelle, Drahbruch und Kurzschlusserkennung

I've tried this, but it's not working:

//BR[ends-with(preceding::text()[1],',')]

(I've also tried preceding-sibling with no joy.)

I hope someone can help show me the way!

Top Replies

Parents Reply Children