I am trying to get counts of conrefs, variables, images and topics included a baseline to present some metrics about the amount of reuse (some factors are ignored). My approach is to parse the fields:
For each baseline, I get the logical id and version of each map or topic. The DocumentObj.RetrieveLanguageMetadata endpoint only allows you to specify 1 version. So, I make multiple calls per baseline for each version included in the baseline. This is very slow since I have to do this for thousands of baselines.
Do you know of a faster way to gather this information?
You are only showing a part of the puzzle, but for a full report you have to organize your data differently in a *slave* database. So it is better to get the link-metadata of all versions of objects but then store it in a more optimized way so you can count.
Directly on the CMS will simply put a lot of logic in your API application and has performance issues. At some point we shared an idea how it could be done over a webinar, see https://community.sdl.com/product-groups/sdl-tridion-dx/tridion-docs/m/videos/3165 around minute 45
Thanks. I'll watch. Generally I've been using graph databases and sparql for reporting. But, it's getting the data for the metrics from the CMS that I'm asking about here. Is there a faster way to find each variable, conref, image or topic referenced, in many baselines than a baseline report and DocumentObj.RetrieveLanguageMetadata possibly multiple times per baseline?
Hi Kendall - I'm interested in your graph db brand and usage :)
On the API part, don't know if we can do all this over a written correspondence. Let's see how this evolves...
A baseline contains LogicalIs (typically GUIDs) and Versions. But the metadata you need is on Language-level (typically source language or publication working language). So instead of retrieving one-by-one using LogicalId plus version, can ask for a report on the baseline that gives you back Language Card Ids.
Ok. I misunderstood the meaning of the reportitem elements apparently. I am using Baselin25.GetReport but each object element only has logicalid and version number. But, the first reportitem has the ishlngref, which I took to be a link rather than the object itself. Thanks!
I am currently using blazegraph to store RDF and then I use SPARQL to query the triples. I've also used openrdf's native storage implementation with their API and another approach when trees made more sense than graphs, was to use XQuery so I stored data in basex.
Many problems I have had to solve working with DITA CMS systems has been accounting for the fact the CMS (SDL or other) stores XML as text. Inventing my own logic to deal with the text has usually looked like more work than using one of the standards based solution like XML or RDF, so I've dealt with that by transferring data out of the CMS into an XML database or graph database in order to work with it.