Today I needed to munge some dirty XML data. I still haven't taught myself XSL/XPath 2.0 yet, so I was limited to XSL 1.0 for now. The data I had looked like this, only much, much worse.
<Subject>Value 1|Value 2</Subject>
<Subject>Value 1|Value 2</Subject>
<Subject>Value 1|Value 2</Subject>
<Time>Time Value 1</Time>
<Time>Time Value 1</Time>
<Time>Time Value 1</Time>
<Subject>Value 3|Value 4</Subject>
<Subject>Value 3|Value 4</Subject>
<Subject>Value 3|Value 4</Subject>
<Time>Time Value 2</Time>
<Time>Time Value 2</Time>
<Time>Time Value 2</Time>
I wanted two things out of that series of elements: unique strings and the value before the |. As I type this, I realize there may be a bit of a bug here, but I'll have to test it out. Here's what I did for the series of subject elements. Can you spot the bug? ;-)
<xsl:variable name="subjects" select="/fragment/index-only-subjects//Subject[not(text()=preceding-sibling::Subject/text())]/text()"/>
<xsl:for-each select="$subjects">
<xsl:sort select="." data-type="text"/>
<subject>
<xsl:choose>
<xsl:when test="contains(.,'|')">
<xsl:value-of select="substring-before(.,'|')"/>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="."/>
</xsl:otherwise>
</xsl:choose>
</subject>
</xsl:for-each>
I'm pretty sure there's a way to do this with xsl:keys / key(), but I got this solution working first.
No comments:
Post a Comment