Friday, July 31, 2009

Unique Attribute Values Across Multiple Documents using XQuery

It's a little slow, but here's one way to get a list of all the unique attribute values across multiple XML documents using XQuery.


let $raw-values :=
for $book in collection("abc")/(gbook|set)[@type='oeb']
return
element { "book" }
{
for $value in distinct-values($book//node()/@class)
return element { "class" } { $value }
}
for $item in distinct-values($raw-values//class)
order by $item
return element { "uniques" } { $item }

Friday, July 10, 2009

MarkLogic XCC Layer File Open Errors

If you have library modules you're importing, the query may work fine in cq, but if you try to use the same query via the XCC layer you may get "File Open Error" messages.

One cause of this for me was the pathing in the import statement. cq seems to handle a relative path while XCC cannot, at least in MarkLogic 4.1.

I needed to change from...

import module namespace my = "http://blah.com" at "search-parser-xml.xqy",
"search-snippet.xqy";

...to...

import module namespace my = "http://blah.com" at "/search-parser-xml.xqy",
"/search-snippet.xqy";

Monday, July 6, 2009

MarkLogic, cq and Namespaces

If you import an XQuery library in cq and declare the namespace, cq gets fussy if you then try to declare your own functions. I know there are clear reasons for this, but here's what I do so I can use my own functions during testing.


xquery version "1.0-ml";

import module namespace search = "http://marklogic.com/appservices/search"
at "/MarkLogic/appservices/search/search.xqy";

declare namespace my="http://www.my-web-site.com/xquery";

declare variable $options-title :=
<options xmlns="http://marklogic.com/appservices/search">
<searchable-expression>
collection("abc123")//(div)
</searchable-expression>
<transform-results apply="snippet">
<per-match-tokens>30</per-match-tokens>
<max-matches>1</max-matches>
<max-snippet-chars>200</max-snippet-chars>
<preferred-elements/>
</transform-results>
</options>;

declare function my:do-search()
{
search:search("food", $options-title, (), 25)
};

my:do-search()

Saturday, July 4, 2009

Saxon, Command Line, C#, and XSL 2.0

I've been using Xalan/Xerces for command line XSL transformations for years, but I've been moving farther away from Java over the years, so I wanted something .NET compatible and I wanted something XSL 2.0 compatible. I finally switched to Saxon.

I normally use the standard XML objects in my ASP.NET apps, but I'll switch to Xalan command line tools when I need the "write" extension. I can do the same with Saxon now.

C:\Program Files\Saxon.NET>bin\Transform SaxonTest.xml SaxonTest.xsl


Here is the C# code to call an XSL transformation using Saxon. This one may seem a little odd because the code doesn't save any file since what I'm doing is splitting a large XML file into multiple small files using <xsl:result-document>.

// Create a Processor instance.
Processor p = new Processor();
// Load the source document.
XdmNode node = p.NewDocumentBuilder().Build(new Uri(file));
// Create a transformer for the stylesheet.
XsltTransformer transformer = p.NewXsltCompiler().Compile(myStream).Load();
// Set the root node of the source document to be the initial context node.
transformer.InitialContextNode = node;
// BaseOutputUri is only necessary for xsl:result-document.
transformer.BaseOutputUri = new Uri(file);
// Create a serializer.
Serializer serializer = new Serializer();
transformer.Run(serializer);


Here is the stylesheet I used to leverage the XSL 2.0 equivalent of xalan:write.

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet
version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:fo="http://www.w3.org/1999/XSL/Format"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:fn="http://www.w3.org/2005/xpath-functions"
exclude-result-prefixes="fo xs fn">

<xsl:output method="xml" indent="yes" encoding="UTF-8" name="xmlFormat"/>

<xsl:template match="text()" />

<xsl:template match="/">
<xsl:for-each select="//node()[@fragment='true']">
<xsl:variable name="filename" select="concat( /gpg-book/@local-id, '/', @local-id, '.xml' )"/>
<xsl:result-document href="{$filename}" format="xmlFormat">
<pcu-gpg-book>
<xsl:copy-of select="/gpg-book/taxonomy_pcu"/>
<xsl:copy-of select="/gpg-book/content-metadata"/>
<xsl:copy-of select="/gpg-book/print-pub-metadata"/>
<xsl:copy-of select="parent::node()"/>
</pcu-gpg-book>
</xsl:result-document>
</xsl:for-each>
</xsl:template>

</xsl:stylesheet>