Wednesday, November 26, 2008

Thanksgiving, 2008

As we come to Thanksgiving, here are some things I'm thankful for this year.
  • My wife, children, and I had another year of good health.
  • The country managed to get through the election without another legal fiasco.
  • I found a new job (rather, a new job found me) relatively quickly after being laid off.
  • We made the decision to not buy a new car or commit to major home renovations mere months before being laid off and this current economic zaniness.
  • My sister made it through her 3rd tour of duty in Iraq, her 4th overseas, in one piece.
Happy Thanksgiving!

Monday, November 24, 2008

Loading XML into eXist Using XQuery and the Sandbox

This past weekend I was tinkering with the eXist XML database. The installation went fine and some of their sample queries ran fine. My next step was to load some of my content into it.

Rather than use their web interface or desktop client, I wanted to load the documents using XQuery through their sandbox application. I thought this would be quick and easy and would allow me to compare some features of eXist to MarkLogic Server.

There is quite a bit of documentation for eXist, but the XQuery API is light on specific usage examples. I also ran into some non-obvious gotchas. Here is the XQuery code that I used to load a document into a specific collection, along with some notes below.
declare namespace xmldb="";
declare variable $file as xs:string {
"file:///C:/Program%20Files/eXist/samples/mattio/sample.xml" };
declare variable $name as xs:string { "sample.xml" };
declare variable $collection as xs:string { "/db/test/" };

let $collection-status :=
if(not(xmldb:collection-exists($collection))) then
xmldb:create-collection("", $collection)
else ("Collection already exists.")
return <collection-status> { $collection-status } </collection-status>
let $load-status := xmldb:store($collection, $name, xs:anyURI($file))
return <load-status> { $load-status } </load-status>
When I was trying to use C:\ to start my path or when I was leaving out xs:anyURI(), I was getting a misleading error that implied there was something wrong with my document. The error was:

XMLDB reported an exception while storing documentorg.xmldb.api.base.XMLDBException: fatal error at (1,1) : Content is not allowed in prolog. [at line 120, column 21] In call to function: sandbox:exec-query(xs:string) [134:10]

Here are some other notes.

  1. Note that the xmldb namespace needs to be declared.

  2. Note the syntax of $file. This is how you reference a document on your file system, including encoding the path to use %20 instead of a space.

  3. Note that $file must be wrapped in xs:anyURI() when used in xmldb:store() in order to force it to be considered a URI and not a simple string.

Thanks to Dannes and Wolfgang for their help with this. They were on the exist-open list on a Saturday.

Next up I'll load about 50 large documents to build some basic queries to review index tuning.

Friday, November 21, 2008

Exporting XML Files from Textml

I had a case where documents were being created and stored dynamically in Textml Server by an application, but we wanted the physical files exported. I had a ContentServer class already in place for selecting all documents in a collection and for selecting a document by file name, which would make this easier. This was going nowhere near a production server, so reusing what I had to get this done quickly was my primary concern.

There are some ways to clean this up, but the general approach should be helpful in similar situations.

Here is the method that will return a list of all documents in a collection. This gives me the file name, which I use to get the individual documents.

private List<ListItem> SelectAllTextml()
List<ListItem> myList = new List<ListItem>();
string textmlStandardHeader = "<?xml version=\"1.0\" encoding=\"UTF-16\"?><query VERSION=\"3.6\" RESULTSPACE=\"RGuideAdmin\">\n";
string textmlStandardFooter = "</query>";
string textmlCollection = "<property NAME=\"collection\"><elem>" +
this.ContextAdditionalName + "</elem></property>";
string textmlFile = "<property NAME=\"NAME\"><elem><anystr/></elem></property>";
string textmlQuery = textmlStandardHeader + "<andkey>" +
textmlCollection +
textmlFile +
"</andkey>" + textmlStandardFooter;
IxiaClientServices IxiaCS = new IxiaClientServices();
IxiaServerServices IxiaSS = IxiaCS.ConnectServer(this.ContextServer);
IxiaDocBaseServices IxiaDS = IxiaSS.ConnectDocBase(this.ContextContainer);
IxiaSearchServices IxiaSearchS = IxiaDS.SearchServices;
IxiaResultSpace textmlResultSpace = IxiaSearchS.SearchDocuments(textmlQuery);
if (textmlResultSpace.Count > 0)
for (int i = 0; i < textmlResultSpace.Count; i++)
ListItem documentItem = new ListItem();
IxiaDocument document;
document = textmlResultSpace.Item(i);

MemoryStream xmlStream = new MemoryStream();
xmlStream.Position = 0;
XPathDocument textmlXmlDocument = new XPathDocument(xmlStream);
XPathNavigator textmlXmlNav = textmlXmlDocument.CreateNavigator();
documentItem.Text =
textmlXmlNav.SelectSingleNode("descendant::title[1]").ToString() +
" (" + document.Collection + ")";
documentItem.Value =
GetAttribute("id", "");

return myList;

Here is the method that will return a document (or documents) by file name, limited to a collection.

private List<XmlDocument> SelectTextml(string fileName)
List<XmlDocument> myList = new List<XmlDocument>();
string textmlStandardHeader = "<?xml version=\"1.0\" encoding=\"UTF-16\"?><query VERSION=\"3.6\" RESULTSPACE=\"RGuideAdmin\">\n";
string textmlStandardFooter = "</query>";
string textmlCollection = "<property NAME=\"collection\"><elem>" +
this.ContextAdditionalName + "</elem></property>";
string textmlFile = "<property NAME=\"NAME\"><elem>" + fileName + "<anystr/></elem></property>";
string textmlQuery = textmlStandardHeader + "<andkey>" +
textmlCollection +
textmlFile +
"</andkey>" + textmlStandardFooter;
IxiaClientServices IxiaCS = new IxiaClientServices();
IxiaServerServices IxiaSS = IxiaCS.ConnectServer(this.ContextServer);
IxiaDocBaseServices IxiaDS = IxiaSS.ConnectDocBase(this.ContextContainer);
IxiaSearchServices IxiaSearchS = IxiaDS.SearchServices;
IxiaResultSpace textmlResultSpace = IxiaSearchS.SearchDocuments(textmlQuery);
if (textmlResultSpace.Count > 0)
for (int i = 0; i < textmlResultSpace.Count; i++)
IxiaDocument document = textmlResultSpace.Item(i);
MemoryStream xmlStream = new MemoryStream();
xmlStream.Position = 0;
XmlDocument textmlXmlDocument = new XmlDocument();

return myList;

Here's the method I used to go through each document returned in the collection list and save each to an XML file.

myContentServer = new ContentServer(Server.MapPath("~/App_Data/" + 
ddlProduct.SelectedValue + ".xml"));
List<ListItem> myGuides = myContentServer.SelectAll();
if (myGuides.Count > 0)
if (Directory.Exists(Server.MapPath(exportDirectory + "/" +
// Delete the directory and anything existing in it.
Directory.Delete(Server.MapPath(exportDirectory + "/" +
ddlProduct.SelectedValue), true);
Directory.CreateDirectory(Server.MapPath(exportDirectory + "/" +
foreach (ListItem guide in myGuides)
List<XmlDocument> myGuide = myContentServer.Select(guide.Value);
XmlDocument document = myGuide[0];
// Save the file to the export directory.
document.Save(Server.MapPath(exportDirectory + "/" +
ddlProduct.SelectedValue + "/" + guide.Value + ".xml"));
divExportList.InnerHtml += "<br/>" + exportDirectory + "/" +
ddlProduct.SelectedValue + "/" + guide.Value + ".xml";

Tuesday, November 18, 2008

Weird Bug While Porting Textml Server Code from JSP to ASP.NET

This morning I was porting an old search results page accessing Textml Server from JSP to ASP.NET. One feature implemented there is search within results. We execute this by storing the original query in the session and then, when a user asks to search within results, we pull it out and re-run it so the second query can reference the first.

We have a line like this in the JSP page...
IxiaResultSpace originalResults = 
...followed by a few lines later by a line like this...
"<include TYPE=\"ResultSpace\">" + sessionID + "-ALL</include>"
All was well.

The logic of the page overall is more than a bit wonky, but we decided to port first and revise later. When done, I was getting an error that said

"vrn2nc55cxej5knnemwyzvqv-ALL is not a valid ResultSpace include /query/andkey/include at Ixiasoft.TextmlServer.ResultSpace.ExecuteQuery() at Ixiasoft.TextmlServer.ResultSpace.get_Count() at searchresults.RunSearch() in c:\Greenwood Web Sites\devsite\searchresults.aspx.cs:line 357."

What's that now?

After some trips through the debugger, poking around the documentation and some googling (no one blogs on this thing) I went back to the old method of just writing out strings to the page. Nothing jumped out as an error and nothing worked.

By sheer chance, I decided to see what the string value of the original query was so I added...
string originalQuery = originalResults.TextmlQuery; the page with the intent of displaying it somewhere for review and suddenly the error stopped being thrown and the code functioned as expected. After making sure I made no other changes I tested it again. I commented out that line and the error was thrown. I put the line back in and the page ran fine. A co-worker asked if the Count property forced it work as well and it does.

I can't explain this one.

Tuesday, November 11, 2008

Logins Fail after SQL Server Restore

After doing a SQL Server db restore, logins can be a problem. This script will re-sync the passwords.

EXEC sp_change_users_login 'Auto_Fix','UserOne', null, 'pwd1'
EXEC sp_change_users_login 'Auto_Fix','UserTwo', null, 'pwd2'
EXEC sp_change_users_login 'Auto_Fix','UserThree', null, 'pwd3'

To find these users.

EXEC sp_change_users_login 'Report', null, null, null

Wednesday, October 8, 2008

Add XQuery Support to UltraEdit

Leave it to the team that works on UltraEdit to make adding XQuery support easy.

Here's a tutorial on adding a language.

Here's the XQuery wordfile they provide.

Done in about 15 seconds.  Nice.

Sunday, September 28, 2008

Stop a Long-Running Query in MarkLogic Server

If you're like me, every once in awhile you'll be working on a query in cq, you'll do something stupid in XQuery, run the query, and it will run forever. MarkLogic Server has built-in timeouts, but you can stop a long-running query rather than waiting. Here's how you can do it using the default admin console in 3.2.
  1. In the left-hand column, click on Groups.
  2. Click on Default
  3. Click on App Servers
  4. Click on the app server cq is connected to.
  5. Click the Status tab.
  6. Click the Show More button.
  7. Scroll to the bottom and you should see a request with the the /cq/ path referenced.
  8. Click the cancel link.
  9. Confirm that you want to cancel the query.

The Last Game at old Yankee Stadium

I was at the last game at the old Yankee Stadium last Sunday, September 21 with my father. I'm not going to get sentimental. There's a tremendous amount of history in that stadium, but the physical structure was due to be replaced.  

Here are some things I'll enjoy remembering about the last game:
  1. During the pregame ceremonies, Yogi Berra standing behind home plate, looking very small and frail, but strangely looking like he belonged right here.
  2. The huge cheers given to Bernie Williams during the pregame ceremonies, his first time back to the stadium.
  3. The grass field well-worn with the season.
  4. Over 54,000 fans chanting "Der-ek Je-ter!" demanding that he take a bow at least 3 times during the game. Every time it was so loud that you could not hear the person next to you.
  5. Over 54,000 fans singing "God Bless America" along with Ronan Tynan. The guy behind us, who was being a Bronx-born tough guy all night, called his wife at the end to tell her he was enjoying the game, but ended the call abruptly while telling her Tynan just finished singing because he was getting choked up describing it.
  6. Derek Jeter, at the end of the game, leading the team around the field waving to the fans. His speech was great. The lap around the stadium was a nice tribute and appreciated. But what I noticed was how Jeter was leading, but about half-way around, a good part of the team wasn't following him that closely any more.  I thought that was an accurate visual comment on this season and the character of this year's team.
  7. Sitting in my seat, not wantint to leave, but being forced to because we had to rush to catch our Metro-North train home.
Here's the text of Derek Jeter's speech:
"Excuse me.  Excuse me. For all of us up here, it's a huge honor to put this uniform on everyday and come out here and play.
And every member of this organization, past and present, has been calling this place home for 85 years. There’s a lot of tradition, a lot of history and a lot of memories. Now the great thing about memories is you’re able to pass it along from generation to generation.
And although things are going to change next year, we’re going to move across the street, there are a few things with the New York Yankees that never change. That’s pride, that's tradition, and most of all we have the greatest fans in the world.
And we are relying on you to take the memories from this stadium, add them to the new memories that come at the new Yankee Stadium and continue to pass them on from generation to generation. So on behalf of the entire organization, we just want to take this moment to salute you, the greatest fans in the world."

Here's the new stadium as seen from the upper deck of the old stadium.

Here's bp for the last time.

This is the original 1922 pennant.  Very cool.

Saying thanks to the fans with Monument Park in the background.

The ump ringing up the last out ever at the old stadium.
Cody Ransom handing the ball to Mariano River.

The video tribute done by Yogi Berra was moving as well, especially for a diehard fan. Here's the YouTube link and the ESPN link.

Monday, September 15, 2008

Creating or Updating an htpasswd File

Don't laugh.

I do this maybe twice a year.

To create a new htpasswd file:
  1. Using the command line / terminal, go to the directory you want to protect then type:
    user$ htpasswd -c .htpasswd YourUsername
  2. Type the password twice
The tool you're using is htpasswd. The -c switch is the instruction to create a new file. The file name will be .htpasswd. Note that in most systems, a file that starts with a . is hidden by default.

To update an existing htpasswd file with a new password for an existing user:
  1. Using the command line / terminal, go to the directory you where the .htpasswd file exists then type:
    user$ htpasswd .htpasswd YourUsername
  2. Type the password twice
The difference here is the lack of the -c switch, which means the existing file will be updated.

Thursday, September 11, 2008

New XQuery Component for SyntaxHighlighter

I created an XQuery component for Alex Gorbatchev's SyntaxHighlighter, which you see in use with my XQuery snippets on this blog.

I've asked Alex to add it to the download package, but you can also download it from my site.

Tuesday, September 9, 2008

Add Syntax Highlighting to Your Blogger Blog

Here are a couple of good links to help you add syntax highlighting to your Blogger blog.
  1. The source in Google's code base.
  2. A "Yet Another Coding Blog" article.
  3. An alternate blog post in case one goes dark.

Monday, August 18, 2008

Atari 400 and My First Dose of FAIL

I'm not sure what made me think about this now, but I was just daydreaming and thought about my very first programming experience.

I had gotten an Atari 400 (may have been XL?) as a kid and it came with a BASIC cartridge and this binder full of programs you could write. I remember looking at them and thinking it was cool that if I type these odd looking lines that didn't make sense to me, the machine would do something. I was fixated with one that I think was shorter, a program that would display an American flag on the TV screen.

One night my father and I typed out the program. If I remember correctly, I typed it out the first time and it didn't run. FAIL. Nice first experience. My father tried it. FAIL. The keyboard was this awful touch-button setup -- not actual keys but areas you would press. Brutal typing. I think this is the model.

I tried again. And again. And again. FAIL. FAIL. FAIL.

I remember ... not being irritated or mad, but disappointed. I wanted it to run and it wouldn't and no one could tell me why. I remember sitting with my face right next to the TV screen going character by character, line by line and checking my work. Never did get it to run.

Hmm. Wonder if I can find that machine somewhere in my parents' basement?

Thursday, July 3, 2008

Read in a String of XML to an XPathDocument in ASP.NET

Had a string representation of an XML file today that needed to be read into an XPathDocument object.

// results was the string XML file representation
StringReader sReader = new StringReader(results);
XmlReader xReader = new XmlTextReader(sReader);
XPathDocument myXpathDocument = new XPathDocument(xReader);

Tuesday, June 24, 2008

Saving an XPathDocument or XmlDocument to a File in ASP.NET

Today I was working on a class that was returning an XPathDocument representation of an XML document and I needed to save it to a file. I switched the class to return an XmlDocument ... and the reason for that should be obvious from the two code samples below.

Here's how I saved an XPathDocument to an XML file:

// myItems[0] is from a generic List<XPathDocument> list
XPathDocument document = myItems[0];
// Feels like there should be an easier way to do this.
XPathNavigator documentNav = document.CreateNavigator();
XmlTextWriter writer = new XmlTextWriter(
Server.MapPath(temporaryFiles + "/" +
fileId + ".xml"), System.Text.Encoding.UTF8);
writer.Formatting = Formatting.Indented;
writer.Indentation = 3;
writer.WriteNode(documentNav, true);

Now here's how I saved an XmlDocument to an XML file:

// myItems[0] is from a generic List<XmlDocument> list
XmlDocument document = myItems[0];
Server.MapPath(temporaryFiles + "/" + fileId + ".xml"));

Monday, June 23, 2008

Load an XML File from the File System into Textml with ASP.NET

I'm building an internal application that needs to load an XML file from the file system into a specific repository path in Textml Server. Here's the method I used within a ContentServer class I created.

The references to various this properties are set in the same class where I get the values from a configuration file for the staging and production servers.
public bool Publish(string fileUri)
ArrayList documents = new ArrayList(1);
IxiaClientServices IxiaCS = new IxiaClientServices();
IxiaServerServices IxiaSS = IxiaCS.ConnectServer(this.ContextServer);
IxiaDocBaseServices IxiaDS = IxiaSS.ConnectDocBase(this.ContextContainer);
IxiaDocumentServices ds = IxiaDS.DocumentServices;
IxiaDocument document = IxiaDocument.getInstance();
FileInfo file = new FileInfo(fileUri);

document.Name = file.Name;
document.MimeType = "text/xml";
document.Collection = this.ContextAdditionalName;
document.Content = IxiaDocument.MakeBinaryContent(file.FullName);

IxiaTextmlServerError [] err = ds.SetDocuments(documents,
(int)IxiaDocumentServices.TextmlSetDocuments.TextmlAddDocument |
(int)IxiaDocumentServices.TextmlSetDocuments.TextmlReplaceDocument |

// If there is more than one item, and that first item is
// not null or empty, return false.
if (err.Length > 1 && !String.IsNullOrEmpty(err[0].ToString()))
// TODO: Log each in the EventViewer
return false;
else return true;
catch (Exception ex)
// TODO: Log in the EventViewer
return false;

I need to do the same into MarkLogic so I'll post that snippet here as soon as it's done.

Tuesday, June 17, 2008

Using the ASP.NET AdRotator Control for Text-Only Ads

Here's a pathetic little hack to leverage the ASP.NET AdRotator control to generate text-only ads.

In your ASPX page:



In your code-behind page:

public void AdRotator1_CustomAdCreated(object sender, AdCreatedEventArgs e)
AdRotator1.Visible = false;
HyperLink1.Text = e.AlternateText;
HyperLink1.NavigateUrl = e.NavigateUrl;

In your XML file:

<?xml version="1.0" encoding="utf-8" ?>
<AlternateText>Did You Know Item 1</AlternateText>
<AlternateText>Did You Know Item 2</AlternateText>
<AlternateText>Did You Know Item 3</AlternateText>

Sunday, June 15, 2008

Getting Server and Other HTTP Information in a Class in an ASP.NET App

Another tidbit I can never remember when I need it. You're working on an ASP.NET website and you have a class that needs to access the Request or Response information, for example. Use HttpContext.Current as in the sample below.
_name = HttpContext.Current.Request.ServerVariables["SERVER_NAME"];

Wednesday, June 11, 2008

Removing Noise Words from a String with XQuery

MarkLogic doesn't offer a way to do stop words (a/k/a suppression lists a/k/a noise words) by default for various reasons -- and I didn't want to block them from being used in searches -- but I was asked to remove them from consideration when using hit highlighting. Here's the code I used to remove a fixed set of noise words from a user's search string.

define variable $NOISE_WORDS as xs:string*
(: \b is a word boundary. This catches beginning,
end, and middle of string matches on whole words. :)
('\bthe\b', '\bof\b', '\ban\b', '\bor\b',
'\bis\b', '\bon\b', '\bbut\b', '\ba\b')

define function remove-noise-words($string, $noise)
(: This is a recursive function. :)
if(not(empty($noise))) then
replace($string, $noise[1], '', 'i'),
(: This passes along the noise words after
the one just evaluated. :)
$noise[position() > 1]
else normalize-space($string)

let $source-string1 := "The Tragedy of King Lear"
let $source-string2 := "The Tragedy OF King Lear These an"
let $source-string3 :=
"The Tragedy of the an of King Lear These of"
let $source-string4 := "The of an of"
(: Need to handle empty result if all noise words,
as in #4 above. :)
let $final :=
remove-noise-words($source-string1, $NOISE_WORDS)
return $final

Wednesday, May 14, 2008

Controlling Which Button Fires When the User Hits Enter in ASP.NET Web Apps

If you need to control which button fires when a user hits Enter on the keyboard in an ASP.NET web application, there are two simple ways to do this.

On an ASPX page with no MasterPage and only one Button control to worry about, set the DefaultButton property of the form tag to the ID of the Button. Nice and simple.

In my specific situation, I had a MasterPage and 3 different buttons, 1 for quick search at the top, 1 for a short signup form, and 1 for a login form if the user had already signed up. Depending on which group of fields had focus, I wanted the Enter key to fire the right Button event. The solution? Wrap each in a Panel control and the set the DefaultButton property on each Panel. Again, nice and simple.

More from Bean Software where I originally saw this hint.

Here's the MSDN documentation for it as well.

Friday, May 9, 2008

Get a List of Embedded Resource Names Within a .NET Application

I have a .NET Windows application and I wanted to embed an XSL file into it permanently and then reference it in code. Embedding it is just a quick setting change, but figuring out how to reference it stumped me. Here's how you can loop through all your embedded resources and see how you should reference them.
Assembly myAssemblyList = Assembly.GetExecutingAssembly();
string[] myResources = myAssemblyList.GetManifestResourceNames();
foreach (string resource in myResources)
txtMessage.Text += resource + "\r\n";
Once I had that information, here's how I could call the embedded XSL file.
Assembly embeddedXsl = Assembly.GetExecutingAssembly();
Stream myEmbeddedXslFile;
myEmbeddedXslFile =
XmlDocument myStylesheet = new XmlDocument();
It seems obvious now, but "MyApp" was the assembly name. "xsl" was the directory I had the file stored in. "MyXslDocument.xsl" was the file name.

Using an Enum Instead of "Magic Values" in an ASP.NET Website

I have a nasty habit of using "magic values" in some complex web pages in ASP.NET sites ... you know, passing type=document or type=legal or type=address in the URL string and then having the page react to it. I've grown to dislike it because it can be a pain to debug on a complex page and I always have to remember the values I'm expecting. I'm experimenting with using a enum instead. Here's what I'm trying.

At the top of the page or in a base page class, I'll setup my enum with the valid page types defined.
enum PageType { address, legal, document, undefined };
I'll also put a method in there that will map a string to one of the predefined page types. I suppose I could do this as a property instead of a method.
protected PageType SetPageType(string thisType)
// Is the value defined in the enum?
if (Enum.IsDefined(typeof(PageType), thisType))
// Convert the string value to a defined PageType enum.
return (PageType)Enum.Parse(typeof(PageType), thisType, true);
// Let's make sure we always have something let.
else return PageType.undefined;
On the page itself, I'll define a default PageType.
protected PageType currentType = PageType.undefined; // The default.
In Page_Load, I'll get the query string parameter value and convert it to a valid PageType.
type = (String.IsNullOrEmpty(Request.QueryString["type"])) ? "" : 
currentType = SetPageType(type);
Note that we force the value to lower case. Now, after all that, instead of matching strings I can say anywhere on my page:
if(Enum.Equals(currentType, PageType.document))
It may seem like a lot of work, but it truly does save development time and debug time in complex situations.

Accessing XML Elements with Namespaces using the ASP.NET XPathNavigator

If you need to access XML elements that have a namespace associated with them using the XPathNavigator object, here's how you can create an XmlNamespaceManager and use it. This is part of a bit of code where I was processing an RSS post.

StringBuilder myString = new StringBuilder();
XPathDocument myDoc = new XPathDocument(myDocumentUri);
XPathNavigator myNav = myDoc.CreateNavigator();
XmlNamespaceManager myManager = new XmlNamespaceManager(myNav.NameTable);
// Here you add the name and URI for all the namespaces you need to handle.
myManager.AddNamespace("content", "");
myManager.AddNamespace("dc", "");

XPathNodeIterator myPosts = myNav.Select("//post");
if (myPosts.Count > 0)
while (myPosts.MoveNext())
// Add the post author.
// You're passing the XmlNamespaceManager to SelectSingleNode
XPathNavigator author = myPosts.Current.SelectSingleNode(
"descendant::dc:creator[1]", myManager);
if (author != null)
myString.AppendLine("by " + author.ToString());

Thursday, May 1, 2008

NATO Phonetic Alphabet

Letter Phonetic Equivalent

A Alpha
B Bravo
C Charlie
D Delta
E Echo
F Foxtrot
G Golf
H Hotel
I India
J Juliet
K Kilo
L Lima
M Mike
N November
O Oscar
P Papa
Q Quebec
R Romeo
S Sierra
T Tango
U Uniform
V Victor
W Whiskey
X X-ray
Y Yankee
Z Zulu


Tuesday, April 15, 2008

Creating a Summary from a MarkLogic Search Result

It's called a summary, or a snippet, or context. It's the string beneath each search result that shows you some words around your search term(s) in the document that was returned.

There's a good one in lib-search if you're using it. I'm not ... yet. At first I tried to use just the relevant functions, but it wasn't doing quite what I wanted and it seemed pretty heavy, especially returning 25 documents per page. The additional things I wanted it to do were to allow me to ignore certain elements and to cross element boundaries. So, even though I'm an XQuery and MarkLogic rookie I decide to try and roll my own!

module ""
default function namespace=""
declare namespace gpg=""

(: Take a search result and create a snippet of text based on the first hit
in the file. Exclude selected elements when generating the snippet. If the
hit is in an element that is removed, it will use the next available hit
or default to the first string of words available. Element boundaries are
ignored, which is a perceived benefit. :)

define variable $gpg:START-TEXT as xs:string { "ML-HIT-START" }
define variable $gpg:END-TEXT as xs:string { "ML-HIT-END" }

define function gpg:get-summary($node as node(), $cts-query as cts:query, $word-buffer as xs:integer) as node()
let $myHighlight as node() := cts:highlight( $node, $cts-query, ($gpg:START-TEXT, $cts:text, $gpg:END-TEXT) )
let $mySummary as node() := <summary> { gpg:remove-elements($myHighlight) } </summary>
let $mySnippet as xs:string := gpg:create-snippet($mySummary, $word-buffer)
(: Yes, we are running cts:highlight twice. The advantage is that it greatly simplifies
the logic for getting the snippet text and has minimal impact on performance when
compared to that alternative. It's the lesser of two evils. :)
cts:highlight(<summary> { $mySnippet } </summary>, $cts-query, <span class="hit"> { $cts:text } </span> )

define function gpg:create-snippet($node as node(), $word-buffer as xs:integer) as xs:string
let $myString := normalize-space(string($node))
let $myTokenizedString := tokenize($myString, "\s")
(: If the sequence contains the start of the search indicator use it, else use 1. :)
(: index-of() can return a sequence of hits, so just grab the first. :)
let $myStartHit := if(index-of($myTokenizedString, $gpg:START-TEXT)[1] castable as xs:integer) then
index-of($myTokenizedString, $gpg:START-TEXT)[1]
else 1
(: If starting the buffer's number of words before the hit is a negative number,
start at 1, otherwise start at the first hit minus the buffer. :)
let $myStart := if( ($myStartHit - $word-buffer) < 0 ) then 1 else ($myStartHit - $word-buffer)
let $myEnd := $word-buffer*2
(: Subsequence does not really care if you feed it negative numbers or numbers that
extend beyond the source sequence's actual size, which is very useful here.
Negative numbers can have odd results, though. :)
let $myTokenizedStringSmall := subsequence($myTokenizedString, $myStart, $myEnd)
(: Join the sequence back together as a string with spaces between each item. :)
let $myUneditedString := string-join($myTokenizedStringSmall, " ")
(: Delete the placeholder text completely. :)
let $myEditedStringStart := replace($myUneditedString, $gpg:START-TEXT, '')
let $myEditedStringEnd := replace($myEditedStringStart, $gpg:END-TEXT, '')
(: When this is returned, run cts:highlight on it to get highlighting in the snippet.
Or don't if it's not needed. :)
return $myEditedStringEnd

(: This group of elements is used to remove selected nodes recursively. This means we can
remove hits on head or metadata elements, which might look odd in a snippet. :)
define function gpg:remove-elements($node as node()) as node()
for $i in $node/node() return gpg:removal($i)
(: This function removes nodes or pass them to the correct handler for processing. :)
define function gpg:removal($node as node()) as node()
case text() return gpg:text-handler($node)
case element(content-metadata) return () (: This is one that is removed. :)
case element(head) return ()
case element(entry-head) return ()
case element(taxonomy) return ()
case processing-instruction() return ()
default return gpg:default-handler($node) (: The default is to return the node and recurse. :)
define function gpg:text-handler($node as node()?) as node()*
if(empty($node)) then ()
else (text {$node})
define function gpg:default-handler($node as node()?) as element()*
element { local-name($node) }
{ $node/@*, gpg:remove-elements($node) }

If you're reading this on a narrower screen resolution, you may be loosing the right-hand side of the code. Copy and paste it out to see it better.

UPDATE: This is painfully slow, primarily I think because my documents are too large to process like this. I'm working to make this leaner.

LibraryThing's JSON API

This morning I was tinkering with LibraryThing's JSON API just to demonstrate a proof of concept to some people internally. I used FireBug to take a look at what the service was actually returning. Here's the bare-bones script.

<title>LibraryThing Tests</title>
<p>This is an example of information we can get from LibraryThing.</p>
<h3>Christmas on Television</h3>
<script language="javascript" type="text/javascript">
function LTHandler(LTData)
for(i in LTData)
var book = LTData[i];
// Just display all the data we know might be there.
if( document.write("<b>ID:</b> " +;
if(book.type) document.write(" (" + book.type.toUpperCase() + ")<br />");
else document.write("<br />");
if( document.write("<b>LibraryThing (LT) work id:</b> " + + "<br />");
if( document.write("<b>LT link:</b> <a target='_blank' href='" + + "'>" + + "</a><br />");
if(book.copies) document.write("<b>LT copies:</b> " + book.copies +
"<br />");
if( document.write("<b>LT reviews:</b> " + +
"<br />");
if(book.rating) document.write("<b>LT rating:</b> " +
if(book.rating_img) document.write(" or <img src='" + book.rating_img + "'/>");
<script src="">

Friday, March 28, 2008

Setting and Getting Documet Quality in MarkLogic Server

I had a request to influence the score of documents returned by searches based on the year of publication. Since a year isn't used as part of most searches, it seemed like the best approach was to set the document quality to the pub year. Then I could use that value in the scoring calculations. Take a look at the Developer's Guide for how do do that.

Here's how I looped through all the documents in a collection and set document quality using a value stored in the XML already.

(: Set document quality :)
for $i in collection('myCollection')/book
let $myYear := $i/metadata/publication-date
let $myBaseUri := base-uri($i)
(: I don't know about you, but I don't trust my XML vendors.
This tests casting the data to an int first. :)
let $myDocumentQuality :=
if($myYear castable as xs:integer) then
$myYear cast as xs:integer
else 1990 (: This is a default setting in case of bad data. :)

xdmp:document-set-quality($myBaseUri, $myDocumentQuality)

Depending on how many documents you have stored, you may need to modify this to set the document quality in smaller batches because it is quite intensive.

Once that's run, you can go back and review all your document quality settings. I originally had these two queries run together, but I think it takes a minute or two (depending on your system) for the settings to actually be indexed.

(: Get document quality :)
for $i in collection('myCollection')/book

Monday, March 17, 2008

Bad CodepointIterator::_next Error from MarkLogic Server

Getting a "Bad CodepointIterator::_next" error from a query in MarkLogic? Check you configuration at the app server level.

We were seeing this error on our production instance, but not our development instance. In our ASP.NET app (through XCC) we would see a generic message, but when we ran the query in cq we saw the more specific version. After a wasted day or so trying to track down the root cause, we thought we had it narrowed down to UTF-8 characters, but we couldn't explain why it was fine on one server but not the other. Then on a whim I compared the configurations looking for anything out of sync and noticed that the "collation" field on the server where we had the error was set to "" while on our development instance it was "". We set them both to use /codepoint and the problem was solved.

If you're using both the HTTP and XDBC layers, make sure to check them both. We fixed it on our HTTP layer, which eliminated the bug in cq, but we were still getting it in the ASP.NET application until we changed the setting in both.

Sunday, March 16, 2008

Basic Information on MarkLogic Collections

I wanted to get some quick information about all of my collections, starting with a list of names and how many documents were in each. This isn't rocket science, but I'll add to this post as the query expands.

(: Set "collection lexicon" to true in the index definition. :)
let $collections := cts:collections() return
<root count="{ fn:count($collections) }">
for $collection in $collections
<name> { $collection } </name>
<size> { fn:count(fn:collection($collection)) } </size>

Wednesday, March 12, 2008

Hiding the Finish Button on an ASP.NET Wizard Control

This is a Class 1 stupid hack. I think it's one of those things where there probably is an easier way to do this, but I haven't found it yet. On an ASP.NET Wizard control, I wanted to hide the Finish button. Here's one way.
myWizard.FinishCompleteButtonType = ButtonType.Link;
myWizard.FinishCompleteButtonText = String.Empty;
myWizard.FinishCompleteButtonStyle.CssClass = "";
myWizard.FinishCompleteButtonStyle.BorderColor = System.Drawing.Color.White;
Another way is to create a CSS class that will hide the button and then assign that to the CssClass property.

Save to Word in ASP.NET

I was looking around for a code snippet to allow a user to save / export a print version of a web page to a Word file on the fly. This didn't need to be complex at all, but everything I found was either overwrought or would result in a Word document that displayed HTML tagging.

Here's an oversimplified method that works. You can build this up as needed. NOTE: Be sure to eliminate the extra space in the HTML tags below. Blogger's not behaving for me, so I had to insert them.

protected void Button1_Click(object sender, EventArgs e)
// Clear the response of any output
// Buffer this so the document comes down in one piece
Response.Buffer = true;
// You can do Excel, too, but I haven't tried that yet
Response.ContentType = "application/msword";
// You can use a variable instead of MyDoc, of course
Response.AddHeader("Content-Disposition", "inline;filename=MyDoc.doc");

StringBuilder myBuilder = new StringBuilder();
// This is the key bit versus other examples I've seen
// Adding the opening < html> etc forces Word to open this in Web Layout view
// If you use inline CSS, you can do much more
myBuilder.Append("< html>< head>< /head>< body>");
myBuilder.Append("< h1>Document Head< /h1>");
myBuilder.Append("< p>< strong>" + Label1.Text + "< /strong>< /p>");
myBuilder.Append("< p>" + TextBox1.Text + "< /p>");
// Do whatever else you need and append it as a string
myBuilder.Append("< /body>< /html>");


Saturday, March 1, 2008

Save an ASP.NET TreeView Control to an XML File

Here are two methods to help you take a TreeView web control and save / export / serialize it to a file. Add these two statements to the top of the file:
using System.Xml;
using System.Text;
These two methods take the TreeView and recursively loop through all the TreeNode controls.
/// <summary>
/// Recurses through a TreeView web control exports the results
/// to an XML file nested to match the TreeView. Properties are
/// saved as attributes when a value is set.
/// <summary>
/// <param name="myTreeView">A TreeView web control.</param>
/// <param name="myFile">A file path and name with extension.</param>
protected void TreeViewToXml(System.Web.UI.WebControls.TreeView
myTreeView, string myFile)
XmlTextWriter xmlWriter = new XmlTextWriter(myFile, Encoding.UTF8);
xmlWriter.Formatting = Formatting.Indented;
xmlWriter.Indentation = 3;
// Opens <TreeView>, the root node.

// Go through child nodes.
ProcessNodes(myTreeView.Nodes, xmlWriter);

// </TreeView>, the root node.

// Closes the object and saves the document to disk.
throw; // Throw any exceptions up the line.

/// <summary>
/// Recursively processes each TreeNode within a TreeNodeCollection
/// from a TreeView web control.
/// </summary>
/// <param name="myTreeNodes">A TreeNodeCollection from a TreeView web
/// control.</param>
/// <param name="myWriter">A XmlTextWriter being used to hold the XML
/// results from the TreeViewToXml method.</param>
protected void ProcessNodes(TreeNodeCollection myTreeNodes,
XmlTextWriter myWriter)
foreach (TreeNode thisNode in myTreeNodes)
myWriter.WriteStartElement("TreeNode"); // <TreeNode>.

// Go through each property and set it as an attribute if it exists.
myWriter.WriteAttributeString("Text", thisNode.Text);
if (!String.IsNullOrEmpty(thisNode.ImageToolTip))
if (!String.IsNullOrEmpty(thisNode.ImageUrl))
myWriter.WriteAttributeString("ImageUrl", thisNode.ImageUrl);
if (!String.IsNullOrEmpty(thisNode.NavigateUrl))
if (!String.IsNullOrEmpty(thisNode.SelectAction.ToString()))
if (!String.IsNullOrEmpty(thisNode.Target))
myWriter.WriteAttributeString("Target", thisNode.Target);
if (!String.IsNullOrEmpty(thisNode.ToolTip))
myWriter.WriteAttributeString("ToolTip", thisNode.ToolTip);
if (!String.IsNullOrEmpty(thisNode.Value))
myWriter.WriteAttributeString("Value", thisNode.Value);
if (!String.IsNullOrEmpty(thisNode.ValuePath))
myWriter.WriteAttributeString("ValuePath", thisNode.ValuePath);
if (thisNode.ShowCheckBox.HasValue)
if (thisNode.Expanded.HasValue)

// Recurse through any child nodes.
if (thisNode.ChildNodes.Count > 0)
ProcessNodes(thisNode.ChildNodes, myWriter);

myWriter.WriteEndElement(); // </TreeNode>.

Friday, February 22, 2008

Apache-Style Web Application Security (Almost) in ASP.NET with Web.config

I say "almost" because if it really was Apache-style, it would be easy. But this is Windows we're talking about, so it isn't.

What I wanted to do was pretty simple: launch a web app with anonymous access except for one directory to contain administrative controls, which I wanted to protect with the standard Login control and values stored in Web.config file.

For the Web.config settings, what I saw was all over the map in terms of what goes where, when, and why. Here's what I ended up using:

<!-- ... -->
<authentication mode="Forms">
<!-- When authentication is triggered, the login page will
be /EditLogin.aspx and the user will be sent to
/edit/default.aspx after a successful login attempt. -->
<forms loginUrl="EditLogin.aspx" defaultUrl="edit/default.aspx"
<!-- Putting this into Web.config isn't exactly a security
best practice, but at least we'll use SHA1 encryption. -->
<credentials passwordFormat="SHA1">
<user name="user1" password="AAAAAAAAAAAAAAAAAAAAAAAAAAAA"/>
<user name="user2" password="BBBBBBBBBBBBBBBBBBBBBBBBBBBBB"/>
<!-- This says that all users can get access to the app. -->
<allow users="*" />
<!-- Anonymous access will be true for all directories except
this /edit directory. user1 and user2 can get access, but
all others will be denied. -->
<location allowOverride="false" path="edit">
<allow users="user1, user2"/>
<deny users="?"/>

Some documentation claims you can use ~/ in the loginUrl and defaultUrl and path definitions. Whenever I included it, the authentication settings were ignored. There are articles out there that start or end their path definitions with a /. This always threw a compile error for me. There are also articles that state you can nest an <authentication> element within <location>/<system.web>, but I always got a compile error on that, too.

I added a Login control to EditLogin.aspx, but I was initially under the impression that on submit the Web.config values would be used and the user would be redirected. This was incorrect. You need to set the Authenticate event handler in your login page.

protected void Login1_Authenticate(object sender,
AuthenticateEventArgs e)
if (FormsAuthentication.Authenticate(Login1.UserName,
You'll also need to add OnAuthenticate="Login1_Authenticate" to the Login control.

I may have missed something obvious where I ran into errors, but I hope this prevents someone else from wasting a few hours of their day!

Wednesday, February 6, 2008

Free Information Retrieval Book Available

Introduction to Information Retrieval published by Cambridge University Press is available free online. Many thanks to the authors for this. They say it will remain free after the book is published. I may grab a copy of the PDFs...just in case.

Just reading the introduction makes me wish I paid more attention in math. I wonder if there are any math textbooks available on USENET? ;-)

Saturday, February 2, 2008

The Time Stamp of Your IIS Logs is Not Wrong

Looking at your IIS server logs and wondering why the time stamp doesn't seem to reflect the actual server time? If you have IIS set to write the logs in W3C extended log file format, the date is always in GMT. Microsoft does have a knowledge base article on this.

Here's a GMT conversion chart for the United States.

Saturday, January 26, 2008

Error Rendering Control with ASP.NET WizardStep and Accordion Controls

If you're using the .NET Framework 2.0 with Visual Studio 2005 and ASP.NET AJAX 1.0, you may run into odd errors when you use the Wizard control and you try to embed an Accordion control inside a WizardStep.

When I did this, it was all in the ASPX code view and the code-behind file. The page and the project built cleanly and the run-time tests were fine, as well. But when another developer then tried to switch the ASPX to design view he saw an error for the entire Wizard control that said "Error Rendering Control" and then the project failed to build throwing errors complaining about the Accordion. We carefully tested this to make sure no other edits were made, that we could reproduce it, and that nothing in our configurations changed. When we reverted that file back to the original, the project built cleanly again and worked fine when deployed.

After some searching, all I could find was one post on the AJAX forums that said it was a known issue in VS 2005 that was fixed in VS 2008. We haven't purchased VS 2008 yet and we're not prepared to upgrade to Framework 3.5 on our production servers yet either, so the suggestions here are not that helpful.

If anyone has a fix for this using Framework 2.0 + VS 2005 + ASP.NET AJAX 1.0, I'd love to hear it.

Thursday, January 24, 2008

Modify an XML Fragment Retrieved by MarkLogic Server

Last week I was experimenting with a scenario where I wanted to retrieve a section of a larger XML document from MarkLogic Server, but modify it before passing it up to the web application layer (we're using MarkLogic -> XCC -> ASP.NET). This XQuery sample recurses through all of the element and text nodes, modifies what's needed, and passes the rest back unchanged. In the end we decided on a different approach, so this query is not tuned but the premise is interesting. I think there's probably a better way to structure the default handler, but for now ...

define variable $myId as xs:string external

define function recurse($node as node()) as node()*
for $i in $node/node() return modify-fragment($i)

(: Here we define which elements we want to modify and declare
a handler for them. :)
define function modify-fragment($node as node()) as node()*
typeswitch ($node)
case text() return text-handler($node)
case element(ref) return ref($node)
case element(see) return see($node)
case element(see-also) return see-also($node)
default return default-handler($node)

define function text-handler($node as node()?) as node()*
if(empty($node)) then ()
else (text {$node})

define function default-handler($node as node()?) as element()*
element { local-name($node) }
{ $node/@*, recurse($node) }

define function see($element as element(see)) as element()
let $destination-node := $element/ancestor::book/
descendant::node()[@local-id=$element/@seeref] return
[@fragment='true'][1]))) then
< see>
{ $element/@* }
attribute {"destination-node"}
attribute {"fragment-local-id"}
{ recurse($element) }
< /see>
else recurse($element)

define function see-also($element as element(see-also)) as element()
let $destination-node := $element/ancestor::book/
descendant::node()[@local-id=$element/@seeref] return
[@fragment='true'][1]))) then
< see-also>
{ $element/@* }
attribute {"destination-node"}
attribute {"fragment-local-id"}
{ recurse($element) }
< /see-also>
else recurse($element)

define function ref($element as element(ref)) as element()
if($element/@type='local-id') then
let $destination-node := $element/ancestor::book/
descendant::node()[@local-id=$element/@value] return
[@fragment='true'][1]))) then
< ref>
{ $element/@* }
attribute {"destination-node"}
attribute {"fragment-local-id"}
{ recurse($element) }
< /ref>
else recurse($element)
else recurse($element)

(:let $myId := 'ID1234':)

for $i in (//chapter[@local-id = $myId] | //entry[@local-id = $myId])[1]

< fragment imagepath="{$i/property::imagepath/text()}">
< /fragment>

Monday, January 21, 2008

My New Kakadu

About two weeks ago I was packing up my stuff and getting ready to leave work when I noticed something stunk. It was my 6-year-old black Eddie Bauer shoulder bag and it was not pleasant.

I had a little Christmas cash in my pocket, so I shopped around a little. After looking at bags from Timbuk2, Chrome, Manhattan Portage and many others I decided that a) they're all over-priced and b) they're all dangerously close to man purses.

I turned to various Army/Navy store outlets and companies that produced similar bags. Most were either way too small, way too big, or had some obnoxious logo on them. Then I found Kakadu Imports. At $49.75, their satchel bag is inexpensive, rugged, and cool. The bag is solidly constructed, the straps are sturdy, and the hardware is nice and heavy...heavy enough where I could club a purse-snatcher soundly. =)

If you're looking for a new bag, check them out. I'm definitely happy with mine.

Tuesday, January 15, 2008

XQuery Support in MarkLogic Server

It's a buried a little deep, but here's how to find out which version of the XQuery spec your version of MarkLogic supports.

Got to and find the documentation. Find the release notes. Find the section called "Compatibility with XQuery Drafts."

Here's the information for 3.2.
This release implements the XQuery language, functions and operators specified in the May 02, 2003 W3C XQuery Working Group Draft Recommendations:
Additionally, much of the added functionality in the January 2007 W3C XQuery Recommendation is implemented in MarkLogic Server 3.2.

Friday, January 4, 2008

Vista Downgrade

As I mentioned in an earlier post, I have a computer with Microsoft Vista. Had. After several months of trying to live with it, I decided to downgrade to XP Pro. I want the OS to stay out of the way and just work and perform reasonably. Is that really so much to ask?

Getting all the drivers straight was a bit of a pain, but the machine is back up and running now. All I have to say is ... ahhhhhhh.

Tuesday, January 1, 2008

Get a Random Item from an Array in ASP.NET

Here's a quick and easy way to get a random item from an array in ASP.NET.
string[] myImages = { "/images/one.gif", "/images/two.gif", "/images/three.gif" };
string randomItem = myImages[ new Random().Next( 0, my.GetLength(0) ) ];

"The state information is invalid for this page and might be corrupted" Using ASP.NET AJAX

When you're using ASP.NET AJAX and the user accesses your site using Firefox, the user may see an error message that says "The state information is invalid for this page and might be corrupted." This seems to be caused by Firefox's methods for saving session information in its cache. There are a couple of ways to make sure this doesn't happen, depending on how extensive you want to disable the cache. In my case, I just wanted to do it on one page, so at the top of Page_Load I added:
You could also set this in the page declaration with:

<%@ Page Language="C#" MasterPageFile="~/MasterPage.master" AutoEventWireup="true" CodeFile="account.aspx.cs" Inherits="Main" Title="My Account" EnableViewStateMac ="false" EnableSessionState="True" EnableEventValidation ="false" ValidateRequest ="false" ViewStateEncryptionMode ="Never" %>

You could also add something like the below to Web.config:

<pages validateRequest="false" enableEventValidation="false" viewStateEncryptionMode ="Never">