Rendition Protocol: May 2007

Thursday, May 31, 2007

Update a Table Based on Values from a Separate Table

Here's a script to update a table based on values from a separate table, where the two tables can be joined by an ID field.

UPDATE destinationtable
SET destinationtable.path = sourcetable.path
FROM sourcetable
WHERE destinationtable.id = sourcetable.id

Sunday, May 27, 2007

Web Services in PHP 4

I have a situation where I need to use a web service in PHP 4. I found a piece of free PHP code call nusoap and in this site created /lib/nusoap for all of its code. Here are some helpful snippets to access the web service methods and data.

require_once('lib/nusoap/nusoap.php');
$wsdl = "http://service.domain.com/ws.asmx?WSDL";
$client = new soapclient($wsdl, 'wsdl');
$proxy = $client->getProxy();

$authResponse = $proxy->authenticate_token(array('token_id'=>$cToken, 'site_code'=>$WS_SITE_CODE));
if($authResponse['authenticate_tokenResult'] == 'true')
{
  setcookie("mytoken", $cToken, 0, "/", ".mydomain.com");
  // Modified to work on www and non-www usage of the domain.
}

$authInfoResponse = $proxy->authenticate_passive(array('ip'=>$cIP, 'referrer'=>$cRef));
$authArrayResponse = $authInfoResponse['authenticate_passiveResult'];
$token_id = $authArrayResponse['token_id'];
if ($token_id != "0" && $token_id != "")
{
  setcookie("mytoken", $token_id, 0, "/", ".mydomain.com");  
}

Accessing an IxiaDocument Object from Textml in JSP

This is actually split across an application listener object and a JSP view page after a search result item is clicked on, which is why you'll see an Object being pulled from the session. You'll also see a reference to SearchUtilities, a search helper object.

String textmlRmiUrl = "rmi://servername:1099";
String textmlDomain = "DOMAIN";
String textmlUser = "username";
String textmlPassword = "password"; 
String textmlServer = "servername";
String textmlDocbase = "docbasename";

HashMap parms = new HashMap(1);
parms.put("ServerURL", textmlRmiUrl);

ClientServices cs = com.ixia.textmlserver.ClientServicesFactory.getInstance("RMI", parms); 
cs.Login(textmlDomain, textmlUser, textmlPassword);
// Note that there can be only one login per application

ByteArrayInputStream inputStream = null;

Object sessionResults = session.getAttribute( SearchUtilities.translateTab(tab));
IxiaServerServices ss = cs.ConnectServer(textmlServer); // Get the server Services
IxiaDocBaseServices docbase = ss.ConnectDocBase(textmlDocbase); // then, the DocbaseServices
IxiaSearchServices search = docbase.SearchServices(); // then, the SearchServices
IxiaResultSpace result = null; // then initialize the results space

result = (IxiaResultSpace)sessionResults;

IxiaDocument.Content ixiacontent = result.Item(searchDocId,"highlight").GetContent(); 
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
ixiacontent.SaveTo(outputStream);
inputStream = new ByteArrayInputStream(outputStream.toByteArray());

File xslreader = new File(xslpath);
TransformerFactory tFactory = TransformerFactory.newInstance();
Transformer transformer = tFactory.newTransformer(new StreamSource(xslreader));
transformer.setParameter("book",x); // XSL parameter
transformer.transform(new StreamSource(inputStream), new StreamResult(out));

Accessing an IxiaDocument Object from Textml in ASP.NET

Here is just one way to access a Textml IxiaDocument object returned after doing a search.

Various parts of this are wrapped in try/catch blocks and defined in methods as is practical.

IxiaClientServices IxiaCS = new IxiaClientServices();
IxiaServerServices IxiaSS = IxiaCS.ConnectServer(textmlServer);
IxiaDocBaseServices IxiaDS = IxiaSS.ConnectDocBase(textmlDocbase);
IxiaSearchServices IxiaSearchS = IxiaDS.SearchServices;
IxiaQueryAnalyzer TextmlQueryAnalyzer = new IxiaQueryAnalyzer();
String queryEdited = TextmlQueryAnalyzer.GetXMLQueryString(queryWordsAll, "words");
String querySubmitted = textmlStandardHeader +
                        "<query VERSION=\"3.6\" RESULTSPACE=\"ALL\">" +
                          "<" + topLevelKey +">" +
                            textmlCollectionLae +
                            queryEdited +
                          "</" + topLevelKey + ">" +
                        textmlStandardSort +
                        textmlStandardFooter;
// Several variables defined elsewhere.

IxiaResultSpace rs = IxiaSearchS.SearchDocuments(querySubmitted); 
// The query is parsed elsewhere

// This section would be part of a loop
IxiaDocument doc = rs.Item(i, "highlight"); 
// Hits marked with a span of the class "highlight"
MemoryStream xmlStream = new MemoryStream();
doc.Content.SaveTo(xmlStream);
xmlStream.Position = 0;

XPathDocument textmlXmlDocument = new XPathDocument(xmlStream);
XslCompiledTransform textmlTransform = new XslCompiledTransform();
textmlTransform.Load(this.Server.MapPath("xsl/" + xslFile)); 
// xslFile defined elsewhere
StringWriter textmlWriter = new StringWriter();
XsltArgumentList textmlXslArg = new XsltArgumentList();
textmlXslArg.AddParam("documentLink", "", link); // XSL parameters
textmlTransform.Transform(textmlXmlDocument, textmlXslArg, textmlWriter);

divXml.InnerHtml = textmlWriter.ToString();

Friday, May 11, 2007

Building and Iterating Over a LinkedHashMap in Java

Building a Java LinkedHashMap and iterating over it always comes in handy. In a class you can have something like:

private Map myLinks = new LinkedHashMap();
public Map getLinks() { 
  return myLinks; 
}
public void setMyLinks(String key, String value) { 
  this.myLinks.put(key, value); 
}
...
for(int j=0; j<links.getLength(); j++)
{
  setMyLinks( ((Element)links.item(j)).getAttribute("url"), 
              getText(links.item(j)) );
}

Then in a JSP you can do something like:

Map thisMyLinks = ThisState.getMyLinks();
for (Iterator it=thisStateLinks.keySet().iterator(); it.hasNext(); ) 
{
  Object key = it.next();
  Object value = thisStateLinks.get(key);
  out.println("<li><a href=\"" + key.toString() + "\">" + 
               value.toString() + "</a></li>");
}

Thursday, May 10, 2007

Textml vs. MarkLogic, Part 1

I've been working with Ixiasoft's Textml Server for several years now. Recently I've also started working with Mark Logic's MarkLogic Server and I'm starting to notice differences -- some in Textml's favor and some in Mark Logic's favor.

Some of my knocks on MarkLogic compared to Textml are:

MarkLogic lacks a query parser. A simple set of expressions should be defined and accepted by a parser -- AND, OR, NOT, a near operator, some sort of frequency and priority operators, would be fine. If you need something more complex then you have to build your own, but give me something. (Truth be told, the one in Textml is a little flaky.)
MarkLogic lacks a common way to not index or search stopwords. Add the ability to define a list of stop words on the forest or database level.
MarkLogic lacks a document-focused admin interface. Textml's version of this comes in quite handy.
MarkLogic lacks result set counts that are both fast and accurate. I should not need to worry about whether I should use xdmp:estimate(), cts:remainder(), or fn:count() to know how many items are in my cts:search(). Just tell me. A database can do it. Textml can do it. MarkLogic needs to as well.

Some of my knocks on Textml compared to MarkLogic are:

Textml lacks the ability to accept a large document and search/return only part of it as a result of a search. If I have a book to load, I have to figure out what my display unit in the application is going to be (an entire chapter, a smaller section of a chapter) and break up the file ahead of time. There are all sorts of reasons why that's a problem.
Textml lacks XQuery support. I'm just learning XQuery now, but it's pretty darn powerful. Where's the support for it?
Textml lacks improvements. Maybe it's just me, but the development of new features seems stagnant.

Bottom line so far: MarkLogic Server is significantly more powerful than Textml Server. MarkLogic is more complex, and therefore more demanding for development, but it has a huge upside.

I'll do another post on this as I learn more about MarkLogic, if necessary.

UPDATE: MarkLogic recently released a new, very powerful search library. If you're reading this, you have to check out lib-search.

Sunday, May 6, 2007

Get a Document's Properties by Attribute Value from MarkLogic Server

Here's a query to get a document's properties from MarkLogic Server using an attribute value. The attribute name is "id" and it is on the node named "document." I used the cts:element-attribute-value-query() function because I can set the case sensitivity and other options. The entire <prop:properties> node is returned.


define variable $myId as xs:string external
(:let $myId := 'ID1234':)

for $i in cts:search(//document,
 cts:element-attribute-value-query(xs:QName("document"),
   xs:QName("id"),
   $myId,
   "case-insensitive"
 )
)
return $i/property::node()/..

You can pass in a variable from an external app or you can define it in the query using let. You can also tack on some additional metadata, like the URI of the document and any collections it belongs to. Change the return block for this to something like:


return
 <result>
   { $i/property::node()/.. }
   <uri> { base-uri($i) } </uri>
   <collections> {  xdmp:document-get-collections(base-uri($i)) } </collections>
 </result>

Thursday, May 3, 2007

Using cts:search to Search More Than One Node Level in MarkLogic Server

I had a relatively simple requirement for building a search application running against MarkLogic Server: search multiple levels of the XML hierarchy for all files in the repository and return each level as a document; the search must be case and diacritic insensitive. Here's beginnings of the query that did the trick:


define variable $ORIGINAL_QUERY as xs:string external

for $i in cts:search( //(chapter
                        div
                        entry
                        section),
                     cts:word-query(
                                     $ORIGINAL_QUERY,
                                     ("case-insensitive", "diacritic-insensitive")
                                   )
                   )
return

<result id="{ $i/@local-id }">
 {
   $i/( content-metadata/title ),
   $i/( content-metadata/subtitle ),
   $i/( content-metadata/label ),
   $i/( head ),
   $i/( entry-head ),
   $i/( content-metadata/contributor/display-name ),
   $i/( content-metadata/copyright/display-date )
 }
</result>

There's a lot more work to do on this -- tokenizing words, tokeninzing quoted strings as phrases, accepting Boolean terms, returning different values based on the node, etc., etc., etc.

Thanks to the people on the Mark Logic developer email list for helping with this.

Rendition Protocol