Some of my knocks on MarkLogic compared to Textml are:
- MarkLogic lacks a query parser. A simple set of expressions should be defined and accepted by a parser -- AND, OR, NOT, a near operator, some sort of frequency and priority operators, would be fine. If you need something more complex then you have to build your own, but give me something. (Truth be told, the one in Textml is a little flaky.)
- MarkLogic lacks a common way to not index or search stopwords. Add the ability to define a list of stop words on the forest or database level.
- MarkLogic lacks a document-focused admin interface. Textml's version of this comes in quite handy.
- MarkLogic lacks result set counts that are both fast and accurate. I should not need to worry about whether I should use xdmp:estimate(), cts:remainder(), or fn:count() to know how many items are in my cts:search(). Just tell me. A database can do it. Textml can do it. MarkLogic needs to as well.
- Textml lacks the ability to accept a large document and search/return only part of it as a result of a search. If I have a book to load, I have to figure out what my display unit in the application is going to be (an entire chapter, a smaller section of a chapter) and break up the file ahead of time. There are all sorts of reasons why that's a problem.
- Textml lacks XQuery support. I'm just learning XQuery now, but it's pretty darn powerful. Where's the support for it?
- Textml lacks improvements. Maybe it's just me, but the development of new features seems stagnant.
I'll do another post on this as I learn more about MarkLogic, if necessary.
UPDATE: MarkLogic recently released a new, very powerful search library. If you're reading this, you have to check out lib-search.