xquery4j in action

In my previous article, I introduced a wrapper library for Saxon, xquery4j http://github.com/rafalrusin/xquery4j.
Here, I will explain how to use it to create an article generator in Java and XQuery for XHTML, called Article. You can download it here: http://github.com/rafalrusin/Article. It’s a simple DSL for article generation.

I think it is something worth noticing, because the whole project took me just a while to implement and has interesting features. Those are:

  • embedded code syntax highlighting for a lot of programming languages (using external program highlight),
  • creating href entries for links, so you don’t need to type URL twice
  • it integrates natively with XHTML constructs

This is an example of an input it takes:

<a:article xmlns='http://www.w3.org/1999/xhtml' xmlns:a="urn:article">
Some text
<a:code lang="xml"><![CDATA[

]]>

It generates XHTML output for it, using command

./run <input.xml >output.xhtml

The interesting thing is that XQuery expression for this transformation is very simple to do in Saxon. This is the complete code of it:

declare namespace a="urn:article";
declare default element namespace "http://www.w3.org/1999/xhtml";

declare function a:processLine($l) {
for $i in $l/node()
return
typeswitch ($i)
case element(a:link, xs:untyped) return <a href=“{$i/text()}”>{$i/text()}
default return $i
};

declare function a:articleItem($i) {
typeswitch ($i)
case element(a:l, xs:untyped) return (a:processLine($i),
)

case element(a:code, xs:untyped) return
( a:highlight($i/text(), $i/@lang)/body/* ,
)

default return “error;”
};

<html xmlns=“http://www.w3.org/1999/xhtml”>

<br /> a.xml


<link rel=“stylesheet” type=“text/css” href=“highlight.css”/>


{
for $i in a:article/*
return
a:articleItem($i)
}

Inside this expression, there is bound a:highlight Java function, which takes two strings on input (a code and a language) and returns DOM Node containing XHTML output from highlight command.
Since there is not much trouble with manipulating DOM using xquery4j, we can get as simple solution as this for a:highlight function:

public static class Mod {
public static Node highlight(final String code, String lang) throws Exception {
Validate.notNull(lang);
final Process p = new ProcessBuilder("highlight", "-X", "--syntax", lang).start();
Thread t = new Thread(new Runnable() {

public void run() {
try {
OutputStream out = p.getOutputStream();
IOUtils.write(code, out);
out.flush();
out.close();
} catch (IOException e) {
throw new RuntimeException(e);
}
}
});
t.start();
String result = IOUtils.toString(p.getInputStream());
t.join();
return DOMUtils.parse(result).getDocumentElement();
}
}

Please note that creating a separate thread for feeding input into highlight command is required, since Thread’s output queue is limited and potentially might lead to dead lock. So we need to concurrently collect output from spawned Process.
However at the end, when we need to convert a String to DOM and we use xquery4j’s DOMUtils.parse(result), so it’s a very simple construct.

You May Also Like

Need to make a quick json fixes – JSONPath for rescue

From time to time I have a need to do some fixes in my json data. In a world of flat files I do this with grep/sed/awk tool chain. How to handle it for JSON? Searching for a solution I came across the JSONPath. It quite mature tool (from 2007) but I haven't hear about it so I decided to share my experience with others.

First of all you can try it without pain online: http://jsonpath.curiousconcept.com/. Full syntax is described at http://goessner.net/articles/JsonPath/



But also you can download python binding and run it from command line:
$ sudo apt-get install python-jsonpath-rw
$ sudo apt-get install python-setuptools
$ sudo easy_install -U jsonpath

After that you can use inside python or with simple cli wrapper:
#!/usr/bin/python
import sys, json, jsonpath

path = sys.argv[
1]

result = jsonpath.jsonpath(json.load(sys.stdin), path)
print json.dumps(result, indent=2)

… you can use it in your shell e.g. for json:
{
"store": {
"book": [
{
"category": "reference",
"author": "Nigel Rees",
"title": "Sayings of the Century",
"price": 8.95
},
{
"category": "fiction",
"author": "Evelyn Waugh",
"title": "Sword of Honour",
"price": 12.99
},
{
"category": "fiction",
"author": "Herman Melville",
"title": "Moby Dick",
"isbn": "0-553-21311-3",
"price": 8.99
},
{
"category": "fiction",
"author": "J. R. R. Tolkien",
"title": "The Lord of the Rings",
"isbn": "0-395-19395-8",
"price": 22.99
}
],
"bicycle": {
"color": "red",
"price": 19.95
}
}
}

You can print only book nodes with price lower than 10 by:
$ jsonpath '$..book[?(@.price 

Result:
[
{
"category": "reference",
"price": 8.95,
"title": "Sayings of the Century",
"author": "Nigel Rees"
},
{
"category": "fiction",
"price": 8.99,
"title": "Moby Dick",
"isbn": "0-553-21311-3",
"author": "Herman Melville"
}
]

Have a nice JSON hacking!From time to time I have a need to do some fixes in my json data. In a world of flat files I do this with grep/sed/awk tool chain. How to handle it for JSON? Searching for a solution I came across the JSONPath. It quite mature tool (from 2007) but I haven't hear about it so I decided to share my experience with others.

Agile Skills Project at my company

Unfulfilled programmers Erich Fromm, a famous humanist, philosopher and psychologist strongly believed that people are basically good. If he was right, then either our society is a mind-breaking dystopia or we have a great misfortune of working i... Unfulfilled programmers Erich Fromm, a famous humanist, philosopher and psychologist strongly believed that people are basically good. If he was right, then either our society is a mind-breaking dystopia or we have a great misfortune of working i...

Confitura 2013 afterthoughts

Confitura, the biggest free-of-charge Java conference in Europe, took place on the 6th of July in Warsaw. TouK's presence was heavy, with 5 separate talks, all chosen in call for papers, no sponsored bullshit. We were sponsoring deck chairs during the...Confitura, the biggest free-of-charge Java conference in Europe, took place on the 6th of July in Warsaw. TouK's presence was heavy, with 5 separate talks, all chosen in call for papers, no sponsored bullshit. We were sponsoring deck chairs during the...