Java Diary: Şubat 2010

You can find a lot of web-sites about Java HotSpot VM. As you may know there are two binary-distributions of JVM: Server JVM and Client JVM.
Differences between them is, Server JVM is best suited for long running server-side applications, and tuned for maximizing operating speed. On the other hand Client JVM is tuned for client-side applications and more memory-friendly. Until now everything is ok.

I have a question. On which side xml – data is process more? On server side or client side? I guess you will reply as “of course in server side”.

However, although server-side applications are processing xml-data more often than client-side applications, and although Server JVM is claimed to be more suitable for server-side applications, Client JVM performing two times better than Server JVM in xml processing. Surprising isn’t it?

You can check what I am claiming by running following code with two jmv options: -server, -client:

import org.dom4j.Document;
import org.dom4j.DocumentException;
import org.dom4j.DocumentHelper;

public class Test {

public static void main(String[] args) throws InterruptedException {
String xmlData = //read from file

int count = 999;
long start, stop;

start = System.currentTimeMillis();
for(int i=0; i newDocument(xmlData );
}
stop = System.currentTimeMillis();
System.out.println
("\n-- Elapsed Time for echo1: " + (stop - start));

}

public static Document newDocument(String message) {
Document doc = null;
try {
doc = DocumentHelper.parseText(message);
} catch (DocumentException e) {
e.printStackTrace();
}
return doc;
}
}

Thanks for reading
Aydın Karaman

Xml-data processing involves two phases:
1. Parsing phase (identifying infosets in an xml-source, such as xml-string)
2. Processing phase (retrieving and modifying infosets)

For these purposes, there are basically two parsing & processing models.
1. Tree-Based parsers (DOM implementations such as DOM4j)
2. Event-Based parsers (STREAMING implementations such as SAX, StAX)
a. Streaming Pull parsers (SAX)
b. Streaming Push parsers (StAX)

There are advantages and disadvantages of all xml-parsing models. Depending on your project needs, you may prefer one to others. But before making a decision, it will be helpful to think that you don’t need to use the same parser for both phases mentioned above. That is, you may use a SAX implementation for Parsing-Phase and a DOM implementation for Processing-Phase.

You are right to say, in event-based parsers two phases are running in parallel. That is; you must process xml-infosets at the time you parse them. So it is not suitable to use Event-Based parsers for only “Processing-Phase”. But on the other hand, you can use Event-Based parsers for reading phase and Tree-Based parsers for processing phase in order to benefit from advantages of DOM parsers.

You must be wondering why should I use a STREAMING parser for the first phase, if I would process xml-infosets by DOM parsers. Here is the answer: For better PERFORMANCE!!!

DOM implementations, actually I tried it only with DOM4j, is not good at parsing phase. They are two times slower than Event-Based parsers. You can check performances of both parsers by running two functions I gave below in a loop.

//DOM example
public static Document newDocument(String message) {
Document doc = null;
try {
doc = DocumentHelper.parseText(message);
} catch (DocumentException e) {
e.printStackTrace();
}
return doc;
}

//STREAMING example
public static void newDocumentByString message) {
XMLReader parser = null;
try {
parser = XMLReaderFactory.createXMLReader();
// parser.setContentHandler(new NodeCounter());
InputSource in = new InputSource(new ByteArrayInputStream(message.getBytes()));
parser.parse(in);
}
catch (Exception e) {
System.err.println(e.toString());
}
}

If for your project a DOM parser is more suitable, for performance reasons use StAX or SAX parser for Parsing-Phase and create a Document object as a result. The code-snapshot that is doing that is below:

public static Document newDocumentByStax(String message) {
XMLInputFactory inputFactory = XMLInputFactory.newInstance();
InputStream input = new ByteArrayInputStream(message.getBytes());
XMLStreamReader parser;
Document doc = null;
try {
parser = inputFactory.createXMLStreamReader(input);
doc = process(parser);

} catch (XMLStreamException e1) {
e1.printStackTrace();
}
return doc;
}

private static Document process(XMLStreamReader reader)
throws XMLStreamException {

Document doc = newDocument();
Element current = null;
Stack elements = new Stack();

while (reader.hasNext()) {
int eventType = reader.getEventType();
switch (eventType) {
case XMLStreamConstants.START_ELEMENT:
Element element = new DOMElement(reader.getLocalName());
if (current == null) {
current = element;
doc.setRootElement(current);
} else {
current.add(element);
}
elements.add(current);
current = element;
break;

case XMLStreamConstants.END_ELEMENT:
current = elements.pop();
break;

case XMLStreamConstants.CHARACTERS:
current.setText(reader.getText());
break;
default:
break;
}
reader.next();
}
return doc;
}

Thanks For Reading
Aydın Karaman

Java Diary

Bu Blogda Ara

22 Şubat 2010 Pazartesi

Java HotSpot VM: -server vs -client

How to parse xml strings or files faster?

İzleyiciler

Blog Arşivi

Hakkımda