DOM parsing vs serialization
Dan Higgins
higgins at nceas.ucsb.edu
Mon Mar 3 09:22:12 PST 2003
Hi All,
Some added information on my initial comparison of
parsing/serialization times. I did a few quick comparisons and found
that simply reading the bytes in a file and doing nothing else takes
about the same time as reading a serialized file of the same size! In
other words, it looks like the bottleneck is reading a series of bytes
from disk, not either parsing or recreating a serialized object.
Serialized DOMs are 2-3 times large than the original XML so they take
longer to read than the original XML!!
Dan
Chad Berkley wrote:
>On a slightly related aside, I was looking stuff up in the XPathAPI the
>other day and I came across another class called CachedXPathAPI that
>basically does the same thing, but it doesn't use static methods so it
>doesn't have to load the document every time you run xpath. It's
>supposed to be faster according to the documentation but it does have a
>warning about it not updating the cached document unless you
>reinstantiate the class.
>
>See the docs here:
>http://xml.apache.org/xalan-j/apidocs/org/apache/xpath/CachedXPathAPI.html
>
>I started using it in the stuff that I was working on and it does seem
>faster. I think you just have to be careful if you update the document
>then do another xpath query.
>
>If you already were enlightened or astute enough to know about this
>class, ignore this email :).
>
>chad
>
>On Fri, 2003-02-28 at 15:36, Dan Higgins wrote:
>
>
>>Hi All,
>>
>> I did some simple comparisons of the time required to read a
>>serialized DOM tree (XERCESJ parser) versus the time to create the DOM
>>by parsing the XML text document. For very small docs, the times were
>>about the same. However, for XML text docs of ~5K or large, parsing the
>>XML is 3-4 or more times faster than reading a serialized version of the
>>DOM from disk !!! (Also, the serialized file is 3-4 times bigger than
>>the original XML text.)
>>
>> It thus looks like my idea of storing eml docs in serialized form on
>>disk for morpho is NOT a good one.(Caching the DOM in RAM does help
>>performance, however.)
>>
>>Dan
>>
>>--
>>*******************************************************************
>>Dan Higgins higgins at nceas.ucsb.edu
>>http://www.nceas.ucsb.edu/ Ph: 805-892-2531
>>National Center for Ecological Analysis and Synthesis (NCEAS)
>>735 State Street - Room 205
>>Santa Barbara, CA 93195
>>*******************************************************************
>>
>>
>>_______________________________________________
>>morpho-dev mailing list
>>morpho-dev at ecoinformatics.org
>>http://www.ecoinformatics.org/mailman/listinfo/morpho-dev
>>
>>
--
*******************************************************************
Dan Higgins higgins at nceas.ucsb.edu
http://www.nceas.ucsb.edu/ Ph: 805-892-2531
National Center for Ecological Analysis and Synthesis (NCEAS)
735 State Street - Room 205
Santa Barbara, CA 93195
*******************************************************************
More information about the Morpho-dev
mailing list