« UNIMARC | Main | WorldCat names in Wikipedia »

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00d83459bf2269e200e55085cf808834

Listed below are links to weblogs that reference cElementTree:

Comments

Rob Sanderson

Thom,

Check out LXML -- it's a wrapper around libxml2/libxslt and as such it supports full XPath, XSLT etc etc, while remaining as true to the elementTree API as is feasible.

http://codespeak.net/lxml/

The lxml folks did some benchmarking and found it to be at least as fast as cElementTree in most circumstances (full details should be on the lxml site)

-- Azaroth

Nicolas

About XML and UNIMARC, see BiblioML at
http://90plan.ovh.net/~adnx/biblioml/doku.php?id=en:introduction

Marc Weeber

Hi Thom,

It was great meeting you last week in Bloomington. I just checked your blog, and found this gem. At Knewco, we use python for most of our stuff, and Zope for our web applications. For XML parsing we also use cElementtree (for Medline XML), and are pleasantly surprised by the speed. My prevous experience was perl/expat, but python/celementtree has decreased both development and processoing time

Shailen Karur

FWIW, Thom:
I ran a quick benchmark for my input XML file
[size:4303244746Bytes::uncompressed]
[numberOfRecords:1691425]
Benchmark:
[Seconds:566.2::RecordsPerSec:2988]
Environment:
[Ruby,libxml::Full XPath support] on my [MacBookPro:2.4 GHz Intel Core 2 Duo::RAM:4GB]
[Development time:marginal]
I would imagine that dealing with compressed XML would help increase the speed. However, I've thrown 200 XPath expressions at the XML file and have seen a marked degradation in performance (upto 25 minutes for the run). While Ruby typically gets bashed for its speed - and justifiably so in a few contexts - it excels at being able to gather linked data from multiple sources, an activity I see myself increasingly doing, to enrich the data base...

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Working...
Your comment could not be posted. Error type:
Your comment has been saved. Comments are moderated and will not appear until approved by the author. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.

Working...

Post a comment

Comments are moderated, and will not appear until the author has approved them.

My Photo

May 2012

Sun Mon Tue Wed Thu Fri Sat
    1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30 31