Monday, April 17, 2006

XML Parser that Uses Pointers to Locations in the File

Reminds me of something I did for Mercator back when I was working on their transformation engine that was part of their integration broker. Can't get into too much detail since that product's still in existence although under a different company and different product name. The idea is good when dealing with large XML files since you don't have to load the whole document into memory. I can certainly see how this can improve memory requirements but I'm a bit skeptical about the performance, especially if you have to do file i/o whenever you need to get some attribute or element. Take a look at the Xerces-C code. If it hasn't changed too drastically since I last looked at it (about 3 years ago), it's not that difficult to build some mods that hook into their scanner classes to do this as well. Maybe that way you can get the improved memory usage as well as conformance to the W3C XML specs.

No comments: