Archive for the ‘XML’ Category

SlashXML (C++ version)

The C++ code for SlashXML is now up, it’s in the /cpp folder within the repository. The C++ version supports most of the functionality the C# version does; maintaining multiple codebases is very difficult!

The C++ version also has 2 dependencies: TinyXML, an awesome, light weight XML parser, and AString, a not-so-lightweight UTF-8 string class. I know not everyone might be crazy about using AString, especially as it’s fairly bloated; I fucked around with things like custom numeric to string conversions with this class – unnecessary stuff, and it’s probably slower than using sprintf. However, it’s well encapsulated in it’s own namespace. I’ll try to have something more elegant in the future.


Load an XML document from a file

SlashXMLReader    xmlreader;
xmlreader.LoadXMLFile(
"test.xml");


Load an XML document from a string

SlashXMLReader    xmlreader;
std::vector<
unsigned char> xmlData;
// copy octets from string into xmlData here
xmlreader.LoadXMLData(xmlData);


Retrieve data from an element

SlashXMLString data = xmlreader.GetData("root/child");


Retrieve data from an indexed element

SlashXMLString data = xmlreader.GetData("root/child[2]");


Get the number of child nodes under an element

int numKids = xmlreader.GetNumChildren("root", "child");

The SlashXMLReader::LoadXMLData() method is kinda messed up and bizarre. I don’t even remember what was going thru my mind there.

SlashXML

I recently put up the SlashXML library on bitbucket. SlashXML provides a way to retrieve data from an XML document via path names to elements. It’s somewhat in the same vein as XPath, but much simpler. I wrote this way back when because I was annoyed at the more conventional method of having to extract data during the depth-first traversal of the document tree – this is a fine way to parse the document and pull in the data but not necessarily the most elegant way to query it. I haven’t written any documentation so, as I did with the bluetooth stuff, I’ll post a quick-and-dirty tutorial here.

The C# code files and .NET DLL are up now (/cs folder). I do have a version of this in C++ which uses TinyXML, but there’s a major bug that needs to be resolved before I upload it.


Load an XML document from a file

SlashXML.SlashXMLReader xmlReader = new SlashXML.SlashXMLReader();
xmlReader.LoadXMLFile(filepath);


Load an XML document from a string

SlashXML.SlashXMLReader xmlReader = new SlashXML.SlashXMLReader();
xmlReader.LoadXMLString(xmlString);


Retrieve data from an element

string data = xmlReader.GetDataAt("root/child");

If the element doesn’t exist, has no data, or contains other elements instead of data, the return value will be null.


Retrieve data from an indexed element

string data = xmlReader.GetDataAt("root/child[2]");

Indexing on data read in occurs when there are multiple elements with the same name.


Get the number of child nodes under an element

int numChildren = xmlReader.GetNumChildren("root", "child");


Note that underlying SlashXML.SlashXMLReader is SlashXML.StringTreeUtility.StringTree which actually holds the data read in and which can be queried directly for the data. Also, SlashXML reads everything into memory – do not use this for massive files (e.g. 1GB+ documents)! In fact, you should probably reconsider your use of XML if your files are that massive.

You’ll notice a major missing feature here – retrieving data in attributes! It’s not too difficult to do, but I haven’t had time to do it. It’ll be in a future version.