Extra XML Support

If you want to load an XML file there is a great blog post by Jonathan Beebe here:

http://blog.anscamobile.com/2011/07/how-to-use-xml-files-in-corona/

With XML files which don’t have complex elements (like CDATA) the XML library module he uses is nice and fast and produces an easy to navigate table structure. In short, each element is stored in a table, it’s element properties are list in key/value pairs in a table called ‘.properties’, the child elements numerically indexed in a table called ‘.child’ and the textual body content, if present, stored in ‘.value’.

There are just two problems with this library:

1: It does not save changes to the XML table content back to file

2: I prefer to name my variables so that I don’t need to index them by number all the time.

To implement solutions to these I decided to extend the library code. First is a ‘saveFile’ function which takes the original format table (as returned from loadFile) and writes it back to a regular XML file. The one caveat is that you must tell saveFile what the root XML element name is because that element gets represented by the returned table from loadFile.

Second is a pair of functions for converting the original format table into a more DIY format and back again. This means that if your XML contains an element with another single child element, for example:

[lua]

Some title I thought of

[/lua]

Then the child is no longer accessed like this:

[lua]local thetitle = myxmltable.child[1].value[/lua]

It is now accessed like this:

[lua]local thetitle = myxmltable.title.value[/lua]

Of course, most XML files will go much deeper than that, which is where the power lies. If you are used to using XPath this will be a more natural form, too.

With the new ‘saveFile’, ‘simplify’ and ‘desimplify’ functions you can now load an XML file, access it’s values naturally, update it (even adding new structures and properties) and save it again.

It bears repeating, however, that the table returned by the loadFile function directly represents the root XML element. Therefore, it’s element name is accessed by:

[lua]local thetitle = myxmltable.name[/lua]

Using the new ‘simplify’ function loses this value because element names are now used as the table names. The properties were previously stored in ‘properties’ under each child table but are now directly accessed by key index. Therefore, the root element would have incorrectly become a property and so is lost. To support this, the new function all take a parameter to name the root element.

This is explained with sample code and a new ‘deep’ dump() function in this Code Exchange post:

https://developer.anscamobile.com/code/much-improved-dump-function-and-xml-simplify [import]uid: 8271 topic_id: 21403 reply_id: 321403[/import]

Hey Horace,

I’m trying to read some XML on Corona and I figure I have to use XPath to decode the pointers such as 

instance_of="//@users.0/@posts.1/@saved_elements.0/@elements.0 //@users.0/@posts.1/@saved_elements.0/@elements.1"

I want to decode the XML and get some regular Lua table that I can access directly. I mean, if I accessed the “instance_of” parameter, I’d get the two tables, not a string…

Can you help me out setting a project like that up? I checked your code and it seems it helps with XML but not with XPath, how do you go about with that?

thanks

ps: my example .xmi is here: https://dl.dropboxusercontent.com/u/6671584/data.xmi

I’m afraid the code I provided does not implement XPath, so you are literally turning the XML into Lua tables and then navigating the document.

Hey Horace,

I’m trying to read some XML on Corona and I figure I have to use XPath to decode the pointers such as 

instance_of="//@users.0/@posts.1/@saved_elements.0/@elements.0 //@users.0/@posts.1/@saved_elements.0/@elements.1"

I want to decode the XML and get some regular Lua table that I can access directly. I mean, if I accessed the “instance_of” parameter, I’d get the two tables, not a string…

Can you help me out setting a project like that up? I checked your code and it seems it helps with XML but not with XPath, how do you go about with that?

thanks

ps: my example .xmi is here: https://dl.dropboxusercontent.com/u/6671584/data.xmi

I’m afraid the code I provided does not implement XPath, so you are literally turning the XML into Lua tables and then navigating the document.

@horacebury,

I posted this thread (http://forums.coronalabs.com/topic/45523-xml-files-and-table-concatenation/) about concatenating tables built from an XML file.  Just saw that you have a different XML parser that would likely also do what I want.

My question for you is after simplifying the XML data into a table, how could I easily concatenate two of these tables into one?

Here is what my XML looks like:

\<?xml version="1.0" encoding="utf-8"?\> \<library\> \<gametype1\> \<category name=""\> \<answer\>\</answer\> \<answer\>\</answer\> \<answer\>\</answer\> \</category\> \<category name=""\> \<answer\>\</answer\> \<answer\>\</answer\> \<answer\>\</answer\> \</category\> \</gametype1\> \<gametype2\> \<category name=""\> \<answer\>\</answer\> \<answer\>\</answer\> \<answer\>\</answer\> \</category\> \<category name=""\> \<answer\>\</answer\> \<answer\>\</answer\> \<answer\>\</answer\> \</category\> \</gametype2\> \</library\>

Using your XML parser, here is my code for reading the data:

gameAnswers1 = xmlapi:loadFile( "file1.xml", system.ResourceDirectory ) gameAnswers2 = xmlapi:loadFile( "file2.xml", system.ResourceDirectory ) gameAnswers1 = xmlapi:simplify( gameAnswers1 ) gameAnswers2 = xmlapi:simplify( gameAnswers2 ) gameAnswers1Gametype1 = {} gameAnswers1Gametype1 = gameAnswers1.gametype1 gameAnswers2Gametype1 = {} gameAnswers2Gametype1 = gameAnswers2.gametype1 dump(gameAnswers1Gametype1)

So far, so good, I think.  The dump tool shows that the category node is the first in the table and I can correctly get the number of category nodes using #gameAnswers1Gametype1.category.
 

But how do I combine gameAnswers1Gametype1 and gameAnswers2Gametype1 into one table so that I can access them as one set of data?  I’ve tried some different pieces of code people recommended, but I either end up with just the entries from the first table, or no entries at all.

You help would be most appreciated.

@horacebury,

So with the help of someone on Stack Overflow, I have figured out how to do this using your XML parser.  Mostly.

Here is the code I am using:

local function concatMe(fromTable, intoTable) for i = 1,#fromTable do table.insert(intoTable, fromTable[i]) end end gameAnswers1 = xmlapi:loadFile( "file1.xml", system.ResourceDirectory ) gameAnswers2 = xmlapi:loadFile( "file2.xml", system.ResourceDirectory ) gameAnswers1 = xmlapi:simplify( gameAnswers1 ) gameAnswers2 = xmlapi:simplify( gameAnswers2 ) concatMe(gameAnswers2.gametype1.category, gameAnswers1.gametype1.category)

So this works great if the XML file (structure noted in previous post) that I am copying from has more than one category node.  If it only has one category node, then the #fromTable resolves as 0.  Add another category node and it resolves as 2.

Any idea why that would be?

Yes, the simplify function decides that if you have a list of elements then you need a table of data, but if you have a single item it is probably not a list and makes it available as properties. If you pass the last parameter to simplify as true it will dump out the XML structure as it sees it and you should see just where all your values are being put.

I’ve taken your XML file and loaded it once with this code:

local function dump(t) for k,v in pairs(t) do print(k,v) end end local json = require("json") local xmlapi = require( "xml" ).newParser() local gameAnswers1 = xmlapi:loadFile( "file1.xml", system.ResourceDirectory ) gameAnswers1 = xmlapi:simplify( gameAnswers1, nil, nil, true ) -- ( xml, tbl, indent, dumpToConsole ) gameAnswers1Gametype1 = {} gameAnswers1Gametype1 = gameAnswers1.gametype1 dump(gameAnswers1Gametype1)

I slightly modified your XML, by changing the number of child elements in the second category:

\<?xml version="1.0" encoding="utf-8"?\> \<library\> \<gametype1\> \<category name="Cat 1"\> \<answer\>Answer One\</answer\> \<answer\>Answer Two\</answer\> \<answer\>Answer Three\</answer\> \</category\> \<category name="Cat 2"\> \<answer\>Answer 4\</answer\> \</category\> \</gametype1\> \<gametype2\> \<category name="Cat 3"\> \<answer\>Answer Five\</answer\> \<answer\>Answer Six\</answer\> \<answer\>Answer Seven\</answer\> \</category\> \<category name="Cat 4"\> \<answer\>Answer Eight\</answer\> \<answer\>Answer Nine\</answer\> \<answer\>Answer Ten\</answer\> \</category\> \</gametype2\> \</library\>&nbsp;

Which dumps out this:

library { gametype1 { category .name = { answer answer , answer , } category .name = { answer } , } gametype2 { category .name = { answer answer , answer , } category .name = { answer answer , answer , } , } } category table: 02EF3B20

To explain my post above:

If (as the code has) you do:

dump(gameAnswers1Gametype1) print(#gameAnswers1Gametype1) print(#gameAnswers1Gametype1.category) print(" &nbsp;") dump(gameAnswers1Gametype1.category[2])

you will see:

category &nbsp; &nbsp; &nbsp; &nbsp;table: 02EDC400 0 2 answer &nbsp;Answer 4 name &nbsp; &nbsp;Cat 2 \_\_special &nbsp; &nbsp; &nbsp; table: 02EDC718

in the terminal.

The first terminal line is because ‘gameAnswers1Gametype1’ contains one named table entry called ‘category’.

The second terminal line is because ‘gameAnswers1Gametype1’ does not contain numerically indexed items ( [1], [2], [3], etc. ) 

The third terminal line is because  ‘gameAnswers1Gametype1.category’ is a table which contains two items.

Terminal lines 5,6,7 are the contents of the second table item found in the ‘gameAnswers1Gametype1.category’ table.

Btw, ‘__special’ is for internal working data in my ‘simplify’ function. Please don’t mess with that.

I believe I understand, and I certainly couldn’t do this type of coding on my own to build and simplify tables based on XML data, but considering that you are loading data that you might not know the quantity of data and number of nodes in the configuration, it just seems odd that it wouldn’t build the data structure the same and consistently for all nodes so that it could all be referenced the same.

I don’t believe this is just an issue with your XML parser.  I believe the other that I was using likely had the same problem as well.

It just seems very odd to me that with how useful this code really is, there is that one big gotcha in it.  But maybe its a gotcha that only I would ever run across.  I seem to do that a lot when coding with Corona.  “Wait, you want to use it how?”  :slight_smile:

Thanks, @horacebury, for your help.  It is mnuch appreciated.

Yes, I understand your point. If you ignore the simplify functions in my library you have Jonathan Beebe’s XML library - I just added a couple of functions.

Beebe’s library does a very good job of loading an XML file into table format. From there you can treat it exactly as I believe you want to: every element structure is stored as a table.

What I wanted was a direct representation of collections versus values. I decided that if an element appeared only once then it deserved to have a dot notation access, whereas if it appeared numerous times (ie, more than once) then it was indexed numerically.

Perhaps what you really want is to represent your data in a JSON format. This is built into Corona with the json.* library (see docs) and converts very, very nicely (not least, natively!) back and forth between files and tables. This is my choice where possible these days.

Thanks, maybe I will consider doing JSON.

@horacebury,

I posted this thread (http://forums.coronalabs.com/topic/45523-xml-files-and-table-concatenation/) about concatenating tables built from an XML file.  Just saw that you have a different XML parser that would likely also do what I want.

My question for you is after simplifying the XML data into a table, how could I easily concatenate two of these tables into one?

Here is what my XML looks like:

\<?xml version="1.0" encoding="utf-8"?\> \<library\> \<gametype1\> \<category name=""\> \<answer\>\</answer\> \<answer\>\</answer\> \<answer\>\</answer\> \</category\> \<category name=""\> \<answer\>\</answer\> \<answer\>\</answer\> \<answer\>\</answer\> \</category\> \</gametype1\> \<gametype2\> \<category name=""\> \<answer\>\</answer\> \<answer\>\</answer\> \<answer\>\</answer\> \</category\> \<category name=""\> \<answer\>\</answer\> \<answer\>\</answer\> \<answer\>\</answer\> \</category\> \</gametype2\> \</library\>

Using your XML parser, here is my code for reading the data:

gameAnswers1 = xmlapi:loadFile( "file1.xml", system.ResourceDirectory ) gameAnswers2 = xmlapi:loadFile( "file2.xml", system.ResourceDirectory ) gameAnswers1 = xmlapi:simplify( gameAnswers1 ) gameAnswers2 = xmlapi:simplify( gameAnswers2 ) gameAnswers1Gametype1 = {} gameAnswers1Gametype1 = gameAnswers1.gametype1 gameAnswers2Gametype1 = {} gameAnswers2Gametype1 = gameAnswers2.gametype1 dump(gameAnswers1Gametype1)

So far, so good, I think.  The dump tool shows that the category node is the first in the table and I can correctly get the number of category nodes using #gameAnswers1Gametype1.category.
 

But how do I combine gameAnswers1Gametype1 and gameAnswers2Gametype1 into one table so that I can access them as one set of data?  I’ve tried some different pieces of code people recommended, but I either end up with just the entries from the first table, or no entries at all.

You help would be most appreciated.

@horacebury,

So with the help of someone on Stack Overflow, I have figured out how to do this using your XML parser.  Mostly.

Here is the code I am using:

local function concatMe(fromTable, intoTable) for i = 1,#fromTable do table.insert(intoTable, fromTable[i]) end end gameAnswers1 = xmlapi:loadFile( "file1.xml", system.ResourceDirectory ) gameAnswers2 = xmlapi:loadFile( "file2.xml", system.ResourceDirectory ) gameAnswers1 = xmlapi:simplify( gameAnswers1 ) gameAnswers2 = xmlapi:simplify( gameAnswers2 ) concatMe(gameAnswers2.gametype1.category, gameAnswers1.gametype1.category)

So this works great if the XML file (structure noted in previous post) that I am copying from has more than one category node.  If it only has one category node, then the #fromTable resolves as 0.  Add another category node and it resolves as 2.

Any idea why that would be?

Yes, the simplify function decides that if you have a list of elements then you need a table of data, but if you have a single item it is probably not a list and makes it available as properties. If you pass the last parameter to simplify as true it will dump out the XML structure as it sees it and you should see just where all your values are being put.

I’ve taken your XML file and loaded it once with this code:

local function dump(t) for k,v in pairs(t) do print(k,v) end end local json = require("json") local xmlapi = require( "xml" ).newParser() local gameAnswers1 = xmlapi:loadFile( "file1.xml", system.ResourceDirectory ) gameAnswers1 = xmlapi:simplify( gameAnswers1, nil, nil, true ) -- ( xml, tbl, indent, dumpToConsole ) gameAnswers1Gametype1 = {} gameAnswers1Gametype1 = gameAnswers1.gametype1 dump(gameAnswers1Gametype1)

I slightly modified your XML, by changing the number of child elements in the second category:

\<?xml version="1.0" encoding="utf-8"?\> \<library\> \<gametype1\> \<category name="Cat 1"\> \<answer\>Answer One\</answer\> \<answer\>Answer Two\</answer\> \<answer\>Answer Three\</answer\> \</category\> \<category name="Cat 2"\> \<answer\>Answer 4\</answer\> \</category\> \</gametype1\> \<gametype2\> \<category name="Cat 3"\> \<answer\>Answer Five\</answer\> \<answer\>Answer Six\</answer\> \<answer\>Answer Seven\</answer\> \</category\> \<category name="Cat 4"\> \<answer\>Answer Eight\</answer\> \<answer\>Answer Nine\</answer\> \<answer\>Answer Ten\</answer\> \</category\> \</gametype2\> \</library\>&nbsp;

Which dumps out this:

library { gametype1 { category .name = { answer answer , answer , } category .name = { answer } , } gametype2 { category .name = { answer answer , answer , } category .name = { answer answer , answer , } , } } category table: 02EF3B20

To explain my post above:

If (as the code has) you do:

dump(gameAnswers1Gametype1) print(#gameAnswers1Gametype1) print(#gameAnswers1Gametype1.category) print(" &nbsp;") dump(gameAnswers1Gametype1.category[2])

you will see:

category &nbsp; &nbsp; &nbsp; &nbsp;table: 02EDC400 0 2 answer &nbsp;Answer 4 name &nbsp; &nbsp;Cat 2 \_\_special &nbsp; &nbsp; &nbsp; table: 02EDC718

in the terminal.

The first terminal line is because ‘gameAnswers1Gametype1’ contains one named table entry called ‘category’.

The second terminal line is because ‘gameAnswers1Gametype1’ does not contain numerically indexed items ( [1], [2], [3], etc. ) 

The third terminal line is because  ‘gameAnswers1Gametype1.category’ is a table which contains two items.

Terminal lines 5,6,7 are the contents of the second table item found in the ‘gameAnswers1Gametype1.category’ table.

Btw, ‘__special’ is for internal working data in my ‘simplify’ function. Please don’t mess with that.

I believe I understand, and I certainly couldn’t do this type of coding on my own to build and simplify tables based on XML data, but considering that you are loading data that you might not know the quantity of data and number of nodes in the configuration, it just seems odd that it wouldn’t build the data structure the same and consistently for all nodes so that it could all be referenced the same.

I don’t believe this is just an issue with your XML parser.  I believe the other that I was using likely had the same problem as well.

It just seems very odd to me that with how useful this code really is, there is that one big gotcha in it.  But maybe its a gotcha that only I would ever run across.  I seem to do that a lot when coding with Corona.  “Wait, you want to use it how?”  :slight_smile:

Thanks, @horacebury, for your help.  It is mnuch appreciated.

Yes, I understand your point. If you ignore the simplify functions in my library you have Jonathan Beebe’s XML library - I just added a couple of functions.

Beebe’s library does a very good job of loading an XML file into table format. From there you can treat it exactly as I believe you want to: every element structure is stored as a table.

What I wanted was a direct representation of collections versus values. I decided that if an element appeared only once then it deserved to have a dot notation access, whereas if it appeared numerous times (ie, more than once) then it was indexed numerically.

Perhaps what you really want is to represent your data in a JSON format. This is built into Corona with the json.* library (see docs) and converts very, very nicely (not least, natively!) back and forth between files and tables. This is my choice where possible these days.