Suggestions on how to parse this string 12 -23.12 14.4 -1?

DiscWiz · January 19, 2011, 6:13am

I have been fighting for a couple of days parsing this simple string into its sub parts. The output I am looking for is

12
-23.12
14.4
-1
Any suggestions?

Dave [import]uid: 18679 topic_id: 5201 reply_id: 305201[/import]

DiscWiz · January 19, 2011, 7:49am

“12 -23.12 14.4 -1” [import]uid: 18679 topic_id: 5201 reply_id: 17283[/import]

jhocking · January 19, 2011, 7:50am

what string? [import]uid: 12108 topic_id: 5201 reply_id: 17282[/import]

DiscWiz · January 19, 2011, 8:11am

I tried this with no luck, sorry very new to this.

local string
for x in string.split(“a,b,c”, “,”) do
print(x)
end

but get a compile error.

Runtime error
/Corona/main.lua:5: attempt to index local ‘string’ (a nil value)
stack traceback:
[C]: ?
/Corona/main.lua:5: in main chunk
Runtime error: /Corona/main.lua:5: attempt to index local ‘string’ (a nil value)
stack traceback:
[C]: ?
/Corona/main.lua:5: in main chunk
[import]uid: 18679 topic_id: 5201 reply_id: 17285[/import]

jhocking · January 19, 2011, 8:36am

lookup the split() command

ADDITION: oops forgot this is Lua, too used to Python. Whatever, it’s still the first result when you look up that command
http://lua-users.org/wiki/SplitJoin [import]uid: 12108 topic_id: 5201 reply_id: 17284[/import]

Ludicrous_Software · January 19, 2011, 8:41am

Hi,

Try this:

[lua]local s = “12 -23.12 14.4 -1”

t = {}
for w in string.gmatch(s, “%S+”) do
t[#t+1] = w
end[/lua]

(This is adapted from sample code in the Lua reference manual.)

Every substring you want to pull out of your string will be added to the table t. So once this is all done:

[lua]t[1] = “12”
t[2] = “-23.12”[/lua]

and so on.

“%S+” collects all non-space characters = %S on its own is an individual non-space character, the ‘+’ tells Lua to match 1 or more non-space characters, which is how you get whole words. [import]uid: 1294 topic_id: 5201 reply_id: 17295[/import]

jhocking · January 19, 2011, 8:55am

The point of that link is to describe how to do the same thing as the split() command from other languages like Python and Perl.

In other words, Lua doesn’t have a split() command, my bad. [import]uid: 12108 topic_id: 5201 reply_id: 17292[/import]

jhocking · February 1, 2011, 6:29pm

That is a really useful bit of code, thanks!

For everyone else, here is more information about Lua’s string commands:
http://lua-users.org/wiki/StringLibraryTutorial

ADDITION: It’s also pretty handy to use commas as a separator so the pattern for that is “[^,]+”

ADDITION2: That code using gmatch() is impressively concise, but I just noticed in Code Exchange there seems to be a more robust version of this same idea.

http://developer.anscamobile.com/code/explode-function [import]uid: 12108 topic_id: 5201 reply_id: 19018[/import]

jhocking · May 18, 2011, 4:53am

I just found an extensive pattern matching library for Lua called LPeg:
http://www.inf.puc-rio.br/~roberto/lpeg/

It looks way more complex than is needed for this problem (Lua’s built-in gmatch is plenty) but the fact that it has an add-on module to work with regular expressions makes it very appealing:
http://www.inf.puc-rio.br/~roberto/lpeg/re.html

Even though it’s overkill I’d be tempted to use the latter method simply to keep in sync with how other languages (perl and python especially) do regular expressions.

ADDITION:

I found another related post online

http://stackoverflow.com/questions/6033807/split-a-string-on-a-separator [import]uid: 12108 topic_id: 5201 reply_id: 35361[/import]

ewing · May 19, 2011, 1:12pm

LPeg requires a C module to be included so you can’t currently use it in Corona.

We recently started using LPeg internally in Corona to help with the Android file and package name validation rules. But LPeg is currently is not exposed so you can’t use it in your scripts.

LPeg is very impressive and seems a lot more elegant and powerful than regex. I also like it because it seems to be much more readable and maintainable than regex patterns. With regex, I write a pattern once, and a week later I forget how it works. If you have to fix it or change it, it can be very difficult sometimes.

For our Android package name identifier, we need to make sure package names follow certain rules (similar to Java rules, but not identical to our annoyance). They must look something like:
com.ansca.ourapp
where there are at least two word segments separated by a single period, alphanumeric characters and underscore are allowed for each word segment but the first character of each word may not be a number or underscore, and you are not allowed to have any Java reserved words as a word segment.

Using the LPeg re module, our grammar looks like this:

local packageIdentifierGrammar = re.compile([[ PackageIdentifer (<dot><identifier>)+ !. }<br> Dot Identifier <baseidentifier><br> BaseIdentifier JavaReservedToken ![A-Za-z0-9_]+<br> JavaReservedWord ]])<br>
I consider this generally readable and maintainable, while just thinking about what the regex pattern would need to be gives me a headache (particularly the compliment parts about not allowing Java reserved words…I suspect we would would end up writing multiple post-processing functions to do each step in pieces instead, but with LPeg it’s all there in that code snippet).

The only downside to LPeg is that it is relatively new so there aren’t a lot of tutorials or documentation out there.

Anyway, I think there is a lot of potential for Corona users if we make it available. Things like parsing XML or JSON or even Lua come to mind. And I liked reading this LPeg tutorial which made me think of both parsing map directions and writing a classic adventure/mud game.
http://www.gammon.com.au/forum/?id=8683

But so far nobody has requested LPeg from the Corona community.

[import]uid: 7563 topic_id: 5201 reply_id: 37339[/import]