String Searching Question

Hello, currently I’m having troubles understanding how Lua’s string functions work, and how to solve this problem.

function main()  
 webpage = "[html]

Blah blah blah blah

# 

# 
[/html]"  
 --Loop through webpage string and get the text between "
# " and "
" for all cases of "
# " and "
" then get the text between the parentheses in the anchor tag for each case  
 --Use gsub? string.find?? not sure what to do  
  
end  
  
main()  

As you can see I’m trying to loop through the webpage string and get the text between all of the h1 and /h1’s and then use that to get the text in between the parentheses in the anchor tag (the http://www.example.com/test.htm part). You may be asking, what’s the point of narrowing it down to the anchor tag within the h1 tag? Well it’s because in the real webpage I’m using there’s many anchor tags and I only want the url of the ones within the h1 tags.

I’ve looked at some XML parsers for Lua but frankly they seem like overkill for something like this and I’m not even sure if they’re corona supported.

Regards and thanks for any help. [import]uid: 44393 topic_id: 34361 reply_id: 334361[/import]

Hello stragy!

Is it necessary for you to use LUA to parse this HTML? I have done something similar using PHP script that returns parsed content in JSON format. JSON is then very easy to decode to LUA table in your app.

You could use, for example, Simple HTML DOM Parser (http://simplehtmldom.sourceforge.net/) which worked nicely for my needs. Of course you need to have some website that can host PHP scripts.

Br,

Kalle
[import]uid: 55867 topic_id: 34361 reply_id: 136572[/import]

Wow thanks for the fast reply hytka81. Yes I will definitely use that instead if this kind of thing is too clunky for Lua. [import]uid: 44393 topic_id: 34361 reply_id: 136573[/import]

HTML is basically a type of XML and you might be able to use an XML parser to break the tags apart. It’s done with the string.gsub() function and pattern matching. There are several XML parsers for Lua out there. We blogged about one last year here:

http://www.coronalabs.com/blog/2011/07/29/how-to-use-xml-files-in-corona/

[import]uid: 199310 topic_id: 34361 reply_id: 136604[/import]

Hello stragy!

Is it necessary for you to use LUA to parse this HTML? I have done something similar using PHP script that returns parsed content in JSON format. JSON is then very easy to decode to LUA table in your app.

You could use, for example, Simple HTML DOM Parser (http://simplehtmldom.sourceforge.net/) which worked nicely for my needs. Of course you need to have some website that can host PHP scripts.

Br,

Kalle
[import]uid: 55867 topic_id: 34361 reply_id: 136572[/import]

Wow thanks for the fast reply hytka81. Yes I will definitely use that instead if this kind of thing is too clunky for Lua. [import]uid: 44393 topic_id: 34361 reply_id: 136573[/import]

HTML is basically a type of XML and you might be able to use an XML parser to break the tags apart. It’s done with the string.gsub() function and pattern matching. There are several XML parsers for Lua out there. We blogged about one last year here:

http://www.coronalabs.com/blog/2011/07/29/how-to-use-xml-files-in-corona/

[import]uid: 199310 topic_id: 34361 reply_id: 136604[/import]