Extract Image URL from a String

I’m using string.match to extract the image URL from an RSS feed into my app.  But I’m just not knowledgeable enough about regular expressions to do it properly. 

My feed is stored in a table called stories[1].content_encoded (thank you, Rob Miracle for the super RSS reader code).  So the string looks kinda like this: 

Lorem ipsum dolor sit amet, consectetur adipiscing elit. <img src=“http://www.myDomain/wp-content/2013/08/myImage.jpg” alt=“my image” />Nunc commodo, metus ornare convallis blandit, arcu tortor hendrerit sapien, vel porttitor tortor magna non urna. Mauris ut tincidunt massa. Sed bibendum pharetra aliquam.

I got as far as this:

local imgURL = string.match(stories[1].content_encoded, (" src=.* " ))

but obviously it matches the whole line, including my alt tag.  

I just want to pull the URL without the quotes so I can set the URL to a local variable that is called in display.loadRemoteImage.  I’m sure it can be done in RegEx I just don’t know how.

Why don’t you use an XML parser?

http://developer.coronalabs.com/code/much-improved-dump-function-and-xml-simplify

I get errors left and right when I use that code.  Which is why I went with:

https://github.com/robmiracle/Corona-SDK-RSS-Reader

Which works great.  

Thank you for the suggestion but I just need a RegEx to extract the image URL from my string.  

Any takers?

I think the following should work:

[lua]

local imgURL = string.match(stories[1].content_encoded, ( [[img.-src="(.-)"]] ))

[/lua]

You have to use a capture (which is the part in parentheses) to get just that portion of the string.  I also used Lua’s [[]] notation for a literal string, so that I could more easily use " within the pattern without having to worry about escaping them.

  • Andrew

Thanks a million!  I would have struggled with that for a while. 

Why don’t you use an XML parser?

http://developer.coronalabs.com/code/much-improved-dump-function-and-xml-simplify

I get errors left and right when I use that code.  Which is why I went with:

https://github.com/robmiracle/Corona-SDK-RSS-Reader

Which works great.  

Thank you for the suggestion but I just need a RegEx to extract the image URL from my string.  

Any takers?

I think the following should work:

[lua]

local imgURL = string.match(stories[1].content_encoded, ( [[img.-src="(.-)"]] ))

[/lua]

You have to use a capture (which is the part in parentheses) to get just that portion of the string.  I also used Lua’s [[]] notation for a literal string, so that I could more easily use " within the pattern without having to worry about escaping them.

  • Andrew

Thanks a million!  I would have struggled with that for a while.