error handling in lua/corona - pro question

Hello,

I have come to the very last point of the app development (no, its not doing the docs)
which is a decent error handling.

If the app crashes (for whatsoever reason) corona/lua will throw an error, leave the function and go on with execution. This can be a problem since this function might need to do something important which then causes other errors or ugly graphic glitches such as flickering transitions or a shattered stage.

Using pcall or xpcall does not really help when using corona since inner functions and nested (async) calls will not get handled properly. Furthermore if using classes and oo the code will become quite insane. I am using middleclass for oo which has been a great help but does not work well together with pcall. So again, placing pcall everywhere might work (despite the overhead) most of the time but is not a very good solution and has problems with functions that have return values and nested functions.

Since a “real” exception handling is not something we can do with lua[1][2] to handle errors appropriate, the only thing I want to do is:

  1. Catch the Runtime error (catch-all)
  2. Save an error.log with device, os, app state, stacktrace and the error.
  3. If there is network send the error.log (if the user is ok with it) to my server so i can examine it (or send it next time)
  4. Open an alert() window telling the user that it all went to hell.
  5. Perform os.exit() to close the app to prevent any unexpected behavior.

Why do I want to do this? First of all it is very hard to find a bug by 1 star “frequent crashes” ratings and second I don’t want to get unexpected behavior which will look very unprofessional to the user. Also a crash with no message might leave a bitter taste. Of couse if something fails hard and result in a segfault there might be no holding back but no solution is perfect.

So the question is: Is there a way to catch errors in Corona? Ideal would be to register a runtime event listener that listens for errors. I guess this would improve the quality of the apps made with corona significantly.

Any ideas on this?

Alex

[1] http://www.lua.org/wshop06/Belmonte.pdf
[2] http://failboat.me/2010/lua-exception-handling

[import]uid: 11772 topic_id: 16330 reply_id: 316330[/import]

those depend on functions in Lua that are sandboxed in CoronaSDK lua version, so that level of error handling cannot be achieved. However you can use the pcall to have some control on your errors.

cheers,

?:slight_smile: [import]uid: 3826 topic_id: 16330 reply_id: 60833[/import]

Just wondering, what kinds of operations are you doing that might cause crashes? For me the best way to deal with exceptions is to try to prevent the major causes of them in the first place: force global declarations, use lots of asserts, avoid use of complex inheritance hierarchies with middleclass, use statemachines to more cleanly isolate transitional code to minimize race conditions, and TEST.
Besides that I rarely use pcall unless it’s to perform some sort of I/O like requiring some file that may not exist like include spritesheets of a certain size.

Also just from experience there are a couple of things in Corona that are crash/error prone such as,

  • Sending bad data into the sprite API calls. I use Spriteloq… actually I wrote Spriteloq :slight_smile: so the API does stuff to shield me from sending bad data in.
    -The event dispatching system also throws mysterious errors and behaves poorly. I’ve documented this here: http://developer.anscamobile.com/forum/2011/08/01/possible-addeventlistener-bugs
  • I get errors when trying to dispose of sprite sheets that have recently had their sprite instances call removeSelf. I had to add a timer to delay the removal.

Anyways, that’s enough ranting, but as much as Corona makes me productive I’d like to see these warts cleared up. So I just want to give people a heads up. [import]uid: 27183 topic_id: 16330 reply_id: 60850[/import]

Thanks for the reply,

well it’s not that my app is not stable - in fact I would rate it as rock stable. I even made test scripts (okay it’s not unit testing but still) to test for regression bugs. Using middleclass the inheritance hierarchy is pretty flat with just some inheritance (AbstractGameSituation - ConcreteGameSituation) since copy pasting is somewhat of error prone. All the very sensitive parts you mentioned (like opening a file) are sanity checked before referencing anything. I am not trying to write banana-ware , my fear is that someone somewhere got some kind of device (probably Android) and something happens that I have not thought of and that’s why I was trying to build a last line of defense where I can provide quality work not just stand there scratching my forehead :wink: besides that exception handling is always a good idea even SymbianOs has trapd()

For your problems:
-Sending bad data into the sprite API calls. I use Spriteloq… actually I wrote Spriteloq :slight_smile: so the API does stuff to shield me from sending bad data in.

BIG thanks for the spriteloq tip - I wont need it for this project but it’s gona make things so much easier on the next one. Now I am using the corona sprite api since i am downloading “game” packages and cant download additional code (the sprite definition lua file) And yeah that sequence number out of … error is something i saw frequently till i make a class which deals with that.

-The event dispatching system also throws mysterious errors and behaves poorly.

I can partly confirm this, its a bit strange that the callback method, listener method or whatever its called will stay alive even when the function/object holding it is destroyed. And adding an event listener twice will result in the callback function getting called twice which is a bit odd.
For example if you add it 100 times (since a loop goes wrong) to perform a transition to x=0 in the listener you might not even notice that this transition is called 100 times - I found out that this is where a lot of erros come from because it happens so easily. Also catching the event the first time by returning true wont help here. Also calling events from a callback/listener method of another event does not work.
If you write a class/module that keeps track of the event listeners that are active it helps a lot. (give it 3 states active, deactivated and removed) for a quick test just add a global variable to the callback and count how many times it’s called) Since I am doing that I am fine with events. But adding and removing event listeners all the time is a problem, even more if it happen fast in loops. I even implemented a message system with events (similar to a command pattern) that drives the main game state machine to avoid race conditions and unhandled situations. So the user can not move somewhere he does not belong and if the watchdog finds that something goes wrong (positions, dialogs, network) I can bring the game back on track.
I watched you video and I cant make it out, above is just what I experienced maybe it helps. (Btw. I LOVE the bunny ;))

I get errors when trying to dispose of sprite sheets…
Yes. Same here.
This error handling thing is the very last thing I would like to have before I release, since so many things can go wrong and I dont have 1000 devices here to test. Also automated test clouds like deviceanywhere are quite expensive. Maybe I can request that error handling feature from corona… =)

Cheers!
Alex

[import]uid: 11772 topic_id: 16330 reply_id: 61057[/import]

You bring up a very good point about having to test on a bunch of different devices. There’s a bunch of issues you might run into on Android sound, texture memory, I/O differences, screen res differences.

Supposedly Ansca has to be testing their stuff on actual devices, so really their our first line of defense against device specific errors with regards to the API. If you’re in the area, it might be possible to visit them to test your app.

But some sort of crash log would be nice. If you request it, I’ll second it. :slight_smile:

I even implemented a message system with events (similar to a command pattern) that drives the main game state machine to avoid race conditions and unhandled situations.

I’ve implemented my own hierarchical state machine based off of the framework here: and using them for my games entities always leads to more robust code where when it does fail helps me easily track down the location of failures. It can be a tedious and a pain in the ass to setup though given the nature of lua and it’s lack of actual names for functions.

I watched you video and I cant make it out, above is just what I experienced maybe it helps. (Btw. I LOVE the bunny :wink:
Hmm you should be able to view the videos at 720p if things are fuzzy. Hopefully the bunny will make it into one of our games. :slight_smile:

Cheers!
[import]uid: 27183 topic_id: 16330 reply_id: 61104[/import]

I “third” this request…we need a handler for errors… [import]uid: 6175 topic_id: 16330 reply_id: 61414[/import]

Has a runtime error event been implemented that we can pass to a handler? [import]uid: 4596 topic_id: 16330 reply_id: 91780[/import]

+1 Error handler request!
:slight_smile:
[import]uid: 86439 topic_id: 16330 reply_id: 100249[/import]