com.ansca.corona.CoronaService ANR affects 20k users

Hey Troy. I don’t have a good list of widget bugs that crash and turn into reports you are seeing. So for #1, I don’t have a good answer for you.

And for #2, yes, we’ve made changes to our version of Lua when we’ve needed to address things, but what you’re asking would be a fundamental change in how Lua works. In other words, I seriously doubt we could or would  change how Lua error handle’s chunks since there is an expectation that Lua works that way.

The Corona team is busy at the moment, and it may not seem like but addressing these Android crashes is the highest priority of our engineering team right now. 

There seems to be two core problems:  1. OpenAL audio crashes.  2. Changes to Google Play core services libraries.  

In many cases, people are seeing jumps in ANR’s and Crashes even though they haven’t updated their apps in a long time. Corona can’t magically change your apps once they are on a device. So we have to look at what else may have changed. Many of these crashes as you note are happening on Android 7 and Android 8 which lends it to updates to Google Play. Many plugins touch Google Play’s base libraries. So we have to pretty much rebuild every plugin we are in control over. We have to get third parties to rebuild theirs now that we’ve updated our dependency library to use a more modern version of the Google Play libraries. This simply takes time and testing to make sure we don’t make the problem worse. It’s possible today, depending on what plugins you use, that you could push a new version and start seeing a decline in these reports. Have we gotten them all yet? No, but we are spending a great amount of engineering effort to address them as fast as we can.

The other crashes are around Android’s horrible support for the OpenAL library. Not only does that generate a bunch of crashes, it performs horribly as well. We have a new under-the-hood audio library in beta now for Android that is free of OpenAL and it’s call compatible with audio.* so when it’s released you should just need to rebuild and resubmit and wait on your community to update.  But it’s still got bugs we are working through and we don’t want to make it worse until we are comfortable that it’s ready to go.  

I can assure you we are working to get these addressed. As far as hardening Lua, I’m not sure what we can do there. We always look to harden Corona and we use plugins already to help maintain the integrity of the engine. But your device operating systems are a moving target and we always have to be agile to adapt to these changes. 

Rob

Thank you for the update Rob, really appreciate it!

I don’t speak for everyone but I guess one of our main concerns also is how fast can we deliver the fix to our players. Do you have any ETA on the fixes so that we can at least communicate this to our players?

In my case, my game is on Google Play’s Early Access and I’m in the process of coordinating my game launch with them but Google will not feature it unless I trim down the ANRs and crashes.

(Updated: This issue RE: runtime error caused by widget.lua has been moved to https://forums.coronalabs.com/topic/72782-runtime-error-caused-by-widgetlua/))

@Rob, thank you for the complete explaination. I understand your point about how hardening Lua could actually create a new behavior, making it worse. As for the widget.lua line 27, can you tell me if the new version in development will resolve this issue that nick_sherman and I both have experienced?

Here is a list of plugins that have been fixed: Revmob, AdColony, InMobi, Facebook Audience Network (Trying to determine if it’s the rev-share version, paid version or both), Unity Ads and Chartboost.

If you’re using these, it might be worth producing a release.

Rob

Troy, I am unware of where any changes to the widget library stand on our priority list. It’s in the queue of things that causes Android Crashes and we are working to get to them as quickly as we can.

Rob

This thread is wandering, and I don’t wish to minimize the impact of various runtime errors caused by errant code that causes “attempt to index nil” -type errors, indeed those should be addressed wherever found, BUT…

I hope your engineering team knows the difference between runtime errors (typically caused by Lua, whether dev’s own or Corona’s libs like widget/etc, and/or their Java-Lua interfaces) and low-level ANR’s (typically caused by flaws in native-code libs, and/or their Java-native interfaces)

To some degree a dev can work around the first class of higher-level problems, even if buried within Corona libs (by monkey-patching them, or using something else), but there’s usually no practical way to work-around the second class of lower-level problems at a dev’s level.

It’s these low-level native-lib “true” ANR’s that are having the greatest impact.  (at least, for those who’ve already done everything else they can do at higher levels to eliminate runtime errors)  So that’s where I hope your engineering is focusing their efforts.

(Updated: This issue RE: runtime error caused by widget.lua has been moved to https://forums.coronalabs.com/topic/72782-runtime-error-caused-by-widgetlua/))

@Rob, Technically, if I wanted to fix the widget errors while waiting for Corona to fix them, I could move it into my project folder and make the changes myself. So, I agree with Dave’s point that priority should be on bugs we (as devs) are unable to reach ourselves. And my CRASH / ANR rate (<1%) is only impacting the traffic Google sends me, which really should be my own responsibility, too. But please, if possible, put widget.lua on the priority list just below this.

Can you point me to where I can download the latest widget folder?

The widget library can be found here:  https://github.com/coronalabs/framework-widget

However, it’s a few months out of sync with our internal version. I’m trying to see what’s involved in getting it updated so everyone has access to the latest version.

Rob

A few months?   @Rob most core files are at least 2 years out of date!

@troy, framework crashes are totally out-of-scope (for us) and need Corona to urgently fix. 

Some progress is being made so this is at least positive, but IMHO this has been a growing issue since way before Christmas and HTML builds should really of been paused to concentrate of live apps that are suffering right now.

Today I saw my CRASH count rise again and the number of my downloads go down, again.

The crash seems to impact Android 7 and 8 the most.

Can anyone refer me to a link where I can learn more about how to read this report? Is the top item where it stopped (at com.ansca.corona.Controller.stop (Controller.java:263) - waiting to lock <0x0abd65e7> (a com.ansca.corona.Controller) held by thread 11

or is it the one at the bottom?

@SGS, would you classify this as a framework crash, also?

Here is the main report…

Broadcast of Intent { act=android.intent.action.SCREEN_OFF flg=0x50000010 launchParam=MultiScreenLaunchParams { mDisplayId=0 mBaseDisplayId=0 mFlags=0 } (has extras) }

“main” prio=5 tid=1 Blocked | group=“main” sCount=1 dsCount=0 obj=0x75d0d6a8 self=0xf1305400 | sysTid=22014 nice=0 cgrp=default sched=0/0 handle=0xf413a534 | state=S schedstat=( 0 0 0 ) utm=1969 stm=897 core=1 HZ=100 | stack=0xff66f000-0xff671000 stackSize=8MB | held mutexes=

at com.ansca.corona.Controller.stop (Controller.java:263)

  • waiting to lock <0x0abd65e7> (a com.ansca.corona.Controller) held by thread 11

at com.ansca.corona.CoronaActivity.requestSuspendCoronaRuntime (CoronaActivity.java:2005)

at com.ansca.corona.CoronaActivity.onPause (CoronaActivity.java:1828)

at android.app.Activity.performPause (Activity.java:6894)

at android.app.Instrumentation.callActivityOnPause (Instrumentation.java:1323)

at android.app.ActivityThread.performPauseActivityIfNeeded (ActivityThread.java:3791)

at android.app.ActivityThread.performPauseActivity (ActivityThread.java:3768)

at android.app.ActivityThread.performPauseActivity (ActivityThread.java:3742)

at android.app.ActivityThread.handlePauseActivity (ActivityThread.java:3716)

at android.app.ActivityThread.-wrap16 (ActivityThread.java)

at android.app.ActivityThread$H.handleMessage (ActivityThread.java:1516)

at android.os.Handler.dispatchMessage (Handler.java:102)

at android.os.Looper.loop (Looper.java:154)

at android.app.ActivityThread.main (ActivityThread.java:6247)

at java.lang.reflect.Method.invoke! (Native method)

at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run (ZygoteInit.java:872)

at com.android.internal.os.ZygoteInit.main (ZygoteInit.java:762)

(Updated: This issue RE: runtime error caused by widget.lua has been moved to https://forums.coronalabs.com/topic/72782-runtime-error-caused-by-widgetlua/))

@Rob, thank you for looking into an update to widget.lua

@troy yes it is.  It is most likely caused by switching to an ad or other plugin that is taking control of the runtime.

@SGS, no ads but numerous plugins

I think @Rob’s response was just re the widget repository being out of date.

However, to your point, yes, many of these “framework” issues go way back (for example) - it’s just that no one seemed to cared about them until last May when mandatory reporting and vitals started punishing your ranking.

@dave yeah fixes are long time coming re ANR’s (which would hopefully help you as well).

But I meant the widgets repo is 2 years out of date (well that is what it shows for me).

Just launched a new version on Android and in the first couple of hours, I see this…
ANR
 

“main” prio=5 tid=1 MONITOR | group=“main” sCount=1 dsCount=0 obj=0x418abd08 self=0x417e1898 | sysTid=32385 nice=-11 sched=0/0 cgrp=[fopen-error:2] handle=1074307412 | state=S schedstat=( 0 0 0 ) utm=278 stm=84 core=1

 

at com.ansca.corona.CoronaActivity.requestSuspendCoronaRuntime (CoronaActivity.java:2005)

 

at com.ansca.corona.CoronaActivity.onPause (CoronaActivity.java:1828)

 

at android.app.Activity.performPause (Activity.java:5563)

 

at android.app.Instrumentation.callActivityOnPause (Instrumentation.java:1239)

 

at android.app.ActivityThread.performPauseActivity (ActivityThread.java:3351)

 

at android.app.ActivityThread.performPauseActivity (ActivityThread.java:3320)

 

at android.app.ActivityThread.handlePauseActivity (ActivityThread.java:3298)

 

at android.app.ActivityThread.access$1100 (ActivityThread.java:172)

 

at android.app.ActivityThread$H.handleMessage (ActivityThread.java:1316)

 

at android.os.Handler.dispatchMessage (Handler.java:102)

 

at android.os.Looper.loop (Looper.java:146)

 

at android.app.ActivityThread.main (ActivityThread.java:5598)

 

at java.lang.reflect.Method.invokeNative (Native Method)

 

at java.lang.reflect.Method.invoke (Method.java:515)

 

at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run (ZygoteInit.java:1283)

 

at com.android.internal.os.ZygoteInit.main (ZygoteInit.java:1099)

 

at dalvik.system.NativeStart.main (Native Method)

Wow!  I just looked at my crash rate and it spiked in January around 500, and is currently around 350 / month, it used to be around 30 a month, and that is with corona 3162 and no updates in more than six months!

I use Admob, Widget Candy UI, and Badger IAP libraries, that’s it!

and most of the crashes are happening on S7’s and S8’s…

Does Corona 2018.3274 fix these crashes yet? I see this build and a previous one fix 2 ‘rare’ android bugs. Would be nice if we get a bit more info on what those bugs exactly relate to?

Following up as well. Anyone from Corona who can verify please?

I am also in the same boat as OP (huge amount of ANRs and crashes which are not relevant with my code). I believe this relates to all developers’ projects, but only the ones with larger userbases detects the 1% crashes (which is significant in google terms and can dramatically lower your ranking).

@Corona Team

Please keep us informed about related findings. Fixing ranking punishment is much much more important than new features!