com.ansca.corona.CoronaService ANR affects 20k users

Btw, guys. In 3036 we hopefully fixed OpenAL ANRs. We changed memory model not to unload OpenAL when game is paused/quitting.

Feel free to check our changes in 3036 and see how it goes.

Hey vlad

I am still seeing openAL issues with 3306

Here is an example stack trace from a crash…

signal 11 (SIGSEGV), code 1 (SEGV\_MAPERR) libopenal.so #00 pc 00000000000a2f94 /data/app/com.spheregamestudios.designercity-1/lib/arm/libopenal.so #01 pc 0000000000016874 /data/app/com.spheregamestudios.designercity-1/lib/arm/libopenal.so (alcCreateContext+436) #02 pc 0000000000005ff7 /data/app/com.spheregamestudios.designercity-1/lib/arm/libalmixer.so (ALmixer\_Init+214) #03 pc 0000000000130f2c /data/app/com.spheregamestudios.designercity-1/lib/arm/libcorona.so #04 pc 000000000010f684 /data/app/com.spheregamestudios.designercity-1/lib/arm/libcorona.so #05 pc 000000000000cc1c /data/app/com.spheregamestudios.designercity-1/lib/arm/liblua.so #06 pc 000000000001ce30 /data/app/com.spheregamestudios.designercity-1/lib/arm/liblua.so #07 pc 000000000000d068 /data/app/com.spheregamestudios.designercity-1/lib/arm/liblua.so #08 pc 000000000000c374 /data/app/com.spheregamestudios.designercity-1/lib/arm/liblua.so #09 pc 000000000000d1e0 /data/app/com.spheregamestudios.designercity-1/lib/arm/liblua.so #10 pc 00000000000055b8 /data/app/com.spheregamestudios.designercity-1/lib/arm/liblua.so (lua\_pcall+88) #11 pc 0000000000007010 /data/app/com.spheregamestudios.designercity-1/lib/arm/liblua.so #12 pc 000000000000cc1c /data/app/com.spheregamestudios.designercity-1/lib/arm/liblua.so #13 pc 000000000001ce30 /data/app/com.spheregamestudios.designercity-1/lib/arm/liblua.so #14 pc 000000000000d068 /data/app/com.spheregamestudios.designercity-1/lib/arm/liblua.so #15 pc 000000000000c374 /data/app/com.spheregamestudios.designercity-1/lib/arm/liblua.so #16 pc 000000000000d1e0 /data/app/com.spheregamestudios.designercity-1/lib/arm/liblua.so #17 pc 00000000000055b8 /data/app/com.spheregamestudios.designercity-1/lib/arm/liblua.so (lua\_pcall+88) #18 pc 0000000000106fb8 /data/app/com.spheregamestudios.designercity-1/lib/arm/libcorona.so #19 pc 00000000000e9150 /data/app/com.spheregamestudios.designercity-1/lib/arm/libcorona.so #20 pc 0000000000140c68 /data/app/com.spheregamestudios.designercity-1/lib/arm/libcorona.so #21 pc 0000000000142cb0 /data/app/com.spheregamestudios.designercity-1/lib/arm/libcorona.so #22 pc 0000000000141120 /data/app/com.spheregamestudios.designercity-1/lib/arm/libcorona.so #23 pc 0000000000141edc /data/app/com.spheregamestudios.designercity-1/lib/arm/libcorona.so #24 pc 000000000002bd48 /data/app/com.spheregamestudios.designercity-1/lib/arm/libcorona.so #25 pc 00000000008d4e8d /data/app/com.spheregamestudios.designercity-1/oat/arm/base.odex

@SGS - ok, so crashes still, but did it clear up any _ ANR _s?  (aka PackageStateChangeService?)

I haven’t yet tried 3306, but it’s the _ ANR _'s that I’m mostly concerned with, not the crashes.

I do get that SIGSEGV crash from libopenal too, but only occasionally, it’s easily 10X less frequent than the PackageStateChangeService ANR.

My biggest crash offender is “signal 5 (SIGTRAP), code 1 (TRAP_BRKPT)” in libc.so.  The SIGSEGV in libopenal.so is #3 on my list, at about only 1/3 the rate of the #1.  All told, cumulative crash rate is still only 0.16%, well under bad behavior threshold, and 99.8% daily crash-free.

But that single ANR source (PackageStateChangeService, libopenal) is so rampant that it’s enough all on its own to cross the bad behavior threshold for ANR’s!!  So, if the ANR is fixed then that’d be a huge deal for me, I could wait on a separate investigation/fix for the crash.

Only just started a rollout today… so far figures look positive

Crashes7Crashes per 1,000 devices1.5-70% vs All Install events4.8K

time will tell

globally I am

ANR-free daily sessions 

99.6%

 

Crash-free daily sessions 

99.7%

Btw, this is different crash, we tried to deal with ANR in 3306.

Could you tell trend, like if you started to get more or less crashes and/or ANRs?

For me all ANRs are gone with 3306, so it definitely helped a lot.

Best regards!

Also, please , use pastebin for crash reports. They’re getting ugly formatted and hard to copy. 

Hi vlads,

like I mentioned in another thread, this seems to be helping on those ANR issues. However, it brings new problems with Stuck partial wake locks (background) regarding AudioMix.

Please see my response in the other thread:

https://forums.coronalabs.com/topic/72856-lots-of-anrs-on-android-packagestatechangedservice-androidintentactionscreen-off/?p=383107

Best regards!

I’m seeing the same thing. Possibly related: I checked my phone today, and one of my apps (which I used on Friday last week) is still draining battery in the background. It doesn’t show up in the task switcher, but I can get a PID from adb:

$ adb shell ps -A

u0_a170      26283   707 1974136  24308 0                   0 S se.appfamily.puzzle.super2.free

And logcat shows audio stuff going on in the background (to re-iterate, I’ve not opened the app for 3 days):

$ adb logcat | grep 26283

06-04 12:58:45.657 26283  8331 W AudioTrack: dead IAudioTrack, PCM, creating a new one from processAudioBuffer()

06-04 12:58:45.665 26283 26296 W AudioSystem: ioConfigChanged() closing unknown output 453

06-04 12:58:45.679 26283  8331 W AudioTrack: AUDIO_OUTPUT_FLAG_FAST denied, rates do not match 44100 Hz, require 48000 Hz

06-04 12:58:45.684 26283  8331 D AudioTrack: Client defaulted notificationFrames to 3592 for frameCount 10776

06-04 12:58:46.000 26283  8503 W AudioTrack: dead IAudioTrack, PCM, creating a new one from getPosition()

06-04 12:58:46.004 26283  8503 D AudioTrack: Client defaulted notificationFrames to 3675 for frameCount 11025

[member=‘Perflubron’],  3311 seems to fix that.

My testing seems to back this up.  I see no backgrounded Corona processes eating battery.

On a personal note, it seems recent daily builds seem to break other areas and that makes me sad.

Seems they are more rushed than tested properly.  As (paying) users of the Corona platform we shouldn’t be alerting you to bugs that could easily be found by basic automated testing.

Things like “Oh this recent daily build doesn’t work on Android 4” shouldn’t really be a thing with modern automated testing tools.

For us devs this ends up with 1 star reviews and angry customers :frowning:

Stability is critical!

SGS, I second that. Stability is key. Having said that, daily builds are afaiu not guaranteed to be suitable for production use.

In some situations, however, daily builds are necessary due to things like GDPR. And the problem is made worse for people who know nothing about these forum bug posts, and just pick the latest daily to ‘catch-up’ with all recent things. We as users have no way of knowing if a daily will be good or not. They might suddenly get all these issues introduced by an experimental fix, and have no idea what happened.

This leads me to suggest that there should be two types of builds:

  • Daily builds. Least stable, can contain experimental stuff.

  • Interim releases / tested daily builds. Created when events inbetween releases demand a daily build to be used until next release. Medium stability, no experimental stuff. Tested to a reasonable extent.

IMHO this doesn’t need to be a complicated setup with new download areas or such. Just a note in a daily builds text that this particular daily is more stable / tested.

A second suggestion is for Corona to use custom builds and work with specific pilot users when testing experimental fixes. It is inefficient that SGS, davebollinger, Bjoern, myself, etc all act as guinea pigs when Corona want to know if a certain bugfix works or not. So, before putting a fix in daily builds, put it in a custom build which your pilot users can use to update an app and verify if the fix worked or not. Once it is verified as working, deploy it to daily builds.

It might feel to the other users like the fixes take a bit longer to implement, but I believe they’d be happier if the fixes worked first time.

3311 appears to improve the situation, but wake locks still above threshold

Our ANRs are almost NIL thanks to the Corona team’s recent builds.

May I know which daily build version is everyone using now?

@Rob & @CoronaLabs, I’m happy to report that my new game Update with one of the most recent Corona builds has resulted in ZERO (0) ANRs after a week - about 1000 installs.This is fantastic news. THANK YOU! I will start a new thread about the 2 crashes that remain.

I’m experiencing the same. Any idea or fix for this? Calling Corona support.

Our engineers are working on finding a solution.

Rob

Our installs were at 150 a day, and with these similar errors based upon com.ansca.corona.Controller, Google has demoted us down to about 90 a day, quickly too.

So please, do what can be done. With plenty of RAM and Android versions 5.1, 7.0 and 8.0, this is especially discouraging. I might expect this from slower, old devices.

V20 (elsa) 2 40.0% LG Stylo 3 Plus (sf340n) 1 20.0% Galaxy S8 (dreamqltesq) 1 20.0% Galaxy J7 Prime (j7popeltetmo) 1 20.0%

Broadcast of Intent { act=android.intent.action.SCREEN_OFF flg=0x50000010 launchParam=MultiScreenLaunchParams { mDisplayId=0 mBaseDisplayId=0 mFlags=0 } (has extras) }

Apr 6, 12:32 PM on app version 273
LGE V20 (elsa), 4096MB RAM, Android 7.0

“main” prio=5 tid=1 Blocked | group=“main” sCount=1 dsCount=0 obj=0x75d0d6a8 self=0xf1305400 | sysTid=22014 nice=0 cgrp=default sched=0/0 handle=0xf413a534 | state=S schedstat=( 0 0 0 ) utm=1969 stm=897 core=1 HZ=100 | stack=0xff66f000-0xff671000 stackSize=8MB | held mutexes=

at com.ansca.corona.Controller.stop (Controller.java:263)

  • waiting to lock <0x0abd65e7> (a com.ansca.corona.Controller) held by thread 11

at com.ansca.corona.CoronaActivity.requestSuspendCoronaRuntime (CoronaActivity.java:2005)

at com.ansca.corona.CoronaActivity.onPause (CoronaActivity.java:1828)

at android.app.Activity.performPause (Activity.java:6894)

at android.app.Instrumentation.callActivityOnPause (Instrumentation.java:1323)

at android.app.ActivityThread.performPauseActivityIfNeeded (ActivityThread.java:3791)

at android.app.ActivityThread.performPauseActivity (ActivityThread.java:3768)

at android.app.ActivityThread.performPauseActivity (ActivityThread.java:3742)

at android.app.ActivityThread.handlePauseActivity (ActivityThread.java:3716)

at android.app.ActivityThread.-wrap16 (ActivityThread.java)

at android.app.ActivityThread$H.handleMessage (ActivityThread.java:1516)

at android.os.Handler.dispatchMessage (Handler.java:102)

at android.os.Looper.loop (Looper.java:154)

at android.app.ActivityThread.main (ActivityThread.java:6247)

at java.lang.reflect.Method.invoke! (Native method)

at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run (ZygoteInit.java:872)

at com.android.internal.os.ZygoteInit.main (ZygoteInit.java:762)