CBench - Benchmarking Tool (Testers Needed)

roaminggamer · June 17, 2013, 9:11pm

I recently put together a Benchmarking Tool that can test Corona ‘code snippets’ for these kinds of metrics:

OPS - Raw (Operations Per Second). i.e. How many of some ‘thing’ can be done per-second?
FPS - While doing some ‘thing’, what kind of min, max, and average sustained FPS is Corona capable of?
MEM - How much main memory (not video mem) does it cost to execute or make some ‘thing’?

How To Help

I need your help to test this concept. Specifically I need you to help me run the benchmark(s) and to collect data.

I have prepared two Android builds of the tool:

Build 833 (June 15 2012)
Build 1142 (June 15 2013).

You’ve got the latest build of the benchmarks if they look like this:

If you have an Android device of any type and want to help, please do the following:

Step #1 - Download and install both versions of the app:

Build 833 (alternate link)

Build 1142 (alternate link)

Note: Build 833 required a special binary for Kindle Fire Gen 1 and Nook Gen 1: Build 833 f1 (alternate link)

Step #2 - Run each app at least once. (Runs in about 8 minutes on a Nexus 7.)

Step #3 - When the app tries to send a report (to me) click on your preferred e-mail client and then click send.

Everyone who:

Participates in this exercise,
Runs both version 833 and 1142 on at least one device, AND
Sends me the report for both runs,…

Will get $10 off any tool or template I sell here: RG Tools and Templates

I will e-mail all participants at the end of the study (July 2nd 2013) using BCC to protect your anonymity and privacy.

Origin Of This Idea

I wrote this because I am a performance geek and have always been fascinated by the above kinds of questions. I also wrote this because I wondered, “Hmm… CoronaLabs releases builds frequently. I wonder if there is a lot of performance change between releases?”

I tried twice in the past to make this tool with limited success, but this time (my third attempt) I have managed to balance the complexities of measuring, running, and writing individual benchmarks. I have also found some (what I consider clever) techniques for getting the measuring framework out of the way while running benchmarks.

Additional Notes

When running the apps you may want to disable any automatic sleep and or screen saver apps/settings.
Both apps will work forever, but will only attempt to collect data and send it until the end of July 01, 2013. i.e. The collection window closes in two weeks.
The app has built-in crash detection. If for some reason the app crashes, run it again and the app will offer to send a crash report so I can figure out what when wrong.
When the app is done, you will be presented with the results and can page through them by swiping your finger.
When the collection period is complete I will produce a report and release it here for all participants to read and examine.
Be aware, the results of this first series of runs may not be a perfect reflection of Corona and its performance. This is the ‘shake down’ run to find flaws in my process, collection methods, and analysis.
Will I be selling this and or parts of it or otherwise providing the code from this tool? Yes. I have not ironed out exactly what will come out of this, but for sure I will release code and tools for everyday Corona SDK users so they can measure their own ‘snippets’ and answer the burning questions like: “Is this too slow?” and “Which way is better?”

More To Come

I will be posting to this thread again soon. I have already collected some data and will show you the kind of information that this app will produce.

Be Sure To Brag

You are encouraged to post your run times to this thread (see a few posts down for my Nexus 7 and soon to be updated Kindle Fire Gen 1 run times).

Thanks!

Ed M.

(aka The Roaming Gamer)

roaminggamer · June 17, 2013, 9:12pm

I failed to mention this above, but the benchmarks run as follows:

The app loads and checks for a prior ‘crash’. If found, it asks to send a report.
You are presented with a menu. You have one choice. ‘Run’.
Clicking run starts the benchmarking harness. This then runs 32 tests. Each test is run 3 times in a row for a total of 96 test runs.
At the end, the app asks to send a report to me.
Finally, you are presented with a listing of the test runs and their results.

Of course, 32 tests is not enough to get a full picture, but like I said in my last post, this is a ‘shake down’ run for the tool, not a true evaluation of Corona SDK.

You can get this tool and run it yourself, but if you want to see a fast run (on the simulator) here it is:

http://www.youtube.com/watch?v=ZsQ04F5wtTs

[hr]

I talked about this on today’s Corona Geek (#43) and threw together a quick set of slides to show some results (from my own Nexus 7 device).

Here are the slides and the data from them.

Unless otherwise specified, the orange bars and lines are build 1142 and blue is 833.

#1 - There is a small set of FPS measurement tests. These tests create N circles every frame, colorize them, and then remove them in the next frame. These tests increase the load from 100 circles to 500 circles to see where FPS starts to drop off from the targeted frame-rate of 60FPS. Note: All circles are on screen.

Result: 1142 is a little faster than 833.

https://www.dropbox.com/s/kc04giogh85m1p7/cbench1.jpg

#2. There is also a small set of tests that examine the memory cost for making different types of Corona and Lua objects. (No details on objects is provided on this slide. Sorry).

Result: 1142 is the same as 833.

https://www.dropbox.com/s/5rxr5jh73m9dmib/cbench2.jpg

#3. There are 20 tests that focus on raw speed (OPS).

Result: 1142 is generally the same or faster than 833.

https://www.dropbox.com/s/08ru3bs24cceci3/cbench3.jpg

#4. These two slides answer the question: "How much faster is direct table iteration vs. using Lua iterators (ipairs and pairs)?"

Result: Direct iteration is 35 times faster than ipairs() and slightly faster than that for pairs()

These results are for build 1142 even though the bars are blue.

https://www.dropbox.com/s/glyervks9qgrf72/cbench4.jpg

https://www.dropbox.com/s/socpym6k8qi2lx1/cbench5.jpg

#5. These two slides answer the question: "How much faster is comparing (x * x) == y vs. x == math.sqrt( y)?"

Result: Using squared-length comparison is ~24% fast that calculating a square root and then comparing.

These results are for build 1142 even though the bars are blue.

https://www.dropbox.com/s/dvl0gf4x16tmkf1/cbench6.jpg

https://www.dropbox.com/s/vbcq404eg65n6ec/cbench7.jpg

#6. These two slides answer the question: "How much faster is a localized math function over a direct call?"

Result: Localizing math.sqrt (using file-level localization) is 26% faster than a call via the library handle.

These results are for build 1142 even though the bars are blue.

https://www.dropbox.com/s/727x6sarucrmji9/cbench8.jpg

https://www.dropbox.com/s/qci5rbli3fkvd9e/cbench9.jpg

Hopefully, you enjoyed these insights into the kinds of questions that CBench can answer. I hope that you will help me improve this tool and gather more data by downloading both versions and running them on all of your android devices. In fact, run as many times as you want. More data is better!

Note: Build 833 required a special binary for Kindle Fire Gen 1 and Nook Gen 1: Build 833 f1 (alternate link)

Charles_McKeever · June 19, 2013, 12:07am

Thanks Ed! The RR Super Meter is great and this project will provide some great insights. I’m running Build 833 now on my Kindle Fire HD. I’ll run Build 1142 once it finishes. I’m looking forward to seeing the results.

JonPM · June 19, 2013, 6:01am

Testing…

EHO · June 19, 2013, 4:56pm

Unfortunately I’m only testing on iOS hardware so I can’t be helpful on this front.

I use #table iteration mostly because ipairs still confuses me slightly. I’m glad to know I’ve defaulted into an approach that performs so much better!

I use math.random() with some frequency in my work. So it looks like I should localize this? Great tip, thanks

roaminggamer · June 19, 2013, 5:38pm

Hello again everyone. I have updated both of the binaries to make them smaller and give them better icons. They also report the duration of the test run so you can brag here about you devices.

!!WARNING!! !!WARNING!! !!WARNING!! !!WARNING!! !!WARNING!! !!WARNING!!

Some folks have been sending me reports without attachments. I think what is happening is that you may have accidentally clicked the delete attachment icon when going to click send. Some e-mail clients make that way too easy to do.

So, please be careful and only click send when the dialog pops up!

!!WARNING!! !!WARNING!! !!WARNING!! !!WARNING!! !!WARNING!! !!WARNING!!

Again, you are encouraged to brag! Please download the binaries, run them both, send the reports and then post back here, telling us your device type and the benchmark duration reported by CBench.

Build 833 (alternate link)

Build 1142 (alternate link)

Note: Build 833 required a special binary for Kindle Fire Gen 1 and Nook Gen 1: Build 833 f1 (alternate link)

I’ll go first!

Build 833 - Nexus 7: 8 minutes 25.95 seconds

Build 1142 - Nexus 7: 8 minutes 20.99 seconds

1142 about 5 seconds faster, but this is not a surprise with the small size of this test set.

Build 833 - Kindle Fire Gen 1: 8:36.13…

Build 1142 - Kindle Fire Gen 1: 8:42.47

Interesting… the runs are similar in length. This is because the framework dominates the ‘between measurement’ times where it sleeps for a while between runs to give Lua and Corona time to do housekeeping.

roaminggamer · June 22, 2013, 8:41pm

Hello again folks. It has been five days since I started this thread. In that time several of you have run the benchmarks and I have learned a few things.

#1 - Although many of you have helped, I am not getting nearly as many runs as I had hoped for. :huh: So, to sweeten the deal, I will offer this deal:

Everyone who:

Participates in this exercise,
Runs both version 833 and 1142 on at least one device, AND
Sends me the report for both runs,…

Will get $10 off any tool or template I sell here: RG Tools and Templates

I will e-mail all participants at the end of the study (July 2nd 2013) using BCC to protect your anonymity and privacy.

#2 - I learned that sending attachments is sometimes hit-or-miss. I think this is primarily due to the many ways e-mail clients are set up on Android devices and not due to the way Corona handles it. Whatever the case, I have received a few e-mails without the data attached.

Today I updated all three binaries to reduce the size of the report as well as to attach it and to send it inline.

Build 833 (alternate link)

Build 1142 (alternate link)

Note: Build 833 required a special binary for Kindle Fire Gen 1 and Nook Gen 1: Build 833 f1 (alternate link)

The latest build of these benchmarks looks like this:

Remember, participate as per the rules above and get up to $10 off a tool or template (which could mean FREE.)

rob.englebright · June 25, 2013, 9:45am

Downloading and testing on a Nexus 4 when I get a couple of minutes…

I’d be really interested in the data you find from this work,

I’ve been trying to think of a way to benchmark the different cross platform dev frameworks…

looks like my Nexus 4 is dog slow, slower than the Nexus 7 which isn’t what I expected…

Build 1142 - Nexus 4: 11 minutes 59 seconds

running 883 now

Build 883 - Nexus 4: 12minutes 6 seconds

Second Run

Build 1142 - Nexus 4 - 11.56

build 883 - Nexus 4 - 11.54

Cheers

Rob

roaminggamer · June 17, 2013, 9:12pm

I failed to mention this above, but the benchmarks run as follows:

The app loads and checks for a prior ‘crash’. If found, it asks to send a report.
You are presented with a menu. You have one choice. ‘Run’.
Clicking run starts the benchmarking harness. This then runs 32 tests. Each test is run 3 times in a row for a total of 96 test runs.
At the end, the app asks to send a report to me.
Finally, you are presented with a listing of the test runs and their results.

Of course, 32 tests is not enough to get a full picture, but like I said in my last post, this is a ‘shake down’ run for the tool, not a true evaluation of Corona SDK.

You can get this tool and run it yourself, but if you want to see a fast run (on the simulator) here it is:

http://www.youtube.com/watch?v=ZsQ04F5wtTs

[hr]

I talked about this on today’s Corona Geek (#43) and threw together a quick set of slides to show some results (from my own Nexus 7 device).

Here are the slides and the data from them.

Unless otherwise specified, the orange bars and lines are build 1142 and blue is 833.

#1 - There is a small set of FPS measurement tests. These tests create N circles every frame, colorize them, and then remove them in the next frame. These tests increase the load from 100 circles to 500 circles to see where FPS starts to drop off from the targeted frame-rate of 60FPS. Note: All circles are on screen.

Result: 1142 is a little faster than 833.

https://www.dropbox.com/s/kc04giogh85m1p7/cbench1.jpg

#2. There is also a small set of tests that examine the memory cost for making different types of Corona and Lua objects. (No details on objects is provided on this slide. Sorry).

Result: 1142 is the same as 833.

https://www.dropbox.com/s/5rxr5jh73m9dmib/cbench2.jpg

#3. There are 20 tests that focus on raw speed (OPS).

Result: 1142 is generally the same or faster than 833.

https://www.dropbox.com/s/08ru3bs24cceci3/cbench3.jpg

#4. These two slides answer the question: "How much faster is direct table iteration vs. using Lua iterators (ipairs and pairs)?"

Result: Direct iteration is 35 times faster than ipairs() and slightly faster than that for pairs()

These results are for build 1142 even though the bars are blue.

https://www.dropbox.com/s/glyervks9qgrf72/cbench4.jpg

https://www.dropbox.com/s/socpym6k8qi2lx1/cbench5.jpg

#5. These two slides answer the question: "How much faster is comparing (x * x) == y vs. x == math.sqrt( y)?"

Result: Using squared-length comparison is ~24% fast that calculating a square root and then comparing.

These results are for build 1142 even though the bars are blue.

https://www.dropbox.com/s/dvl0gf4x16tmkf1/cbench6.jpg

https://www.dropbox.com/s/vbcq404eg65n6ec/cbench7.jpg

#6. These two slides answer the question: "How much faster is a localized math function over a direct call?"

Result: Localizing math.sqrt (using file-level localization) is 26% faster than a call via the library handle.

These results are for build 1142 even though the bars are blue.

https://www.dropbox.com/s/727x6sarucrmji9/cbench8.jpg

https://www.dropbox.com/s/qci5rbli3fkvd9e/cbench9.jpg

Hopefully, you enjoyed these insights into the kinds of questions that CBench can answer. I hope that you will help me improve this tool and gather more data by downloading both versions and running them on all of your android devices. In fact, run as many times as you want. More data is better!

Note: Build 833 required a special binary for Kindle Fire Gen 1 and Nook Gen 1: Build 833 f1 (alternate link)

Charles_McKeever · June 19, 2013, 12:07am

Thanks Ed! The RR Super Meter is great and this project will provide some great insights. I’m running Build 833 now on my Kindle Fire HD. I’ll run Build 1142 once it finishes. I’m looking forward to seeing the results.

JonPM · June 19, 2013, 6:01am

Testing…

roaminggamer · July 3, 2013, 5:10pm

Hello all. This exercise/study is now over.

I want to thank those of you who helped by participating. I have just sent e-mails to each of you with details on how to acquire your $10 discount a module, template, or tool.

I will leave the links to CBench active for some time and you may continue to download and run it, but the tool will not send anymore reports.

Cheers,

Ed M.

EHO · June 19, 2013, 4:56pm

Unfortunately I’m only testing on iOS hardware so I can’t be helpful on this front.

I use #table iteration mostly because ipairs still confuses me slightly. I’m glad to know I’ve defaulted into an approach that performs so much better!

I use math.random() with some frequency in my work. So it looks like I should localize this? Great tip, thanks

roaminggamer · June 19, 2013, 5:38pm

Hello again everyone. I have updated both of the binaries to make them smaller and give them better icons. They also report the duration of the test run so you can brag here about you devices.

!!WARNING!! !!WARNING!! !!WARNING!! !!WARNING!! !!WARNING!! !!WARNING!!

Some folks have been sending me reports without attachments. I think what is happening is that you may have accidentally clicked the delete attachment icon when going to click send. Some e-mail clients make that way too easy to do.

So, please be careful and only click send when the dialog pops up!

!!WARNING!! !!WARNING!! !!WARNING!! !!WARNING!! !!WARNING!! !!WARNING!!

Again, you are encouraged to brag! Please download the binaries, run them both, send the reports and then post back here, telling us your device type and the benchmark duration reported by CBench.

Build 833 (alternate link)

Build 1142 (alternate link)

Note: Build 833 required a special binary for Kindle Fire Gen 1 and Nook Gen 1: Build 833 f1 (alternate link)

I’ll go first!

Build 833 - Nexus 7: 8 minutes 25.95 seconds

Build 1142 - Nexus 7: 8 minutes 20.99 seconds

1142 about 5 seconds faster, but this is not a surprise with the small size of this test set.

Build 833 - Kindle Fire Gen 1: 8:36.13…

Build 1142 - Kindle Fire Gen 1: 8:42.47

Interesting… the runs are similar in length. This is because the framework dominates the ‘between measurement’ times where it sleeps for a while between runs to give Lua and Corona time to do housekeeping.

roaminggamer · June 22, 2013, 8:41pm

Hello again folks. It has been five days since I started this thread. In that time several of you have run the benchmarks and I have learned a few things.

#1 - Although many of you have helped, I am not getting nearly as many runs as I had hoped for. :huh: So, to sweeten the deal, I will offer this deal:

Everyone who:

Participates in this exercise,
Runs both version 833 and 1142 on at least one device, AND
Sends me the report for both runs,…

Will get $10 off any tool or template I sell here: RG Tools and Templates

I will e-mail all participants at the end of the study (July 2nd 2013) using BCC to protect your anonymity and privacy.

#2 - I learned that sending attachments is sometimes hit-or-miss. I think this is primarily due to the many ways e-mail clients are set up on Android devices and not due to the way Corona handles it. Whatever the case, I have received a few e-mails without the data attached.

Today I updated all three binaries to reduce the size of the report as well as to attach it and to send it inline.

Build 833 (alternate link)

Build 1142 (alternate link)

Note: Build 833 required a special binary for Kindle Fire Gen 1 and Nook Gen 1: Build 833 f1 (alternate link)

The latest build of these benchmarks looks like this:

Remember, participate as per the rules above and get up to $10 off a tool or template (which could mean FREE.)

rob.englebright · June 25, 2013, 9:45am

Downloading and testing on a Nexus 4 when I get a couple of minutes…

I’d be really interested in the data you find from this work,

I’ve been trying to think of a way to benchmark the different cross platform dev frameworks…

looks like my Nexus 4 is dog slow, slower than the Nexus 7 which isn’t what I expected…

Build 1142 - Nexus 4: 11 minutes 59 seconds

running 883 now

Build 883 - Nexus 4: 12minutes 6 seconds

Second Run

Build 1142 - Nexus 4 - 11.56

build 883 - Nexus 4 - 11.54

Cheers

Rob

roaminggamer · July 3, 2013, 5:10pm

Hello all. This exercise/study is now over.

I want to thank those of you who helped by participating. I have just sent e-mails to each of you with details on how to acquire your $10 discount a module, template, or tool.

I will leave the links to CBench active for some time and you may continue to download and run it, but the tool will not send anymore reports.

Cheers,

Ed M.