Which call would be tougher on performance?

Blex · June 8, 2018, 11:47pm

Hey there, this is just a question kind of, out of the blue… but let’s say you have two scenarios.

One runtime listener with a for loop that runs 3 times.

Three runtime listeners.

This is assuming both scenarios have listeners that do the same things.

Case 1: function hi() print("hey there") end function print1() for i = 1, 3 do hi() end end Runtime:addEventListener("enterFrame", print1) Case 2: function hi() print("hey there") end Runtime:addEventListener("enterFrame", hi) Runtime:addEventListener("enterFrame", hi) Runtime:addEventListener("enterFrame", hi)

roaminggamer · June 9, 2018, 12:18am

First, Both are negligible with regards to performance.

*UPDATE* Michael makes a good point below. While the cost of a single enterFrame listeners may be tiny, it is a cumulative cost. The same goes for the global lookup.

Second, enterFrame() calling mechanism will always be dominated by the workload in the enterFrame() function.

Having said that, my guess is they are about the same. At first I thought you were avoiding some function look ups in case 1, but you’re not. In case 1 it is a global lookup (slower). In case 2 it may be a native lookup (faster).

A better and clearer winner for case 1 would be:

function print1() for i = 1, 3 do print("hey there") end end Runtime:addEventListener("enterFrame", print1)

Can you tell me why you want to know?

Are you debating the value of having one enterFrame listener() that does a lot of work versus multiple smaller ones that do a small amount of work?

Please elaborate if you can.

roaminggamer · June 9, 2018, 1:16am

Just for giggles I made a test to check which is better.

I modified your test a little to make it more benchmark-able.

https://github.com/roaminggamer/RG_FreeStuff/raw/master/AskEd/2018/06/performance.zip

I removed the print.
I switched the order my test 2 is your case 1 and vice versa:

io.output():setvbuf(“no”) display.setStatusBar(display.HiddenStatusBar) – ===================================================== require “ssk2.loadSSK” _G.ssk.init( { measure = false } ) ssk.meters.create_fps(true) --ssk.meters.create_mem(true) --ssk.misc.enableScreenshotHelper(“s”) – ===================================================== local workloadCount = 30000 – No payload function hi() end function print1() for i = 1, workloadCount do hi() end end local function test1() for i = 1, workloadCount do Runtime:addEventListener(“enterFrame”, hi) end end local function test2( iter) Runtime:addEventListener(“enterFrame”, print1) end – 1. Run NO test and check FPS average – 2. Run ONLY test 1 and adjust workloadCount till FPS drops below average from step #1 test1() – 3. Run ONLY test 2 and compare FPS to step 2 --test2()

My findings are that doing the work in a loop with a single enterFrame listener is better than one enterframe listener per loop equivalent.

On my machine, the FPS dropped off from 60 to 55 for test1 at 35,000 enterFrame listeners, but test2() ran 35,000 loops at 60 FPS

Michael_Flad · June 9, 2018, 7:00am

Such things can matter a lot if you write code that has to be fast but of course one needs to do all the required and known things to get a real speedup.

The right way here is obviously the loop over an integer range *and* you want to make the executed inner function a local so you don’t need the constant global lookups. This will make the difference much bigger than what you got.

Also there’s quite a hidden cost regarding adding an evenlistener. There has to be an allocation in the eventdispatcher, this won’t be huge for a single entry, but if you use it f.i. for thousands of objects and use countless eventtypes, f.i. for an entity system, it’s going to be noticable and it *will* have a negative impact on your code, depending on the situation and dataset it might be huge but you won’t notice it as you’re gradually losing performance with all the little additions you’re doing and if you’re not an expert, you’ll almost never find the issues as it’s many small ones and not a single big one.

Blex · June 9, 2018, 9:30pm

roaminggamer:

First, Both are negligible with regards to performance.

*UPDATE* Michael makes a good point below. While the cost of a single enterFrame listeners may be tiny, it is a cumulative cost. The same goes for the global lookup.

Second, enterFrame() calling mechanism will always be dominated by the workload in the enterFrame() function.

Having said that, my guess is they are about the same. At first I thought you were avoiding some function look ups in case 1, but you’re not. In case 1 it is a global lookup (slower). In case 2 it may be a native lookup (faster).

A better and clearer winner for case 1 would be:
function print1() for i = 1, 3 do print("hey there") end end Runtime:addEventListener("enterFrame", print1)
Can you tell me why you want to know?

Are you debating the value of having one enterFrame listener() that does a lot of work versus multiple smaller ones that do a small amount of work?

Please elaborate if you can.

I asked this for an approach to the enemy AI I am making. I wanted to do it globally using groups instead of adding a new listener for each enemy made, but I wanted to make sure it would be positive on performance.

roaminggamer · June 9, 2018, 9:51pm

I suggest you choose the design method you are most familiar with and least likely to have bugs in. Both will suit your needs. You simply aren’t going to have enough enemies for this to make a different.

Blex · June 10, 2018, 5:08pm

Who knows, maybe I will have 35,000 enemies on a single level.

I have already started using this approach, and it is working swimmingly so far, so I will stick to it.

roaminggamer · June 10, 2018, 5:29pm

FYI: I didn’t mean that in a condescending way, it is just that 35,000 is a lot of display objects.

Blex · June 12, 2018, 8:17pm

I know.

anon63346430 · June 12, 2018, 10:41pm

Can I just add that 10,000s is not a lot of display objects if handled correctly. My games handle many more than 10k.

Note: this is a **** lot of work to make performant.

roaminggamer · June 9, 2018, 12:18am

First, Both are negligible with regards to performance.

*UPDATE* Michael makes a good point below. While the cost of a single enterFrame listeners may be tiny, it is a cumulative cost. The same goes for the global lookup.

Second, enterFrame() calling mechanism will always be dominated by the workload in the enterFrame() function.

Having said that, my guess is they are about the same. At first I thought you were avoiding some function look ups in case 1, but you’re not. In case 1 it is a global lookup (slower). In case 2 it may be a native lookup (faster).

A better and clearer winner for case 1 would be:

function print1() for i = 1, 3 do print("hey there") end end Runtime:addEventListener("enterFrame", print1)

Can you tell me why you want to know?

Are you debating the value of having one enterFrame listener() that does a lot of work versus multiple smaller ones that do a small amount of work?

Please elaborate if you can.

roaminggamer · June 9, 2018, 1:16am

Just for giggles I made a test to check which is better.

I modified your test a little to make it more benchmark-able.

https://github.com/roaminggamer/RG_FreeStuff/raw/master/AskEd/2018/06/performance.zip

I removed the print.
I switched the order my test 2 is your case 1 and vice versa:

io.output():setvbuf(“no”) display.setStatusBar(display.HiddenStatusBar) – ===================================================== require “ssk2.loadSSK” _G.ssk.init( { measure = false } ) ssk.meters.create_fps(true) --ssk.meters.create_mem(true) --ssk.misc.enableScreenshotHelper(“s”) – ===================================================== local workloadCount = 30000 – No payload function hi() end function print1() for i = 1, workloadCount do hi() end end local function test1() for i = 1, workloadCount do Runtime:addEventListener(“enterFrame”, hi) end end local function test2( iter) Runtime:addEventListener(“enterFrame”, print1) end – 1. Run NO test and check FPS average – 2. Run ONLY test 1 and adjust workloadCount till FPS drops below average from step #1 test1() – 3. Run ONLY test 2 and compare FPS to step 2 --test2()

My findings are that doing the work in a loop with a single enterFrame listener is better than one enterframe listener per loop equivalent.

On my machine, the FPS dropped off from 60 to 55 for test1 at 35,000 enterFrame listeners, but test2() ran 35,000 loops at 60 FPS

Michael_Flad · June 9, 2018, 7:00am

Such things can matter a lot if you write code that has to be fast but of course one needs to do all the required and known things to get a real speedup.

The right way here is obviously the loop over an integer range *and* you want to make the executed inner function a local so you don’t need the constant global lookups. This will make the difference much bigger than what you got.

Also there’s quite a hidden cost regarding adding an evenlistener. There has to be an allocation in the eventdispatcher, this won’t be huge for a single entry, but if you use it f.i. for thousands of objects and use countless eventtypes, f.i. for an entity system, it’s going to be noticable and it *will* have a negative impact on your code, depending on the situation and dataset it might be huge but you won’t notice it as you’re gradually losing performance with all the little additions you’re doing and if you’re not an expert, you’ll almost never find the issues as it’s many small ones and not a single big one.

Blex · June 9, 2018, 9:30pm

roaminggamer:

First, Both are negligible with regards to performance.

*UPDATE* Michael makes a good point below. While the cost of a single enterFrame listeners may be tiny, it is a cumulative cost. The same goes for the global lookup.

Second, enterFrame() calling mechanism will always be dominated by the workload in the enterFrame() function.

Having said that, my guess is they are about the same. At first I thought you were avoiding some function look ups in case 1, but you’re not. In case 1 it is a global lookup (slower). In case 2 it may be a native lookup (faster).

A better and clearer winner for case 1 would be:
function print1() for i = 1, 3 do print("hey there") end end Runtime:addEventListener("enterFrame", print1)
Can you tell me why you want to know?

Are you debating the value of having one enterFrame listener() that does a lot of work versus multiple smaller ones that do a small amount of work?

Please elaborate if you can.

I asked this for an approach to the enemy AI I am making. I wanted to do it globally using groups instead of adding a new listener for each enemy made, but I wanted to make sure it would be positive on performance.

roaminggamer · June 9, 2018, 9:51pm

I suggest you choose the design method you are most familiar with and least likely to have bugs in. Both will suit your needs. You simply aren’t going to have enough enemies for this to make a different.

Blex · June 10, 2018, 5:08pm

Who knows, maybe I will have 35,000 enemies on a single level.

I have already started using this approach, and it is working swimmingly so far, so I will stick to it.

roaminggamer · June 10, 2018, 5:29pm

FYI: I didn’t mean that in a condescending way, it is just that 35,000 is a lot of display objects.

Blex · June 12, 2018, 8:17pm

I know.

anon63346430 · June 12, 2018, 10:41pm

Can I just add that 10,000s is not a lot of display objects if handled correctly. My games handle many more than 10k.

Note: this is a **** lot of work to make performant.