How do you test efficiency of functions?

nameless1 · November 16, 2024, 9:22pm

Taking an example like t={1,2,3,4,5}. Let’s say I have 3 methods to print the elements: all(), pairs(), cor_all(). What is the best way to go about determining the best method to use? I tried looking at the memory usage(negligible). I tried the dt and 'avgDt" i.e. (ttl_dt/event.frame) a ratio u want to stay under 1. My understanding is dt is showing how well the code runs per my machine. If I run the code 1000 times in a frame I can balloon the value and see which one gives a lower avgDt… even then though the numbers weren’t different or consistent enough to really get an idea of a difference though.

local function all(t) local i = 0; local n = #t
	return function ()
		i = i + 1
		if i <= n then return t[i] end
end end

local function cor_all(tbl)
	return coroutine.wrap(function()
		for i=1,#tbl do
			coroutine.yield(tbl[i])
		end
end) end


local dt,ttl_dt,last_ms,mspf=0,0,0,1000/display.fps; local function info(event) local stage=display.getCurrentStage(); dt = event.time - last_ms; last_ms = event.time; --[[if dt>20 then dt=20 end]] dt = dt/mspf; ttl_dt=ttl_dt+dt
	if event.frame%100==0 then
		print(event.frame," |avgDt:"..(ttl_dt/event.frame)) --ttl_dt as a whole value should always be less than event.frame aka ratio under 1 
end end
Runtime:addEventListener("enterFrame", function(event)info(event)
	local t={1,2,3,4,5}
	for i=1,20000 do --frame 500: .93  on its own
		-- for obj in all(t) do end --frame 500: 1.17
		-- for obj in cor_all(t) do end --frame 500: 2.37
		-- for _,v in pairs(t) do end --frame 500: .96
	end
end)

I ballooned it to 20,000 iterations and more sensible differences appeared. So is this method valid? It’s about what I would expect honestly. If all() was built-in to the engine like pairs it would run faster than pairs I would assume…coroutines always slow so that checks out. I don’t need more links to optimization guides, I need to actually see the differences at some point. If you balloon to 60k iterations it shows for i=1,#t do end runs ~30% faster than pairs(), which confirms the docs saying pairs is slower.

depilz · November 16, 2024, 11:18pm

Testing efficiency is a tough task.

The way I usually test the efficiency of different methods is by just registering the time before and after execution.

As simple as:

local timeStart = system.getTimer()

for i = 1, 20000 do
    all()
    -- cor_all()
    -- pairs()
end

print("TIME: ", system.getTimer() - timeStart)

That’s generally the formula. Now, if you also want to compare the RAM consumption, you can use:

collectgarbage("collect")
collectgarbage("collect")
collectgarbage("collect") -- 3 times, there are always left overs.

print("MEMORY: ", collectgarbage("count"))

There are many considerations you must take into account, but I’ll leave the most important here:

Optimize each method/approach:
For example, globals are expensive, and accessing methods nested in tables can also be slower.
To avoid disadvantages, consider this optimization:

local wrap = coroutine.wrap
local yield = coroutine.yield
local resume = coroutine.resume
local function cor_all(tbl)
    return wrap(function()
		for i=1,#tbl do
			yield(tbl[i])
		end
    end)
end

Run each method separately:
Memory pressure and heat can make subsequent tests run slower. While this isn’t strictly necessary, it helps improve reliability.
Don’t rely on display.fps:
The display.fps value only shows the set FPS, not the actual frame rate.
Always localize globals:
Same point as above. Accessing globals repeatedly is inefficient.
Test performance only when necessary:
Do not test performance unless you want to learn about it or you really need it. Most languages are already highly optimized. In my experience, performance issues often come from poorly optimized or incorrect use of broader systems or things that you know that are slow but some people just use it anyways.

If for whatever reason you want even more precise results have in mind that:

• Results vary across environments: Factors like OS, temperature, and RAM pressure can significantly affect outcomes.
Even the order in which you declare your variables can make a difference. (An illusion)
To ensure accuracy, test across multiple devices, operating systems, and inputs.

• Understand the engine and language deeply: Without thorough knowledge, you may end up testing under misleading conditions.
Reading, exploring, testing, and reading again is key to avoiding this.

StarCrunch · November 17, 2024, 12:03am

Also worth noting is that any time you use the function keyword, you’re creating a closure object. (The local function cases are done when loading the file, and saved to variables, and thus are each one-time costs.) So you’ll get dinged for those each time, and it also becomes garbage after the call, since nothing is holding onto it. It probably won’t be huge, but it’s not nothing, in particular in the context of a test like this where you have 20,000 iterations. (This will also mean some occasional garbage collection going on.) In the case of wrap() you’ll also be doing the same for a coroutine, which is a little more heavyweight. (Probably not up to it tonight, but I’ll chime in later on your post for those. I’m the author of the article, as well.)