Open GL draw calls - More info if possible

Hey guys,

Just going through performance optimisation stuff. I noticed the info on draw call batching on this page:

https://docs.coronalabs.com/guide/basics/optimization/index.html

I’m not 100% I follow. Is there any more information on this? I’d like to know what to avoid, if there’s anything I can do to reduce the amount of draw calls.

Thanks in advance.

Rob,

Hi Rob. I’m seeing if I can get a clarification for you.

Rob

As I read that advice, the takeaway I got was… in a nutshell… if you are doing draw operations that share resources, make sure to do them consecutively in order to have the possibility of a speed up.

For example.

If you have 5 objects (let’s call them type A) that use the same spritesheet/texture and 5 objects ( B )   that use another spritesheet/texture, you are more likely to get a speed up if you do this draw call order:

BETTER: AAAAABBBBB

Than if you do this one:

WORSE: ABABABABAB

This concept is true for any shared resources and/or hardware features.

Of course, the overhead you would need in a complex game with many possible types/sharing to re-order draws might negate the ‘possible’ speedup you’d get from being in order, so be careful.

I think this kind of advice should only be followed when your game already lends itself to repeated calls matching this strategy.

This is rather simple: when using image sheets, instead of drawing each sprite in separate draw call, they’re batched. For example we have a 2 imageRects, with sprite A and sprite B from same image sheet. For them to be rendered in same draw call (basically all vertices would be grouped together and sent to GPU at once, instead of sending 2 sets of vertices) they have to use same material (same shader + same image sheet), they shouldn’t use masks (because masks culling can’t be batched easily), and they have to be rendered one after another. Pretty much any number of objects batched. Btw, you don’t have to use image sheets. like you can draw same image several times that would be batched as well without much performance impact. Like, you can draw 1000 same objects in a row, but having 500 of one type and 500 of another would slow things down.

I made little project to demonstrate the issue. Tap on FPS counter to toggle between batched and individual sprites

https://gist.github.com/Shchvova/3d7fecc371f728e66264690a416d2acc

It gives me double FPS on batched calls than on individual. Here’s link to zip file.

One question that came to mind when examining Vlads’ example was, “Will I see a reduction or difference in batching efficiency if my objects are rendered to different groups?”

Answer, “No.”  You still get the same speedup.

Thanks so much for these replies, so useful!

Where do shapes (circles, rects, polygons) fall into this?

Also, is there a way of printing the amount of draw calls? This would be incredibly useful!

  1. Unfilled primitives (newRect(), newCircle()) are not affected by this, so there is no appreciable delta.

  2. You can override the display.* functions and add your own ‘calls accumulation’ system.

You can verify statement 1 by changing Vlads’ example to render primitives:

 if mode == "individual" then for i=1,numberOfImages do display.newRect( 0, 0, 30, 30 ):translate( math.random(display.contentWidth), math.random( display.contentHeight ) ) end mode = "batched" else for i=1,numberOfImages/2 do display.newRect( 0, 0, 30, 30 ):translate( math.random(display.contentWidth), math.random( display.contentHeight ) ) display.newCircle( 0, 0, 30 ):translate( math.random(display.contentWidth), math.random( display.contentHeight ) ) end mode = "individual" end

PS - Modified for legibility. Also you’ll need to set the label fill color to black.

Yes, all shapes are batched, except meshes. Meshes are batched only if “strip” mode is used.

  1. Thanks! That’s really clear.

  2. Could you explain this a bit more? I understand how to override functions to include my own code in there, but how would I determine when there is a draw call. That’s the bit I’m confused about. In the example above with “ABABABABAB”, would that mean that there are 10 draw calls, whereas “AAAAABBBBB” would be two? So am I just looking for when I change image sheets?

I really appreciate all your help with this, thanks of much. Sorry if I’m being a bit simple. I’ve never looked into this stuff before (probably should have!!)

  1. Override the calls (two for this example; up to you what to accumulate):

    local data = {} local display_newCircle = display.newCircle function display.newCircle( … ) local obj = display_newCircle( unpack( arg ) ) local rec = { type = “circle” } data[#data+1] = rec return obj end – does NOT account for case of same named textures sourced from different baseDirs – local display_newImageRect = display.newImageRect function display.newImageRect( … ) local obj = display_newImageRect( unpack( arg ) ) local rec = { type = “imageRect” } data[#data+1] = rec – is first argument a group object or the texture name or the sheet ID local idx = 1 idx = idx + (arg[1].removeSelf ~= nil) and 1 or 0 – capture the ‘texture’ source ID could be discrete texture or image sheet rec.resourceID = arg[idx] if( type(arg[idx]) == “string” ) then rec.isDiscrete = true – discrete texture else rec.isDiscrete = false – texture sheet end return obj end

  2. Use a runtime listener to track current frame and to either flush the last frame data or summarize it.

    local accum = {} local onEnterFrame = function( ) for i = 1, #data – do something or some set of ‘accumulation’ calculations local rec = { – Your calculations go here. } accum[#accum+1] = rec data = {} – clear data for next frame acculation end end; Runtime:addEventListener(“enterFrame”,onEnterFrame)

I can’t really demo this properly without doing it for real and since I don’t know exactly what you need, I hope this is clear enough.

Unless the implementation is really strange, under the hood most of these objects will boil down to the same thing, namely some configuration of triangles. A shape, then, is just one of these with a dummy texture (a single white pixel, say) that has basic shaders on it (emit the position unchanged; tint the pixel with the fill color).

@vlads What if several objects share a mask?

Also, what about strokes? I noticed when trying to shade some that the texture coordinates were all 0, so I’ve wondered if anything different is going on.

> What if several objects share a mask?

That works fine. What doesn’t batch is when object are separated by mask pop/push.

Strokes would batch if they’re using same material, which is rarely the case - most of the times stroke just use plain colour material.

By the way, under the hood corona transforms every batch into strip formation mesh, which is sent to GL in one call, shapes are connected with degenerate triangles (lines), which are invisible.

Also, while potential pefrormance boost is there, it is rarely the problem solver, it won’t give too much…

Thanks again for your help with this!

Just to clarify, regarding this;

BETTER:  AAAAABBBBB

 

Than if you do this one:

 

WORSE:  ABABABABAB

Is this referring to the order sprites are created, or the order in which they are displayed (z-index like). So if I had layered groups and created sprites in those groups in different orders for example.

Still trying to get my head around this! 

If it’s to do with the order sprites are actually created, then I could potentially create all the sprites I’ll need (in the correct order) in one go and just make them visible, put them in the correct group and location when I need.

Thanks!

Displayed. Btw, whole idea of batching is that you can batch if you use image sheets. Basically, sprites from same image sheet would be batched.

Would be great to have such details in the docs. I do create some meshes at the moment and, even though I don’t have any performance probs, it’s great if the info is there, if required.

To be honest, not everything requires batching. If your meshes are large by themselves, you, probably, won’t get any performance boost for batching them.

Yeah that’s true - but can’t hurt to know it’s possible. F.i. just like the info that changing customvertexdata for the shaders does not stop the draws to be batched. So, f.i. even without the option to use separate vertex colors for meshes at the moment I might still implement per character colors by just splitting up the mesh into as many smaller submeshes as required, even if it might be hundreds. Sure there’s still some overhead, but the less unknowns there are, the better .)

Hi Rob. I’m seeing if I can get a clarification for you.

Rob