Batch Rendering

Dell · March 26, 2011, 7:54am

Hi All,

Im considering using Corona for my next project but wanted to check something first, I couldn’t find a good sub-forum to put this in so I thought id turn it into a feature request of sorts…

Is there any form of batch rendering in Corona? For example if I have 50 sprites on screen all using the same texture/spritesheet is that 1 draw call or 50?

If there isn’t, are there any plans to add it? Would it fall under the “Improve Sprite API” on the roadmap?

Thanks

Dell [import]uid: 8600 topic_id: 8301 reply_id: 308301[/import]

jhocking · March 28, 2011, 11:13am

For the less technically inclined here, he is referring to the low-level calls made to OpenGL. How exactly Corona makes draw calls to OpenGL can have a massive effect on performance.

I sure hope that is what Corona does, maybe an engineer from Ansca can tell us for sure. In the meantime, your question really just boils down to “can corona draw 50 identical sprites really fast?” In which case, you should do some performance tests. Downloading the free trial and then learning how to display a bunch of sprites is something you would want to do anyway in evaluating a game engine.

Also, take a look at this thread for a method to have huge tile maps:
http://developer.anscamobile.com/forum/2011/01/29/object-culling-render-process-when-not-content-area#comment-26658 [import]uid: 12108 topic_id: 8301 reply_id: 29854[/import]

Dell · March 28, 2011, 9:46pm

Thanks for the reply jhocking, ive just done some quick tests but im not sure of the results so far…

For ease I modified the Fishies sample, got rid of the background and replaced the image with a 16x16 png to try and rule out fill rate.

On an iPod Touch 2G I get a fairly steady 30fps with 75 sprites, which is more than I would expect a 2G to be able to handle if that was 75 draw calls.

However if I for example up it to 300 sprites and test on a 3GS it manages <20fps, now 300 sprites is quite a lot, but if they were all drawn in 1 call I would expect a 3GS to be able to handle that quite comfortably…

EDIT:

Ive just setup a similar scene using cocos2d’s batch node, im getting 60fps with 500 sprites on the 3GS, so perhaps they are individual calls in Corona? Or could the bottleneck be iterating through an lua array with that many objects? [import]uid: 8600 topic_id: 8301 reply_id: 29910[/import]

jmp909 · March 29, 2011, 10:36am

just to check… you are actually using the [lua]sprite.newSpriteSheet[/lua] etc? commands?
http://developer.anscamobile.com/reference/index/sprite

i only ask because people might refer to a normal bitmap image loaded with [lua]display.newImage[/lua] as a “sprite”

j [import]uid: 6645 topic_id: 8301 reply_id: 30024[/import]

Dell · March 29, 2011, 1:19pm

Ive tried both methods and see little if any difference between them, that would lead me to believe they are being treated the same, other than the fact that the spritesheet shares memory… [import]uid: 8600 topic_id: 8301 reply_id: 30048[/import]

jhocking · March 29, 2011, 1:48pm

Those testing results you had seem to suggest that Corona does batch the draw calls but that there is some other bottleneck from how Corona is handling the instances. Would be helpful for an Ansca engineer to pipe up at this point.

Or could the bottleneck be iterating through an lua array with that many objects?

Wait, what is your code doing that you need to loop through an array? For a draw calls test I would expect that your code doesn’t actually do anything with the objects other than just place them on the screen at startup. If the code is doing anything more (eg. moving the objects around) then you would expect that to be slower because Lua is slower than C, and that has nothing to do with draw calls. Please post your code. [import]uid: 12108 topic_id: 8301 reply_id: 30057[/import]

Dell · March 29, 2011, 2:30pm

My code is essentially the Fishies sample that comes with corona, perhaps the problem then is my understanding…which we may be getting to the edge of I assumed that the buffer would not be cleared if nothing on screen changed… if it is the case that the the sprites are re-drawn every frame regardless of whether they move then there must be some batching after all, as I can get a fairly steady 30fps with 300 static sprites on the 2G Touch…

On the down side, if that is the case, does that mean Lua is not fast enough to handle an array of that size on a 3GS? Admittedly there are few if statements etc as well in that sample but nothing major… [import]uid: 8600 topic_id: 8301 reply_id: 30064[/import]

rocket5tim · April 12, 2011, 5:25pm

I have the same question and am hoping someone from Ansca can chime in.

would the following = 1 draw call or 3?

local but1 = display.newImageRect( "sprites/buttonAclosed.png", 32, 32 ) local but2 = display.newImageRect( "sprites/buttonAclosed.png", 32, 32 ) local but3 = display.newImageRect( "sprites/buttonAclosed.png", 32, 32 ) [import]uid: 48658 topic_id: 8301 reply_id: 32185[/import]

mike4 · April 12, 2011, 8:39pm

Based on an old podcast I listened to that was a conversation with Carlos and Walter, they explicitly stated that calling an image multiple times, that reference the same image file, Corona will reuse the image bits and only load the image into memory once.

Info on the podcast here: http://mobileorchard.com/corona-easy-to-implement-high-performance-native-iphone-apps-written-in-lua/

Direct download here: http://podcast.mobileorchard.com/podcast/019-Corona.mp3 [import]uid: 5317 topic_id: 8301 reply_id: 32210[/import]

J_A_Whye · April 13, 2011, 12:16am

Earlier today I was playing with Fishies iPad and put 2000 fish swimming back and forth on the screen. On an iPad 2 I got 8 FPS. Cut the number of fish in half and I got about 16 FPS, cut it in half again --500 swimming fish – and got 30 FPS.

I was hoping for more, but there may be many places where that code can be optimized – it was probably written for clarity more than speed.

And just for the fun of it I “unrolled the loop” – instead of doing the same chunk of code 500 times I put in 500 chunks of code. Each fish had it’s own code. No speed up, but I didn’t really expect any with that few iterations.

Jay

PS - You young whippersnappers have it easy. Back in the day we put graphics on the screen a pixel at a time, and while unrolling the loops expanded the size of your code, we could get noticeable speed increases from doing that. [import]uid: 9440 topic_id: 8301 reply_id: 32232[/import]

mike4 · April 13, 2011, 7:38am

I am well aware of that. But the question of a single draw call versus multiples in this case (the fishies) is implicit in the example.

According to the podcast, they are only loading the image once into memory, and reusing the OpenGL id of the bits, with the slight overhead of redrawing the fish (position, rotation, scale), a single draw call is to be expected.

Now, I guess the real question is “is” it actually making a single draw call, or is there something else going on in the code that “breaks” the batch?

I know in all the other engines I have used over the years, batching is very dependent on the requirements, to ensure a batch. If one little thing is wrong, the batch breaks.

Has anyone tested changing the Fishies example to actually load in a sprite sheet instead of the current display.newImage? [import]uid: 5317 topic_id: 8301 reply_id: 32281[/import]

mike4 · April 13, 2011, 8:47am

While I haven’t written any OpenGL code myself, I do understand how it works, and my assumptions of a single draw call are based entirely on Carlos and Walters statements. That assumption may be completely wrong, but it is based on what they have said.

Without actually either hearing from Carlos, Walter or another Ansca engineer, or a method in Corona to get the number of draw calls, we are just speculating, based on our tests.

I wanted to find out how many draw calls were going on in my app, to verify that the sprite sheets were actually working as I expected, and was told there is no method to get the number of draw calls from Corona.

Do you have a method of getting the draw calls? Because that would be a good number to have, or see.

[import]uid: 5317 topic_id: 8301 reply_id: 32290[/import]

jhocking · April 13, 2011, 8:48am

According to the podcast, they are only loading the image once into memory, and reusing the OpenGL id of the bits, with the slight overhead of redrawing the fish (position, rotation, scale), a single draw call is to be expected.

I still don’t think you’re getting it. Have you ever wrote anything in OpenGL? Not used rendering engines based on OpenGL, but actually made calls to OpenGL in your own code.

It is perfectly possible to use the same OpenGL id over and over in multiple draw calls. The code would look something like (warning: massively simplified pseudo-code)

tex = opengl.load("fish.png")  
opengl.draw(poly1, tex)  
opengl.draw(poly2, tex)  
opengl.draw(poly3, tex)  
opengl.draw(poly4, tex)

as opposed to

tex = opengl.load("fish.png")  
polys = {poly1, poly2, poly3, poly4}  
opengl.draw(polys, tex)

Every call to opengl.draw (which isn’t what the command actually looks like, I’ve made it look like Corona here) has a small bit of overhead. The fewer times you call that command the better. [import]uid: 12108 topic_id: 8301 reply_id: 32288[/import]

jhocking · April 13, 2011, 8:56am

No I don’t have any way of knowing that. That’s why in my very first post in this thread I said “I sure hope that is what Corona does, maybe an engineer from Ansca can tell us for sure.”

As for our speculations, I was simply disagreeing with you that the statement you quoted from that podcast has any relevance to draw calls. [import]uid: 12108 topic_id: 8301 reply_id: 32291[/import]

jhocking · April 13, 2011, 8:58am

Corona will reuse the image bits and only load the image into memory once.

Loading the image isn’t what we’re talking about; we’re talking about how the code renders the image after it’s been loaded. It’s a subtle technical point that only people with low-level experience in OpenGL will understand, but you can draw multiple instances of the same image with a single draw call. OpenGL is optimized to handle lots of data in a single draw call faster than the same amount of data spread out over multiple draw calls.

This is one of the myriad reasons spritesheets are more efficient than separate images; if you have multiple sprites sharing a spritesheet then OpenGL will allow you to render all of those sprites with a single draw call. The question here is, does Corona do that? [import]uid: 12108 topic_id: 8301 reply_id: 32262[/import]

mike4 · April 13, 2011, 9:00am

No worries! I like discussions like this, because I learn by having to solidify my knowledge into words. Sometimes it proves me wrong and I like that. After 15 years making games, there is always something, on a basic level, for me to learn.

I would like an Ansca person to chime in on this as well.

What I really want is a call to be able to see how many draw calls there actually are. [import]uid: 5317 topic_id: 8301 reply_id: 32295[/import]

snarla · May 4, 2011, 4:35pm

If we told you, we’d have to eliminate you. [import]uid: 6787 topic_id: 8301 reply_id: 35133[/import]

mike4 · May 4, 2011, 5:17pm

Now that’s just not funny. [import]uid: 5317 topic_id: 8301 reply_id: 35139[/import]

seth.dev.ios · May 5, 2011, 1:20am

Why do Ansca staff like to reply super-funny (not) answers when we’re expecting them to give us a valuable one? Is that all we get for being subscribers beside the game engine, “jokes” on a forum? [import]uid: 51516 topic_id: 8301 reply_id: 35182[/import]

rocket5tim · May 5, 2011, 12:50pm

“If we told you, we’d have to eliminate you.”

Responses like that might eliminate me as a customer, is that what you meant?

Seriously Dell’s question should already be covered in the docs, we shouldn’t need to ask basic questions like this in the forums. And certainly we shouldn’t have to wait over a month for a response. [import]uid: 48658 topic_id: 8301 reply_id: 35275[/import]