Just can't seem to get smooth performance...?

mike470/gtt:“Move the tiles” is one of the bigger “duh” moments in my life. Can’t believe I didn’t think of that one first. Beats the hell out of sprite creation (70ms!) (FWIW, display.newSprite and :setSequence are roughly 70% of the CPU workload of my newTile() function)

Now, gtt, you suggest using translate here, but is it really faster if you have to do the math first to figure out where to translate to? ie: which one of these is faster?

[code]-- 1. Calculate and translate
local distance = map.vis.btm[map.vis.btm.numChildren][32].y - currentRow.y
currentRow:translate(0, distance)
distance = nil

– 2. Set x/y
currentRow.y = map.vis.btm[map.vis.btm.numChildren][32].y[/code]

Each tile is a newSprite(), specifically because I need to be able to animate them. Likewise, it means I use setSequence instead of setFrame. And as luck would have it I’m already using 2 tiles of overdraw on each side so that I can scroll without tiles disappearing on-screen. (Likewise, :prepare is deprecated so wouldn’t use that.)

Results : Well…

  • Performance on Simulator (2011 Air OSX) is even worse visually; tons of hiccups. At a glance I’d say the framerate has dropped at least 25% below my previous solution.
  • Performance on Device (iPhone4) is clearly better than before. Instead of constant hiccups there is now just a big 300ms hiccup when you change directions but even that seems to go away after awhile.
  • In the profiler, the move function has dropped about 15% in performance use (from 1080ms to 890ms)

I’d say there is still something significantly wrong here, but I’ll attach some code and take some suggestions. :wink:

Zwonkie : imageGroup culling has a performance benefit but it does not actually cull the tiles per se. I imagine it’s just using some clever memory sharing. (I’m sure Corona Labs has a better explanation.)

32x32 tile grid (1024 tiles, average Dragon Quest town size): Smooth framerate
64x64 tile grid (4096 tiles, large dungeon): Serious performance issues
256x256 tile grid (65k tiles, world map): Crash device

The key takeaway here is that it does *not* let you skip culling in your code solution. You simply cannot hold so many tiles in memory, let alone translate the entire thing. It’s for this reason that I can recommend Lime for, say, average sidescroller levels but cannot recommend it for an RPG.

[import]uid: 41884 topic_id: 30486 reply_id: 122223[/import]

Sample Code: (as promised…)

EDIT: This version seems 3x faster than the old version. newRow() and newColumn() now take the majority of the hit.

[code]local function checkMapPosition(map)
– How many tiles worth of movement before redraw == true?
local pixels_over = 96 --local pixels_over = (map.overdraw - 1 or 2) * tileSize

– Localize the map data
local vis = map.vis

if vis.originX - vis.offsetX >= pixels_over and vis.btm[1][1].xTile > 1 then

return true, “left”

elseif vis.offsetX - vis.originX >= pixels_over and vis.btm[1][vis.btm[1].numChildren].xTile < map.width then

return true, “right”

elseif vis.originY - vis.offsetY >= pixels_over and vis.btm[1][1].yTile > 1 then

return true, “up”

elseif vis.offsetY - vis.originY >= pixels_over and vis.btm[vis.btm.numChildren][1].yTile < map.height then
return true, “down”
end
end --/checkMapPosition() -------------------------------------------------------------------------[/code]

And here’s what my newRow() code looks like now as a result of the suggestion. I imagine there is a way to optimize code so it doesn’t mirror so much but I’m more worried about getting down the execution time and keeping it readable.

[code]-- FUNCTION: Moves an old row of tiles into a new row and then changes their appearance to match. -
function newRow(map, angle)

– Shortcut tables
local top, btm = {}, {}

if angle == “up” then

btm.row = map.vis.btm[map.vis.btm.numChildren]
btm.pre = map.vis.btm[1]
top.row = map.vis.top[map.vis.btm.numChildren]
top.pre = map.vis.top[1]

– Move the row to it’s new position above pre.
btm.row:setReferencePoint(display.TopLeftReferencePoint)
btm.row.y = btm.pre.y - tileSize
–btm.row.x = btm.pre.x

top.row:setReferencePoint(display.TopLeftReferencePoint)
top.row.y = top.pre.y - tileSize
–top.row.x = top.pre.x

– Change the position of the group within the displayGroup array.
map.vis.btm:insert(1, btm.row)
map.vis.top:insert(1, top.row)

elseif angle == “down” then

btm.row = map.vis.btm[1]
btm.pre = map.vis.btm[map.vis.btm.numChildren]
top.row = map.vis.top[1]
top.pre = map.vis.top[map.vis.btm.numChildren]

– Move the row to it’s new position below pre.
btm.row:setReferencePoint(display.TopLeftReferencePoint)
btm.row.y = btm.pre.y + tileSize

top.row:setReferencePoint(display.TopLeftReferencePoint)
top.row.y = top.pre.y + tileSize

– Change the position of the group within the displayGroup array.
map.vis.btm:insert( btm.row )
map.vis.top:insert( top.row )

end

local adjust = { up=-1, down=1 }

– Update each tile in the row to its new identity.
for i = 1, btm.row.numChildren do

– Update the xTile and yTile
– btm.row[i].xTile = btm.pre[i].xTile
btm.row[i].yTile = btm.pre[i].yTile + adjust[angle]

– top.row[i].xTile = top.pre[i].xTile
top.row[i].yTile = top.pre[i].yTile + adjust[angle]

– Update the tile ID and type
btm.row[i].id = map.properties.btm[btm.row[i].yTile][btm.row[i].xTile].id
btm.row[i].type = map.properties.btm[btm.row[i].yTile][btm.row[i].xTile].type

top.row[i].id = map.properties.top[top.row[i].yTile][top.row[i].xTile].id
top.row[i].type = map.properties.top[top.row[i].yTile][top.row[i].xTile].type

– Update the visual appearance
btm.row[i]:setSequence( map.tileset.properties[btm.row[i].type ].name )
btm.row[i]:play()

top.row[i]:setSequence( map.tileset.properties[top.row[i].type ].name )
top.row[i]:play()

end

– Clean up
table.remove(top)
table.remove(btm)
top, btm = nil, nil

end --/newRow() -----------------------------------------------------------------------------------[/code] [import]uid: 41884 topic_id: 30486 reply_id: 122224[/import]

Try to localize. For example, in the innermost loop above, you have multiple references to btm.row[i] and top.row[i]. Make those local vars and refer to them. That will speed things up. Same with btm.row[i].yTile and top.row[i].xTile.

Other that that, it is hard to figure out what to optimize without having the whole project in front of me. There is a pretty good sticky: http://developer.coronalabs.com/forum/2011/12/03/tips-optimization-101 - take a look and follow its advice. For example, newrow seems to be a global function - make it local.

[import]uid: 160496 topic_id: 30486 reply_id: 122251[/import]

I see what you mean, although top and btm are already local to the function so Im not sure further localization will save more time than declaring the variable. Will give it a try!

I’ve been following that thread pretty closely, actually. And using Profiler to test some of its assumptions.

newRow is not actually global. The variable is set ahead of time for forward referencing, and then I add it to the module output later. :slight_smile:

Anyway, I think it’s about as fast as its going to get, some minor localization aside. I’ll continue on and start adding the NPC layer, but I’m guessing I’ll have another performance thread coming up soon, if only because the simulator performance in this project is just so poor. Thanks for your advice! :slight_smile: [import]uid: 41884 topic_id: 30486 reply_id: 122260[/import]

The simulator performance depends quite a bit on your desktop’s CPU and graphics card. Mine, for example, performs a LOT better than the devices. [import]uid: 160496 topic_id: 30486 reply_id: 122261[/import]

Some notes from my experience building my first Tile based application in Corona:
* Corona is not well suited for Tile based games. these games are graphics demanding compraed to other games and this is not one of the best sides in corona compared to native development. despite that, it is possible with a lot of tweaking to get reasonable performance, especially using Build 841 and above, which solved some offscreen culling issues.

Some things I ended up doing:
* I pre-create the entire map (between 4-10k tiles, depending on level) with all 7 layers before game start. as mike said, if you can reuse tiles instead of pre-creating them, even better.
* In enterFrame, try to do as little as possible - I mostly move the NPC,enemies and the scene view. specifically try to avoid creating new things there.
* If possible, you can calculate some things only every n frames, saving a lot of time in between.
* localize everything and avoid OO methodology where you can without makeing the code unmaintainable - function calling across modules has penalties in Lua.
* reduce amount of event listeners. use one for the main UI.
* when using physics:

  • use collision filters
  • set friction to 0.0 unless specifically needed, otherwise can cause stuttering.
  • i’ve found that on iOs its good to use time step = 1/fps for a smooth game, where on android it will
    not work well and time step need to be set to 0.
  • let the physics engine do most of the checks for you ( for example, collision between NPCs, walls etc.) and use events instead of checking yourself using lua functions.
    * note that there are BIG differences between iOS and Android - on Android the game will suffer much more from lags and on low end devices it will be much slower.
    * single core android devices also suffer from severe audio problems. I still do not know how to completely avoid these, but I try to resample music to lower rates and optimize SFX. another problem is that you cannot detect if a device is single core, so you will end up hurting at least some of your potential clients either way.

* On a side note, personally, I think it was a mistake to make my first game a tile based game - its one of the more complicated games to develop, and corona is not the best tool for that. It took me around 5months to complete with graphics and all.
* despite the above note, if you’re not under money/time pressure and develop mostly for fun, I totally understand the joy in building a tile based game! It was fun. Enjoy. [import]uid: 118978 topic_id: 30486 reply_id: 122271[/import]

Interesting stuff, rune7.
* I hadn’t considered using physics for my collision since there’s so little of it to actually measure, but I may try that later. I do have an existing function based method that seems to work fairly well since there’s no measurement involved. Will see how it does combined with this culling solution.

* My current performance plan is just to use the i4 as a baseline. I’m well aware that Android could have a variety of problems, but I’d much rather have a working iOS game first. :slight_smile:

* Not really sure what you mean by time-step…I assume that’s some sort of “if” filter within the enterFrame listener? ie: within enterFrame don’t do anything until X amount of time has past?

* My current plan is to create and store NPCs based on when they enter view, but you’re right, I may just have to create them all at the start and have them hang out in space somewhere. I tried to do a 4k tile load once but that just wasn’t working well, hence my current solution.

* mike470, I agree, but I would say that 400 tiles giving serious problems on my MBA strikes me as a bit odd…maybe tiles really are that hard for Corona to handle. (You’re right though, although localizing the xTile/yTile would have been overkill, just localizing again within the loop seemed to cut 50ms!) [import]uid: 41884 topic_id: 30486 reply_id: 122293[/import]