Better way to calculate associative table sizes?

Sometimes it just makes sense to use associative tables, but # can’t accurately return the length of assoc tables that contain ‘holes’ so I’ve been using a custom function for that instead:

module.tableLength = function(t) local iteration = 0 for k,v in pairs(t) do iteration = iteration + 1 end return iteration end

For my own needs so far, this approach has worked fine, but I’m conscious that if given a huge table it’s going to take a considerable time to churn over it and return a size.

Therefore just wondering if anybody has a better approach to this? I considered using custom wrapper functions to add to and remove from assoc tables which could then increment/decrement an internal counter to return instead of having to actually count when asked, but this approach seems risky - sods law I’ll just forget to use the wrapper one time and skew the count. Additionally the overhead of updating a counter every single time a change is made might counter the increased speed of the occasional count anyway…?

you need to know your use case before you can optimize it.  fe, what is the primary operation:  access via key? or counting? ever need access by index??  (btw, failure to use your wrapper should not be counted against that approach - plus there are ways to write such a wrapper that would completely privatize the raw data)

odds are though if you write something to improve the performance of counting, then you’ll decrease the performance elsewhere (insertion, deletion, etc).  often it’s a telltale sign that there’s something not-quite thought out thoroughly if you frequently need the count of a hash-keyed table.  other approaches include maintaining two tables (one by key, one by index) trading memory for performance of counting/indexing, though big hit to deletion.  always tradeoffs.

No specific use case to be honest… for what I’ve done this for so far in Lua, performance isn’t really an issue, but I’m coming at Lua as an otherwise PHP developer and assoc arrays make up just about everything in PHP.

One example that I can think of is this - say you have a game where each player has a username - maybe it’s an online game… some kind of an RPG where everybody has a map position, skill stats, etc. I.e. a bunch of players all containing properties that all other players need to be aware of in some way.

My instinct is to handle this as assoc arrays/tables. So “ric” and “dave” load in to an assoc table as keys, and then each of those table elements explode out into a tree of all the properties belonging to each player. Then later in the game code you can easily call for the properties of a player by doing something like players[“dave”].positionX or players[“ric”].positionX.

For something like a 2 player billiards game this is perfect and zero performance impact. And for something like an RPG where there might be say… 100 players within the near-distance that need handling, this is a really quick way to find the properties of any one player, but if you want to then do something like outputting how many players are near-by, #players will only work if there are no holes, but “ric” might drop offline and be killed from this table thus leaving a hole.

Now in this example it doesn’t take long to iterate a 100 key table, but what if you’re talking 10,000 players? Granted storing a three dimensional tree of properties for 10,000 players is a bit excessive and going to require lots of memory capacity, but if plenty of memory is available this is still a faster way to handle data than using a database (unless you require search capabilities of course) but iterating 10,000 keys to do a count every time you want one is going to be a problem, and that’s where I’m wondering if there’s a better solution?

aside:   lua is very like php in that the only data structure is an associative array.  lua has some internal “tweaks” to make the special-case of using numbers as keys perform better, but they’re still just keys in an associative table at the programming level, tho that special treatment of numeric keys can lead to important differences.  that, plus nils.  at the root of it lies the fact that lua can’t distinguish between:

local t1 = {} -- count of t1 is zero, got it, duh local t2 = { nonsense = nil } -- what is the "count" of t2? -- zero?? wtf?!?! -- didn't i just explicitly set a key?!?! but i WANT to store a literal nil there!!!!! -- friggin lua... can't even count a hashed table's keys, what a piece of s

:smiley:

back on topic:  again, hard to optimize a fictitious scenario, but…

players[“ric”].x apparently being a very frequent operation, then yes, hash-keyed tables to address your primary use

players *occassionally* dropping in/out of a network game would seemingly be far secondary, the query “how many total players do i have” being far less frequent (and far more static) than fe position queries, so a candidate for a “counter wrapper” approach (take the counter hit once on insert/delete, don’t iterate to count)

you need to know your use case before you can optimize it.  fe, what is the primary operation:  access via key? or counting? ever need access by index??  (btw, failure to use your wrapper should not be counted against that approach - plus there are ways to write such a wrapper that would completely privatize the raw data)

odds are though if you write something to improve the performance of counting, then you’ll decrease the performance elsewhere (insertion, deletion, etc).  often it’s a telltale sign that there’s something not-quite thought out thoroughly if you frequently need the count of a hash-keyed table.  other approaches include maintaining two tables (one by key, one by index) trading memory for performance of counting/indexing, though big hit to deletion.  always tradeoffs.

No specific use case to be honest… for what I’ve done this for so far in Lua, performance isn’t really an issue, but I’m coming at Lua as an otherwise PHP developer and assoc arrays make up just about everything in PHP.

One example that I can think of is this - say you have a game where each player has a username - maybe it’s an online game… some kind of an RPG where everybody has a map position, skill stats, etc. I.e. a bunch of players all containing properties that all other players need to be aware of in some way.

My instinct is to handle this as assoc arrays/tables. So “ric” and “dave” load in to an assoc table as keys, and then each of those table elements explode out into a tree of all the properties belonging to each player. Then later in the game code you can easily call for the properties of a player by doing something like players[“dave”].positionX or players[“ric”].positionX.

For something like a 2 player billiards game this is perfect and zero performance impact. And for something like an RPG where there might be say… 100 players within the near-distance that need handling, this is a really quick way to find the properties of any one player, but if you want to then do something like outputting how many players are near-by, #players will only work if there are no holes, but “ric” might drop offline and be killed from this table thus leaving a hole.

Now in this example it doesn’t take long to iterate a 100 key table, but what if you’re talking 10,000 players? Granted storing a three dimensional tree of properties for 10,000 players is a bit excessive and going to require lots of memory capacity, but if plenty of memory is available this is still a faster way to handle data than using a database (unless you require search capabilities of course) but iterating 10,000 keys to do a count every time you want one is going to be a problem, and that’s where I’m wondering if there’s a better solution?

aside:   lua is very like php in that the only data structure is an associative array.  lua has some internal “tweaks” to make the special-case of using numbers as keys perform better, but they’re still just keys in an associative table at the programming level, tho that special treatment of numeric keys can lead to important differences.  that, plus nils.  at the root of it lies the fact that lua can’t distinguish between:

local t1 = {} -- count of t1 is zero, got it, duh local t2 = { nonsense = nil } -- what is the "count" of t2? -- zero?? wtf?!?! -- didn't i just explicitly set a key?!?! but i WANT to store a literal nil there!!!!! -- friggin lua... can't even count a hashed table's keys, what a piece of s

:smiley:

back on topic:  again, hard to optimize a fictitious scenario, but…

players[“ric”].x apparently being a very frequent operation, then yes, hash-keyed tables to address your primary use

players *occassionally* dropping in/out of a network game would seemingly be far secondary, the query “how many total players do i have” being far less frequent (and far more static) than fe position queries, so a candidate for a “counter wrapper” approach (take the counter hit once on insert/delete, don’t iterate to count)