Thanks, Joshua. That looks similar to this lua function I found on another thread here:
-- returns the number of bytes used by the UTF-8 character at byte i in s -- also doubles as a UTF-8 character validator function utf8charbytes (s, i) -- argument defaults i = i or 1 local c = string.byte(s, i) -- determine bytes needed for character, based on RFC 3629 if c \> 0 and c \<= 127 then -- UTF8-1 return 1 elseif c \>= 194 and c \<= 223 then -- UTF8-2 local c2 = string.byte(s, i + 1) return 2 elseif c \>= 224 and c \<= 239 then -- UTF8-3 local c2 = s:byte(i + 1) local c3 = s:byte(i + 2) return 3 elseif c \>= 240 and c \<= 244 then -- UTF8-4 local c2 = s:byte(i + 1) local c3 = s:byte(i + 2) local c4 = s:byte(i + 3) return 4 end end
And here are the utf8len and utf8sub functions, as well as a utf8 character replace function, also found in another thread here
-- returns the number of characters in a UTF-8 string function utf8len (s) local pos = 1 local bytes = string.len(s) local len = 0 while pos \<= bytes and len ~= chars do local c = string.byte(s,pos) len = len + 1 pos = pos + utf8charbytes(s, pos) end if chars ~= nil then return pos - 1 end return len end -- functions identically to string.sub except that i and j are UTF-8 characters -- instead of bytes function utf8sub (s, i, j) j = j or -1 if i == nil then return "" end local pos = 1 local bytes = string.len(s) local len = 0 -- only set l if i or j is negative local l = (i \>= 0 and j \>= 0) or utf8len(s) local startChar = (i \>= 0) and i or l + i + 1 local endChar = (j \>= 0) and j or l + j + 1 -- can't have start before end! if startChar \> endChar then return "" end -- byte offsets to pass to string.sub local startByte, endByte = 1, bytes while pos \<= bytes do len = len + 1 if len == startChar then startByte = pos end pos = pos + utf8charbytes(s, pos) if len == endChar then endByte = pos - 1 break end end return string.sub(s, startByte, endByte) end -- replace UTF-8 characters based on a mapping table function utf8replace (s, mapping) local pos = 1 local bytes = string.len(s) local charbytes local newstr = "" while pos \<= bytes do charbytes = utf8charbytes(s, pos) local c = string.sub(s, pos, pos + charbytes - 1) newstr = newstr .. (mapping[c] or c) pos = pos + charbytes end return newstr end
I tried replaced all the string.len and string.sub in the Text Candy Library with calls to these functions instead, and it actually seemed to work somewhat. That is, it didn’t crash, but the non-ascii characters in my bitmap font were still not being displayed, and I don’t understand the workings of the TC library enough to know why they didn’t. And as I mentioned, the functions killed my framerate, even on my Mac desktop. I’ll try to wrap my head around the bitop plugin, but at first glance I’m not sure how it could be used to speed up these functions.