Getting string.upper() to work with international characters

I’m struggling with getting string.upper() to work with international characters. 

local text = "ÆæØøÅå" text = string.upper(text) label = display.newText(text, display.contentWidth\*0.5, display.contentHeight\*0.5, native.systemFontBold, 40)

Running the code above gives the output below:

ÆæØøÅå

(I would love to show you a screen dump, but for some unfathomable reason I’m not allowed to upload files larger than 9.26 KB…)

So why is this happening and is there a way to make this work? I’m not particularly interested in doing all kinds of characters manually.

@runwinse,

I’m not 100% sure, but I think you’re talking about dealing with UTF-8 characters.  i.e. Non-ASCII encodings.

AFAIK LUA doesn’t have any ‘built-in’ libraries for dealing with UTF-8 symbols.  You’re probably going to need to hunt down a pure-lua solution for dealing with them.

@all - Folks if I’ve got this wrong, please correct me as I want to learn more about this myself.

You may also get some hints for dealing with multibyte cgaracters here, thanks to @ingemar’s code.
https://forums.coronalabs.com/topic/42019-split-utf-8-string-word-with-foreign-characters-to-letters/#entry320172

It may or may not be what you’re looking for, but don’t forget os.setlocale. There’s a nearly identical question here that gives an example of how to use it.

  • Caleb

I don’t find any mention of os.setlocale in the Corona docs at all:

https://docs.coronalabs.com/daily/api/library/os/index.html

How is this function supposed to be used? Can it be changed to different things while the app is running? My app happens to be a spelling trainer for kids supporting a slew of different languages. And I have to be able to uppercase texts in Spanish and Norwegian at the same instance.

Hi @runewinse,

These are not documented by us because, for one reason, they’re strictly Lua APIs, not Corona APIs. Atop that, the C/C++ function that os.setlocale() is bound to in Lua changes the locale globally within the app (not the system) on all threads, which might have a negative impact with native UI. Atop even that, os.setlocale() no-ops on Android because Google didn’t implement it in the Android NDK. So, on Android, C/C++ code (including Lua) always uses the “us-en” locale, regardless of the locale you set in the phone/tablet (at least for Google’s official devices, but various forks of Android may do things differently).

Stepping back from the engineering speak, have you considered just using an “all caps” custom font where all characters are inherently styled as if uppercase?

Best regards,

Brent

Hmm… that just might work for my purpose! I haven’t tried loading custom fonts to a Corona app yet, but one time must there’s a first for everything :slight_smile:

I’ll google around a bit and see if there are any “all caps” TTF fonts that has a rich character set out there. Any tips are appreciated!

Thanks!

-Rune

Thanks Brent!

Using an all caps TTF font worked just beautifully. I’d love to show how the app looked, but unfortunately there is an incomprehensible 9kB limit to the attachments on this forum…

Anyway, it works just fine now. Still incredible annoying not to be able to fix it the “right way” though…  :stuck_out_tongue:

@roaminggamer

@g.sciacchitano

There surely is a UTF-8 problem here. I’ve already included the algorithm from @ingemar, but unfortunately I haven’t found any good “UTF-8 uppercaser”.

Googling I found this:

https://github.com/starwing/luautf8

But I’m not even sure this is something that works or is possible for me to use.

Hi @runewinse,

You’ll be happy to learn that we just released a UTF-8 plugin, based on the same “luautf8” library that you found. This should help you solve all of your string manipulation needs. :slight_smile:

https://coronalabs.com/blog/2016/03/21/introducing-the-utf-8-string-plugin/

https://docs.coronalabs.com/plugin/utf8/index.html

Take care,

Brent

Thanks a million Brent!

How are you trying to attach things?

@runwinse,

I’m not 100% sure, but I think you’re talking about dealing with UTF-8 characters.  i.e. Non-ASCII encodings.

AFAIK LUA doesn’t have any ‘built-in’ libraries for dealing with UTF-8 symbols.  You’re probably going to need to hunt down a pure-lua solution for dealing with them.

@all - Folks if I’ve got this wrong, please correct me as I want to learn more about this myself.

You may also get some hints for dealing with multibyte cgaracters here, thanks to @ingemar’s code.
https://forums.coronalabs.com/topic/42019-split-utf-8-string-word-with-foreign-characters-to-letters/#entry320172

It may or may not be what you’re looking for, but don’t forget os.setlocale. There’s a nearly identical question here that gives an example of how to use it.

  • Caleb

I don’t find any mention of os.setlocale in the Corona docs at all:

https://docs.coronalabs.com/daily/api/library/os/index.html

How is this function supposed to be used? Can it be changed to different things while the app is running? My app happens to be a spelling trainer for kids supporting a slew of different languages. And I have to be able to uppercase texts in Spanish and Norwegian at the same instance.

Hi @runewinse,

These are not documented by us because, for one reason, they’re strictly Lua APIs, not Corona APIs. Atop that, the C/C++ function that os.setlocale() is bound to in Lua changes the locale globally within the app (not the system) on all threads, which might have a negative impact with native UI. Atop even that, os.setlocale() no-ops on Android because Google didn’t implement it in the Android NDK. So, on Android, C/C++ code (including Lua) always uses the “us-en” locale, regardless of the locale you set in the phone/tablet (at least for Google’s official devices, but various forks of Android may do things differently).

Stepping back from the engineering speak, have you considered just using an “all caps” custom font where all characters are inherently styled as if uppercase?

Best regards,

Brent

Hmm… that just might work for my purpose! I haven’t tried loading custom fonts to a Corona app yet, but one time must there’s a first for everything :slight_smile:

I’ll google around a bit and see if there are any “all caps” TTF fonts that has a rich character set out there. Any tips are appreciated!

Thanks!

-Rune

Thanks Brent!

Using an all caps TTF font worked just beautifully. I’d love to show how the app looked, but unfortunately there is an incomprehensible 9kB limit to the attachments on this forum…

Anyway, it works just fine now. Still incredible annoying not to be able to fix it the “right way” though…  :stuck_out_tongue:

@roaminggamer

@g.sciacchitano

There surely is a UTF-8 problem here. I’ve already included the algorithm from @ingemar, but unfortunately I haven’t found any good “UTF-8 uppercaser”.

Googling I found this:

https://github.com/starwing/luautf8

But I’m not even sure this is something that works or is possible for me to use.

Hi @runewinse,

You’ll be happy to learn that we just released a UTF-8 plugin, based on the same “luautf8” library that you found. This should help you solve all of your string manipulation needs. :slight_smile:

https://coronalabs.com/blog/2016/03/21/introducing-the-utf-8-string-plugin/

https://docs.coronalabs.com/plugin/utf8/index.html

Take care,

Brent

Thanks a million Brent!