How to read file in UTF-8 mode

Hello!

Ihave some trouble. I had never work with LUA before.

I have simple file with the next content:

«Counting Stars» ― OneRepublic

So I wrote the next code for read this file:

BotSite = "" local open = io.open local function read\_file(path) local file = open(path, "r") if not file then return nil end local content = file:read "\*all" file:close() return content end BotSay = UserFrom .. ", " ..read\_file("D:/Snip/Snip.txt")

After executing that code, I getting:

nickname, «Counting Stars» â OneRepublic

As i right understand it UTF-8 problem. How to fix it?

It’s an issue with how you’re opening the file.  On Windows, you have to tell the OS how to read the file such as text or binary… and when opening for text, you have to specify which text encoding.

Since you open the file with “r”, Windows will use its default file-open settings which is ANSI text mode.  This is not what you want since you’re file is encoded in UTF-8.

The simplest solution would be to open the file in binary mode with “rb”, which will read the file’s bytes as-is, just like how it works on the Unix-like platforms (Apple and Android).

Yeah I use Windows 10.

So i tried use rb mode but it have the same effect and one more behavior: i can’t edit this file. This file use other application lalala

In fact, I use special program for twitch. And this program can execute any lua file:

https://yadi.sk/d/ShvzqZMivQSpV

The full content of file in first post message. I need just read the first line as string. That is all.

Are you sure your text file is using UTF-8 text encoding?

If you created the text file with Windows Notepad, then it’ll use an ANSI encoding instead which is not what you want.

Most code editing applications will have a Save As option where you can select which text encoding you want.  You want to use UTF-8 without BOM (ie: without a signature).

https://yadi.sk/d/D4I0PWvnvQYVM

This must be an issue with how you’re outputing the text then.

What does it look like when you output the file’s text via the Lua print() function?

Also, to help isolate the issue, try outputing the string received from the file unmodified.

And you definitely need to use “rb” when opening the file too.

I no have access to executing program.

I just have LUA file. The program written on c# as i know.

A minimum code what i must to write:

BotSite = "" BotSay = "text lalala"

So i need to  BotSay variable assign a string

String is file content. 

In binary mode i can’t to wrote the file ;[ It file periodically rewriting

Oh hold on.  You’re not using the Corona SDK and Corona Simulator then?

If so, then you’re in the wrong developer forum.

I say this because the official *unmodified* Lua library does not support UTF-8 on Windows.  Most SDKs and game engines (such as Corona) modify the native Lua library code to add UTF-8 support to it on Windows.  (Technically the issue is that Lua calls the Win32 char* APIs which only support ANSI text encoding, not UTF-8, and you have to transcode the UTF-8 string in Lua to UTF-16 and call Win32’s wchar* APIs instead.)

Oh my God ;[ Likely. So we are living in 2016 year and doesn’t support utf-8. Great xD

So can you tell me, how can I move my string in UTF-8?

chances are

After script executing, dispite “read mode” file was changed https://yadi.sk/d/rmerk8WIvTYnV

These forums for developers using the Corona SDK.  Since you’re not using Corona, it’s up to you find a solution to this.  I will tell you that there are open source *forks* of the Lua library that can be found on the Internet where developers have added Unicode support to the Win32 build of the Lua library.  But this is a task for you.

It’s an issue with how you’re opening the file.  On Windows, you have to tell the OS how to read the file such as text or binary… and when opening for text, you have to specify which text encoding.

Since you open the file with “r”, Windows will use its default file-open settings which is ANSI text mode.  This is not what you want since you’re file is encoded in UTF-8.

The simplest solution would be to open the file in binary mode with “rb”, which will read the file’s bytes as-is, just like how it works on the Unix-like platforms (Apple and Android).

Yeah I use Windows 10.

So i tried use rb mode but it have the same effect and one more behavior: i can’t edit this file. This file use other application lalala

In fact, I use special program for twitch. And this program can execute any lua file:

https://yadi.sk/d/ShvzqZMivQSpV

The full content of file in first post message. I need just read the first line as string. That is all.

Are you sure your text file is using UTF-8 text encoding?

If you created the text file with Windows Notepad, then it’ll use an ANSI encoding instead which is not what you want.

Most code editing applications will have a Save As option where you can select which text encoding you want.  You want to use UTF-8 without BOM (ie: without a signature).

https://yadi.sk/d/D4I0PWvnvQYVM

This must be an issue with how you’re outputing the text then.

What does it look like when you output the file’s text via the Lua print() function?

Also, to help isolate the issue, try outputing the string received from the file unmodified.

And you definitely need to use “rb” when opening the file too.

I no have access to executing program.

I just have LUA file. The program written on c# as i know.

A minimum code what i must to write:

BotSite = "" BotSay = "text lalala"

So i need to  BotSay variable assign a string

String is file content. 

In binary mode i can’t to wrote the file ;[ It file periodically rewriting

Oh hold on.  You’re not using the Corona SDK and Corona Simulator then?

If so, then you’re in the wrong developer forum.

I say this because the official *unmodified* Lua library does not support UTF-8 on Windows.  Most SDKs and game engines (such as Corona) modify the native Lua library code to add UTF-8 support to it on Windows.  (Technically the issue is that Lua calls the Win32 char* APIs which only support ANSI text encoding, not UTF-8, and you have to transcode the UTF-8 string in Lua to UTF-16 and call Win32’s wchar* APIs instead.)

Oh my God ;[ Likely. So we are living in 2016 year and doesn’t support utf-8. Great xD

So can you tell me, how can I move my string in UTF-8?

chances are

After script executing, dispite “read mode” file was changed https://yadi.sk/d/rmerk8WIvTYnV