Why does encoding a binary file as a base64 string become corrupted when I use network.request()?

Hi fellow Corona devs,

I’m trying to send a .pdf / docx (binary) file which I have base64 encoded as part of the network.request() call.

Here is the code I’m using to test:

local fileName = "document.docx" local pathToImport = system.pathForFile( fileName, system.DocumentsDirectory ) local fileHandle = io.open(pathToImport,"rb") local mime = require "mime";     local encodedFile = mime.b64(fileHandle:read( "\*a" ));   local body = "fileName=".. fileName .. "&encodedFile =" .. encodedFile local postParams = {} postParams.body = body network.request( "http://www.domain.com/file.php", "POST", uploadCompleted, postParams )

Within the php file I am base64_decoding the encoded string.  I can see the decoded string present (since I output it for testing purposes).  So all the data I expect to be available via the request is present.

Once I base64 decode the string I save it to disk.  All appears to be working expected.

However when I try to view the saved docx file in Word, it complains that the file is corrupt or not of the right format.  It’s also happening to .png images, .pdf files etc.  Basically any binary data I sending across the network.

Has anybody experienced the same issue with regards to base64 encoding and the file becoming corrupted?  If so I’d be very interested to know what you did about it.

I’ve read on some *older* (2013) forum posts that network.request has / had a bug where it corrupts encoded base64 values?  Is this still true?

I’ve also seen this forum post: https://forums.coronalabs.com/topic/70098-help-how-to-covert-image-to-base64/

where the solution is to load a binary (image) file, save it as a binary file and reload it as a binary file.  I tried something similar but that didn’t work.  Or perhaps I did it incorrectly.

I’ve read that I could also use network.upload() but in the docs, it says that network.upload simply calls network.request eventually anyway.  Perhaps I’m missing some request header info?

I’m really at a loss as to what I should try next.  So any help would be greatly appreciated.

Thanks for reading.

For those that might have the same issue I think I solved it.

I’m replacing every occurrence of a space (" “) with a (”+") character in the php before decoding.  I remember reading another forum post where this was done.

Like so: 

$binaryData = str\_replace(" ", "+", $binaryData ); file\_put\_contents($fileName, base64\_decode($binaryData ));

and the resulting file is no longer corrupted.

Hi mate,

I havent done docs but images and I remember that to get it to work I first had to save the file as a binary (using the “b” flag), then read it back before encoding it.

I cant explain why it works like that but it works.

Seing your original code at the top, you could try to remove the “b” flag (io.open(pathToImport,“r”)) when reading the file the first time, then save it with the “b” flag, then load again with the “b” flag and encode.

This behaviour should not be related to using network.request.

I havent tested this so please let me know if it works or not.

Could be this is just a different way of doing what you did with replacing spaces.

There seems to be multiple topics to explore here.

  1. network.request() itself does not corrupt Base64 strings. There was a bug in the old mime library’s b64() function that had a string length limit where after a certain size, it wouldn’t encode correctly. That bug was fixed a long time ago.

  2. “rb” and “wb” are important in particular on Windows PCs. If you leave the “b” off, it opens the file in text mode. Line endings may get converted in unexpected ways.  Windows uses a CRLF (Carriage Return / Line Feed) method of marking the end of a text line. That’s a CTRL-M CTRL-J sequence. Mac OS 9 and earlier, had a line ending sequence that was a single CTRL-M (CR). Unix has always used a single CTRL-J (LF) character. If you open a file in text mode, the operating system under the hood attempts to convert these line endings. Because a CTRL-M (ASCII 13) and a CTRL-J (ASCII 10) can be valid of binary data like an image file, it shouldn’t be treated like a line ending. This is why opening images in binary mode (“rb”, “wb”) is important.  macOS X, iOS and Android are all variants of Unix. Many servers are running either Unix or Linux too. Because of this, you generally can get away without using binary mode, but there are some servers running Windows and end users can be running Windows too, so it’s always safer to open binary files in binary to avoid confusion.

  3. Data sent to a web server generally needs to honor URL encoding. Spaces are not permitted in URLs. When you type a URL into your web browser, the web browser will convert that space to a + sign or a %20 which is the hex value for an ASCII space. I don’t understand why POST data is impacted by this, but maybe PHP is doing something on the server side that’s expecting the POST data to be encoded. This may be why you’re running into issues. This isn’t a specific issue with network.request. It’s just sending the data.

Rob

if you are uploading a file why not using network.upload?

https://docs.coronalabs.com/api/library/network/upload.html

and in your code you don’t close the file. you encode it before you close it. don’t know if will corrupt the data.

@caroloscosta, just as an FYI, network.upload() is just a wrapper around network.request() and only works with certain server scripts that can accept PUT data. Most server scripts are expecting Multi-part MIME forms. The server script may also require the data to be base64 encoded instead of accepting binary data.

People developing projects that require uploading data needs to fully understand the server’s expectations and code the Corona side accordingly.

Rob

Rob i know its a wrapper, i’m not telling that the problem in @support_pz code will be resolved using network.update. I’m just asking if it’s doing an upload why not use the “simple” version of it. if the code is bad implemented elsewhere…it will still not work, ofc.

About base64 encoded files, i’m using it right now in a project and never had a problem with it. Sending in POST method, receiving in php

and like i told the network.request made @support_pz doesn’t look right to me.

*edit* also, i think PUT method (Idempotent) is better to handle files when uploading than POST method.

btw, im using base64 encoded json files not binary. if i was…i would post my code here that i know it works.

For those that might have the same issue I think I solved it.

I’m replacing every occurrence of a space (" “) with a (”+") character in the php before decoding.  I remember reading another forum post where this was done.

Like so: 

$binaryData = str\_replace(" ", "+", $binaryData ); file\_put\_contents($fileName, base64\_decode($binaryData ));

and the resulting file is no longer corrupted.

Hi mate,

I havent done docs but images and I remember that to get it to work I first had to save the file as a binary (using the “b” flag), then read it back before encoding it.

I cant explain why it works like that but it works.

Seing your original code at the top, you could try to remove the “b” flag (io.open(pathToImport,“r”)) when reading the file the first time, then save it with the “b” flag, then load again with the “b” flag and encode.

This behaviour should not be related to using network.request.

I havent tested this so please let me know if it works or not.

Could be this is just a different way of doing what you did with replacing spaces.

There seems to be multiple topics to explore here.

  1. network.request() itself does not corrupt Base64 strings. There was a bug in the old mime library’s b64() function that had a string length limit where after a certain size, it wouldn’t encode correctly. That bug was fixed a long time ago.

  2. “rb” and “wb” are important in particular on Windows PCs. If you leave the “b” off, it opens the file in text mode. Line endings may get converted in unexpected ways.  Windows uses a CRLF (Carriage Return / Line Feed) method of marking the end of a text line. That’s a CTRL-M CTRL-J sequence. Mac OS 9 and earlier, had a line ending sequence that was a single CTRL-M (CR). Unix has always used a single CTRL-J (LF) character. If you open a file in text mode, the operating system under the hood attempts to convert these line endings. Because a CTRL-M (ASCII 13) and a CTRL-J (ASCII 10) can be valid of binary data like an image file, it shouldn’t be treated like a line ending. This is why opening images in binary mode (“rb”, “wb”) is important.  macOS X, iOS and Android are all variants of Unix. Many servers are running either Unix or Linux too. Because of this, you generally can get away without using binary mode, but there are some servers running Windows and end users can be running Windows too, so it’s always safer to open binary files in binary to avoid confusion.

  3. Data sent to a web server generally needs to honor URL encoding. Spaces are not permitted in URLs. When you type a URL into your web browser, the web browser will convert that space to a + sign or a %20 which is the hex value for an ASCII space. I don’t understand why POST data is impacted by this, but maybe PHP is doing something on the server side that’s expecting the POST data to be encoded. This may be why you’re running into issues. This isn’t a specific issue with network.request. It’s just sending the data.

Rob

if you are uploading a file why not using network.upload?

https://docs.coronalabs.com/api/library/network/upload.html

and in your code you don’t close the file. you encode it before you close it. don’t know if will corrupt the data.

@caroloscosta, just as an FYI, network.upload() is just a wrapper around network.request() and only works with certain server scripts that can accept PUT data. Most server scripts are expecting Multi-part MIME forms. The server script may also require the data to be base64 encoded instead of accepting binary data.

People developing projects that require uploading data needs to fully understand the server’s expectations and code the Corona side accordingly.

Rob

Rob i know its a wrapper, i’m not telling that the problem in @support_pz code will be resolved using network.update. I’m just asking if it’s doing an upload why not use the “simple” version of it. if the code is bad implemented elsewhere…it will still not work, ofc.

About base64 encoded files, i’m using it right now in a project and never had a problem with it. Sending in POST method, receiving in php

and like i told the network.request made @support_pz doesn’t look right to me.

*edit* also, i think PUT method (Idempotent) is better to handle files when uploading than POST method.

btw, im using base64 encoded json files not binary. if i was…i would post my code here that i know it works.