Issue with readfile() and download on at least one mobile browser

LLLLLLL · December 30, 2013

I've written software that lets a file get downloaded by a user click. There is one customer site where downloads are failing but only on Android browser (and possibly other mobile?). Both mp3s and PDFs fail to download, and both file types stop downloading at about 14.61 KB. To clarify, what that means is that my Android phone shows that the mp3 was downloaded, with a file size of 14.61, even though the actual file is 3MB or so.

The code is pretty standard stuff:

header( "Content-Description: File Transfer" );
header( "Content-Type: application/force-download");
header( "Content-Length: " . filesize( $filename ) );
header( 'Content-Disposition: attachment; filename="' . $dp->download_filename . '"' );
readfile( $filename );

Since this code works on hundreds of sites, and since there's no mobile issue on other sites, a code issue seems unlikely. I think there might be a server-side issue but I don't know what to check. Download issues are often php.ini limits on file size, but since this same file downloads on desktop browsers without issue, a php.ini issue seems unlikely.

Any ideas on why this would fail on mobile?

MDCode · December 30, 2013

Try adding another header to specify the type of the file being downloaded. I had an issue with that before.

LLLLLLL · December 30, 2013

I will try that, but why would those files download...

1) Successfully from a desktop browser

2) Successfully from my server on a mobile browser (instead of using the customer's site)

3) Unsuccessfully from his server and mobile.

...?

#2 makes it seem like it's a server thing. The same code runs on my dev server as the customer's server. The same files download for me without issue.

LLLLLLL · December 30, 2013

Didn't help, by the way. Still stumped.

mac_gyver · December 31, 2013

when you open this 14k file after it has been downloaded, in a programming editor, what does it contain? is it the correct first 14k of the actual file or is it some error messages from the server?

Edited December 31, 2013 by mac_gyver

LLLLLLL · December 31, 2013

It's a corrupt file. It's nothing. I'm not sure how I'd know if it's "the first 14k" of the file.

Edited December 31, 2013 by timneu22

mac_gyver · December 31, 2013

by opening the full correct file in a programming editor too, to see if the start of the file is the same.

edit: also, did you scroll to the end of the 14k file when you had it open in your programing editor to see if there were any error messages at the end of it?

Edited December 31, 2013 by mac_gyver

LLLLLLL · December 31, 2013

Well this is interesting, the file being downloaded is actually the raw HTML text of the download page.

Any explanation on that?

MDCode · December 31, 2013

I had a problem with that but forgot how I fixed it. All I know is it had something to do with it requesting the file twice for some reason. When it refreshed one of the session variables I had unset to query the database for the location was unset and assumed it was the file I was headering to if it was not set.

LLLLLLL · December 31, 2013

I really have to think this is a server issue. The error doesn't occur on anyone else's site, and it only occurs on this site with mobile. Something's wrong with the server.

MDCode · December 31, 2013

When downloading, it can be browsers as well. Just as in older versions of FF, if you don't specify the size of the file, it returns an incorrect value.

LLLLLLL · January 1, 2014

Right, but it's not a browser issue. The same browser can download the same file from a different server. So it's a server issue, but are there specific Apache directives or some craziness that would make something fail only on mobile?

mac_gyver · January 1, 2014

what's your complete force download script?

best guess is the $filename variable is being set to the page the download link was on, perhaps due to some url rewriting or similar.

also, afaik "application/force-download" is a non-existent type and cause browsers to perform a raw/binary download because the browsers don't know what to do for a non-existent type. afaik, the content type for a file download should be application/octet-stream This may also have something to do with the devices where this isn't working as they may be choosing to do something different for a type they don't understand.

edit: at this point, i would be logging the actual data your download script is receiving as input values so that you can look to see if the problem is in the values it is receiving or in something after the code runs.

Edited January 1, 2014 by mac_gyver

LLLLLLL · January 1, 2014

There's no problem with $filename, because again, this works from a desktop browser. I have already examined the variables with logging and of course they are correct, since it's the exact same code from desktop or mobile browser.

I've tried changing to octet-stream and I've tried dozens of other variations. This may be useful, but it doesn't change the fact that this code works on other servers. It looks like octet-stream is a better choice, and I'll do that, but again, this code works on other servers to the same device and browser.

mac_gyver · January 2, 2014

it doesn't matter how many servers or browsers your code may have worked on, what matters is what is exactly happening on the one server with the client(s) where it doesn't work and with the files or file types where it doesn't work.

this is something going on, on the server. that means everything from the request it receives from the client, through all the code/settings involved, to the last byte of data that is send back at the end of the download. without seeing what the http request(s) are in the server's access log file (you may be getting multiple ones from the client), what sort of url rewriting there might intentionally or accidentally be, what your full php force download code is (less any database credentials), what values the code is actually using when it doesn't work (they may have looked correct and normal to you, but they would provide a clue to someone here, such as how the server could be sending the html contents of a page when it should be sending the contents of a file stored somewhere), ... there's no way to eliminate anything on the server as the possible cause of the problem.

when you opened the 14k file in your editor and saw the html of the download page, were there any php errors at the start or end of it? do you have php's error_reporting set to E_ALL and either display_errors set to ON or log_errors set to ON so that you would know if there are any php detected errors? have you looked at the web server's access log to see what the requests are? is there only the one expected request to your force download script or are the more than one from the client? have you looked at the web server's error log to see if there is any relevant information in it?

and since this may have something to do with php's output buffering, what does the output from a phpinfo() statement show for the output_buffering setting?

are you intentionally or perhaps accidentally using any sort of output handlers/call-back functions?

edit: i would also be logging the integer length that the readfile() statement returns.

Edited January 2, 2014 by mac_gyver

LLLLLLL · January 2, 2014

I found that the script is getting called twice, and this is the problem. Because after the file is downloaded once, it's not allowed to be downloaded again. The script checks for this and if the download limit was reached, it redirects to the original page with a GET paramter so the appropriate error is displayed. The HTML that gets returned has the message "No more download attempts are allowed", which is the valid code in that situation. But why the script gets called twice is beyond me.

I have the server people looking at the logs to get this information. I've long suspected some htaccess or Apache or other rewrite that somehow makes the URL request or response get screwed up. That's probably the case for the code getting hit twice, but it's odd to me that the HTML would be returned as an attachment; the normal behavior is just to be directed to the page (the page that's made up of the HTML that's being returned).

Output buffering is 4096, for what it's worth.

I'm awaiting information from the server people.

kicken · January 2, 2014

The device in question may be doing something odd such as a HEAD request prior to the GET request, or it may do the GET, realize it's a download when it sees the content-type and cancel that request and forward the URL to some other service which then re-issues the request. If dual requests are the cause, then you may just have to modify your script to allow them. Rather than disallow any future downloads immediately on the first request, maybe set a timestamp and any additional requests received within x seconds(or minutes) will still be allowed.

When trying to debug issues like this, what I find is it is usually good to setup a log file and pepper the code with a bunch of calls which write data to the log file so you can track the scripts progression. Along with any specific variables you want to log, I usually always write the values of the $_POST, $_GET, and $_SERVER super-globals to the log as well so I can see what data is being sent to the script.

With such a setup you would have been able to tell right away you were dealing with multiple-requests (as there'd be two sets of log entries) and possibly a reason why if there are any clues in one of the super globals listed.

mac_gyver · January 2, 2014

since you are redirecting around all over the place based on conditions, it's likely you have some combination of a logic error/value problem/redirect loop/trying to use $_SERVER['HTTP_REFERER'] (which you cannot rely on) and/or don't have an exit statement after one or more header redirects.

here's my current guess - your logic is trying to redirect back to a page (you would like it to be the original download page) via a value either from a session variable or HTTP_REFERER, for some condition such as the requested file cannot be found, and your logic has already set the session variable that states a download has already occurred, but for the devices where this doesn't work, the url being used in the redirect ends up being to the force download script (i.e. the session variable or HTTP_REFERER value being used as the target page is likely empty so that the redirect is to the current page, which is the force download script), but it now has a get parameter on the end of the url, not of the filename to download, but the url of the download page, resulting in the original download page being requested (and since the session variable stating the download has already been used, is set, the content on that page shows that message) and the url of the original download page is read and output as the content of the forced downloaded file.

some problem with just posting your code? it would save a bunch of time and cut out the guesswork.

of course, if what i have theorized is actually happening, it means your force download script is not validating the supplied filename and your script will allow any file on the server to be requested and downloaded, such as the file holding your database connection details.

LLLLLLL · January 2, 2014

I'm not "redirecting all over the place", and I never use HTTP_REFERER. I do appreciate the help but what you're suggesting here is not correct.

1) There's a GET parameter passed that references a database row

2) That row knows the file, knows if it was downloaded before, knows other security settings related to the download

3) That file gets sent to the browser

4) All logic to redirect (in #2) goes through a single function that calls header() and then calls exit.

Posting the whole file isn't something that will help because the issue only occurs on one server. There's too much application logic on there anyway. Maybe I can send it to you privately, but at this point I'm awaiting the web host.

I should have probably posted this in the Apache forum or something. The web host is unable to get back to me with the reason for the multiple requests on the file.

mac_gyver · January 3, 2014

The web host is unable to get back to me with the reason for the multiple requests on the file.

because it is highly likely that it's your force download code where the problem is at.

if you look at the questions in any programming help forum, 99.5% of the time, the problem is due to a lack of or a misinterpretation of the core information. the other .5% are due to actual bugs in the underlying system.

at this point, since you don't know yourself, don't believe that it could be your code causing the problem, we cannot help you further without specific information that you have refused to simply post, bye.

jazzman1 · January 3, 2014

at this point, since you don't know yourself, don't believe that it could be your code causing the problem, we cannot help you further without specific information that you have refused to simply post, bye.

I think he makes no difference downloading a binary data through a browser and download manager in Android OS.

The download manager doesn't care about the MIME type of the script.

There is one customer site where downloads are failing but only on Android browser (and possibly other mobile?

What exactly browser are you using in your phone and what the current android version is?

MDCode · January 3, 2014

https://code.google.com/p/android/issues/detail?id=1978

As stated about half way down, it is an intended part of the android system. It has nothing to do with not working on one server. Not sure why you are not experiencing it on another server, but it is intended. You will have to rewrite your code to comply with android.

Edited January 3, 2014 by SocialCloud

LLLLLLL · January 3, 2014

It was not my intention to keep the code from you. I have been away all day. But saying it's a coding issue when the code clearly works on all other sites and browsers, but not on this site only with mobile, seems completely illogical. Here's the modified/simplified code. Almost all of it is application-specific and won't help you, but here it is anyway.

<?php
common::sanitize_gets();

if ( !common::gx( 'guid' ) ) {
	echo translated_text::find( "Download.NoFileGuid" );
	exit;
}

session_start();
$guid = common::g( 'guid' );

// get the file location from preferences
{ phpfreaks...  read some preferences from DB }
$source_dir        = 
$root_download_dir = 
$enforce_ip        = 
$retry_mins        = 
$download_attempts_allowed = 

$data = { php freaks... read info about file }
// these error numbers come from download_product constants
if ( $data === false ) {
	exit_w_msg( "error=681" );
}

// read the product
$dp = new download_product();
$dp->read( $data );

// need the order guid and the customer name.
$data = { php freaks ... get customer name and order name }
$order_guid = 
$oid        = 
$cname      = 

//  php freaks... this is where the log shows we reach this code twice
//	common::debug_to_log( $dp );
//	common::debug_to_log( $download_attempts_allowed );

// check status
if ( $dp->status >= $download_attempts_allowed ) {
	exit_w_msg( "id=$order_guid&error=" . download_product::ERR_ALREADY );
}

// if an attempt has already been made on the file, verify the download rules
if ( !empty( $dp->attempt_time ) ) {	
	if ( $enforce_ip && $_SERVER[ 'REMOTE_ADDR' ] != $dp->ip_address ) {
		exit_w_msg( "id=$order_guid&error=" . download_product::ERR_WRONG_IP );
	}
}

// find the source file to copy. we copy to a temporary directory
// for safety; it's apparently better to stream from a directory that
// doesn't actually exist except for temporarily. plus we may
// alter the file before we stream it.
$source_file = $source_dir . $dp->filename;

if ( !file_exists( $source_file ) ) {
	exit_w_msg( "id=$order_guid&error=" . download_product::ERR_NO_SOURCE );
}

// find out where we need to copy the files.
if ( !file_exists( $root_download_dir ) ) {
	exit_w_msg( "id=$order_guid&error=" . download_product::ERR_NO_DIR );
}

// the guid folder should NOT already exist. it should be deleted from
// a previous download attempt. but if it DOES exist, just go ahead and
// use it.
if ( !file_exists( $root_download_dir . $guid ) ) {
	// now create the new directory where we will copy the file
	$dir_made = mkdir( $root_download_dir . $guid );
	
	if ( !$dir_made ) {
		exit_w_msg( "id=$order_guid&error=" . download_product::ERR_DIR_CREATE );
	}
}

// so create the filename, and then copy the file.
$filename = $root_download_dir . $guid . "/" . $dp->filename;

// that last line isn't always right. if the dp filename is
// a subfolder/filename, remove the subfolder.
if ( strpos( $dp->filename, "/" ) !== false ) {	
	$just_filename = basename( $source_file );
	$filename = $root_download_dir . $guid . "/" . $just_filename;
}

$copy_ok = copy( $source_file, $filename );

if ( !$copy_ok ) {
	exit_w_msg( "id=$order_guid&error=" . download_product::ERR_FILE_COPY );
}

// check for existence. would be odd, wouldn't it?
if ( !file_exists( $filename ) ) {
	exit_w_msg( "id=$order_guid&error=" . download_product::ERR_NO_FILE );
}

//
////////////////////////////////// all of the below is up for debate
//

//header( "Pragma: public" ); // required
//header("Cache-Control: must-revalidate, post-check=0, pre-check=0");
//header("Cache-Control: private",false); // required for certain browsers 

header( "Content-Description: File Transfer" );

/*
header('Content-Transfer-Encoding: binary'); // test
header('Expires: 0'); // test
header('Cache-Control: must-revalidate'); // test
header('Pragma: public'); // test
header( "Content-Type: application/octet-stream"); // test
*/

header( "Content-Type: application/octet-stream" );  // commented. test
header( "Content-Length: " . filesize( $filename ) );
header( 'Content-Disposition: attachment; filename="' . $dp->download_filename . '"' );

ob_clean(); // test
flush(); // test

readfile( $filename );
//
///////////////////////////////////// all of this is up for debate. ^^^^^^^^^^
//


// we're done streaming. do some cleanup.
unlink( $filename );
$rm_dir_ok = rmdir( $root_download_dir . $guid );

// now update the download status:
$dp->status = $dp->status + 1;
$dp->ip_address = $_SERVER[ 'REMOTE_ADDR' ];
$d = date( 'Y-m-d H:i:s' );

if ( empty( $dp->attempt_time ) )
	$dp->attempt_time = $d;
	
$dp->update();

// this function logs ALL attempts (ip, time) on a file. security.
$dp->log_attempt( $_SERVER[ 'REMOTE_ADDR' ], $d, $guid );


function exit_w_msg( $msg ) {
	header( "Location:download.php?" . $msg );
	exit;
}

?>

Edited January 3, 2014 by timneu22

LLLLLLL · January 3, 2014

https://code.google.com/p/android/issues/detail?id=1978

As stated about half way down, it is an intended part of the android system. It has nothing to do with not working on one server. Not sure why you are not experiencing it on another server, but it is intended. You will have to rewrite your code to comply with android.

I don't think this has anything to do with the current topic.

Edited January 3, 2014 by timneu22

MDCode · January 3, 2014

I don't think this has anything to do with the current topic.

Just because it is cgi doesn't mean it has nothing to do with it. The issue they stated was that it loads the URL twice.

Later down, it was posted that android intended for that to happen. One request to see if it is a download. If it is, send another request to the download manager.

Sign In

Issue with readfile() and download on at least one mobile browser

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation

Important Information