Jump to content

Reading large remote xml file


seavers

Recommended Posts

Hi,

 

I'm attempting to read a large xml file that is held remotely, I'm guessing it's around 250MB in total. I have attempted to use xml_reader to parse the data, but that fails with an XML Error: invalid end of document at around the 65MB mark.

 

So, I attempted to fread the file and write it locally, but that fails around about the same time (65 MB).

 

Here's my simple fread code:

 

$handle = fopen("http://www.example/largefile.xml", "r");
$contents = '';
if($handle)	{
	while (!feof($handle)) {
	// set execution time no limit
	set_time_limit(0);
	$contents .= fread($handle, 8192);
 }
	fclose($handle);
	// now write to local file
	// set execution time no limit
	set_time_limit(0);
	$fp = fopen('xmlinfo.txt', 'w');
	fwrite($fp, $contents);
	fclose($fp);
}

 

Should I be able to read and write the whole content of a very large file?

 

If not, what is the best way to get large remote xml files to process locally?

 

Help is much appreciated.

 

James

Link to comment
Share on other sites

hi,

 

looks to me that this is a php.ini configuration issue...

 

I'm not quite sure which param is guilty here but you might want to start by looking at the "memory_limit"

 

In both cases, xml_reader and fread, I think PHP tries to load the whole file in the memory...

 

Also, do pay attention to the "max_execution_time" param...

Link to comment
Share on other sites

Hi,

 

Adjusting the memory_limit and execution time seemed to have no effect.

 

I decided not to use fread, and instead used fgets. I thought this would read the remote file in one line chunks, and therefore negate any memory errors. Here's my new code:

 

<?php

$handle = fopen("http://www.example.com/largefile.xml", "r");

if($handle)	{
	while (!feof($handle)) {
		$contents = '';
		// set execution time no limit
		set_time_limit(0);
		$fp = fopen('c:\\users\\workjames\\xmlinfo.txt', 'a');
		$contents = fgets($handle, 4096);
		// now write to local file
		// set execution time no limit
		set_time_limit(0);
		fwrite($fp, $contents);
		fclose($fp);
		flush();
	}

	fclose($handle);
}

?>

However, this still only returns a prtion of the remote file, a similar amount to before.

 

I don't understand how the memory is being used up (it's set to 128M), can anyone enlighten me? Could it be the size of the file that I'm opening to write the data to?

 

Kind regards,

 

James

Link to comment
Share on other sites

  • 2 weeks later...

I have tried to read the remote file line by line using php using 'fread', but it still only returns a portion of the file.

 

I have tried usinf cURL, but again only a portion of the remote file is returned.

 

I think it may be a time limit set at the remote end, as I have tested further by reading a larger file from one of my remote servers.

 

So, before I go back to the client and tell them I cannot access the file for a long enough time to download it, am I missing a quicker method access and read the file?

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.