[SOLVED] merge files

drifter · December 19, 2006

OK I have 2 files that are | delimited - one is like 26MB and holdds image names, the other is 48MB and holds records.

Each record has an record/code that coresponds to a line in the image names file... (each is about 40,000 lines and the do NOT match 1 to 1)

From the records file
data|1234|more|data|other|stuff...........................

from the image file
1234|image|names|here

So I currently start by looping through the image file- explode each line and write to an image array
$photoarray[$id]['pic1']=$lineelement[5];
$photoarray[$id]['pic2']=$lineelement[7];

So I get this whole 26 MB file in an array

then loop through the other file and match it with the right element in the array....

Now the problem - this is killing my memory - running 50-60% on this script - any bump in traffic and things start backing up and compounding and crash.

I am doing unset on everything that I use - ever photo element that is already used is unset just to save memory...

So are there any other ways of doing this?

Are there any good ways to scan the image file rather then saving it in an array?

drifter · December 19, 2006

Just as a note; I use this in there to slow things down when they get busy, but sleep only frees up CPU not memory

[code]
$load = sys_getloadavg();
if ($load[0] > 4) {
sleep(30);
echo "Busy server - sleep 30 seconds<br>";
}else if ($load[0] > 2) {
sleep(5);
echo "Busy server - sleep 5 seconds<br>";
}else{
if (time_nanosleep(0, 30000000) === true) {
echo "Slept for 0.03 second.<br>";
}
}
[/code]

michaellunsford · December 19, 2006

Load one line into an array to search with. Then loop through the entire other file -- overwriting the same array until you find what you're looking for.

Works great if there would be only one line in each file that would match one line in the other. If multiple lines are possible, you might have to make the array multi-dimensional, but it should work.

It'll be slow, though. Want faster? mysql was built for relational data.

drifter · December 19, 2006

do you think there are performance problems though looping through a 40,000 line file 40,000 times? I really have no idea. although on average, the line I am seeking would be at line 20,000, so I would only be looping through an average of 20,000 lines 40,000 times. (not counting those that do not have a match)

And I do not care about speed - this is a background process - I care about memory and CPU usage

michaellunsford · December 19, 2006

it'll probably clock the CPU, but by not dumping the whole thing into an array, the memory should be held in check. With 40k lines of file, one problem you will likely run into is PHP's maximum execution time.

I still think mysql is the magic pill. Faster, smaller, cleaner.

drifter · December 19, 2006

Hey - I switched it to the loop thought the file thing - I think I like it - I am amazed how fast if can loop through that many lines and take a substr to find a match.

I see the CPU is way up but my memory is down to like 10% - I have that code in there to make the program sleep whenever there is traffic, so I am not worried so much about the CPU.

Thanks

Sign In

[SOLVED] merge files

Recommended Posts

drifter

Link to comment

Share on other sites

drifter

Link to comment

Share on other sites

michaellunsford

Link to comment

Share on other sites

drifter

Link to comment

Share on other sites

michaellunsford

Link to comment

Share on other sites

drifter

Link to comment

Share on other sites

Join the conversation

Browse

Activity

Important Information