Jump to content

optimizing is_dir()


nobodyk

Recommended Posts

I have approximately 50k-100k files in a directory. I'm running a script to check if any of the files are being used by the DB, if not then delete them. The problem is that I have made a quick test on a directory of just 1k files and it dies. Is there a way to optimize it? I know the script works, it's just that it takes too long to run. Even with just 1k files. And I'm pretty sure is the is_dir that's taking its sweet time. Any ideas?

<?php
require_once 'db_connect.php';


$default_dir = "storage/2011/"; 

if(!($dp = opendir($default_dir))) die("Cannot open $default_dir.");

while($file = readdir($dp)) 
{
if(is_dir($file)) 
{
continue;
}
else if($file != '.' && $file != '..') 
{
$query = "SELECT * FROM images Where filename = '".$file."' OR thumbname = '".$file."'";
$dbResult = mysql_query($query);
$num_rows = mysql_num_rows($dbResult);
if ($num_rows == 0){
	unlink($default_dir.$file);
	echo $file."<br />"; 
}
}
}
closedir($dp);
?>

 

Link to comment
https://forums.phpfreaks.com/topic/228578-optimizing-is_dir/
Share on other sites

yeah, I have access to the server. Tough I'm not much of an expert. I just know the basics. The web seems easier for me :/

 

what do you mean by making a list? I thought I already made one.

 

The script scans the dir and it cross references against the DB, if the file is not used in the DB then it gets deleted. I also added "LIMIT 1" to the query, but that didn't seem to help much.

Link to comment
https://forums.phpfreaks.com/topic/228578-optimizing-is_dir/#findComment-1178557
Share on other sites

what do you mean by making a list? I thought I already made one.

 

The script scans the dir and it cross references against the DB, if the file is not used in the DB then it gets deleted. I also added "LIMIT 1" to the query, but that didn't seem to help much.

 

As you have discovered, you are executing a new query for each file. The best thing to do would be to execute one query, save the results, then loop through those results and cross reference your files.

Link to comment
https://forums.phpfreaks.com/topic/228578-optimizing-is_dir/#findComment-1178560
Share on other sites

OK so I did what you recommended and it works better than the old one. I tried this on 1k files and it took about 3 minutes to run. I'm pretty sure it could be optimized, I just don't know what I can trim. This time my mysql didn't took a hit, but my php process was at 100%.

 

Any room for improvement?

 

<?php
require_once 'db_connect.php';

$default_dir = "storage/2011/"; 

//declare
$query = mysql_query("SELECT filename, thumbname FROM images");
$values_filename = array();
$values_thumbname = array();
$value_files = array();
$counter = 0;
$file_counter = 0;
$found = false;

//store db info in array
while ($row = mySql_fetch_array ($query)) {
$values_filename[$counter] = $row[filename];
$values_thumbname[$counter] = $row[thumbname];
$counter++;
}

if(!($dp = opendir($default_dir))) die("Cannot open $default_dir.");

//Store dir files in array
while($file = readdir($dp)) 
{
if(is_dir($file)) 
{
continue;
}
else if($file != '.' && $file != '..') 
{
$value_files[$file_counter] = $file;
$file_counter++;
}//end elseif
}//end while
closedir($dp);

//process files
for ( $index = 0; $file_counter > $index; $index += 1) {
for ($nest_index = 0; $counter > $nest_index; $nest_index +=1) {
	if (($value_files[$index] == $values_filename[$nest_index]) || ($value_files[$index] == $values_thumbname[$nest_index])) {
		$found = true; 
		$nest_index = $counter+1; //exit for loop
		}
}
if (!$found) echo $value_files[$index]."<br />"; //this is where the delete code goes. Just testing for now.
$found = false;//reset it for next file.
}
?>

Link to comment
https://forums.phpfreaks.com/topic/228578-optimizing-is_dir/#findComment-1178578
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.