sangoku Posted March 31, 2010 Share Posted March 31, 2010 Hy i am making a script which should go trough a directory files and do some stuff with it, but the script has an execution time limit. i want to be able to continue where i started, how can i tell the file handler from which file to start the directory reading????? Quote Link to comment Share on other sites More sharing options...
ignace Posted March 31, 2010 Share Posted March 31, 2010 set_time_limit(0); ..read directory ..do stuff exit(0); Quote Link to comment Share on other sites More sharing options...
Psycho Posted March 31, 2010 Share Posted March 31, 2010 ignace's solution should allow the script to run until completion. But, if this is on a hosted serer your host may have other means in place to limit execution time and/or the host may not take kindly to you running memory intensitive scripts for long periods. Â If that solution does not work for you, here is another option which I have used in processing a lot of files (e.g. scanning my mp3 collection to read the metadata of every file). I created a two part process. The first step was to create a script to read all the folders and subfolders in the root of the music directory and put them into an array. At the end I would run a single query to create a scan queue. The second step was a script which would get the next folder in the queue and read/process the files. I used AJAX on the "processing" page to keep calling the script until the process was complete This also allowed me to create a progress bar. If you have a large number of files in a single directory you could still do something similar to have x number of files processed on each call to the script. Quote Link to comment Share on other sites More sharing options...
ignace Posted March 31, 2010 Share Posted March 31, 2010 The first step was to create a script to read all the folders and subfolders in the root of the music directory and put them into an array. Â This does assume your script will be finished reading all directories without hitting the memory_limit or the time_limit Quote Link to comment Share on other sites More sharing options...
sangoku Posted March 31, 2010 Author Share Posted March 31, 2010 I only need the mean how to store the file pointer.... I am comparing the file names against a DB and making some manipulations against it. I ONLY need the way to store the current location nothing more nothing less  The only way i can think of is that i store the whole dir content into array and then loop trough it and store where i stooped..... any other solutions??????????  And i have already written the rest about manipulation and other stuff loping trough files bla bla bla ... that is not difficult i only don't want to have to store such GIANT arrays into sessions.... because the server limits.... is there a way to store the file pointer???? Quote Link to comment Share on other sites More sharing options...
Psycho Posted March 31, 2010 Share Posted March 31, 2010 This does assume your script will be finished reading all directories without hitting the memory_limit or the time_limit  Yes. I was able to read several thousand folders and add tem to the queue.  I only need the mean how to store the file pointer.... I am comparing the file names against a DB and making some manipulations against it. I ONLY need the way to store the current location nothing more nothing less  The only way i can think of is that i store the whole dir content into array and then loop trough it and store where i stooped..... any other solutions??????????  If you are processing against an array, I suppose you could iterate through the array using something like this: foreach($filesList as $idx => $value) {  $_SESSION['idx'] = $idx;  //Do something with the value }  But, that is terribly inefficienet since you have to rebuild the array each time the page loads. Plus, it is problematic if any folders change between page loads. You could probably avoid that by storing the folder path in the session and doing an array_search to find the last record processed.  I have a feeling you may be making this more complicated than it needs to be. Without nowing exactly what "manipulations" you are doing I can't say for sure, but it's possible you could run a single query using the array of filenames. Quote Link to comment Share on other sites More sharing options...
sangoku Posted March 31, 2010 Author Share Posted March 31, 2010 nope im not XD  The thing is this I am making a cleanup script for a frend which has a forum which has GIANT attachments masses. But his forum engine has leaks and sometimes when he deletes stuff has a crash ect... files get out of sync and he has now mass of files that are not in his DB he wants me to move them into a folder or delete them now i have his att folder and i loop trough it and see if the file is listed in the DB and if not i move/delete it. the thing is i dont want to load the server so i installed a sleep function and low priority select query which results in giant script execution times... now i need a way to store the point where i stooped the comparison when the execution time ends...  I could load the file names in a .ini file and on load parse it.... that would be faster i think but i am not sure how i should store the whole folder in it... this one has over 4TB of attachments in it.... I am thinking is it possible to open a file and then go to the next one?  Or i could use stream_get_contents to store the remaining files in the file.... Quote Link to comment Share on other sites More sharing options...
sangoku Posted March 31, 2010 Author Share Posted March 31, 2010 nope..... stream_get_contents dos not work... it should but it duos not.... QQ Quote Link to comment Share on other sites More sharing options...
Psycho Posted March 31, 2010 Share Posted March 31, 2010 Ok, if I understand you correctly you are trying to identify all the files which are no longer associated with existing forum posts. I am guessing that you are doing a db query against each file individually to see if the file is still listed in the db. If so, runiing many individual queries is very inefficient. Are you also generating an array of all the physical files each time? There is definitely a more efficient method. Here is what comes to mind:  1. Get an array of ALL the physical files on the server using glob() 2. Run a single query to get ALL the file names that are in the database 3. Use the db results to generate an array of valid files 4. Use array_diff() to generate an array of ALL the files that do not exist in the database 5. Run whatever process you need to against the values in the list of invalid file. If this list is too long to complete in a single run, then add the array of invalid files to a temporary table where you can process x number of records at a time.  There is no need to manually check each file individually. Example code: <?php  //Get array of ALL files on the server $filesOnServer = glob("path/to/files/*.*");  //Create array of ALL files in the DB $filesInDB = array(); $query = "SELECT filename FROM post_files"; $result = mysql_query($query) or die(mysql_error()); while($record = mysql_fetch_assoc($result)) {   $filesInDB[] = $record['filename']; }  //Create array of ALL files that exist on server //but do not exist in DB $filesNotInDB = array_diff($filesOnServer, $filesInDB);  ?> Quote Link to comment Share on other sites More sharing options...
sangoku Posted March 31, 2010 Author Share Posted March 31, 2010 nice idea but I have something more advanced in mind here is my current version of the file, the only problem is now to sore the so far read data.... and i think i should store the data of the scandir somwhere to... iam thinking of the ini file as a tem storage.... what do zou think??? ye i know the code is pretty messy....  <?php   /**   * @copyright at ©sinisaculic@gmail.com, all rights reserverd 2009-2010   * @author Siniša Čulić   * @version 1.0   * @created 31-mar-2010 13:51:17   */   //////////change this pasword to your personal one////   //////////change this pasword to your personal one////   //////////change this pasword to your personal one////   //////////change this pasword to your personal one////   $password = 'password1';   //////////change this pasword to your personal one////   //////////change this pasword to your personal one////   //////////change this pasword to your personal one////   //////////change this pasword to your personal one////   /**   *   */   $comparingValues = array();   $counter;   if ($password =='password'){     echo '<div class="alert">change the pasword!!!!!!</div>';     return false;   }   session_name('cleanup');   if(isset($_SESSION['cleanup_in_progress'])){   }else{     if(!isset( $_POST['DBname'],$_POST['host'],$_POST['DBusername'],$_POST['DBpassword'],$_POST['directory'],$_POST['pagePassword'],$_POST['exectutionTime'],$_POST['delayTime'],$_POST['attachment'])){       $_SESSION['cleanup_in_progress'] = true;       $_SESSION['start_time'] = time();       $_SESSION['execution_time'] = $_POST['delayTime'];       $dir=$_POST['directory'];       if($_POST['mesurment'] =='minutes'){         $time = int($_POST['exectutionTime']) * 60;       }else{         $time = int($_POST['exectutionTime']);       }       $partitoning = $_POST['partitioning'];       $_SESSION['partitioning'] = $partitoning;       set_time_limit($time);       $dir = $_POST['directory'];       if(is_dir($dir)){         $dir_content = scandir($dir);         $link = mysql_connect($_POST['host'],$_POST['DBusername'],$_POST['DBpassword']) or die('<div class="alert">cant connect to mysql!!!</div>');         mysql_select_db($_POST['DBname'],$link) or die ('<div class="alert">incorect DB name</div>');         if ($_POST['attachment'] !==''){           $name = $_POST['attachment'] ;         }else{           $name = 'attachment';         }         foreach($dir_content as $file){           if(is_file($file)){             if(bufferQuerry($file)){               /// need to put a storage way in here !!!!!!!!               /// need to put a storage way in here !!!!!!!!               /// need to put a storage way in here !!!!!!!!               /// need to put a storage way in here !!!!!!!!               /// need to put a storage way in here !!!!!!!!               /// need to put a storage way in here !!!!!!!!               /// need to put a storage way in here !!!!!!!!               /// need to put a storage way in here !!!!!!!!             }else{               $_SESSION['eror'] = '<div class="alert">there was an folder reading eror</div>';               echo'<div class="alert">there was an folder reading eror</div>';               die;             }           }         }       }     }   }   function bufferQuerry($file){     global $comparingValues;     global $counter;     global $partitoning;     if ($partitoning >= $counter){       $sql ="SELECT `filename` FROM `attachment` WHERE";       foreach($comparingValues as $name){         $sql.= " `filename` = $name or ";       }       trim($sql,' or');       $result = mysql_query($sql,$link);       if ($result){        return heckResultsOfTheQuery($result);       }else return false;     } else {       $counter++ ;       $comparingValues[] = $file;       return true;     }   }   function checkResultsOfTheQuery($result){     global $comparingValues;         if(!$result) return false;         foreach (mysql_fetch_array($result) as $row){       $result[] = $row[0];     }         $diference = array_diff($result,$comparingValues);     if($diference){       return MissingFIle($diference);     } else return true;     $comparingValues = array();   }   function MissingFIle($files){     $option = $_POST['delete'];     $moveFolder = $_POST['movingFolder'];     $originalFolder = $_POST['directory'];     if(!is_dir($moveFolder)){       $_SESSION['eror'] ='<div class="alert">The specified rouge folder is not a folder</div>';       echo '<div class="alert">The specified rouge folder is not a folder</div>';       die;     }     if(!is_writable($moveFolder)){       $_SESSION['eror'] ='<div class="alert">cant write to the rouge files folder</div>';       echo '<div class="alert">cant write to the rouge files folder</div>';       die;     }     if($option == 'move'){       foreach ($files as $filename){         rename($originalFolder.'/'.$filename, $moveFolder.'/'.$filename);       }      }else{       foreach ($files as $filename){         unlink($originalFolder.'/'. $filename);       }     }   } ?> <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> <html>   <head>     <title>Parameter setings</title>     <link rel="stylesheet" type="text/css" href="my.css">   </head>   <body>       <div class="data" <?php         if(isset($_SESSION['cleanup_in_progress'])){           echo "style='display:none'";         }       ?>       >       <form action="" method="post" >         <div class="database data">           Host name <input type="text" name="host" value="enter host name" size="45" ><br/>           Database name <input type="text" name="DBname" value="Enter the DB name" size="45" ><br/>           Database username <input type="text" name="DBusername" value="Enteher here your DB username" size="45" ><br/>           Database password <input type="password" name="DBpassword" size="45" ><br/>           I you changed the name of the table attachment then put its name here:<br/>           <input type="text" size="10" name="attachment" value=""><br/>           Else LEAVE the field empty!!!<br/><br/>           Upload Directorry<br/> <input type="text" name="directory" value="Enter here the directory you want to be cleaned" size="45" ><br/>         </div>         Database Page Paswoord <br/><input type="password" class="pagePassword" name="pagePassword" size="45" ><br/>         <div class="options">           Script execution time <input type="text" name="exectutionTime" value="10" size="10" ><br/>           <input type="radio" name="mesurment" value="seconds" checked="checked">in seconds<br/>           <input type="radio" name="mesurment" value="minutes">in minutes<br/>           Script delay time <input type="text" name="delayTime" value="in miliseconds" size="45" ><br/>           select if you want to delete or just move the invalid attachments to a folder<br/>           <input type="radio" name="delete" value="delete" checked="checked">delete           <input type="radio" name="delete" value="move"> move <br/>           folder for rougue files: <input type="text" name="movingFolder"><br/>           input the SQL partitoning size<input type="text" name="partitioning" value="10"><br/>           this value mesures how many files at once will be compared against the Database           <br/>           <br/>         </div>         <input type="submit" name="submit" value="exectute the script!" class="bt_register" />       </form>     </div>     <div class="status" <?php         if(!isset($_SESSION['cleanup_in_progress'])){           echo "style='display:none'";         }       ?>       >       Remaining time of the cleanup: <br/>       Duplicate files so far found are:<br/>     </div>   </body> </html> you have any ideas Quote Link to comment Share on other sites More sharing options...
Psycho Posted March 31, 2010 Share Posted March 31, 2010 That is exactly the problems I was referring to. You can make that much more efficient by not running a query for each file individually. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.