Jump to content

Recommended Posts

the problem:

3 yrs ago I made a very big site, over 200 files of php.

started with myself and one other coder than hired another coder to do things I could not do.

Cause of the different coders I know there are files not in use anymore and taking up space.

 

Is it possible in php to scan the site and every link in the site and print a list of files that are used, cause doing it one file at a time and one file links to 3-4 other php files and some are cross linked would take to long.

 

or could someone suggest something cause ive searched high and low and all I come up with is a link checker but that does not work the way I need.

 

I could really use a point in the right direction

 

ive looked into the sitemaping programs and tried a few the problem is that some files are not visible unless logged in and even then some files i know are in use did not show up little function files and includes.

 

thats leading me to post here in search of help

 

but thanks for the try

I am pretty sure there is no existing utility. It would require reading the source files and determining all referenced files, either through the file system or url's. The utility would need to take into account relative and absolute url's and all the possible methods of referencing the file - href links, header(), include, require, _once versions of those, file(), fopen(), file_get_contents()...

 

You would be better off making a copy of the site onto a development system with full php error reporting turned on and start with just the known files (from a site map) and go through all possible links and see what errors occur.

You may want to use PHP to crawl through your site.

 

First, use a recursive directory reader to build a list of every PHP file in the project folder.

 

Then starting with your main page, loop through every line of code and use regex to pull out strings containing '.php' Use this to build another array... you may have to edit the results to replace variable paths.

 

Compare the arrays, delete any files you don't find. Browse your site and hit every link. If you run into an error, restore the file causing it.

 

Good luck :)

I'd say discomatt's ideas the best to go with.. Perhaps store the results in a database and then query it for files not matched to the original file list? So like a table called "original_filelist" and one called "used_files".... as you come across a reference to a file enter it into "used_files"... then obivouslly files in "original_filelist" without a match in "used_files" are unused files... Simple?

 

Adam

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.