Jump to content

How to convert dynamic pages's from old site, to new site (static pages?)


hippypink

Recommended Posts

I have a site that was built with some custom coding, and no longer have access to it. 

The pages are all in this structure:

 

index.php?filename=123

 

I want to copy those pages over, and keep their existing file structure.  I thought about copying the outputted html pages (named file123.html), and then doing something with PHP to read each of those files somehow. I don't really know, and maybe there's something easier, but I do like having separate files for each page on the site.

 

I know a very tiny bit of PHP as I have recently began to teach myself programming; but I can surely copy/paste all day long!

 

Any ideas?

The problem with that is that the pages contain a question mark "?" and that is not workable. 

For example, HTTrack can spider the site and download the pages, but changes all the filenames. 

This is because Windows (and Linux) will not accept question marks in filenames, which is why I am stuck.

found a few tutorials (havent read them all yet, so not sure which is working/good) and this seems like it might work:

 

http://everything2.com/title/How+to+replicate+a+dynamic+website+quickly+without+the+source+code+or+database

 

will update shortly

Here's the full instructions:

 

=======================================

 

Step 1:

 

Make a new folder to store the new web pages you are about to copy. On a Linux computer this will usually be somewhere within /var/www/html. Change to that directory.

 

Step 2:

 

Use wget to copy the website over to your computer.

 

wget -r -t5 http://foo.net/ -o download.log

This means: Recursivly download everything you can find on foo.net, if you cannot fetch something keep trying 5 times and record all the progress in the file called "download.log".

 

Step 3:

 

Setup your apache virtual hosts file and your local DNS server (or /etc/hosts file) so that you can see the website you have just copied over on a convenient URL on your computer. This makes it easy to find your copied web-site and test the next step.

 

Step 4:

 

In any folder where you can find HTML pages, insert a .htaccess script that looks something like this:

 

#Beginning

RewriteEngine on

Options +FollowSymlinks

 

RewriteRule (.+) page.php

#End

This says, if any pages other than "/" (the default page) are requested from this directory, rather than attempting to display the specified page, just run a script called page.php.

 

Page.php needs to be something like this:

<?

 

// This is the folder from which all relative URLs are derived.

$base_path="/where/to/find/your/page/";

 

// This is the filename that I shall retrieve.

if (! is_null( $_SERVER["REQUEST_URI"] ) )

    {

    $file = $base_path . $_SERVER["REQUEST_URI"];

    }

else

    {

    die ("No file to get");

    }

 

// Comment out this next line once you have it working, it's a security risk.

echo "<!-- This content was read from: ".$file."-->";

$fp=fopen($file, "r");

 

// Limit the page length to aprox 100kb

echo fread( $fp, 100000 );

?>

Step 5: Restart your apache server.

 

On a red-hat Linux box do:

 

/etc/init.d/httpd restart

Job done!

 

 

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.