How to convert dynamic pages's from old site, to new site (static pages?)

hippypink · April 2, 2010

I have a site that was built with some custom coding, and no longer have access to it.

The pages are all in this structure:

index.php?filename=123

I want to copy those pages over, and keep their existing file structure. I thought about copying the outputted html pages (named file123.html), and then doing something with PHP to read each of those files somehow. I don't really know, and maybe there's something easier, but I do like having separate files for each page on the site.

I know a very tiny bit of PHP as I have recently began to teach myself programming; but I can surely copy/paste all day long!

Any ideas?

Jax2 · April 2, 2010

Just out of curiosity, if this site is still live, can't you just find a piece of shareware software that saves a mirror of the site onto your hard drive?

hippypink · April 2, 2010

The problem with that is that the pages contain a question mark "?" and that is not workable.

For example, HTTrack can spider the site and download the pages, but changes all the filenames.

This is because Windows (and Linux) will not accept question marks in filenames, which is why I am stuck.

hippypink · April 2, 2010

found a few tutorials (havent read them all yet, so not sure which is working/good) and this seems like it might work:

http://everything2.com/title/How+to+replicate+a+dynamic+website+quickly+without+the+source+code+or+database

will update shortly

hippypink · April 2, 2010

Here's the full instructions:

=======================================

Step 1:

Make a new folder to store the new web pages you are about to copy. On a Linux computer this will usually be somewhere within /var/www/html. Change to that directory.

Step 2:

Use wget to copy the website over to your computer.

wget -r -t5 http://foo.net/ -o download.log

This means: Recursivly download everything you can find on foo.net, if you cannot fetch something keep trying 5 times and record all the progress in the file called "download.log".

Step 3:

Setup your apache virtual hosts file and your local DNS server (or /etc/hosts file) so that you can see the website you have just copied over on a convenient URL on your computer. This makes it easy to find your copied web-site and test the next step.

Step 4:

In any folder where you can find HTML pages, insert a .htaccess script that looks something like this:

#Beginning

RewriteEngine on

Options +FollowSymlinks

RewriteRule (.+) page.php

#End

This says, if any pages other than "/" (the default page) are requested from this directory, rather than attempting to display the specified page, just run a script called page.php.

Page.php needs to be something like this:

<?

// This is the folder from which all relative URLs are derived.

$base_path="/where/to/find/your/page/";

// This is the filename that I shall retrieve.

if (! is_null( $_SERVER["REQUEST_URI"] ) )

{

$file = $base_path . $_SERVER["REQUEST_URI"];

}

else

{

die ("No file to get");

}

// Comment out this next line once you have it working, it's a security risk.

echo "";

$fp=fopen($file, "r");

// Limit the page length to aprox 100kb

echo fread( $fp, 100000 );

?>

Step 5: Restart your apache server.

On a red-hat Linux box do:

/etc/init.d/httpd restart

Job done!

hippypink · April 2, 2010

It works!

i had to remove the echo that "This content was read from"...

And had to put in a ./ in the $base_path

No need to restart Apache either I think.

Thanks me!

Sign In

How to convert dynamic pages's from old site, to new site (static pages?)

Recommended Posts

hippypink

Link to comment

Share on other sites

Jax2

Link to comment

Share on other sites

hippypink

Link to comment

Share on other sites

hippypink

Link to comment

Share on other sites

hippypink

Link to comment

Share on other sites

hippypink

Link to comment

Share on other sites

Archived

Browse

Activity

Important Information