Jump to content

Is this possible and how?


ojsimon

Recommended Posts

Hi

i found this code bellow in a forum, apparently it allows you to grab contents from any website, providing that you enter the starting and ending html points. I understand that this may be difficult and maybe pointless but i am interested, would it be possible to modify it so the administrator enters a website and then selects the section they want almost like cropping a website. Would this be possible? if so where should i start? what parts do i change? is there any current code that may help me out?

 

Thanks

<?php

// Mini-Fetch - Remote Content Retrieval System

//In this case, it fetches a search for "fresh content" from www.alltheweb.com, whom we hope you will visit.
$theLocation="http://www.nespaintball.com/pages/tournaments.php?tid=3";
//Below, at $start and $finish, you'll enter the start and finish points in the remote HTML.

$startingpoint = "<table style=\"width:100%;padding:4px;border:1px solid #000;background:#7e661c;color:#fff;\">"; // replace inside the quotes with with your unique start point in the source of the HTML page. It HAS to be unique.
$endingpoint = "</body>"; // replace with the unique finish point in the source of the HTML page 
//Don't forget to escape any " marks with a \ mark.
// Example: If the starting HTML is: <img src="images/something.jpg">
// You would tell Mini-Fetch: $startingpoint = "<img src=\"images/something.jpg\">";

//That's probably all you need to edit, unless you want to match and replace certain text or HTML.

// - "Don't touch this part..."
preg_match("/^(https?:\/\/)?([^\/]*)(.*)/i", "$theLocation", $matches);
$theDomain = "http://" . $matches[2];
$page = $matches[3];

$fd = fopen($theDomain.$page, "r"); // can change to "rb", on NT/2000 servers, if problems.
$value = "";
while(!feof($fd)){
$value .= fread($fd, 4096); 
}
fclose($fd);
$start= strpos($value, "$startingpoint"); 
$finish= strpos($value, "$endingpoint"); 
$length= $finish-$start;
$value=substr($value, $start, $length);
// end "don't touch this part"


// eregi_replace, below, is a case-insensitive function to find, match, and replace variations of text that you define.
//The following commands strip or replace HTML tags. 
//To NOT strip a certain HTML tag, add // before the line in question.
// the "", before the $value at the end of the line means replace the tag with blank space, which effectively deletes the tag.

// $value = eregi_replace( "<img src=[^>]*>", "", $value ); // Remove all image tags. This is disabled until you remove the // in front of this line.
$value = eregi_replace( "<IMG alt=[^>]*>", "", $value ); // Remove all image alt="whatever" tags
$value = eregi_replace( "<class[^>]*>", "", $value ); // Remove all variations of <class> tags.
//$value = eregi_replace( "<table[^>]*>", "", $value ); // Remove ALL variations of <table> tags.
//$value = eregi_replace( "<tr[^>]*>", "", $value ); // Replace <tr> tags with blank space.
//$value = eregi_replace( "<td[^>]*>", "", $value ); // Remove all variations of <td> tags.
$value = eregi_replace( "Signed up teams[^>]*>", "", $value );



// Below - what's the difference, you ask, between eregi_replace and str_replace?
// str_replace is faster, by a long shot... The catch is that in can only be used
// to replace EXACT value matches, as you see below, and doesn't work well in huge files without using arrays.
$value = str_replace( "</font>", "", $value ); // Remove closing </font> tags.
//$value = str_replace( "</table>", "", $value ); // Remove closing </table> tags.
//$value = str_replace( "</tr>", "", $value ); // Remove closing </tr> tags.
//$value = str_replace( "</td>", "", $value ); // Remove closing </td> tags.
//$value = str_replace( "<center>", "", $value ); // Remove <center> tag...
//$value = str_replace( "</center>", "", $value ); // ...alignment calls.
$value = str_replace( "<b>", "", $value ); // Remove <b> tags.
$value = str_replace( "</b>", "", $value ); // Remove closing </b> tags...
//$value = str_replace( "<table style=\"width:100%;padding:4px;border:1px solid #000;background:#7e661c;color:#fff;\">", "<table align=\"center\" border=\"0\" cellpadding=\"4\" cellspacing=\"1\" class=\"alt1\" width=\"100%\">", $value );
$value = str_replace( "<td>No</td>", "", $value );
$value = str_replace( "<td style=\"font:12px Arial,sans-serif;color:#fff;\"><b>PAID</b></td>", "", $value );
$value = str_replace( "<td>No</td>", "", $value );
$value = str_replace( "<a href=", "<a", $value );
$value = str_replace( "<table style=\"width:100%;padding:4px;border:1px solid #000;background:#7e661c;color:#fff;\">", "<table>", $value );
$value = str_replace( "</body>", "", $value );
$value = str_replace( "<td style=\"font:12px Arial,sans-serif;color:#fff;\">", "", $value );
$value = str_replace( "PAID", "", $value );
$value = str_replace( "<td colspan=\"3\" style=\"font:12px Arial,sans-serif;color:#fff;\"></td>", "", $value );
$value = str_replace( "</td>DIV</td>", "", $value );
$value = str_replace( "TEAM NAME</td>", "", $value ); 



// More tags. Just take out the // in front and edit as you like.
//$value = eregi_replace( "Competitors name", "", $value ); // Remove certain text...
//$value = eregi_replace( "<javascript[^>]*>", "", $value ); //remove javascripts
//$value = eregi_replace( "<script[^>]*>", "", $value ); //remove scripts

// replace normal links with HTML to open fetched links in new window
$value = eregi_replace( "href=", "target=\"_blank\" href=", $value ); 

// open links that use " in new window 
$value = eregi_replace( "href=\"", "target=\"_blank\" href=\"", $value ); 

$donstart = "<table class=\"tborder\" width=\"175\"><tr><td class=\"alt1\">";

$donend = "</td></tr></table>";

$FinalOutput = preg_replace("/(href=\"?)(\/[^\"\/]+)/", "\\1" . $theDomain . "\\2", $value);

echo $donstart ;
echo $FinalOutput ; //prints it to your page
echo $donend ;

flush (); //force output to your page faster

?>

Link to comment
https://forums.phpfreaks.com/topic/84855-is-this-possible-and-how/
Share on other sites

Hi

i found this code bellow in a forum, apparently it allows you to grab contents from any website, providing that you enter the starting and ending html points. I understand that this may be difficult and maybe pointless but i am interested, would it be possible to modify it so the administrator enters a website and then selects the section they want almost like cropping a website. Would this be possible? if so where should i start? what parts do i change? is there any current code that may help me out?

No!

 

So you're telling me that if you enter in http://www.phpfreaks.com and selects the parts you want, the site will crop itself to that!? You can't do that.

 

If that's not what you mean, please elaborate. :)

 

no what i want to do is modify this script which allows you to put sections of any site on your site, but i want to change it so instead of entering the in and out html points you have a selecter which you drag over the section you want and then you can use that section

 

thanks

no what i want to do is modify this script which allows you to put sections of any site on your site, but i want to change it so instead of entering the in and out html points you have a selecter which you drag over the section you want and then you can use that section

 

thanks

1. That's copyright infringement.

2. PHP is a server-side language. "Dragging over a selection" is client-side. You'll have to use another language. As for loading the page HTML, you can do that with PHP, I think.

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.