Jump to content

Recommended Posts

Hi

i found this code bellow in a forum, apparently it allows you to grab contents from any website, providing that you enter the starting and ending html points. I understand that this may be difficult and maybe pointless but i am interested, would it be possible to modify it so the administrator enters a website and then selects the section they want almost like cropping a website. Would this be possible? if so where should i start? what parts do i change? is there any current code that may help me out?

 

Thanks

<?php

// Mini-Fetch - Remote Content Retrieval System

//In this case, it fetches a search for "fresh content" from www.alltheweb.com, whom we hope you will visit.
$theLocation="http://www.nespaintball.com/pages/tournaments.php?tid=3";
//Below, at $start and $finish, you'll enter the start and finish points in the remote HTML.

$startingpoint = "<table style=\"width:100%;padding:4px;border:1px solid #000;background:#7e661c;color:#fff;\">"; // replace inside the quotes with with your unique start point in the source of the HTML page. It HAS to be unique.
$endingpoint = "</body>"; // replace with the unique finish point in the source of the HTML page 
//Don't forget to escape any " marks with a \ mark.
// Example: If the starting HTML is: <img src="images/something.jpg">
// You would tell Mini-Fetch: $startingpoint = "<img src=\"images/something.jpg\">";

//That's probably all you need to edit, unless you want to match and replace certain text or HTML.

// - "Don't touch this part..."
preg_match("/^(https?:\/\/)?([^\/]*)(.*)/i", "$theLocation", $matches);
$theDomain = "http://" . $matches[2];
$page = $matches[3];

$fd = fopen($theDomain.$page, "r"); // can change to "rb", on NT/2000 servers, if problems.
$value = "";
while(!feof($fd)){
$value .= fread($fd, 4096); 
}
fclose($fd);
$start= strpos($value, "$startingpoint"); 
$finish= strpos($value, "$endingpoint"); 
$length= $finish-$start;
$value=substr($value, $start, $length);
// end "don't touch this part"


// eregi_replace, below, is a case-insensitive function to find, match, and replace variations of text that you define.
//The following commands strip or replace HTML tags. 
//To NOT strip a certain HTML tag, add // before the line in question.
// the "", before the $value at the end of the line means replace the tag with blank space, which effectively deletes the tag.

// $value = eregi_replace( "<img src=[^>]*>", "", $value ); // Remove all image tags. This is disabled until you remove the // in front of this line.
$value = eregi_replace( "<IMG alt=[^>]*>", "", $value ); // Remove all image alt="whatever" tags
$value = eregi_replace( "<class[^>]*>", "", $value ); // Remove all variations of <class> tags.
//$value = eregi_replace( "<table[^>]*>", "", $value ); // Remove ALL variations of <table> tags.
//$value = eregi_replace( "<tr[^>]*>", "", $value ); // Replace <tr> tags with blank space.
//$value = eregi_replace( "<td[^>]*>", "", $value ); // Remove all variations of <td> tags.
$value = eregi_replace( "Signed up teams[^>]*>", "", $value );



// Below - what's the difference, you ask, between eregi_replace and str_replace?
// str_replace is faster, by a long shot... The catch is that in can only be used
// to replace EXACT value matches, as you see below, and doesn't work well in huge files without using arrays.
$value = str_replace( "</font>", "", $value ); // Remove closing </font> tags.
//$value = str_replace( "</table>", "", $value ); // Remove closing </table> tags.
//$value = str_replace( "</tr>", "", $value ); // Remove closing </tr> tags.
//$value = str_replace( "</td>", "", $value ); // Remove closing </td> tags.
//$value = str_replace( "<center>", "", $value ); // Remove <center> tag...
//$value = str_replace( "</center>", "", $value ); // ...alignment calls.
$value = str_replace( "<b>", "", $value ); // Remove <b> tags.
$value = str_replace( "</b>", "", $value ); // Remove closing </b> tags...
//$value = str_replace( "<table style=\"width:100%;padding:4px;border:1px solid #000;background:#7e661c;color:#fff;\">", "<table align=\"center\" border=\"0\" cellpadding=\"4\" cellspacing=\"1\" class=\"alt1\" width=\"100%\">", $value );
$value = str_replace( "<td>No</td>", "", $value );
$value = str_replace( "<td style=\"font:12px Arial,sans-serif;color:#fff;\"><b>PAID</b></td>", "", $value );
$value = str_replace( "<td>No</td>", "", $value );
$value = str_replace( "<a href=", "<a", $value );
$value = str_replace( "<table style=\"width:100%;padding:4px;border:1px solid #000;background:#7e661c;color:#fff;\">", "<table>", $value );
$value = str_replace( "</body>", "", $value );
$value = str_replace( "<td style=\"font:12px Arial,sans-serif;color:#fff;\">", "", $value );
$value = str_replace( "PAID", "", $value );
$value = str_replace( "<td colspan=\"3\" style=\"font:12px Arial,sans-serif;color:#fff;\"></td>", "", $value );
$value = str_replace( "</td>DIV</td>", "", $value );
$value = str_replace( "TEAM NAME</td>", "", $value ); 



// More tags. Just take out the // in front and edit as you like.
//$value = eregi_replace( "Competitors name", "", $value ); // Remove certain text...
//$value = eregi_replace( "<javascript[^>]*>", "", $value ); //remove javascripts
//$value = eregi_replace( "<script[^>]*>", "", $value ); //remove scripts

// replace normal links with HTML to open fetched links in new window
$value = eregi_replace( "href=", "target=\"_blank\" href=", $value ); 

// open links that use " in new window 
$value = eregi_replace( "href=\"", "target=\"_blank\" href=\"", $value ); 

$donstart = "<table class=\"tborder\" width=\"175\"><tr><td class=\"alt1\">";

$donend = "</td></tr></table>";

$FinalOutput = preg_replace("/(href=\"?)(\/[^\"\/]+)/", "\\1" . $theDomain . "\\2", $value);

echo $donstart ;
echo $FinalOutput ; //prints it to your page
echo $donend ;

flush (); //force output to your page faster

?>

Link to comment
https://forums.phpfreaks.com/topic/84855-is-this-possible-and-how/
Share on other sites

Hi

i found this code bellow in a forum, apparently it allows you to grab contents from any website, providing that you enter the starting and ending html points. I understand that this may be difficult and maybe pointless but i am interested, would it be possible to modify it so the administrator enters a website and then selects the section they want almost like cropping a website. Would this be possible? if so where should i start? what parts do i change? is there any current code that may help me out?

No!

 

So you're telling me that if you enter in http://www.phpfreaks.com and selects the parts you want, the site will crop itself to that!? You can't do that.

 

If that's not what you mean, please elaborate. :)

 

no what i want to do is modify this script which allows you to put sections of any site on your site, but i want to change it so instead of entering the in and out html points you have a selecter which you drag over the section you want and then you can use that section

 

thanks

no what i want to do is modify this script which allows you to put sections of any site on your site, but i want to change it so instead of entering the in and out html points you have a selecter which you drag over the section you want and then you can use that section

 

thanks

1. That's copyright infringement.

2. PHP is a server-side language. "Dragging over a selection" is client-side. You'll have to use another language. As for loading the page HTML, you can do that with PHP, I think.

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.