Jump to content

What Is Hindering The Page Fetch With cURL ?


phpsane

Recommended Posts

Fellow Php'ers,

 

I'm a php learner. Beginner. Very enthusiastic enough to try to build my own web proxy.

See how far I have reached.

 

The following is cURL. It shows you a ui text box where you type a url and cURL would fetch that url.

 

1. Now, in that fetched url, there could be links like so:

 

<a href="http://www.google.com">Google</a>

<a href="http://yahoo.com">Yahoo</a>

 

This script is supposed to precede 'proxified_page_test.php?url_to_proxify=' on all links.

So now, the links present on the proxified page (cURL fetched page) should look like this:

 

<a href="proxified_page_test.php?url_to_proxify=http://www.google.com">Google</a>

<a href="proxified_page_test.php?url_to_proxify=http://yahoo.com">Yahoo</a>

 

2. Also, in that fetched url, there could be a search box, like the google search box. Search box containing search box code like so ...

"....action = http://google.com/q?"

Again, this script is supposed to precede 'proxified_page_test.php?url_to_proxify=' on all links including on those forms that forward you to their processor.php.

So now, the links present on the proxified page (cURL fetched page) should look like this:

"....action = proxified_page_test.php?url_to_proxify=http://google.com/q?"

That way, if you do a search on the proxified page then the SERPs presented would contain links where

 proxified_page_test.php?url_to_proxify=

have been added. That way, if you were viewing google and doing a search, the SERPs presented would list links that precede the "proxified_page_test.php?url_to_proxify=" so that the result links can also be proxified when clicked.

 

ISSUE

Problem is, if you type "http://www.google.com" then it does not fetch that page. What is hindering the fetch ?

 

 

<?php
 
/*
ERROR HANDLING
*/
declare(strict_types=1);
ini_set('display_errors', '1');
ini_set('display_startup_errors', '1');
error_reporting(E_ALL);
 
/* STEP 2:
The IF gets triggered as soon as the "submit" button is clicked in the ui text box labeled: Url
Following IF code deals with GET method.
*/
 
if(isset($_GET["url_to_proxify"]) === TRUE)
   {
echo "IF got triggered!";
$url_to_proxify = filter_input(INPUT_GET, 'url_to_proxify', FILTER_VALIDATE_URL);
 
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "$url_to_proxify");
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
curl_setopt($ch, CURLOPT_HEADER, 5);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
$curl_result = curl_exec($ch); 
 
$domain = parse_url($url_to_proxify, PHP_URL_HOST);
echo var_dump($domain);
 
//Add proxy link on all links present on proxified page
$pattern = array("http://", "https://", "http://www.", "https://www.", "localhost");
$replace = array("proxified_page_test.php?url_to_proxify=http://\".$domain\"", "proxified_page_test.php?url_to_proxify=https://\".$domain\"", "proxified_page_test.php?url_to_proxify=http://www.\".$domain\"", "proxified_page_test.php?url_to_proxify=https://www.\".$domain\"", "proxified_page_test.php?url_to_proxify=http://www.\".$domain\"");
$string_replaced_data = str_replace($pattern, $replace, $curl_result);
echo var_dump($string_replaced_data);
 
//Add proxy link on all Image Links (Eg. Google Img File) 
$pattern = array('src="', 'src = "', 'src= "', 'src ="', "src='", "src = '", "src= '", "src='");
$replace = array('src="proxified_page_test.php?url_to_proxify=\".$domain\""', 'src = "proxified_page_test.php?url_to_proxify=\".$domain\""', 'src= "proxified_page_test.php?url_to_proxify=\".$domain\""', 'src ="proxified_page_test.php?url_to_proxify=\".$domain\""', "src='proxified_page_test.php?url_to_proxify=\".$domain\"'", "src = 'proxified_page_test.php?url_to_proxify=\".$domain\"'", "src= 'proxified_page_test.php?url_to_proxify=\".$domain\"'", "src ='proxified_page_test.php?url_to_proxify=\".$domain\"'");
$string_replaced_data = str_replace($pattern, $replace, $curl_result);
echo var_dump($string_replaced_data);
 
//Add proxy link on all links presented by the searchengine result pages (SERPS). Eg. Google Search Pages (SERPs)
$pattern = array('action="', 'action = "', 'action= "', 'action ="', "action='", "action = '", "action= '", "action='");
$replace = array('action="proxified_page_test.php?url_to_proxify=\".$domain\""', 'action = "proxified_page_test.php?url_to_proxify=\".$domain\""', 'action= "proxified_page_test.php?url_to_proxify=\".$domain\""', 'action ="proxified_page_test.php?url_to_proxify=\".$domain\""', "action='proxified_page_test.php?url_to_proxify=\".$domain\"'", "action = 'proxified_page_test.php?url_to_proxify=\".$domain\"'", "action= 'proxified_page_test.php?url_to_proxify=\".$domain\"'", "action ='proxified_page_test.php?url_to_proxify=\".$domain\"'");
$string_replaced_data = str_replace($pattern, $replace, $curl_result);
echo var_dump($string_replaced_data);
 
print_r($curl_result);
curl_close($ch);          
}
else
    {
echo "ELSE got triggered!";
//Html Form
?>
<html>
<body>   
<form action = "<?php echo $_SERVER['PHP_SELF']; ?>" method = "GET">
Url: <input type = "text" name = "url_to_proxify" />
<input type = "submit" />
      </form>      
   </body>
</html>
 
<?php
}
 
?>
Link to comment
Share on other sites

Php Gurus,

 

Why is not the following code doing it's job ?

When I fetch google homepage with cURL then it gets fetched. So far, so good.

It also manages to precede the following on all links present on the proxied page (cURL fetched google homepage):

 

proxified_page_test.php?url_to_proxify=\".$domain\"

 

 

So now, a link like so (original link):

http://google.com/contactus.html

 

Would now look like this (proxied link):

proxified_page_test.php?url_to_proxify=http://google.com/contactus.html

 

Like I say: So far, so good.

But on the proxied page, when you do a google search, then whatever links the SERP (searchengine result page) presents you, they are original links like so:

 

http://cars.com/contactus.html

http://autos.com/contactus.html

 

And not like so (when it should have been like so according to the code that follows):

 

proxified_page_test.php?url_to_proxify=http://cars.com/contactus.html

proxified_page_test.php?url_to_proxify=http://autos.com/contactus.html

//Add proxy link on all links presented by the searchengine result pages (SERPS). Eg. Google Search Pages (SERPs)
$pattern = array('action="', 'action = "', 'action= "', 'action ="', "action='", "action = '", "action= '", "action='");
$replace = array('action="proxified_page_test.php?url_to_proxify=\".$domain\""', 'action = "proxified_page_test.php?url_to_proxify=\".$domain\""', 'action= "proxified_page_test.php?url_to_proxify=\".$domain\""', 'action ="proxified_page_test.php?url_to_proxify=\".$domain\""', "action='proxified_page_test.php?url_to_proxify=\".$domain\"'", "action = 'proxified_page_test.php?url_to_proxify=\".$domain\"'", "action= 'proxified_page_test.php?url_to_proxify=\".$domain\"'", "action ='proxified_page_test.php?url_to_proxify=\".$domain\"'");
$string_replaced_data = str_replace($pattern, $replace, $curl_result);
echo var_dump($string_replaced_data);

What is wrong ? This is a mystery! Why is not the str_replace working ?

Link to comment
Share on other sites

Folks!

 

I grabbed the code you see on my op and previous post from here:

 

https://www.sitepoint.com/community/t/curl-experiments/264321/242

https://www.sitepoint.com/community/t/curl-experiments/264321/240

 

Nearly 250 posts and no php guru at sitepoint.com could solve the issue! I believe we can be different here! What do you think ?

I notice the same issue exists in the popular Php-Proxy web proxy script as I been fiddling with the script lately!

So, how-about why try solving this problem and finding a solution that others can benefit from (this forum members, sitepoint.com failing members, Php-Proxy users, php newbies all across the globe, etc.) ?

Link to comment
Share on other sites

Does it make a difference if you url_encode() the target link before putting it into the link src?

 

Mmm. I am not sure. Aslong as it does what I want done then I don't think it should matter.

If I can have a look at the code snippet you have in mind then I can test it on my end and provide you the results. Just one silly obstacle making everyone and everything go round and round in a vicious circle as if there is no end to it. I'm all ears to try any suggestions from anyone.

If your method manages to precede the following on all links presented by any keywords search, such as google search, then it's working as expected. And most likely, problem solved at last! :)

 

"proxified_page_test.php?url_to_proxify=".

Link to comment
Share on other sites

Mmm. I am not sure. Aslong as it does what I want done then I don't think it should matter.

If I can have a look at the code snippet you have in mind then I can test it on my end and provide you the results. Just one silly obstacle making everyone and everything go round and round in a vicious circle as if there is no end to it. I'm all ears to try any suggestions from anyone.

If your method manages to precede the following on all links presented by any keywords search, such as google search, then it's working as expected. And most likely, problem solved at last! :)

 

"proxified_page_test.php?url_to_proxify=".

 

I researched about this nearly a wk ago and found others in the same boat. But they mentioned url_encode()  was not working.

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.