Jump to content
phpsane

What Is Hindering The Page Fetch With cURL ?

Recommended Posts

Fellow Php'ers,

 

I'm a php learner. Beginner. Very enthusiastic enough to try to build my own web proxy.

See how far I have reached.

 

The following is cURL. It shows you a ui text box where you type a url and cURL would fetch that url.

 

1. Now, in that fetched url, there could be links like so:

 

<a href="http://www.google.com">Google</a>

<a href="http://yahoo.com">Yahoo</a>

 

This script is supposed to precede 'proxified_page_test.php?url_to_proxify=' on all links.

So now, the links present on the proxified page (cURL fetched page) should look like this:

 

<a href="proxified_page_test.php?url_to_proxify=http://www.google.com">Google</a>

<a href="proxified_page_test.php?url_to_proxify=http://yahoo.com">Yahoo</a>

 

2. Also, in that fetched url, there could be a search box, like the google search box. Search box containing search box code like so ...

"....action = http://google.com/q?"

Again, this script is supposed to precede 'proxified_page_test.php?url_to_proxify=' on all links including on those forms that forward you to their processor.php.

So now, the links present on the proxified page (cURL fetched page) should look like this:

"....action = proxified_page_test.php?url_to_proxify=http://google.com/q?"

That way, if you do a search on the proxified page then the SERPs presented would contain links where

 proxified_page_test.php?url_to_proxify=

have been added. That way, if you were viewing google and doing a search, the SERPs presented would list links that precede the "proxified_page_test.php?url_to_proxify=" so that the result links can also be proxified when clicked.

 

ISSUE

Problem is, if you type "http://www.google.com" then it does not fetch that page. What is hindering the fetch ?

 

 

<?php
 
/*
ERROR HANDLING
*/
declare(strict_types=1);
ini_set('display_errors', '1');
ini_set('display_startup_errors', '1');
error_reporting(E_ALL);
 
/* STEP 2:
The IF gets triggered as soon as the "submit" button is clicked in the ui text box labeled: Url
Following IF code deals with GET method.
*/
 
if(isset($_GET["url_to_proxify"]) === TRUE)
   {
echo "IF got triggered!";
$url_to_proxify = filter_input(INPUT_GET, 'url_to_proxify', FILTER_VALIDATE_URL);
 
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "$url_to_proxify");
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
curl_setopt($ch, CURLOPT_HEADER, 5);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
$curl_result = curl_exec($ch); 
 
$domain = parse_url($url_to_proxify, PHP_URL_HOST);
echo var_dump($domain);
 
//Add proxy link on all links present on proxified page
$pattern = array("http://", "https://", "http://www.", "https://www.", "localhost");
$replace = array("proxified_page_test.php?url_to_proxify=http://\".$domain\"", "proxified_page_test.php?url_to_proxify=https://\".$domain\"", "proxified_page_test.php?url_to_proxify=http://www.\".$domain\"", "proxified_page_test.php?url_to_proxify=https://www.\".$domain\"", "proxified_page_test.php?url_to_proxify=http://www.\".$domain\"");
$string_replaced_data = str_replace($pattern, $replace, $curl_result);
echo var_dump($string_replaced_data);
 
//Add proxy link on all Image Links (Eg. Google Img File) 
$pattern = array('src="', 'src = "', 'src= "', 'src ="', "src='", "src = '", "src= '", "src='");
$replace = array('src="proxified_page_test.php?url_to_proxify=\".$domain\""', 'src = "proxified_page_test.php?url_to_proxify=\".$domain\""', 'src= "proxified_page_test.php?url_to_proxify=\".$domain\""', 'src ="proxified_page_test.php?url_to_proxify=\".$domain\""', "src='proxified_page_test.php?url_to_proxify=\".$domain\"'", "src = 'proxified_page_test.php?url_to_proxify=\".$domain\"'", "src= 'proxified_page_test.php?url_to_proxify=\".$domain\"'", "src ='proxified_page_test.php?url_to_proxify=\".$domain\"'");
$string_replaced_data = str_replace($pattern, $replace, $curl_result);
echo var_dump($string_replaced_data);
 
//Add proxy link on all links presented by the searchengine result pages (SERPS). Eg. Google Search Pages (SERPs)
$pattern = array('action="', 'action = "', 'action= "', 'action ="', "action='", "action = '", "action= '", "action='");
$replace = array('action="proxified_page_test.php?url_to_proxify=\".$domain\""', 'action = "proxified_page_test.php?url_to_proxify=\".$domain\""', 'action= "proxified_page_test.php?url_to_proxify=\".$domain\""', 'action ="proxified_page_test.php?url_to_proxify=\".$domain\""', "action='proxified_page_test.php?url_to_proxify=\".$domain\"'", "action = 'proxified_page_test.php?url_to_proxify=\".$domain\"'", "action= 'proxified_page_test.php?url_to_proxify=\".$domain\"'", "action ='proxified_page_test.php?url_to_proxify=\".$domain\"'");
$string_replaced_data = str_replace($pattern, $replace, $curl_result);
echo var_dump($string_replaced_data);
 
print_r($curl_result);
curl_close($ch);          
}
else
    {
echo "ELSE got triggered!";
//Html Form
?>
<html>
<body>   
<form action = "<?php echo $_SERVER['PHP_SELF']; ?>" method = "GET">
Url: <input type = "text" name = "url_to_proxify" />
<input type = "submit" />
      </form>      
   </body>
</html>
 
<?php
}
 
?>

Share this post


Link to post
Share on other sites

Php Gurus,

 

Why is not the following code doing it's job ?

When I fetch google homepage with cURL then it gets fetched. So far, so good.

It also manages to precede the following on all links present on the proxied page (cURL fetched google homepage):

 

proxified_page_test.php?url_to_proxify=\".$domain\"

 

 

So now, a link like so (original link):

http://google.com/contactus.html

 

Would now look like this (proxied link):

proxified_page_test.php?url_to_proxify=http://google.com/contactus.html

 

Like I say: So far, so good.

But on the proxied page, when you do a google search, then whatever links the SERP (searchengine result page) presents you, they are original links like so:

 

http://cars.com/contactus.html

http://autos.com/contactus.html

 

And not like so (when it should have been like so according to the code that follows):

 

proxified_page_test.php?url_to_proxify=http://cars.com/contactus.html

proxified_page_test.php?url_to_proxify=http://autos.com/contactus.html

//Add proxy link on all links presented by the searchengine result pages (SERPS). Eg. Google Search Pages (SERPs)
$pattern = array('action="', 'action = "', 'action= "', 'action ="', "action='", "action = '", "action= '", "action='");
$replace = array('action="proxified_page_test.php?url_to_proxify=\".$domain\""', 'action = "proxified_page_test.php?url_to_proxify=\".$domain\""', 'action= "proxified_page_test.php?url_to_proxify=\".$domain\""', 'action ="proxified_page_test.php?url_to_proxify=\".$domain\""', "action='proxified_page_test.php?url_to_proxify=\".$domain\"'", "action = 'proxified_page_test.php?url_to_proxify=\".$domain\"'", "action= 'proxified_page_test.php?url_to_proxify=\".$domain\"'", "action ='proxified_page_test.php?url_to_proxify=\".$domain\"'");
$string_replaced_data = str_replace($pattern, $replace, $curl_result);
echo var_dump($string_replaced_data);

What is wrong ? This is a mystery! Why is not the str_replace working ?

Edited by phpsane

Share this post


Link to post
Share on other sites

Folks!

 

I grabbed the code you see on my op and previous post from here:

 

https://www.sitepoint.com/community/t/curl-experiments/264321/242

https://www.sitepoint.com/community/t/curl-experiments/264321/240

 

Nearly 250 posts and no php guru at sitepoint.com could solve the issue! I believe we can be different here! What do you think ?

I notice the same issue exists in the popular Php-Proxy web proxy script as I been fiddling with the script lately!

So, how-about why try solving this problem and finding a solution that others can benefit from (this forum members, sitepoint.com failing members, Php-Proxy users, php newbies all across the globe, etc.) ?

Edited by phpsane

Share this post


Link to post
Share on other sites

Does it make a difference if you url_encode() the target link before putting it into the link src?

Share this post


Link to post
Share on other sites

Does it make a difference if you url_encode() the target link before putting it into the link src?

 

Mmm. I am not sure. Aslong as it does what I want done then I don't think it should matter.

If I can have a look at the code snippet you have in mind then I can test it on my end and provide you the results. Just one silly obstacle making everyone and everything go round and round in a vicious circle as if there is no end to it. I'm all ears to try any suggestions from anyone.

If your method manages to precede the following on all links presented by any keywords search, such as google search, then it's working as expected. And most likely, problem solved at last! :)

 

"proxified_page_test.php?url_to_proxify=".

Edited by phpsane

Share this post


Link to post
Share on other sites

I thought you sounded like an annoying fucking twat I've heard before...

 

http://forums.devshed.com/php-development/978085-curl-experiments-post2977611.html#post2977611

 

What do you mean ?

Did I not mention on post 4 that I grabbed the code from:

https://www.sitepoint.com/community/t/curl-experiments/264321/242

Edited by phpsane

Share this post


Link to post
Share on other sites

Mmm. I am not sure. Aslong as it does what I want done then I don't think it should matter.

If I can have a look at the code snippet you have in mind then I can test it on my end and provide you the results. Just one silly obstacle making everyone and everything go round and round in a vicious circle as if there is no end to it. I'm all ears to try any suggestions from anyone.

If your method manages to precede the following on all links presented by any keywords search, such as google search, then it's working as expected. And most likely, problem solved at last! :)

 

"proxified_page_test.php?url_to_proxify=".

 

I researched about this nearly a wk ago and found others in the same boat. But they mentioned url_encode()  was not working.

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.