Jump to content

I've been tracking incoming url requests to my server, Baidu SEO?


Recommended Posts

Sorry for the crappy title.

 

So I wrote a PHP IP tracker that I include in all of my websites. This logs what time the client connected to my server, what they were requesting, number of times they've connected, and their ip-address. I use the following to get the last ip-address even if they went through an "ip-bounce", not sure what to call that. This is my understanding anyway. Ah proxy, right.

<?php

// this is only a part of the script, specifically how I get the ip, as well as the url-requested

$client_ip = $_SERVER['REMOTE_ADDR'];

 // If more than one ip address is returned, the last one is captured 

if ( array_key_exists('HTTP_X_FORWARDED_FOR', $_SERVER)) { $client_ip = array_pop(explode(',', $_SERVER['HTTP_X_FORWARDED_FOR'])); }

// url requested

$url_requested = "http://$_SERVER[HTTP_HOST]$_SERVER[REQUEST_URI]";

?>

Anyway my question is, one of the logs I tracked had the ip-address 222.85.138.75 and when I look that up at ip-lookup.net it says China. The requested URL was http://www.baidu.comwww.baidu.com:443

 

I don't get why that is, how could an inbound url request lead to my server. Is that a way to affect SEO? Like my server was requesting that url? I don't even know...

 

Link to comment
Share on other sites

Sorry what does "...default host pointing to your application..." mean?

 

It means that your webserver accepts any Host header, regardless of whether there's actually a virtual host with that name. This is a matter of configuration. You can either reject all invalid hostnames, or you can map them to a default host. Both is valid and common. If you go with a default host, you'll have to live with the fact that your log contains nonsensical hostnames.

 

 

 

When you mention probes, is that like crawling? I've logged a few Baidu spiders before.

 

I mean automated scanners looking for vulnerabilities. Legitimate crawlers hardly request nonsense hosts.

Link to comment
Share on other sites

It means that your webserver accepts any Host header, regardless of whether there's actually a virtual host with that name. This is a matter of configuration. You can either reject all invalid hostnames, or you can map them to a default host. Both is valid and common. If you go with a default host, you'll have to live with the fact that your log contains nonsensical hostnames.

 

 

 

 

I mean automated scanners looking for vulnerabilities. Legitimate crawlers hardly request nonsense hosts.

 

 

Thanks for this information, I'm going to have to look into this more and decide. I'd probably go with the method that avoids nonsensical hostnames if it doesn't prevent any functionality/access on my websites.

 

Vulnerabilities, yeah, that's always a concern. What did I miss.

Link to comment
Share on other sites

Most likely someone was testing your server to see if it would operate as an open proxy by requesting a full URL rather than just a path on your site. For example, sending a request like:

GET http://www.example.com/ HTTP/1.1
Link to comment
Share on other sites

Most likely someone was testing your server to see if it would operate as an open proxy by requesting a full URL rather than just a path on your site. For example, sending a request like:

GET http://www.example.com/ HTTP/1.1

 

That's kind of cool, seems malicious in intent unless you intend to be a proxy, but cool non the less.

 

Thanks.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.