Jump to content

Google Blogsearch Ping PHP Script


mtorbin

Recommended Posts

Hey all,

 

I'm in the process of writing a small script that pings blogsearch.google.com via REST client.  The problem is that I don't think that Google works as nicely as they'd like you too believe.  Here's what's going on:

 

We have about seventy or so blogs that we're pinging Google with.  The process basically performs a file_get_contents which returns either a 200 alert or an error.  Immediately after the ping process is complete, I download their changes.xml document to see if any of our requested URLs have been accepted.  Sometimes I get lucky and a few actually make it into the changes.xml file.  However, when I try to search for any of our blogs in their blogsearch, I don't get any responses.

 

My guess is that the blobsearch is being pinged so quickly and so frequently by everyone and their second cousins that it doesn't have time to ingest everything.  Has anyone else had this experience?  If you have any suggestions on how best to deal with this, please let me know.

 

Thanks,

 

  - MT

 

p.s. here is my script if that helps any of you:

<?php
if(file_exists("changes.xml")) {unlink("changes.xml");}
$blogsURL = "[document containing blog links]"; 
//links should be in the following format:
//<google>http://blogsearch.google.com/ping?name=[blog_name]&url=[blog_url]&changesURL=[blog_feed.rss]</google>
$counter = 0;
$blogsData = file_get_contents($blogsURL);
preg_match_all("/(?<=<google>)(.*)(?=<\/google>)/isU", $blogsData, $linksFound);

foreach($linksFound[0] as $link) {
	$counter++;

	$link = substr($link,34,strlen($link));
	$link = "http://blogsearch.google.com/ping?" . getEscapeChars($link,"esc");
	$myURL = $link;

	$myData = file_get_contents($myURL);
	$blogName = substr($link, 39, (strpos($link,"&url")-39));
	$blogName = str_replace("+", " ", str_replace("%27", "'", str_replace("%3A", ":", str_replace("%2C", ",", str_replace("%21", "!", $blogName)))));

	if($myData == "Thanks for the ping.") {print("The url for '" . getEscapeChars($blogName,"unesc") . "' was accepted.\n");}
	else {print("The url for '" . getEscapeChars($blogName,"unesc") . "' was not accepted.\n");}
}
print("Total Blogs: " . $counter . "\n\n");

exec("curl http://blogsearch.google.com/changes.xml?last=900 -o changes.xml");
print("\n");

function getEscapeChars($myChars,$myProcess) {
	if($myProcess == "esc") {$myChars = str_replace(":","%3A",str_replace("/","%2F",str_replace("+","%2B",$myChars)));}
	else {$myChars = str_replace("%3A",":",str_replace("%2F","/",str_replace("%2B"," ",$myChars)));}
	return $myChars;
}
?>

Link to comment
Share on other sites

  • 1 year later...
This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.