Jump to content

Archived

This topic is now archived and is closed to further replies.

XeroXer

Checking over 100000 pages for code

Recommended Posts

Hi there.

I would like som help.
I am a member at a gaming page.
There all members have their own presentation site.
The members page adress are like this:
[a href=\"http://www.example.com/member/profile/145031\" target=\"_blank\"]http://www.example.com/member/profile/145031[/a]

The different members have special status depending on their gaming behavior.
What I now would like is to have a list of how many members have certain status.

The status is displayed of a certain image. Let's call it:
[a href=\"http://www.example.com/images/powermember.gif\" target=\"_blank\"]http://www.example.com/images/powermember.gif[/a]

So a php script that searches the html code from:
[a href=\"http://www.example.com/member/profile/1\" target=\"_blank\"]http://www.example.com/member/profile/1[/a]
to:
[a href=\"http://www.example.com/member/profile/500000\" target=\"_blank\"]http://www.example.com/member/profile/500000[/a]
after:
[a href=\"http://www.example.com/images/powermember.gif\" target=\"_blank\"]http://www.example.com/images/powermember.gif[/a]
and then displayes how many it found.

Can anyone help me with this?
I have done some php'ing before so a few guidlines and I might be able to do it myself.

THankful for any help...

Share this post


Link to post
Share on other sites
you would need to look into these two function

ereg()
file_get_contents()


assuming you know how many users there are, you'd create a loop from 1 to that number using a counter var
and have it ereg each page for that gif
using the counter var again


something like this
[code]
$parseUrl = "http://www.example.com/member/profile/";
$numUsers = 0;
for($i=0; $i<5000; $i++) {
   $parsePage = file_get_contents($parseUrl$i);
   if(ereg("http://www.example.com/images/powermember.gif", $parsePage))
       $numUsers++;
}

echo "There are " . $numUsers . " power users";[/code]


EDIT

But using something like that would take FOREVER if there are a lot of pages

Share this post


Link to post
Share on other sites
Please tell me that this is all generated from the same page and you just want to look at the status of a field in a database of all your users. Please? Otherwise, that's gotta be the worst design of a website I've ever heard of.

Share this post


Link to post
Share on other sites
[!--quoteo(post=360123:date=Mar 30 2006, 02:44 PM:name=ober)--][div class=\'quotetop\']QUOTE(ober @ Mar 30 2006, 02:44 PM) [snapback]360123[/snapback][/div][div class=\'quotemain\'][!--quotec--]
Please tell me that this is all generated from the same page and you just want to look at the status of a field in a database of all your users.
[/quote]
I think, no I'm pretty sure that he's trying to phish data from a database he doesn't have access to.
[!--quoteo(post=360123:date=Mar 30 2006, 02:44 PM:name=ober)--][div class=\'quotetop\']QUOTE(ober @ Mar 30 2006, 02:44 PM) [snapback]360123[/snapback][/div][div class=\'quotemain\'][!--quotec--]Otherwise, that's gotta be the worst design of a website I've ever heard of.
[/quote]
These forums are laid out such a way.....for instance, the 'newbies'. 'lurkers', and 'gurus' all have a significant field value on their profile...wouldn't be too hard to just do a loop, but it would take such a long time

Share this post


Link to post
Share on other sites
Well I really don't know how the page is build but I hope it comes from a database.

I tried your code like this:
[code]<?php
$parseUrl = "http://www.example.com/member/view/";
$numUsers = 3;
for($i=3; $i<143089; $i++)
{
    $parsePage = file_get_contents($parseUrl$i);
    if(ereg("http://www.example.com/pictures/supermember.gif", $parsePage))
    $numUsers++;
}
echo "There are " . $numUsers . " super members. ";
?>[/code]

The first member has the number 3 and the last 143089.
This code results in nothing.
The source becomes this:
[code]<html><body></body></html>[/code]

Share this post


Link to post
Share on other sites
Might help if you would actually link us to the site so we could really help you out =P

Share this post


Link to post
Share on other sites
Well can't really do that. Sorry...
Or maybe I can but I would much rather get this working.
What can make me end up with nothing?

Share this post


Link to post
Share on other sites
Guest footballkid4
1) More than likely the site is not set up the way you think...they probably use mod_rewrite and get a feed from a single page

2) Linking us to the page would allow us to easily see what we're trying to do exactly

Share this post


Link to post
Share on other sites
[a href=\"http://www.gamers.nu\" target=\"_blank\"]Gamers.nu[/a]
That's the page.

I got the script working from one point of view.
It tries to read the page correctly but I get a LOT of errors.
Well I get 150000 errors :-)

It's the file_get_contents() that doesn't work.
Because probebly they generate the page from the database.
So there is no .php or .html file to read from.

This is the code that "worked":
[code]<?php
$siteurl = "http://www.gamers.nu/profile/show/";
$imgurl = "http://www.gamers.nu/_tpl/site/default/_img/flags/gold.gif";
$numusers = 0;

for($i = 0; $i < 150000; $i++)
{
    $sitepage = file_get_contents("$siteurl$i");
    if(ereg($imgurl, $sitepage))
    $numusers++;
}
?>
<html>
<head>
<title>Gnu members...</title>
</head>
<body>
<?php
echo "Det finns " . $numusers . " guldmedlemmar på gamers.nu. ";
?>
</body>
</html>[/code]

You can see the page here:
[a href=\"http://www.xeroxer.com/gnu.php\" target=\"_blank\"]Gnu - XeroXer.com[/a]

Share this post


Link to post
Share on other sites
Guest footballkid4
A page isn't just "generated" from the database. file_get_contents may or may not work if the host has mod_rewrite in use...you might want to try sockets or curl

Share this post


Link to post
Share on other sites
[!--quoteo(post=360187:date=Mar 31 2006, 12:48 AM:name=footballkid4)--][div class=\'quotetop\']QUOTE(footballkid4 @ Mar 31 2006, 12:48 AM) [snapback]360187[/snapback][/div][div class=\'quotemain\'][!--quotec--]
A page isn't just "generated" from the database. file_get_contents may or may not work if the host has mod_rewrite in use...you might want to try sockets or curl
[/quote]

How do I get those working?
All my php installation and .ini file are at my webhoster. I can't edit anything really.

I can set php version to PHP4 or PHP5.
I can set PHP errror messages on or off.
I can turn register globals on or off.

Please help... :-)

Share this post


Link to post
Share on other sites
Well it seeems the curl needs me to install something extra while the sockets does not.
Witch means the sockets would be a better try. :-)

I have never used sockets before.
Could anyone help me with how to use it?
How to use it in the above script to get it working. :-)

Share this post


Link to post
Share on other sites

×

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.