Jump to content

[SOLVED] Help grabbing info from URL


Looktrne

Recommended Posts

I am writing some code to grab some info from a url...

 

this is what I have...

 

$url="http://miscurl.com/search";
$code = file_get_contents($url);
$start= strpos($code, "<div class=\"lc\"><a href=\"");
$start=$start+25;
$finish= strpos($code, "#in");
$length= $finish-$start;
$code=substr($code, $start, $length);
echo $code;

 

here is an example of what the url will look like...

 

<div class="tsb"><div class="lc"><a href="member8029481.htm#in">BlueEyedAngel1988</a> 20
<div class="tsbalt"><div class="lc"><a href="member5158248.htm#in">mistressbater</a> 45
</div>
<div class="clear"> </div>

</div>



<div class="tsbalt"><div class="lc"><a href="member5240698.htm#in">KAT707</a> 27
<br><P>FSM - Friends

 

this code would echo "member8029481.htm"

 

how can I make it retrieve all 3 of the member####.htm and store them into an array ?

 

thanks for any help on this I am very new at extracting bits of a variable...

 

thanks

 

Paul

 

Link to comment
https://forums.phpfreaks.com/topic/112258-solved-help-grabbing-info-from-url/
Share on other sites

This is just off the top of my head and hasn't been tested but I would do something like this:

 

while(!(strpos($code, "member") === false))
{
  $mem = strpos($code, "member");
  $link   = substr($code, $mem, (strpos($code, ".htm", $mem)+4))
  $code = substr($code, strpos($code, $link, strlen($link)));

  echo $link;
}

 

This finds the position of the first occurence of the word member and creates a substring from there to the next .htm after the occurence of member. The code variable is then shortened to remove all data before that occurence of member.

 

Sorry if my explanation's not that easy to follow but it's the best I could muster at this late hour!

 

I've tried testing it but the URL you provided is not real so I can't!

$url="http://miscurl.com/search";
$code = file_get_contents($url);

while(!(strpos($code, "member") === false))
{
  $mem = strpos($code, "member");
  $link   = substr($code, $mem, (strpos($code, ".htm", $mem)+4))
  $code = substr($code, strpos($code, $link, strlen($link)));

  echo $link;
}

 

I just provided the code to extract the member links and not the code to get the content. I figured you could copy and paste that part yourself. Anyway, try the above, it might work. That's the best I can offer without a link to the actual site you're trying to extract data from.

$url="http://www.plentyoffish.com/basicsearch.aspx?iama=m&seekinga=f&minage=18&maxage=99&imagesetting=0&searchtype=&starsign=&ethnicity=0&country=1&state=13&City=&z_code=&miles=100&cmdSearch=Search&Profession=&Interests=&save=1";
$code = file_get_contents($url);
$i		= 0;

while(!(strpos($code, "member") === false))
{
  $mem = strpos($code, "member");
  $link   = substr($code, $mem, (strpos($code, ".htm", $mem)+4)-$mem);
  $code = substr($code, (strpos($code, $link)+strlen($link)));

#if ($i%2==0)
  echo $link.br(1);
  
$i++;

if ($i == 10)
 break;
}

 

If you uncomment the if($i%2==0) line then each link wont be echo twice. It's echoed twice but there are two occurences of each link on each page.

this is skipping the first member?

 

I modified it to grab all 15 on the page but is passing over the first? any ideas?

 

thanks

 

$url="http://www.plentyoffish.com/basicsearch.aspx?iama=m&seekinga=f&minage=18&maxage=99&imagesetting=0&searchtype=&starsign=&ethnicity=0&country=1&state=13&City=&z_code=&miles=100&cmdSearch=Search&Profession=&Interests=&save=1";
$code = file_get_contents($url);
echo $code;
$i		= 0;
$llink  = 0;
while(!(strpos($code, "member") === false))
{	
  $mem = strpos($code, "member");
  $link   = substr($code, $mem, (strpos($code, ".htm", $mem)+4)-$mem);
  $code = substr($code, (strpos($code, $link)+strlen($link)));

if ($llink!=$link){
$ct++;
$member[$ct]=$link;
  echo $link."<br>";}
  $llink=$link;
$i++;

if ($i == 34)
 break;
}

Yeah, sorted it now. This exact code below should print each member link, including the first with no duplicates.

 

$url	="http://www.plentyoffish.com/basicsearch.aspx?iama=m&seekinga=f&minage=18&maxage=99&imagesetting=0&searchtype=&starsign=&ethnicity=0&country=1&state=13&City=&z_code=&miles=100&cmdSearch=Search&Profession=&Interests=&save=1";
$code 	= file_get_contents($url);
$i		= 0;

while(!(strpos($code, "member") === false))
{
  $mem	= strpos($code, "member");
  $link	= substr($code, $mem, (strpos($code, ".htm", $mem)+4)-$mem);
  $code = substr($code, (strpos($code, $link)+strlen($link)));

if ($i%2==0 || $i==0)
  echo $link."<br />";
  
$i++;
}

oh my bad looks like my added code that checked if the link was the same as the last was not letting the first show..

 

thanks for all your help..

 

did you see my other thread about posting to a contact form? perhaps you could help me with that one?

 

your a big help thanks :)

 

Paul

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.