Jump to content

Recommended Posts

preg_match_all('/<a href="([^"]+)">([^<]+)<\/a><font size="-1">([^"]+)<\/font>/s', $html,$posts,PREG_SET_ORDER);

Here's a few of the target URL's

<p><a href="http://southcoast.craigslist.org/muc/1564255288.html">Drummer looking for weeknight gigs</a> - <font size="-1"> (New Bedford)</font></p>

<p><a href="http://southcoast.craigslist.org/muc/1564167149.html">Pagan Musicians</a> - </p>

<p><a href="http://southcoast.craigslist.org/muc/1564061446.html">Seeking 5th member</a> - <font size="-1"> (RI/Southern, MA)</font></p>

 

<p><a href="http://southcoast.craigslist.org/muc/1563926651.html">Gigging cover band in search for new lead guitarist </a> - <font size="-1"> ((south shore))</font></p>

<p><a href="http://southcoast.craigslist.org/muc/1563506659.html">Acoustic Guitarist Wanted</a> - <font size="-1"> (New Bedford/Fall River/Providence/East Bay area)</font></p>

<p><a href="http://southcoast.craigslist.org/muc/1563233552.html">Need Help Writing Raps?</a> - <font size="-1"> (Fall River, Ma)</font> <span class="p"> pic</span></p>

 

<h4>Wed Jan 20</h4>

<p><a href="http://southcoast.craigslist.org/muc/1562404109.html">drums and guitar looking for bass w/ vocals</a> - <font size="-1"> (taunton)</font></p>

<p><a href="http://southcoast.craigslist.org/muc/1562389093.html">wack ass egyptians need guitarist</a> - <font size="-1"> (quincy/whitman)</font></p>

<p><a href="http://southcoast.craigslist.org/muc/1561458375.html">Looking for a few good men - Bass/Baritones</a> - <font size="-1"> (Fall River Area)</font> <span class="p"> pic</span></p>

 

<h4>Tue Jan 19</h4>

<p><a href="http://southcoast.craigslist.org/muc/1561104614.html">singer/guitarist looking</a> - </p>

<p><a href="http://southcoast.craigslist.org/muc/1560864071.html">south shore cover band needs bass</a> - <font size="-1"> (plymouth,ma)</font></p>

<p><a href="http://southcoast.craigslist.org/muc/1559645835.html">Looking for Rhythm Guitarist</a> - <font size="-1"> (Taunton, Ma)</font></p>

 

<h4>Mon Jan 18</h4>

<p><a href="http://southcoast.craigslist.org/muc/1558191492.html">Working Cover Rock Band Looking for GOOD Lead Singer</a> - <font size="-1"> (SE MA/RI)</font></p>

<p><a href="http://southcoast.craigslist.org/muc/1557842807.html">wanted: guitar player (christian)</a> - <font size="-1"> (Dartmouth)</font></p>

 

 

Here's my code which worked fine up until yesterday. Now it only works when I strip the <font syntax at the end.

<?php
error_reporting(E_ALL);
ini_set("display_errors", 1);
$st = isset($_POST['submit']) ? $_POST['state'] : '';

$urls= array("http://" . $st . ".craigslist.org");
foreach ($urls as $url) {
    $html = file_get_contents("$url/muc/");


   preg_match_all('/<a href="([^"]+)">([^<]+)<\/a><font size="-1">([^"]+)<\/font>/s', $html,$posts,PREG_SET_ORDER);
    //echo "<pre>";print_r($posts);
$i = 1; //set start point;
$limit = 60; //set limit;
foreach ($posts as $post) {
  //print_r $post[0]; //HTML
   $post[2] = str_ireplace($url,"",$post[2]); //remove domain
  echo "<a href=\"$url{$post[1]}\" target=\"_blank\">{$post[2]}<font size=\"3\">{$post[3]}</font></a><br />";
   print "<BR />\n";


   if ($i == $limit)
   {
      break;
   }
  $i++; 
}

}
?>
[code] 

When I remove <font size="-1">([^"]+)<\/font> ir works however it displays all some thinks I don't want as before this Regex worked perfect?

Thanks in advance

Link to comment
https://forums.phpfreaks.com/topic/191524-this-reg-no-longer-works/
Share on other sites

That's because they added a -

a simple RegEx update would be

<a href="([^"]+)">([^<]+)</a> - (?:<font size="-1">([^"]+)</font>)?

 

Also i have pointed this out before but do you have permission to collect this data ?

as if you don't it would be unlawful!

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.