Jump to content

This Reg no longer works?


Modernvox

Recommended Posts

preg_match_all('/<a href="([^"]+)">([^<]+)<\/a><font size="-1">([^"]+)<\/font>/s', $html,$posts,PREG_SET_ORDER);

Here's a few of the target URL's

<p><a href="http://southcoast.craigslist.org/muc/1564255288.html">Drummer looking for weeknight gigs</a> - <font size="-1"> (New Bedford)</font></p>

<p><a href="http://southcoast.craigslist.org/muc/1564167149.html">Pagan Musicians</a> - </p>

<p><a href="http://southcoast.craigslist.org/muc/1564061446.html">Seeking 5th member</a> - <font size="-1"> (RI/Southern, MA)</font></p>

 

<p><a href="http://southcoast.craigslist.org/muc/1563926651.html">Gigging cover band in search for new lead guitarist </a> - <font size="-1"> ((south shore))</font></p>

<p><a href="http://southcoast.craigslist.org/muc/1563506659.html">Acoustic Guitarist Wanted</a> - <font size="-1"> (New Bedford/Fall River/Providence/East Bay area)</font></p>

<p><a href="http://southcoast.craigslist.org/muc/1563233552.html">Need Help Writing Raps?</a> - <font size="-1"> (Fall River, Ma)</font> <span class="p"> pic</span></p>

 

<h4>Wed Jan 20</h4>

<p><a href="http://southcoast.craigslist.org/muc/1562404109.html">drums and guitar looking for bass w/ vocals</a> - <font size="-1"> (taunton)</font></p>

<p><a href="http://southcoast.craigslist.org/muc/1562389093.html">wack ass egyptians need guitarist</a> - <font size="-1"> (quincy/whitman)</font></p>

<p><a href="http://southcoast.craigslist.org/muc/1561458375.html">Looking for a few good men - Bass/Baritones</a> - <font size="-1"> (Fall River Area)</font> <span class="p"> pic</span></p>

 

<h4>Tue Jan 19</h4>

<p><a href="http://southcoast.craigslist.org/muc/1561104614.html">singer/guitarist looking</a> - </p>

<p><a href="http://southcoast.craigslist.org/muc/1560864071.html">south shore cover band needs bass</a> - <font size="-1"> (plymouth,ma)</font></p>

<p><a href="http://southcoast.craigslist.org/muc/1559645835.html">Looking for Rhythm Guitarist</a> - <font size="-1"> (Taunton, Ma)</font></p>

 

<h4>Mon Jan 18</h4>

<p><a href="http://southcoast.craigslist.org/muc/1558191492.html">Working Cover Rock Band Looking for GOOD Lead Singer</a> - <font size="-1"> (SE MA/RI)</font></p>

<p><a href="http://southcoast.craigslist.org/muc/1557842807.html">wanted: guitar player (christian)</a> - <font size="-1"> (Dartmouth)</font></p>

 

 

Here's my code which worked fine up until yesterday. Now it only works when I strip the <font syntax at the end.

<?php
error_reporting(E_ALL);
ini_set("display_errors", 1);
$st = isset($_POST['submit']) ? $_POST['state'] : '';

$urls= array("http://" . $st . ".craigslist.org");
foreach ($urls as $url) {
    $html = file_get_contents("$url/muc/");


   preg_match_all('/<a href="([^"]+)">([^<]+)<\/a><font size="-1">([^"]+)<\/font>/s', $html,$posts,PREG_SET_ORDER);
    //echo "<pre>";print_r($posts);
$i = 1; //set start point;
$limit = 60; //set limit;
foreach ($posts as $post) {
  //print_r $post[0]; //HTML
   $post[2] = str_ireplace($url,"",$post[2]); //remove domain
  echo "<a href=\"$url{$post[1]}\" target=\"_blank\">{$post[2]}<font size=\"3\">{$post[3]}</font></a><br />";
   print "<BR />\n";


   if ($i == $limit)
   {
      break;
   }
  $i++; 
}

}
?>
[code] 

When I remove <font size="-1">([^"]+)<\/font> ir works however it displays all some thinks I don't want as before this Regex worked perfect?

Thanks in advance

Link to comment
https://forums.phpfreaks.com/topic/191524-this-reg-no-longer-works/
Share on other sites

That's because they added a -

a simple RegEx update would be

<a href="([^"]+)">([^<]+)</a> - (?:<font size="-1">([^"]+)</font>)?

 

Also i have pointed this out before but do you have permission to collect this data ?

as if you don't it would be unlawful!

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.