Modernvox Posted February 9, 2010 Share Posted February 9, 2010 preg_match_all('/<a href="([^"]+)">([^<]+)<\/a><font size="-1">([^"]+)<\/font>/s', $html,$posts,PREG_SET_ORDER); Here's a few of the target URL's <p><a href="http://southcoast.craigslist.org/muc/1564255288.html">Drummer looking for weeknight gigs</a> - <font size="-1"> (New Bedford)</font></p> <p><a href="http://southcoast.craigslist.org/muc/1564167149.html">Pagan Musicians</a> - </p> <p><a href="http://southcoast.craigslist.org/muc/1564061446.html">Seeking 5th member</a> - <font size="-1"> (RI/Southern, MA)</font></p> <p><a href="http://southcoast.craigslist.org/muc/1563926651.html">Gigging cover band in search for new lead guitarist </a> - <font size="-1"> ((south shore))</font></p> <p><a href="http://southcoast.craigslist.org/muc/1563506659.html">Acoustic Guitarist Wanted</a> - <font size="-1"> (New Bedford/Fall River/Providence/East Bay area)</font></p> <p><a href="http://southcoast.craigslist.org/muc/1563233552.html">Need Help Writing Raps?</a> - <font size="-1"> (Fall River, Ma)</font> <span class="p"> pic</span></p> <h4>Wed Jan 20</h4> <p><a href="http://southcoast.craigslist.org/muc/1562404109.html">drums and guitar looking for bass w/ vocals</a> - <font size="-1"> (taunton)</font></p> <p><a href="http://southcoast.craigslist.org/muc/1562389093.html">wack ass egyptians need guitarist</a> - <font size="-1"> (quincy/whitman)</font></p> <p><a href="http://southcoast.craigslist.org/muc/1561458375.html">Looking for a few good men - Bass/Baritones</a> - <font size="-1"> (Fall River Area)</font> <span class="p"> pic</span></p> <h4>Tue Jan 19</h4> <p><a href="http://southcoast.craigslist.org/muc/1561104614.html">singer/guitarist looking</a> - </p> <p><a href="http://southcoast.craigslist.org/muc/1560864071.html">south shore cover band needs bass</a> - <font size="-1"> (plymouth,ma)</font></p> <p><a href="http://southcoast.craigslist.org/muc/1559645835.html">Looking for Rhythm Guitarist</a> - <font size="-1"> (Taunton, Ma)</font></p> <h4>Mon Jan 18</h4> <p><a href="http://southcoast.craigslist.org/muc/1558191492.html">Working Cover Rock Band Looking for GOOD Lead Singer</a> - <font size="-1"> (SE MA/RI)</font></p> <p><a href="http://southcoast.craigslist.org/muc/1557842807.html">wanted: guitar player (christian)</a> - <font size="-1"> (Dartmouth)</font></p> Here's my code which worked fine up until yesterday. Now it only works when I strip the <font syntax at the end. <?php error_reporting(E_ALL); ini_set("display_errors", 1); $st = isset($_POST['submit']) ? $_POST['state'] : ''; $urls= array("http://" . $st . ".craigslist.org"); foreach ($urls as $url) { $html = file_get_contents("$url/muc/"); preg_match_all('/<a href="([^"]+)">([^<]+)<\/a><font size="-1">([^"]+)<\/font>/s', $html,$posts,PREG_SET_ORDER); //echo "<pre>";print_r($posts); $i = 1; //set start point; $limit = 60; //set limit; foreach ($posts as $post) { //print_r $post[0]; //HTML $post[2] = str_ireplace($url,"",$post[2]); //remove domain echo "<a href=\"$url{$post[1]}\" target=\"_blank\">{$post[2]}<font size=\"3\">{$post[3]}</font></a><br />"; print "<BR />\n"; if ($i == $limit) { break; } $i++; } } ?> [code] When I remove <font size="-1">([^"]+)<\/font> ir works however it displays all some thinks I don't want as before this Regex worked perfect? Thanks in advance Link to comment https://forums.phpfreaks.com/topic/191524-this-reg-no-longer-works/ Share on other sites More sharing options...
MadTechie Posted February 10, 2010 Share Posted February 10, 2010 That's because they added a - a simple RegEx update would be <a href="([^"]+)">([^<]+)</a> - (?:<font size="-1">([^"]+)</font>)? Also i have pointed this out before but do you have permission to collect this data ? as if you don't it would be unlawful! Link to comment https://forums.phpfreaks.com/topic/191524-this-reg-no-longer-works/#findComment-1009864 Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.