Nuv Posted February 28, 2011 Share Posted February 28, 2011 Can someone please help me with the regex.I am trying to scrape the data.Below is the pattern of data in the source code i would like to scrape.I would like to get Name,Address,City,State,Country,Pincode,Phone number and Category Pattern of Data :- <div align="center"><center><table border="0" cellpadding="3" cellspacing="0" width="350"> <tr> <td valign="top" width="350" bgcolor="#FFFFFF"<div align="center"><table border="1" cellpadding="12" cellspacing="0" width="100%" bgcolor="#FFFFE6" bordercolor="#000000" bordercolordark="#808080" bordercolorlight="#C0C0C0"> <tr> <td colspan="2" width="350"><font size="2"> <b> Name </b><br>Address <br>City, State <br>Country, Pincode <BR><BR><font size="2"><img src="../images/ph.gif" align="left" hspace="4" alt = "AL0674"> - Phone # x-xxx-xxx-xxxx </font> <BR> <font size ="1"> <BR>Category: Something <BR>Something <BR>Something </font></td> </tr> </table> </td> </tr> </table> </center></div> I tried it myself but i am getting warning with it. Code - <?php $data = file_get_contents('http://xxx.com'); $regex = '/~<tr>\s+<td\s+colspan="2"\s+width="350"><font\s+size="2">\s+<b>(.*?)</b><br>(.*?<br>(.*?), (.*?)\s+<br>(.*?), (.*?) <BR><BR><font\s+size="2"><img\s+src="../images/ph.gif"\s+align="left"\s+hspace="4"\s+alt\s+=\s+"AL0674">\s+-\s+Phone # (.*?) </font>~/'; preg_match($regex,$data,$match); var_dump($match); echo $match[1]; ?> Warning message i am getting :- Warning: preg_match() [function.preg-match]: Unknown modifier 'b' in C:\Users\Boone\AppData\Roaming\NuSphere\PhpED\projects\scrapingtest.php on line 5 NULL Quote Link to comment Share on other sites More sharing options...
Jerred121 Posted March 1, 2011 Share Posted March 1, 2011 First you need to escape your slashes ie: </b> should be <\/b>, i'm not sure but it probably wouldn't hurt to escape your <>'s also Quote Link to comment Share on other sites More sharing options...
Nuv Posted March 1, 2011 Author Share Posted March 1, 2011 Thankyou. I did escape /b with \/b and it worked. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.