kevinkhan Posted October 29, 2009 Share Posted October 29, 2009 Please have a look at this script and tell me what is wrong.. Im trying to extract the link and whats between <a href> </a> and i only want to take the ones which have a class of class="vehicle" <?php function getMatches($strMatch,$strContent) { if(preg_match_all($strMatch,$strContent,$objMatches)) { return $objMatches; } return ""; } $strContent = '<li class="vehicle"><a href="http://www.domain.ie/abc?/">vehicle1</a> <li class="bus"><a href="http://www.domain.ie/abc?/">Alfa Romeo 147</a> <li class="vehicle"><a href="http://www.domain.ie/abc?/">vehicle2</a> <li class="van"><a href="http://www.domain.ie/abc?/">Alfa Romeo 147</a> <li class="van"><a href="http://www.domain.ie/abc?/">Alfa Romeo 147</a>'; $strMatch "#<li class=\"vehicle\"><a href=\"(.*)\">(.*)</a>#"; $objListMatches = getMatches($strMatch,$strContent); $sUrl = $objListMatches[1]; $sTitle = $objListMatches[2]; echo $sUrl; echo "<br />"; echo $sTitle; Any help will be greatly appreciated Quote Link to comment https://forums.phpfreaks.com/topic/179483-help-with-this-small-php-script/ Share on other sites More sharing options...
cags Posted October 29, 2009 Share Posted October 29, 2009 '#<li class="vehicle"><a href="([^"]*)">([^<]*)</a>#is'; Will work assuming the HTML is actually the same as your example. Quote Link to comment https://forums.phpfreaks.com/topic/179483-help-with-this-small-php-script/#findComment-946972 Share on other sites More sharing options...
kevinkhan Posted October 29, 2009 Author Share Posted October 29, 2009 This is what im getting when i run the script Parse error: parse error in C:\Program Files\Apache Software Foundation\Apache2.2\htdocs\carzoneCrawler\preg_match.php on line 19 Quote Link to comment https://forums.phpfreaks.com/topic/179483-help-with-this-small-php-script/#findComment-947095 Share on other sites More sharing options...
cags Posted October 29, 2009 Share Posted October 29, 2009 And what does that line of code look like? Quote Link to comment https://forums.phpfreaks.com/topic/179483-help-with-this-small-php-script/#findComment-947098 Share on other sites More sharing options...
kevinkhan Posted October 29, 2009 Author Share Posted October 29, 2009 Sorry i left out the equals sign it worked fine but now i change the code to this and when i run the script i get nothing <?php $strContent = file_get_contents('http://www.carzone.ie/search/results?searchsource=browse&cacheBuster=1256829754944212'); function getMatches($strMatch,$strContent) { if(preg_match_all($strMatch,$strContent,$objMatches)) { return $objMatches; } return ""; } $strMatch ='#<li class="vehicle"><a href="([^"]*)">([^<]*)</a>#is'; $objListMatches = getMatches($strMatch,$strContent); $sUrl = $objListMatches[1]; $sTitle = $objListMatches[2]; print_r($sUrl); echo "<br />"; print_r($sTitle); ?> Does the regular expression have to change now? Im looking for the same piece of html code in the carzone.ie url.. Quote Link to comment https://forums.phpfreaks.com/topic/179483-help-with-this-small-php-script/#findComment-947115 Share on other sites More sharing options...
cags Posted October 29, 2009 Share Posted October 29, 2009 Which is correct behavior. As far as I can tell, that page doesn't contain a single instance of <li class="vehicle"> Quote Link to comment https://forums.phpfreaks.com/topic/179483-help-with-this-small-php-script/#findComment-947120 Share on other sites More sharing options...
kevinkhan Posted October 29, 2009 Author Share Posted October 29, 2009 sorry you are right i changed it to this $strMatch ='#<li class="vehicle-make-model"><a href="([^"]*)">([^<]*)</a>#is'; check this please im looking for the titles of ads on the page with a class of vehicle-make-model please check i really want to know how to do this... Quote Link to comment https://forums.phpfreaks.com/topic/179483-help-with-this-small-php-script/#findComment-947170 Share on other sites More sharing options...
cags Posted October 29, 2009 Share Posted October 29, 2009 As suggested in previous threads, you would probably be better with DOMDocument. Interestingly the source code recieved from file_get_contents is different to what you get when you view source in a browser, that is probably what is causing you problems. When clicking view source in Firefox, we get this... <li class="vehicle-make-model"><a title="Alfa Romeo 146 Alfa Romeo Alfa Romeo 146 1.4 5DR" href="http://www.carzone.ie/search/Alfa-Romeo/146/Alfa-Rom/200935195049580/advert?channel=CARS">Alfa Romeo 146 Alfa Romeo …</a></li> When viewing the contents of file_get_contents, we get this format instead... <li class="vehicle-make-model"> <a href="http://www.carzone.ie/search/Alfa-Romeo/146/1.4/200840190284999/advert?channel=CARS"> Alfa Romeo 146 </a> Making a guess at which bits you want, you should try something like... '~<li class="vehicle-make-model">\s*<a href="([^"]*)">(.*?)</a>~is' Quote Link to comment https://forums.phpfreaks.com/topic/179483-help-with-this-small-php-script/#findComment-947197 Share on other sites More sharing options...
kevinkhan Posted October 29, 2009 Author Share Posted October 29, 2009 As suggested in previous threads, you would probably be better with DOMDocument. Interestingly the source code recieved from file_get_contents is different to what you get when you view source in a browser, that is probably what is causing you problems. When clicking view source in Firefox, we get this... <li class="vehicle-make-model"><a title="Alfa Romeo 146 Alfa Romeo Alfa Romeo 146 1.4 5DR" href="http://www.carzone.ie/search/Alfa-Romeo/146/Alfa-Rom/200935195049580/advert?channel=CARS">Alfa Romeo 146 Alfa Romeo …</a></li> When viewing the contents of file_get_contents, we get this format instead... <li class="vehicle-make-model"> <a href="http://www.carzone.ie/search/Alfa-Romeo/146/1.4/200840190284999/advert?channel=CARS"> Alfa Romeo 146 </a> Making a guess at which bits you want, you should try something like... '~<li class="vehicle-make-model">\s*<a href="([^"]*)">(.*?)</a>~is' Thats exactly what i want... Thanks for that.. :) Quote Link to comment https://forums.phpfreaks.com/topic/179483-help-with-this-small-php-script/#findComment-947267 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.