shaunie Posted March 14, 2011 Share Posted March 14, 2011 Hi, I have a script that is scraping merchant names from Amazon market place, for example: http://www.amazon.co.uk/gp/offer-listing/B002PLB2F4/?condition=new The following code works where the seller has a logo: preg_match_all('/<ul class="sellerInformation">([\s]+)(.*?)<img src="(.*?)" width="(.*?)" alt="(.*?)" height="(.*?)" border="(.*?)" \/><\/a>/', $html, $merchants); and this works where merchants don't have a logo: preg_match_all('/<ul class="sellerInformation">([\s]+)<li><div class="seller"><span class="sellerHeader">Seller:<\/span>([\s]+)<a href="(.*?)"><b>(.*?)<\/b><\/a>/', $html, $merchants2); How can I combine these regular expressions so that I just get one array of merchant names? Many thanks for your advice... Link to comment https://forums.phpfreaks.com/topic/230610-regex-for-amazon/ Share on other sites More sharing options...
sasa Posted March 15, 2011 Share Posted March 15, 2011 try <?php $url = 'http://www.amazon.co.uk/gp/offer-listing/B002PLB2F4/?condition=new'; $html = file_get_contents($url); preg_match_all('~<ul class="sellerInformation">.*?(alt="|<b>)([^"<]+)("|<)~is', $html, $matchesarray); print_r($matchesarray[2]); ?> Link to comment https://forums.phpfreaks.com/topic/230610-regex-for-amazon/#findComment-1187609 Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.