soltek Posted December 22, 2010 Share Posted December 22, 2010 Hello again, I'm trying to scrape a table from another website using preg_match, especifically, using this code: <?php $data = file_get_contents('http://tvcountdown.com/index.php'); $regex = '/[color=red]<table class="episode_list_table">[/color] (.+?) [color=red]</table>[/color]/'; preg_match($regex,$data,$match); var_dump($match); echo $match[0]; ?> Heres the thing. It doesnt work I think it's because the first and second anchors are html tags, 'cause if I parse some other stuff without any tag, there's no problemo. Any hints, mates? Thanks Quote Link to comment https://forums.phpfreaks.com/topic/222385-parsing-an-entire-html-table/ Share on other sites More sharing options...
BlueSkyIS Posted December 22, 2010 Share Posted December 22, 2010 if you are actually looking for [ and ], you'll need to escape them since they are special reg-ex characters. the same goes for backslash, if you are using it as your open/close delimiters. so your regex should be more like $regex = '/\[color=red\]<table class="episode_list_table">\[/color\] (.+?) \[color=red\]<\/table>\[\/color\]/'; Quote Link to comment https://forums.phpfreaks.com/topic/222385-parsing-an-entire-html-table/#findComment-1150333 Share on other sites More sharing options...
soltek Posted December 22, 2010 Author Share Posted December 22, 2010 Oh I got it! Dumb me. Now I'm not getting any warning, though, the array is still giving me nothing but «array(0) { } ». I'm using this: <?php $data = file_get_contents('http://tvcountdown.com/index.php'); $regex = '/<table class="episode_list_table" style="width:848px;margin-top:6px"> (.+?) <\/table>/'; preg_match($regex,$data,$match); var_dump($match); echo $match[1]; ?> Maybe I'm doing something else wrong. I wanted to scrape evertything from <table class="episode_list_table" style="width:848px;margin-top:6px"> to </table>. I'm seeing the source of http://tvcountdown.com/index.php and that data is there :s <table class="episode_list_table" style="width:848px;margin-top:6px"> <tr> <td class="c1 hr">Show</td> <td class="c2 hr">Ep</td> <td class="c3 hr">Title</td> <td class="c4 hr">Countdown</td> <td class="c5 hr" id="air_time_header"><div>Airtime (local)</div></td> [...] <td class="c3"><a href="http://www.tvrage.com/Conan/episodes/1064999912">Jason Segel, Reggie Watts</a></td><td class="c4"><div id="c442_S01E28"></div></td><td class="c5"><div id="cc442_S01E28"></div></td></tr></table> Quote Link to comment https://forums.phpfreaks.com/topic/222385-parsing-an-entire-html-table/#findComment-1150421 Share on other sites More sharing options...
soltek Posted December 23, 2010 Author Share Posted December 23, 2010 Any one? Quote Link to comment https://forums.phpfreaks.com/topic/222385-parsing-an-entire-html-table/#findComment-1150632 Share on other sites More sharing options...
johnny86 Posted December 23, 2010 Share Posted December 23, 2010 preg_match('#<table class="episode_list_table"[^>]+>([\w\W]*)</table>#i', file_get_contents('http://tvcountdown.com/index.php'), $match); Try it =) Quote Link to comment https://forums.phpfreaks.com/topic/222385-parsing-an-entire-html-table/#findComment-1150651 Share on other sites More sharing options...
johnny86 Posted December 23, 2010 Share Posted December 23, 2010 Woops, tried that myself. This works: preg_match('#<table class="episode_list_table"[^>]+>[\w\W]*?</table>#i', file_get_contents('http://tvcountdown.com/index.php'), $match); Quote Link to comment https://forums.phpfreaks.com/topic/222385-parsing-an-entire-html-table/#findComment-1150652 Share on other sites More sharing options...
soltek Posted December 23, 2010 Author Share Posted December 23, 2010 Great, now 'I' got it parsed. Thanks! Is there any way to do this? I'm getting an error: [color=red]$url = http://tvcountdown.com/index.php;[/color] preg_match('#<table class="episode_list_table"[^>]+>[\w\W]*?</table>#i', file_get_contents('[color=red]$url[/color]'), $match); Quote Link to comment https://forums.phpfreaks.com/topic/222385-parsing-an-entire-html-table/#findComment-1150741 Share on other sites More sharing options...
johnny86 Posted December 23, 2010 Share Posted December 23, 2010 What do you exactly want to do? Turn all text red in that table or what? Because then you have to go trough the whole table and make <td style="color: red"> or give them id="red" and then apply #red to your sites stylesheet. But what do you want to do exactly? =) Quote Link to comment https://forums.phpfreaks.com/topic/222385-parsing-an-entire-html-table/#findComment-1150746 Share on other sites More sharing options...
soltek Posted December 23, 2010 Author Share Posted December 23, 2010 What do you exactly want to do? Turn all text red in that table or what? Because then you have to go trough the whole table and make <td style="color: red"> or give them id="red" and then apply #red to your sites stylesheet. But what do you want to do exactly? =) Sorry, I didnt noticed that BBcode couldnt be used inside the code tag. The red parts were just to highlight what I intented to do. $url = http://tvcountdown.com/index.php; preg_match('#<table class="episode_list_table"[^>]+>[\w\W]*?</table>#i', file_get_contents('$url'), $match); Quote Link to comment https://forums.phpfreaks.com/topic/222385-parsing-an-entire-html-table/#findComment-1150758 Share on other sites More sharing options...
johnny86 Posted December 24, 2010 Share Posted December 24, 2010 You asked me to read your last reply on this topic.. So do you need some more help for this matter? Please give me some specific information on what you want to do and what are the problems related to that. Thanks =) Quote Link to comment https://forums.phpfreaks.com/topic/222385-parsing-an-entire-html-table/#findComment-1151171 Share on other sites More sharing options...
laffin Posted December 25, 2010 Share Posted December 25, 2010 I think he's trying to scrape the page with the info from the tables. the problem I seen was that the final 2 columns (countdown and aittime) are done via a javascript. Quote Link to comment https://forums.phpfreaks.com/topic/222385-parsing-an-entire-html-table/#findComment-1151279 Share on other sites More sharing options...
soltek Posted December 25, 2010 Author Share Posted December 25, 2010 Everything is working nicely, but I wanted to change this part: preg_match('#<table class="episode_list_table"[^>]+>[\w\W]*?</table>#i', file_get_contents('http://tvcountdown.com/index.php'), $match); to something like this: $url = http://tvcountdown.com/index.php; preg_match('#<table class="episode_list_table"[^>]+>[\w\W]*?</table>#i', file_get_contents('$url'), $match); I tried that, but it didnt work very well Thanks for passing by. Quote Link to comment https://forums.phpfreaks.com/topic/222385-parsing-an-entire-html-table/#findComment-1151280 Share on other sites More sharing options...
johnny86 Posted December 25, 2010 Share Posted December 25, 2010 $url = "http://tvcountdown.com/index.php"; preg_match('#<table class="episode_list_table"[^>]+>[\w\W]*?</table>#i', file_get_contents($url), $match); You were trying to load string '$url' not http://tvcountdown.com/index.php. Quote Link to comment https://forums.phpfreaks.com/topic/222385-parsing-an-entire-html-table/#findComment-1151322 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.