PutterPlace Posted January 28, 2008 Share Posted January 28, 2008 I am writing a script in order to learn, more in-depth, how PHP works. It's basically for learning purposes. I have written a script that uses cURL to get the HTML contents of another file on my host. The HTML contents are returned in the variable $content. What I would like to do now is extract certain information from the HTML. There are 6 instances of information that needs to be extracted. All of the information lies between two tags. Example: <p class=top>This <b>is</b> the information that <i>needs</i> to be extracted.</p> Everything between "<p class=top>" and "</p>" needs to be extracted and put into an array, or printed directly on the page (whichever is easiest to accomplish). Any help with this would be greatly appreciated. Thanks in advance. -PutterPlace Quote Link to comment https://forums.phpfreaks.com/topic/88161-solved-simple-php-help/ Share on other sites More sharing options...
ziv Posted January 28, 2008 Share Posted January 28, 2008 use preg_match_all() to extract your pattern. (click here for preg syntax) Quote Link to comment https://forums.phpfreaks.com/topic/88161-solved-simple-php-help/#findComment-451058 Share on other sites More sharing options...
PutterPlace Posted January 28, 2008 Author Share Posted January 28, 2008 I took a look at the URL you have provided, and it somewhat confused me. Is there some way that you could simplify the expression for me? I kinda know how to use preg_match_all, but I'm not sure how to use it in this scenario. This is what I have right now (as sampled from another source): function ExtractText($start, $end, $content) { preg_match_all('/' . preg_quote($start, '/') . '([^\.)]+)'. preg_quote($end, '/').'/i', $string, $m); return $m[1]; } $start = "<p class=top>"; $end = "</p>"; $output = ExtractText($start, $end, $content); print_r($output); The above function worked fine with plain text, but I can't seem to get it to work with the HTML contents that I have. This is the plain text I tested it with: function ExtractText($start, $end, $string) { preg_match_all('/' . preg_quote($start, '/') . '([^\.)]+)'. preg_quote($end, '/').'/i', $string, $m); return $m[1]; } $start = "StartTag"; $end = "EndTag"; $text = "Hello Everybody! StartTagMy name is Bob!EndTag. What is your name? Well...StartTagit doesn't really matterEndTag"; $output = ExtractText($start, $end, $text); print_r($output); The above code resulted with this: Array ( [0] => My name is Bob! [1] => it doesn't really matter ) Why doesn't it do the same for the HTML code? Quote Link to comment https://forums.phpfreaks.com/topic/88161-solved-simple-php-help/#findComment-451078 Share on other sites More sharing options...
PutterPlace Posted January 31, 2008 Author Share Posted January 31, 2008 Anyone?? Quote Link to comment https://forums.phpfreaks.com/topic/88161-solved-simple-php-help/#findComment-454053 Share on other sites More sharing options...
resago Posted January 31, 2008 Share Posted January 31, 2008 your regex would be '/<p class=top>(.*)<\/p>/' Quote Link to comment https://forums.phpfreaks.com/topic/88161-solved-simple-php-help/#findComment-454063 Share on other sites More sharing options...
PutterPlace Posted January 31, 2008 Author Share Posted January 31, 2008 Thanks. It works perfectly. Quote Link to comment https://forums.phpfreaks.com/topic/88161-solved-simple-php-help/#findComment-454084 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.