Jump to content

Recommended Posts

I am writing a script in order to learn, more in-depth, how PHP works. It's basically for learning purposes. I have written a script that uses cURL to get the HTML contents of another file on my host. The HTML contents are returned in the variable $content.

 

What I would like to do now is extract certain information from the HTML. There are 6 instances of information that needs to be extracted. All of the information lies between two tags. Example:

 

<p class=top>This <b>is</b> the information that <i>needs</i> to be extracted.</p>

 

Everything between "<p class=top>" and "</p>" needs to be extracted and put into an array, or printed directly on the page (whichever is easiest to accomplish).

 

Any help with this would be greatly appreciated. Thanks in advance.

 

 

 

-PutterPlace

Link to comment
https://forums.phpfreaks.com/topic/88161-solved-simple-php-help/
Share on other sites

I took a look at the URL you have provided, and it somewhat confused me. Is there some way that you could simplify the expression for me? I kinda know how to use preg_match_all, but I'm not sure how to use it in this scenario. This is what I have right now (as sampled from another source):

 

function ExtractText($start, $end, $content) {
    preg_match_all('/' . preg_quote($start, '/') . '([^\.)]+)'. preg_quote($end, '/').'/i', $string, $m);
    return $m[1];
}
$start = "<p class=top>";
$end = "</p>";
$output = ExtractText($start, $end, $content);
print_r($output);

 

The above function worked fine with plain text, but I can't seem to get it to work with the HTML contents that I have. This is the plain text I tested it with:

 

function ExtractText($start, $end, $string) {
    preg_match_all('/' . preg_quote($start, '/') . '([^\.)]+)'. preg_quote($end, '/').'/i', $string, $m);
    return $m[1];
}

$start = "StartTag";
$end = "EndTag";
$text = "Hello Everybody! StartTagMy name is Bob!EndTag. What is your name? Well...StartTagit doesn't really matterEndTag";
$output = ExtractText($start, $end, $text);

print_r($output);

 

The above code resulted with this:

 

Array
(
    [0] => My name is Bob!
    [1] => it doesn't really matter
)

 

 

Why doesn't it do the same for the HTML code?

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.