Jump to content

How to fetch only data from within certain html tags?


spaze

Recommended Posts

Hello,

 

I have a response in HTML form from which I need to get data from between the <p id="myparagraph"> and </p> tags. For example:

 

<html>
  <head>
  </head>
  <body>
    <p id="myparagraph">
      sdfg dfgdfkjg dflkgj dflgkj dflgkdf g
      <br>
      dfgkjdflgkjdflgkjdflkgj
    </p>
    sdfsdfsdf
    sdfsdf
    fsdfsdf
    sdfsdf
    <p id="myparagraph">
      sdfg dfgdfkjg dflkgj dflgkj dflgkdf g
      <br>
      dfgkjdflgkjdflgkjdflkgj
    </p>
  </body>
</html>

 

So what should I get is:

 

sdfg dfgdfkjg dflkgj dflgkj dflgkdf g

dfgkjdflgkjdflgkjdflkgj

sdfg dfgdfkjg dflkgj dflgkj dflgkdf g

dfgkjdflgkjdflgkjdflkgj

 

basically what I am doing is creating my own custom RSS feed parser from a website that does not have RSS. All the paragraphs needed for the news content is within <p id="articleParagraph">....</p> and there can be n amount of these paragraphs per page.

Firstly I should point out that using an ID twice isn't valid (X)HTML.

 

However to do this you'd use regex. Give this a try:

 

if (preg_match_all('#<p id="myparagraph">(.*?)</p>#s', $source, $matches))
{
    print_r($matches);
}

 

The content you're interested in would be found in the $matches[1] array.

 

As I was saying at first though, I'd change the attribute to "class"; not "id".

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.