Jump to content

preg_match HELP


HaLo2FrEeEk

Recommended Posts

Hey, I'm trying to parse out the text of a weekly update from another site so I can convert it to bbcode and post it on my forum, an example of one of the weekly updates can be found here, my problem is that, if you look at the source, the start of the update is formatted like this:

 

 <div class="stdcontent" id="topStoryPreviewDiv">
<p>Frankie is super busy this week getting ready to show the game to a host of press doods who won’t be able to talk about jack or squat from Halo 3 until later this year (if they value their souls), 

 

So the defined start point of the preg_match would be the  <div class="stdcontent" id="topStoryPreviewDiv"> part, but the update is on the next line, and I can't seem to get my preg_match to want to read the next line.  This is my preg_match arguement:

 

preg_match("| <div class=(.+?) id=(.+?)>\n\r <p>(.+?)|", $text, $match);

 

but when I do an print_r($match), it shows nothing.  If I remove the \n\r <p> from the argument, it shows the two values of the div tag (class and id) so I know it's not that that's broken.  What can I do to fix this, any help will be appreciated.

 

Thanks.

Link to comment
Share on other sites

I'm pretty new here, so I give it a shot of trying to help you. I don't really a big fan of using preg-functions so I just show you a different way of how I would do it.

 

$content = file_get_contents("http://www.bungie.net/News/content.aspx?type=topnews&cid=12562",'r');
$content = strstr($content,'<div class="stdcontent" id="topStoryPreviewDiv">');
$len = strstr($content,'</div>');
$len = strlen($len);
print substr($content,0,-$len);

 

Good luck with it ^_^

Link to comment
Share on other sites

Well, for one I can't use file_get_contents becuase my server is secured so that that function is turned off, but I still have a way of getting the contents of the file by using the Snoopy library.  If you actually look at the source though, you will notice that the beginning and end div's are on new lines, so I either need \n, \r, or both, but neither work.  I tried your code but it didn't work, thank you though, at least you replied.

Link to comment
Share on other sites

If you want to match each line within the div you needs two r.expressions. The first matches the div's and the second each line. Try this:

 

if(preg_match('/<div class="stdcontent" id="topStoryPreviewDiv">(.+?)<\/div>/s',$htmlpage,$mth))
{
preg_match_all('/<p>(.+?)<\/p>/s',$mth[1],$paragrahps);
 # if you want to exclude titles (span/strong lines)
 # preg_match_all('/<p>(?!<strong|<span)(.+?)(?:<br>)?<\/p>/s',$mth[1],$paragrahps);
echo '<pre>'.print_r($paragrahps[1],true).'</pre>';
}
else echo 'No matching found.';

Link to comment
Share on other sites

No, look at the site I linked to, I want everything in the actual post itself, whether it is bold, a list, or whatever, I'll do a str_replace to change the html to bbcode.  I also tried what you are using, rea|and and it did not work for what I wanted, becuase it isn't like the update text is between two div tags like this:

 

<div>UPDATE CONTENT</div>

 

It's like this:

 

 <div class="stdcontent" id="topStoryPreviewDiv">
<p>Frankie is super busy this week getting ready to show the game to a host of press doods who won’t be able to talk about jack or squat from Halo 3 until later this year (if they value their souls), so instead of a shimmering Atlas holding a golden pen this week, you get a tall(er) Ewok who needs a shave. </p>
<p>Your tears, let me lick them. </p>
(...)
<p>Today marks the conclusion of the long-running Halo-themed pimping of your ride in <em>Forza 2</em>. By the time you get to this line, the official thread will be locked, pictures will be harvested and awe will no doubt strike the faces of those who gaze at what the community created. Next week, we’ll announce three winners and coordinate the claiming of prizes as well as our receipt of the cars, because Frank and I couldn’t ever make anything that rad using a livery editor. We need cheats. </p><br>
<p></p>

</div>

 

The start and end div's are on different lines, which is why I thought I needed a \n and\or \r in the regex, but I can't get it to work.

Link to comment
Share on other sites

I used that code against your link, anyways, try to use only the first preg_match, that works for multiline strings and it matches the div's content.

 

$htmlpage='your html code here';
if(preg_match('/<div class="stdcontent" id="topStoryPreviewDiv">(.+?)<\/div>/s',$htmlpage,$mth))
    echo $mth[1];
else echo 'No matching found.';

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.