Jump to content

Parse html issue


nickjonnes

Recommended Posts

hey all,

i as you will be able to see below i have a large amount of div's inside div's i need to get allthe information from the div class "items". i am sorry to post such a large source but i didnt want to half write out the source and waste everyones time. i have spent a large amount of time trying to figure this one out and i have been googling alot but i seem to have no luck.

can anyone please help?

 

<div class="content">      
    <div class="main">
        <div class="hero hero-small">
            <div id="slide-wrapper">
                <div class="items">
                    <div class="container-outer">
                        <div class="container-inner">

<img src= "www.example.com/picture.jpg"
class= "class" style="width: 300px; height: 200px;" /><a href= "www.importantsite.com/" target="_blank">
<h3>title</h3></a> <a href= "http://anotherimportantsite.com/"target="_blank">
<h4><span>blahblah</span></h4></a><p>blahblah<span>... <img src="http://example.com/blah.gif" alt="" class="example" /></span></p>

                        </div>
                    </div>
<div class="container-outer">
                        <div class="container-inner">

<img src= "www.example.com/picture.jpg"
class= "class" style="width: 300px; height: 200px;" /><a href= "www.importantsite.com/" target="_blank">
<h3>title</h3></a> <a href= "http://anotherimportantsite.com/"target="_blank">
<h4><span>blahblah</span></h4></a><p>blahblah<span>... <img src="http://example.com/blah.gif" alt="" class="example" /></span></p>

                        </div>
                    </div>
<div class="container-outer">
                        <div class="container-inner">

<img src= "www.example.com/picture.jpg"
class= "class" style="width: 300px; height: 200px;" /><a href= "www.importantsite.com/" target="_blank">
<h3>title</h3></a> <a href= "http://anotherimportantsite.com/"target="_blank">
<h4><span>blahblah</span></h4></a><p>blahblah<span>... <img src="http://example.com/blah.gif" alt="" class="example" /></span></p>

                </div>
            </div>
        </div>
    </div>
</div>

Link to comment
Share on other sites

You could use a positive lookahead.

 

$pattern = "~<div class=\"items\">(.*)</div>(?=</div></div></div></div>)~"

 

This should take the items div and everything after the items div up until an end div that is followed by at least 4 other divs, but will not return those 4 end divs. I haven't tested it though so please tell me if it works ;D

Link to comment
Share on other sites

This is too complicated for RegEx to work well.

 

Use an HTML parser, or string functions. I can show you how to do it with a parser, but you'll have to give me the entire document you want to parse.

Link to comment
Share on other sites

the document i am trying to get this from is mail.com it is their news slider. if you rightclick and go to view source you will see the a huge page. i have already read through it all and worked out that this part is the only part i need that is dynamic the rest is jsut fixed code like the slide. it is only the content that changes.

Link to comment
Share on other sites

hey all,

i solved the issue, it took me ages and i learnt a lot as i am not a php coder but here it what i did.

all form the power of google;)

thanks to everyone who helped and sorry to be so confusing

<?php
include 'simple_html_dom.php';
// Create DOM from string 
$dom = file_get_html('http://www.mail.com/');
$table = $dom->getElementById('slide-wrapper');
//$ret = $html->find('div[class id=content');
echo $table;
?>

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.