Jump to content

count childNodes in domDocument


Recommended Posts

Hi guys,

Reading this from php.net, has got me a wee bit confused. Trying to implement is has got me doubly confused! My code:

        $dom = new DOMDocument;
        libxml_use_internal_errors(true);
        $dom->loadHTMLFile($parent_node);

        if($dom->childNodes <>0) {
            $kids = array (
                'url' => $parent_node,
                'No_of_kids' => count($dom->childNodes)
            ); 
        }

 Results in '' Notice: Object of class DOMNodeList could not be converted to int'

How the heck am i supposed to count the childNodes?

Link to post
Share on other sites

A (very) brief note about Countable objects

Classes implementing the Countable interface define and implement their own count() method.  The DOMNodeList class is one such class.

Instances of classes that implement the Countable interface can be passed to the count() function, and their own special count() method gets called.  In DOMNodeList's case, that method returns the number of nodes in the list.  There is nothing stopping you from calling the count() method on the object (e.g. $myobject->count()) rather than the count() function (e.g. count($myobject)), if that's what you want to do.

 

How the heck am i supposed to count the childNodes?

Back to your original question.  There are several ways to get the number of nodes in a DOMNodeList (which is what your $dom->childNodes is).

1. $dom->childNodes->length
2. count($dom->childNodes)
3. $dom->childNodes->count()

 

  • Like 1
Link to post
Share on other sites

Hmm,

        $dom = new DOMDocument;
        libxml_use_internal_errors(true) ;
        $dom->loadHTMLFile($parent_node) ;

        if($dom->childNodes->length <>0) {
            $kids[] = array (
                'url' => $parent_node,
                'No_of_kids' => count($dom->childNodes) 
            );   
        }
		echo '<pre>',print_r( $kids ),'</pre>'; 

Results in:

    [0] => Array
        (
            [url] => http://mysite.com/test/php/intro.pdo.html
            [No_of_kids] => 2
        )

    [1] => Array
        (
            [url] => http://mysite.com/test/php/pdo.setup.html
            [No_of_kids] => 2
        )

    [2] => Array
        (
            [url] => http://mysite.com/test/php/pdo.constants.html
            [No_of_kids] => 2

Pretty sure the answer aint 2 every time, something fishy going on. Any ideas guys?

Link to post
Share on other sites
1 hour ago, ludo1960 said:

Pretty sure the answer aint 2 every time, something fishy going on. Any ideas guys?

It looks like you're scraping pages from the PHP manual, so taking one of those as an example, the HTML looks like this (super-stripped down for simplicity):

<?php
$html = '<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" lang="en">
    <!-- lots more goes here -->
</html>';
$dom = new DOMDocument();
$dom->loadHTML($html);

var_dump($dom->childNodes->length);
foreach ($dom->childNodes as $childNode) {
    var_dump(get_class($childNode));
}

The above outputs the following:

int(2)
string(15) "DOMDocumentType"
string(10) "DOMElement"

This shows that the document ($dom) has two child nodes: 1. the document type (<!DOCTYPE html>) and 2. the "html" element.

Hope that helps. ?

Link to post
Share on other sites

One last question, if i'm crawling a site e.g. index page -> level 1 page -> level 2 page etc (no more child pages after this) How do I know i've reached the end point? Should I expect no. of childnodes = 0  ? Or have I got the wrong end of the stick?

Link to post
Share on other sites

Depends, but the answer is probably that you have to figure it out for yourself. Typically by storing a list of the URLs you've hit then checking it whenever you think you want to crawl a new page.

Link to post
Share on other sites
This thread is more than a year old.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.