Jump to content

ludo1960

Members
  • Content Count

    69
  • Joined

  • Last visited

Community Reputation

0 Neutral

About ludo1960

  • Rank
    ( ͡° ͜ʖ ͡°)

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

  1. ludo1960

    count childNodes in domDocument

    My crystal ball says that there are some in_array() gymnastics coming my way! Cheers!
  2. ludo1960

    count childNodes in domDocument

    One last question, if i'm crawling a site e.g. index page -> level 1 page -> level 2 page etc (no more child pages after this) How do I know i've reached the end point? Should I expect no. of childnodes = 0 ? Or have I got the wrong end of the stick?
  3. ludo1960

    count childNodes in domDocument

    Thanks for your reply, lots of stuff to read up on, then it's play time
  4. ludo1960

    count childNodes in domDocument

    Hmm, $dom = new DOMDocument; libxml_use_internal_errors(true) ; $dom->loadHTMLFile($parent_node) ; if($dom->childNodes->length <>0) { $kids[] = array ( 'url' => $parent_node, 'No_of_kids' => count($dom->childNodes) ); } echo '<pre>',print_r( $kids ),'</pre>'; Results in: [0] => Array ( [url] => http://mysite.com/test/php/intro.pdo.html [No_of_kids] => 2 ) [1] => Array ( [url] => http://mysite.com/test/php/pdo.setup.html [No_of_kids] => 2 ) [2] => Array ( [url] => http://mysite.com/test/php/pdo.constants.html [No_of_kids] => 2 Pretty sure the answer aint 2 every time, something fishy going on. Any ideas guys?
  5. ludo1960

    count childNodes in domDocument

    Surely you mean this link http://php.net/manual/en/domnodelist.count.php and not the link you sent http://php.net/manual/en/class.domnodelist.php which says the object is countable, but thanks for your non-answer anyway.
  6. Hi guys, Reading this from php.net, has got me a wee bit confused. Trying to implement is has got me doubly confused! My code: $dom = new DOMDocument; libxml_use_internal_errors(true); $dom->loadHTMLFile($parent_node); if($dom->childNodes <>0) { $kids = array ( 'url' => $parent_node, 'No_of_kids' => count($dom->childNodes) ); } Results in '' Notice: Object of class DOMNodeList could not be converted to int' How the heck am i supposed to count the childNodes?
  7. ludo1960

    getting started with domdocument

    oops, all good now. Thanks for pointing me in the right direction, off to play now and I promise to read the manual .
  8. Hi guys, Just starting to play with PHP Domdocument, only to fail at the very first step: <?php $html = 'test/php/somefile.html' ; if(!empty($html)){ $dom_1 = new domDocument ; $dom_1->loadHTML($html) ; $links = $dom_1->getElementsByTagName('li') ; foreach ( $links as $link) { // echo $link ; echo $link->nodeValue, PHP_EOL; } } ?> When I visit it in a browser I get a WSOD, what am I missing?
  9. ludo1960

    count child nodes in phpsimpledom

    Good guess! Yeah I tried that, but some of the $lis2 don't have children, and I'm not sure how to deal with null results, $lis2 = $html2->find('.chunklist', 0)->children() ; results in: Fatal error: Uncaught Error: Call to a member function children() on null I've tried: for ( $j = 0 ; $j < count($lis2) ; $j++ ) { if ($lis2 > 0) { // and tried $lis2 <> 0 $parent_term = $lis2[$j]->first_child()->innertext . ', ' ; $parent_node = $lis2[$j]->children[0]->attr['href'] . '<br>' ; // echo count( $parent_node ) ; } else { echo "no data found" ; } } kinda stuck as to what to try next?
  10. Hi guys, I'm using PHP Simple DOM, thanks to the good folk her I'm making progress. The html i'm parsing has a bunch of links in a li ul structure. I've managed to get the top layer of links extracted and I would like to have a count of the number of child nodes in the layer below the main links. Here is my code: $html = file_get_html('test/php/book.html'); if(!empty($html)){ $lis = $html->find('.chunklist', 0)->children() ; for ( $i = 0 ; $i < count($lis) ; $i++ ) { $parent_term = $lis[$i]->first_child()->innertext . ', ' ; $parent_node = $lis[$i]->children[0]->attr['href'] . '<br>' ; //echo count($parent_node->children()) ; this gives error Warning: count(): Parameter must be an array or an object that implements Countable echo $parent_term . $parent_node ; $parent_node = $const . $parent_node ; echo $parent_node ; $html2 = file_get_html($parent_node) ; $lis2 = $html2->find('.chunklist', 0)->children() ; } } I don't see anything in the manual regarding counting nodes, any idea how to go about this?
  11. ludo1960

    array hierarchy and filter

    Thanks again, changed a bit of your code and it works great for ( $i = 0 ; $i < count($li) ; $i++ ) { echo $li[$i]->children[0]->attr['href'] . '<br>' ; //echo $li[$i]->children[0]->children[0]->_[4] . '<br>' ; This was my effort lol! echo $li[$i]->first_child()->innertext } So now I have all I need to construct my associative array ! Great answer! thanks again Maxxd
  12. ludo1960

    array hierarchy and filter

    Yeah, got lots to learn. Wouldn't want it any other way! At least I try, that's got to count for something?
  13. ludo1960

    array hierarchy and filter

    Already tried that: echo $li[$i]->children[0]->["parent"]->["_"]->[1]->[4] . '<br>' ; and echo $li[$i]->children[0]->["parent"]->['_']->[1]->[4] . '<br>' ; and lots of other guesses Thanks for chipping in though!
  14. ludo1960

    array hierarchy and filter

    Trust me I'm trying, $li[$i]->children[0]->attr['href'] is obvious now $li[$i]->children[0]->children[0]->_[4] aint so obvious! I need a Prolific Member.....but I suppose we all do
  15. ludo1960

    array hierarchy and filter

    Halle friggen lujah!!! Am I using the right approach? for ( $i = 0 ; $i < count($li) ; $i++ ) { echo $li[$i]->children[0]->attr['href'] . '<br>' ; //echo $li[$i]->children[0]->children[0] . '<br>' ; } Gets me the child nodes on the page visited, all good and well but I also need the text from the href, it's buried deeper in the array/object: object(simple_html_dom_node)#66 (9) { ["nodetype"]=> int(1) ["tag"]=> string(2) "li" ["attr"]=> array(0) { } ["children"]=> array(1) { [0]=> object(simple_html_dom_node)#67 (9) { ["nodetype"]=> int(1) ["tag"]=> string(1) "a" ["attr"]=> array(1) { ["href"]=> string(21) "pdo.requirements.html" } ["children"]=> array(0) { } ["nodes"]=> array(1) { [0]=> object(simple_html_dom_node)#68 (9) { ["nodetype"]=> int(3) ["tag"]=> string(4) "text" ["attr"]=> array(0) { } ["children"]=> array(0) { } ["nodes"]=> array(0) { } ["parent"]=> *RECURSION* ["_"]=> array(1) { [4]=> string(12) "Requirements" } The last bit "Requirements" just after the suspicious looking *RECURSION* I can see now how the objects and arrays work at the top level but how to address the ["_"][4]?
×

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.