Jump to content

ludo1960

Members
  • Posts

    123
  • Joined

  • Last visited

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

ludo1960's Achievements

Advanced Member

Advanced Member (4/5)

0

Reputation

  1. My crystal ball says that there are some in_array() gymnastics coming my way! Cheers!
  2. One last question, if i'm crawling a site e.g. index page -> level 1 page -> level 2 page etc (no more child pages after this) How do I know i've reached the end point? Should I expect no. of childnodes = 0 ? Or have I got the wrong end of the stick?
  3. Thanks for your reply, lots of stuff to read up on, then it's play time
  4. Hmm, $dom = new DOMDocument; libxml_use_internal_errors(true) ; $dom->loadHTMLFile($parent_node) ; if($dom->childNodes->length <>0) { $kids[] = array ( 'url' => $parent_node, 'No_of_kids' => count($dom->childNodes) ); } echo '<pre>',print_r( $kids ),'</pre>'; Results in: [0] => Array ( [url] => http://mysite.com/test/php/intro.pdo.html [No_of_kids] => 2 ) [1] => Array ( [url] => http://mysite.com/test/php/pdo.setup.html [No_of_kids] => 2 ) [2] => Array ( [url] => http://mysite.com/test/php/pdo.constants.html [No_of_kids] => 2 Pretty sure the answer aint 2 every time, something fishy going on. Any ideas guys?
  5. Surely you mean this link http://php.net/manual/en/domnodelist.count.php and not the link you sent http://php.net/manual/en/class.domnodelist.php which says the object is countable, but thanks for your non-answer anyway.
  6. Hi guys, Reading this from php.net, has got me a wee bit confused. Trying to implement is has got me doubly confused! My code: $dom = new DOMDocument; libxml_use_internal_errors(true); $dom->loadHTMLFile($parent_node); if($dom->childNodes <>0) { $kids = array ( 'url' => $parent_node, 'No_of_kids' => count($dom->childNodes) ); } Results in '' Notice: Object of class DOMNodeList could not be converted to int' How the heck am i supposed to count the childNodes?
  7. oops, all good now. Thanks for pointing me in the right direction, off to play now and I promise to read the manual .
  8. Hi guys, Just starting to play with PHP Domdocument, only to fail at the very first step: <?php $html = 'test/php/somefile.html' ; if(!empty($html)){ $dom_1 = new domDocument ; $dom_1->loadHTML($html) ; $links = $dom_1->getElementsByTagName('li') ; foreach ( $links as $link) { // echo $link ; echo $link->nodeValue, PHP_EOL; } } ?> When I visit it in a browser I get a WSOD, what am I missing?
  9. Good guess! Yeah I tried that, but some of the $lis2 don't have children, and I'm not sure how to deal with null results, $lis2 = $html2->find('.chunklist', 0)->children() ; results in: Fatal error: Uncaught Error: Call to a member function children() on null I've tried: for ( $j = 0 ; $j < count($lis2) ; $j++ ) { if ($lis2 > 0) { // and tried $lis2 <> 0 $parent_term = $lis2[$j]->first_child()->innertext . ', ' ; $parent_node = $lis2[$j]->children[0]->attr['href'] . '<br>' ; // echo count( $parent_node ) ; } else { echo "no data found" ; } } kinda stuck as to what to try next?
  10. Hi guys, I'm using PHP Simple DOM, thanks to the good folk her I'm making progress. The html i'm parsing has a bunch of links in a li ul structure. I've managed to get the top layer of links extracted and I would like to have a count of the number of child nodes in the layer below the main links. Here is my code: $html = file_get_html('test/php/book.html'); if(!empty($html)){ $lis = $html->find('.chunklist', 0)->children() ; for ( $i = 0 ; $i < count($lis) ; $i++ ) { $parent_term = $lis[$i]->first_child()->innertext . ', ' ; $parent_node = $lis[$i]->children[0]->attr['href'] . '<br>' ; //echo count($parent_node->children()) ; this gives error Warning: count(): Parameter must be an array or an object that implements Countable echo $parent_term . $parent_node ; $parent_node = $const . $parent_node ; echo $parent_node ; $html2 = file_get_html($parent_node) ; $lis2 = $html2->find('.chunklist', 0)->children() ; } } I don't see anything in the manual regarding counting nodes, any idea how to go about this?
  11. Thanks again, changed a bit of your code and it works great for ( $i = 0 ; $i < count($li) ; $i++ ) { echo $li[$i]->children[0]->attr['href'] . '<br>' ; //echo $li[$i]->children[0]->children[0]->_[4] . '<br>' ; This was my effort lol! echo $li[$i]->first_child()->innertext } So now I have all I need to construct my associative array ! Great answer! thanks again Maxxd
  12. Yeah, got lots to learn. Wouldn't want it any other way! At least I try, that's got to count for something?
  13. Already tried that: echo $li[$i]->children[0]->["parent"]->["_"]->[1]->[4] . '<br>' ; and echo $li[$i]->children[0]->["parent"]->['_']->[1]->[4] . '<br>' ; and lots of other guesses Thanks for chipping in though!
  14. Trust me I'm trying, $li[$i]->children[0]->attr['href'] is obvious now $li[$i]->children[0]->children[0]->_[4] aint so obvious! I need a Prolific Member.....but I suppose we all do
  15. Halle friggen lujah!!! Am I using the right approach? for ( $i = 0 ; $i < count($li) ; $i++ ) { echo $li[$i]->children[0]->attr['href'] . '<br>' ; //echo $li[$i]->children[0]->children[0] . '<br>' ; } Gets me the child nodes on the page visited, all good and well but I also need the text from the href, it's buried deeper in the array/object: object(simple_html_dom_node)#66 (9) { ["nodetype"]=> int(1) ["tag"]=> string(2) "li" ["attr"]=> array(0) { } ["children"]=> array(1) { [0]=> object(simple_html_dom_node)#67 (9) { ["nodetype"]=> int(1) ["tag"]=> string(1) "a" ["attr"]=> array(1) { ["href"]=> string(21) "pdo.requirements.html" } ["children"]=> array(0) { } ["nodes"]=> array(1) { [0]=> object(simple_html_dom_node)#68 (9) { ["nodetype"]=> int(3) ["tag"]=> string(4) "text" ["attr"]=> array(0) { } ["children"]=> array(0) { } ["nodes"]=> array(0) { } ["parent"]=> *RECURSION* ["_"]=> array(1) { [4]=> string(12) "Requirements" } The last bit "Requirements" just after the suspicious looking *RECURSION* I can see now how the objects and arrays work at the top level but how to address the ["_"][4]?
×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.