ludo1960 Posted February 22, 2019 Share Posted February 22, 2019 Hi guys, I'm trying to build an array to replicate the hierarchy in a menu: <ul> <li><a href="file1.html">text1</a></li> <ul> <li><a href="file2.html">text2</a></li> <li><a href="file3.html">text3</a></li> <li><a href="file4.html">text4</a></li> <li><a href="file5.html">text5</a></li> </ul> </ul> An i would like the output to be: "text1" "text1", "text2" "text1", "text3" "text1", "text4" "text1", "text5" Here is my loop to go through the html hierarchy: foreach ($html2->find('ul') as $ul) { foreach ($ul->find('li') as $li) { foreach($li->find('a') as $a) { // need to filter out empty and index.html, tried if(!$->href = 'index.html) {do stuff} but didn't work $links2[] = $a->href ; $taxo2[] = $a->plaintext ; } } } This finds all the links but not the hierarchy, any ideas how to approach this? And also how to filter out blanks and references to index.html? Quote Link to comment Share on other sites More sharing options...
gw1500se Posted February 22, 2019 Share Posted February 22, 2019 It is not clear what you mean. Your code certainly will not do what you ask. You don't save the text from the <li> that follows the first <ul>. It sounds like you want an associative array where the key is 'text1' and the value is an array of 'text2', 'text3', etc. Quote Link to comment Share on other sites More sharing options...
requinix Posted February 22, 2019 Share Posted February 22, 2019 Yeah, unless there is a particular need for that 1/2, 1/3, 1/4, 1/5 list then it really should be an array containing one entry for 1, that itself has another array of 2-5. array( "text1" => array( "text2", "text3", "text4", "text5" ) ) or array( array( "name" => "text1", "items" => array( array( "name" => "text2", "items" => array() ), array( "name" => "text3", "items" => array() ), array( "name" => "text4", "items" => array() ), array( "name" => "text5", "items" => array() ) ) ) ) Also your HTML is incorrect: the UL needs to be within the parent LI. Outside it is invalid. Keep in mind that ->find() will find all children, so ->find(a) on the text1 LI will find all five links. A better approach would be to find the A from the LIs set of immediate children, then from there go recursively into the immediate child UL if any. Quote Link to comment Share on other sites More sharing options...
ludo1960 Posted February 23, 2019 Author Share Posted February 23, 2019 @requinix Quote A better approach would be to find the A from the LIs set of immediate children, then from there go recursively into the immediate child UL if any. Sorry if my request wasn't clear, it's just that I am confused as to how to traverse the ul's and the l'si to get the immediate children a's, the help for DOM parser isn't clear to me how to find child nodes if indeed there are any. The idea behind me wanting the output to be text1... is that I want the structure of the array to reflect the structure of the ul's and li's for simplicity's sake. Thank you both for taking the time to answer my post, your help is greatly appreciated. Quote Link to comment Share on other sites More sharing options...
requinix Posted February 24, 2019 Share Posted February 24, 2019 Have you checked the main documentation page? Yes, you can get to a node's children. Quote Link to comment Share on other sites More sharing options...
ludo1960 Posted February 24, 2019 Author Share Posted February 24, 2019 Hello again, yes I have read the docs, but there seems to be a tiny wee gap in my interpretation foreach ($html->find('li') as $li) { $str1[] = $li->find('a')->first_child(); // foreach ($li->find('a') as $a) { // $a->find('#layout', 0)->children(1)->children(1)->children(2)->id ; // } } All attempts end in abject failure, Quote Fatal error: Uncaught Error: Call to a member function first_child() on array What am I doing wrong? Quote Link to comment Share on other sites More sharing options...
requinix Posted February 24, 2019 Share Posted February 24, 2019 I don't see anything "first_child" on that doc page I linked, but that "How to traverse the DOM tree?" example looks highly relevant. Quote Link to comment Share on other sites More sharing options...
ludo1960 Posted February 24, 2019 Author Share Posted February 24, 2019 First_child is on the traverse he dom tree page, but I think my usage of it is wrong, a small hint or pointer would be of great help! Quote Link to comment Share on other sites More sharing options...
gw1500se Posted February 24, 2019 Share Posted February 24, 2019 Hint: Get the array of children first. Quote Link to comment Share on other sites More sharing options...
ludo1960 Posted February 24, 2019 Author Share Posted February 24, 2019 The way I read the docs, I need to find the child nodes of an element and that is what I tried foreach ($html->find('li') as $li) { $str1[] = $li->find('a')->first_child(); } I need s bigger hint! Come on guys, throw the dog a bone Quote Link to comment Share on other sites More sharing options...
maxxd Posted February 24, 2019 Share Posted February 24, 2019 Take a look at how you're making the call and how the documentation makes the call. I've updated the examples a bit to make the comparison more direct, I hope. Yours: $li->find('a')->first_child(); Theirs: $li->find('a', 0)->first_child(); And the error message: Call to a member function first_child() on array And finally, the documentation itself states (modified for emphasis): Quote // Find all anchors, returns a array of element objects $ret = $html->find('a');// Find (N)th anchor, returns element object or null if not found (zero based) $ret = $html->find('a', n); Hope that helps. Quote Link to comment Share on other sites More sharing options...
ludo1960 Posted February 24, 2019 Author Share Posted February 24, 2019 Tried both your suggestions: foreach ($html->find('li') as $li) { $str1 = $li->find('a', 0)->first_child(); } Results in: Fatal error: Uncaught Error: Call to a member function first_child() on null $str2 = $html->find('a', 0); Results in: Fatal error: Allowed memory size of 2147483648 bytes exhausted I am missing something fundamental here, I'm all out of ideas. If I promise never to laugh again at your beloved President, would you help me out? Quote Link to comment Share on other sites More sharing options...
gw1500se Posted February 24, 2019 Share Posted February 24, 2019 (edited) You need to learn some debugging techniques so you can help yourself. First use: echo "<pre>"; var_dump($html->find('li')); echo "</pre>"; to see exactly what you are retrieving. From there you should be able to figure out what to use to parse that result or if you are not getting what you expect. Edited February 24, 2019 by gw1500se Quote Link to comment Share on other sites More sharing options...
ludo1960 Posted February 24, 2019 Author Share Posted February 24, 2019 Eek!! object(simple_html_dom_node)#27 (9) { ["nodetype"]=> int(1) ["tag"]=> string(2) "li" ["attr"]=> array(1) { ["style"]=> string(12) "float: left;" } ["children"]=> array(1) { [0]=> object(simple_html_dom_node)#28 (9) { ["nodetype"]=> int(1) ["tag"]=> string(1) "a" ["attr"]=> array(1) { ["href"]=> string(25) "function.odbc-tables.html" ...... Yes you are 100% correct, I need to learn some debugging techniques. Never seen an array this big! So to access the children I dont understand why $str1 = $li->children() ; Does not work? I thought that is how you access elements in an object. What am i missing? Quote Link to comment Share on other sites More sharing options...
gw1500se Posted February 25, 2019 Share Posted February 25, 2019 $str1 is not a string so you can't treat it that way. It is an array. You can see that $str1[0] is the element you are looking for. Quote Link to comment Share on other sites More sharing options...
maxxd Posted February 25, 2019 Share Posted February 25, 2019 20 hours ago, ludo1960 said: Tried both your suggestions: They weren't intended to be copy and paste solutions. They were a mashup of the documentation and your code with the goal of showing you how the function call needs to be made to get you the results you want. Read the code, read the documentation, then apply logic. As @gw1500se says, $str1 isn't a string, it's an object with some properties that are arrays and it needs to be treated as such. Quote Link to comment Share on other sites More sharing options...
ludo1960 Posted February 25, 2019 Author Share Posted February 25, 2019 I know you guys are trying to point me in the right direction, but after hours going around in circles I just can't figure out how to access an object that is in an array. Must be obvious for you guys, but I can't see it. Remember when you were first learning the dark art of PHP and you had a WTF moment? Well that's me right now, Quote Link to comment Share on other sites More sharing options...
Barand Posted February 25, 2019 Share Posted February 25, 2019 The structure may be clearer to you if you use print_r() instead of var_dump() echo "<pre>"; print_r($html->find('li')); echo "</pre>"; Quote Link to comment Share on other sites More sharing options...
ludo1960 Posted February 25, 2019 Author Share Posted February 25, 2019 Trying your code: echo "<pre>"; print_r($html->find('li')); echo "</pre>"; results in: Fatal error: Allowed memory size of 2147483648 bytes exhausted (tried to allocate 1071648768 bytes) whereas: echo "<pre>"; var_dump($html->find('li')); echo "</pre>"; spits out the largest array known to mankind: array(9) { [0]=> object(simple_html_dom_node)#27 (9) { ["nodetype"]=> int(1) ["tag"]=> string(2) "li" ["attr"]=> array(1) { ["style"]=> string(12) "float: left;" } ["children"]=> array(1) { [0]=> object(simple_html_dom_node)#28 (9) { ["nodetype"]=> int(1) ["tag"]=> string(1) "a" ["attr"]=> array(1) { ["href"]=> string(14) "intro.pdo.html" } ["children"]=> array(0) { } ["nodes"]=> array(1) { [0]=> object(simple_html_dom_node)#29 (9) { ["nodetype"]=> int(3) ["tag"]=> string(4) "text" ["attr"]=> array(0) { } ["children"]=> array(0) { } ["nodes"]=> array(0) { } ["parent"]=> *RECURSION* ["_"]=> array(1) { [4]=> string(15) "« Introduction" } ["tag_start"]=> int(0) ["dom":"simple_html_dom_node":private]=> object(simple_html_dom)#2 (23) { ["root"]=> object(simple_html_dom_node)#3 (9) {............ad infinitum!! Quote Link to comment Share on other sites More sharing options...
maxxd Posted February 25, 2019 Share Posted February 25, 2019 Give this a try: $li = $html->find('li'); print("<p>{$li[0]->children[0]->attr['href']}</p>"); and see if you can follow the track through the output of the var_dump() function. Then try this: $li = $html->find('li', 0); print("<p>{$li->children[0]->attr['href']}</p>"); and follow that as well. Coupled with the documentation and the comments above, things will hopefully start to look a little clearer... Quote Link to comment Share on other sites More sharing options...
ludo1960 Posted February 26, 2019 Author Share Posted February 26, 2019 Halle friggen lujah!!! Am I using the right approach? for ( $i = 0 ; $i < count($li) ; $i++ ) { echo $li[$i]->children[0]->attr['href'] . '<br>' ; //echo $li[$i]->children[0]->children[0] . '<br>' ; } Gets me the child nodes on the page visited, all good and well but I also need the text from the href, it's buried deeper in the array/object: object(simple_html_dom_node)#66 (9) { ["nodetype"]=> int(1) ["tag"]=> string(2) "li" ["attr"]=> array(0) { } ["children"]=> array(1) { [0]=> object(simple_html_dom_node)#67 (9) { ["nodetype"]=> int(1) ["tag"]=> string(1) "a" ["attr"]=> array(1) { ["href"]=> string(21) "pdo.requirements.html" } ["children"]=> array(0) { } ["nodes"]=> array(1) { [0]=> object(simple_html_dom_node)#68 (9) { ["nodetype"]=> int(3) ["tag"]=> string(4) "text" ["attr"]=> array(0) { } ["children"]=> array(0) { } ["nodes"]=> array(0) { } ["parent"]=> *RECURSION* ["_"]=> array(1) { [4]=> string(12) "Requirements" } The last bit "Requirements" just after the suspicious looking *RECURSION* I can see now how the objects and arrays work at the top level but how to address the ["_"][4]? Quote Link to comment Share on other sites More sharing options...
gw1500se Posted February 26, 2019 Share Posted February 26, 2019 (edited) ["parent"]["_"][1][4] I think. You may need to parse each element to drill down to what you want. Edited February 26, 2019 by gw1500se Quote Link to comment Share on other sites More sharing options...
ludo1960 Posted February 26, 2019 Author Share Posted February 26, 2019 (edited) Trust me I'm trying, $li[$i]->children[0]->attr['href'] is obvious now $li[$i]->children[0]->children[0]->_[4] aint so obvious! I need a Prolific Member.....but I suppose we all do Edited February 26, 2019 by ludo1960 Quote Link to comment Share on other sites More sharing options...
ludo1960 Posted February 26, 2019 Author Share Posted February 26, 2019 Already tried that: echo $li[$i]->children[0]->["parent"]->["_"]->[1]->[4] . '<br>' ; and echo $li[$i]->children[0]->["parent"]->['_']->[1]->[4] . '<br>' ; and lots of other guesses Thanks for chipping in though! Quote Link to comment Share on other sites More sharing options...
gw1500se Posted February 26, 2019 Share Posted February 26, 2019 You need to understand the difference between accessing an object via pointer (->) and an array element [...] or '=>'. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.