dilbertone Posted December 24, 2010 Share Posted December 24, 2010 Hello dear friends, first of all : merry merry Xmas!!! i want to parse with the simple Simple HTML DOM Parser, well i am pretty new to php and to the Simple HTML DOM Parser. My example: http://schulen.bildung-rp.de/gehezu/startseite/einzelanzeige.html?tx_wfqbe_pi1[uid]=60119 I want to collect the data in the block: I have investigated the sourcecode - and found out that the attribute of interest should be this one: class="content"div class="content"><!-- TYPO3SEARCH_begin --> here the code is: - my trails. // inculde the Simple HTML DOM Parser include_once('simple_html_dom.php'); // get the file we want to parse right now,create a DOM $html = file_get_html(''); // simple_html_dom::find() creates a new // simple_html_dom-Objekt, that consists out of // corresponding childelements foreach($html->find('class: content ') as $h3) { // simple_html_dom::get the text in a tag // den Text innerhalb eines Tags if($h3->innertext == 'Text of a H3 Tag') { break; } } // simple_html_dom::next_sibling() gives the // next Element $table = $h3->next_sibling(); but believe me - it gives me not back what is aimed. what have id done wrong...? dbone Quote Link to comment Share on other sites More sharing options...
RichardRotterdam Posted December 24, 2010 Share Posted December 24, 2010 Is simple_html_dom.php part of typo3? Also what is it you want to accomplish? I don't see a question anywhere Quote Link to comment Share on other sites More sharing options...
dilbertone Posted December 24, 2010 Author Share Posted December 24, 2010 Hello - thanks for answering! simple html-dom-parser is not part of typo 3 - no - i do not think so!!! My example: i want to parse and get the following information - (in the block) consisting of the follwing 11 labels and corresponding values. see the page: http://schulen.bildung-rp.de/gehezu/startseite/einzelanzeige.html?tx_wfqbe_pi1[uid]=60119 BTW: Sorry for the funny looking url - but it is the real url!!! Schulart: BBS Schulnummer: 60119 Anschrift: Berufsbildende Schule Boppard Antoniusstr. 21 56154 Boppard Telefon: (0 67 42) 80 61-0 Telefax: (0 67 42) 80 61-29 E-Mail: sekretariat@bbs-boppard.de Internet: http://www.bbs-boppard.de Träger: Kreisverwaltung Rhein-Hunsr�ck-Kreis letzte Änderung: 08 Feb 2010 14:33:12 von 60119 i try to get these infos - with the Simple HTML DOM Parser. Well - i am not very familiar with Simple HTML DOM Parser- i thougth that i have to give some attributes. is this right!? greetings dbone Quote Link to comment Share on other sites More sharing options...
RichardRotterdam Posted December 24, 2010 Share Posted December 24, 2010 So you want to scrape the following url : =60119]http://schulen.bildung-rp.de/gehezu/startseite/einzelanzeige.html?tx_wfqbe_pi1[uid]=60119 And filter out the following data: Schulart: BBS Schulnummer: 60119 Anschrift: Berufsbildende Schule Boppard Antoniusstr. 21 56154 Boppard Telefon: (0 67 42) 80 61-0 Telefax: (0 67 42) 80 61-29 E-Mail: sekretariat@bbs-boppard.de Internet: http://www.bbs-boppard.de Träger: Kreisverwaltung Rhein-Hunsr�ck-Kreis letzte Änderung: 08 Feb 2010 14:33:12 von 60119 Why not use DOMdocument instead? <?php $dom = new DOMDocument(); @$dom->loadHTMLFile('http://schulen.bildung-rp.de/gehezu/startseite/einzelanzeige.html?tx_wfqbe_pi1[uid]=60119'); $divElement = $dom->getElementById('wfqbeResults'); $innerHTML= ''; $children = $divElement->childNodes; foreach ($children as $child) { $innerHTML .= $child->ownerDocument->saveXML( $child ); } echo $innerHTML; Quote Link to comment Share on other sites More sharing options...
dilbertone Posted December 24, 2010 Author Share Posted December 24, 2010 hello dear Dj Kat, good evening! - many many thanks for the answer and the hints! Yes i want to scrape the mentioned url. I will try this out - and run the mentioned parser. So you want to scrape the following url : =60119]http://schulen.bildung-rp.de/gehezu/startseite/einzelanzeige.html?tx_wfqbe_pi1[uid]=60119 And filter out the following data: Schulart: BBS Schulnummer: 60119 Anschrift: Berufsbildende Schule Boppard Antoniusstr. 21 56154 Boppard Telefon: (0 67 42) 80 61-0 Telefax: (0 67 42) 80 61-29 E-Mail: sekretariat@bbs-boppard.de Internet: http://www.bbs-boppard.de Träger: Kreisverwaltung Rhein-Hunsr�ck-Kreis letzte Änderung: 08 Feb 2010 14:33:12 von 60119 Why not use DOMdocument instead? <?php $dom = new DOMDocument(); @$dom->loadHTMLFile('http://schulen.bildung-rp.de/gehezu/startseite/einzelanzeige.html?tx_wfqbe_pi1[uid]=60119'); $divElement = $dom->getElementById('wfqbeResults'); $innerHTML= ''; $children = $divElement->childNodes; foreach ($children as $child) { $innerHTML .= $child->ownerDocument->saveXML( $child ); } echo $innerHTML; again thanks - i will run the code and do some tests. I come back and report all my findings. Have a great day! greetings dilbertone Quote Link to comment Share on other sites More sharing options...
jr_developer Posted February 19, 2013 Share Posted February 19, 2013 hi, im pretty new to simple html dom, is there anyone can help me to code this example. this is because i want to try collect some data from this website 4D88.com - Latest 4D Results thanks Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.