dil_bert Posted April 25, 2019 Share Posted April 25, 2019 (edited) good day dear phpfreaks. I am new to PHP's SimpleXML. i want to work with SimpleXML on OSM-files. The original version of this question was derived from here: OSM Data parsing to get the nodes with child https://stackoverflow.com/questions/16129184/osm-data-parsing-to-get-the-nodes-with-child I am thankful that hakre offered a great example in the comments that makes a overwhelming starting point for my project. Below I have added my own answer to the question, how to refine the code to ad more tags. I can work on the methods using SimpleXML and Xpath; The job is most easily done with xpath, the used PHP XML library is based on libxml which supports XPath 1.0 which covers the various querying needs very well. goal: how to get more out of it: I want to filter the data to get the nodes with special category. Here is sample of the OSM data I want to get the whole schools within an area. The first script runs well - but now I want to refine the search and add more tags. Finally I want to store all into MySQL. So we need to make some XML parsing with PHP: The following is a little OSM Overp Quote # get all school nodes with xpath $xpath = '//node[tag[@k = "amenity" and @v = "school"]]'; $schools = $result->xpath($xpath); printf("%d School(s) found:\n", count($schools)); foreach ($schools as $index => $school) { # Get the name of the school (if any), again with xpath list($name) = $school->xpath('tag[@k = "name"]/@v') + ['(unnamed)']; printf("#%02d: ID:%' -10s [%s,%s] %s\n", $index, $school['id'], $school['lat'], $school['lon'], $name); } since i am learning - i break down the code into pieces...For my question, the second part is more interesting here. That is querying the XML data we have already. Again - as mentioned above: This is most easily done with xpath, the used PHP XML library is based on libxml which supports XPath 1.0 which covers the various querying needs very well. The following example lists all schools and tries to obtain their names as well. # get all school nodes with xpath $xpath = '//node[tag[@k = "amenity" and @v = "school"]]'; $schools = $result->xpath($xpath); printf("%d School(s) found:\n", count($schools)); foreach ($schools as $index => $school) { # Get the name of the school (if any), again with xpath list($name) = $school->xpath('tag[@k = "name"]/@v') + ['(unnamed)']; printf("#%02d: ID:%' -10s [%s,%s] %s\n", $index, $school['id'], $school['lat'], $school['lon'], $name); } The key point here are the xpath queries: Two are used, the first one to get the nodes that have certain tags. //node[tag[@k = "amenity" and @v = "school"]] This line says: Give me all node elements that have a tag element inside which has the k attribute value "amenity" and the v attribute value "school". Explanation: This is the condition we have to filter out those nodes that are tagged with amenity school. Further on xpath is used again - a second time: now relative to those school nodes to see if there is a name and if so to fetch it: Therefore we use the foreach-syntax: foreach ($schools as $index => $school) { # Get the name of the school (if any), again with xpath list($name) = $school->xpath('tag[@k = "name"]/@v') + ['(unnamed)']; printf("#%02d: ID:%' -10s [%s,%s] %s\n", $index, $school['id'], $school['lat'], $school['lon'], $name); } and tag[@k = "name"]/@v' = $school->xpath('tag[@k = "name"]/@v') + ['(unnamed)']; and this is pretty important tag[@k = "name"]/@v' This line says: Relative to the current node, give me the v attribute from a tag element that as the k attribute value "name". As you can see, some parts are again similar to the line before. I think you can both adopt them to your needs. Because not all school nodes have a name, a default string is provided for display purposes by adding it to the (then empty) result array: list($name) = $school->xpath('tag[@k = "name"]/@v') + ['(unnamed)']; ^^^^^^^^^^^^^^^ Provide Default Value So here some of the results for that code-example: Query returned 907 node(s) and took 1.10735 seconds. more than 2000 School(s) found: #00: ID:332534486 [39.5017565,16.2721899] Scuola Primaria #01: ID:1428094278 [39.3320912,16.1862820] (unnamed) #02: ID:1822746784 [38.9075566,16.5776597] (unnamed) #03: ID:1822755951 [38.9120272,16.5713431] (unnamed) #04: ID:1903859699 [38.6830409,16.5522243] Liceo Scientifico Statale A. Guarasci #05: ID:2002566438 [39.1347698,16.0736924] (unnamed) #06: ID:2056891127 [39.4106679,16.8254844] (unnamed) #07: ID:2056892999 [39.4124687,16.8286119] (unnamed) #08: ID:2272010226 [39.4481717,16.2894353] SCUOLA DELL'INFANZIA SAN FRANCESCO #09: ID:2272017152 [39.4502366,16.2807664] SCUOLA MEDIA and now i try to figure out how i can enter more xpath queries at the above mentioned code goal: to get out even more important data - see here Key:contact - OpenStreetMap Wiki Well - we are already extracting the name: If we want to have more data then we just have to run a few more xpath queries inside our loop for all the address keys and the website. So - additionally: we do not have to forget to look for the website key additional to contact:website. cf: https://wiki.openstreetmap.org/wiki/Key:website conclusio: well - i think that i need to extend the xpath requests within the loop where xpath is used again, now relative to those school nodes to see if there is a name and if so to fetch it: tag[@k = "name"]/@v' tag[@k = "contact:website"]/@v' tag[@k = "contact:email"]/@v' What do you say...? i did some further tess and found out very interesting things see more here: - the code that runs very well: #'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''# <?php /** * OSM Overpass API with PHP SimpleXML / XPath * * PHP Version: 5.4 - Can be back-ported to 5.3 by using 5.3 Array-Syntax (not PHP 5.4's square brackets) */ // // 1.) Query an OSM Overpass API Endpoint // $query = 'node ["amenity"~".*"] (38.415938460513274,16.06338500976562,39.52205163048525,17.51220703125); out;'; $context = stream_context_create(['http' => [ 'method' => 'POST', 'header' => ['Content-Type: application/x-www-form-urlencoded'], 'content' => 'data=' . urlencode($query), ]]); # please do not stress this service, this example is for demonstration purposes only. $endpoint = 'http://overpass-api.de/api/interpreter'; libxml_set_streams_context($context); $start = microtime(true); $result = simplexml_load_file($endpoint); printf("Query returned %2\$d node(s) and took %1\$.5f seconds.\n\n", microtime(true) - $start, count($result->node)); // // 2.) Work with the XML Result // # get all school nodes with xpath $xpath = '//node[tag[@k = "amenity" and @v = "school"]]'; $schools = $result->xpath($xpath); printf("%d School(s) found:\n", count($schools)); foreach ($schools as $index => $school) { # Get the name of the school (if any), again with xpath list($name) = $school->xpath('tag[@k = "name"]/@v') + ['(unnamed)']; printf("#%02d: ID:%' -10s [%s,%s] %s\n", $index, $school['id'], $school['lat'], $school['lon'], $name); } //node[tag[@k = "amenity" and @v = "school"]] //tag[@k = "name"]/@v' $query = 'node ["addr:postcode"~"RM12"] (51.5557914,0.2118915,51.5673083,0.2369398); node (around:1000) ["amenity"~"fast_food"]; out;'; $context = stream_context_create(['http' => [ 'method' => 'POST', 'header' => ['Content-Type: application/x-www-form-urlencoded'], 'content' => 'data=' . urlencode($query), ]]); $endpoint = 'http://overpass-api.de/api/interpreter'; libxml_set_streams_context($context); $result = simplexml_load_file($endpoint); printf("Query returned %2\$d node(s) and took %1\$.5f seconds.\n\n", microtime(true) - $start, count($result->node)); see the results: me/martin/dev/php/o1.php on line 68 linux-3645:/home/martin/dev/php # php o1.php Query returned 2799 node(s) and took 17.02055 seconds. 33 School(s) found: #00: ID:332534486 [39.5018840,16.2722854] Scuola Elementare #01: ID:1428094278 [39.3320912,16.1862820] (unnamed) #02: ID:1822746784 [38.9075566,16.5776597] (unnamed) #03: ID:1822755951 [38.9120272,16.5713431] (unnamed) #04: ID:2002566438 [39.1349460,16.0736446] (unnamed) #05: ID:2056891127 [39.4106679,16.8254844] (unnamed) #06: ID:2056892999 [39.4124687,16.8286119] (unnamed) #07: ID:2272010226 [39.4481717,16.2894353] Scuola dell'infanzia San Francesco #08: ID:2272017152 [39.4502366,16.2807664] Scuola Media #09: ID:2358307794 [39.5015031,16.3905965] I.I.S.S. Liceo Statale V. Iulia #10: ID:2358307796 [39.4926280,16.3853662] Liceo Classico #11: ID:2358307797 [39.4973761,16.3858275] Scuola Media #12: ID:2358307800 [39.5015527,16.3941156] I.T.C. e per Geometri #13: ID:2358307801 [39.4983862,16.3807796] Istituto Professionale #14: ID:2448031004 [38.6438417,16.3873106] (unnamed) #15: ID:2458139204 [39.0803263,17.1291649] Sacro Cuore #16: ID:2552412313 [39.0765212,17.1224610] (unnamed) #17: ID:2582443083 [39.0815417,17.1178983] Liceo Socio Biologico Gravina #18: ID:2585754364 [38.8878393,16.4076323] Scuola Elementare #19: ID:2585754366 [38.8877600,16.4076216] Scuola Media #20: ID:3071126720 [38.6022703,16.5554408] Scuola Media #21: ID:3071127683 [38.6027273,16.5563125] Scuola Elementare #22: ID:3081362915 [39.2865638,16.2601963] Convitto Nazionale Bernardino Telesio #23: ID:3081362921 [39.2856714,16.2613594] Liceo Classico B. Telesio #24: ID:3081362926 [39.2888949,16.2577446] Scuola #25: ID:3732551794 [39.5132435,16.2863285] (unnamed) #26: ID:3740289655 [39.5167318,16.2838146] scuola media #27: ID:3740289656 [39.5164344,16.2821103] scuola elementare #28: ID:4004532684 [38.7804787,16.5122952] Liceo Artistico #29: ID:4589289756 [38.6794209,16.1063084] Scuola Comprensiva Trentacapilli #30: ID:4843966477 [39.0709866,17.1288384] Pegaso #31: ID:5297629775 [38.5768845,16.3263536] Scuola Media Statale "Ignazio La Russa" #32: ID:5316865306 [39.0807997,17.1264225] Enrico Fermi Query returned 3 node(s) and took 17.44780 seconds. so far so good : if i add some lines in the part 2 i run into errors... -see below: background: i want to get more data out of the dataset - i wnat to have more information about. i want to get more data out of it: - and coded like so; { # Get the name of the school (if any), again with xpath list($name) = $school->xpath('tag[@k = "name"]/@v') + ['(unnamed)']; list($name) = $school->xpath('tag[@k = "contact:website"]/@v'); list($name) = $school->xpath('tag[@k = "contact:email"]/@v'); printf("#%02d: ID:%' -10s [%s,%s] %s\n", $index, $school['id'], $school['lat'], $school['lon'], $name); } note - within the part 2 that works with the XML-Result. // // 2.) Work with the XML Result // # get all school nodes with xpath $xpath = '//node[tag[@k = "amenity" and @v = "school"]]'; $schools = $result->xpath($xpath); printf("%d School(s) found:\n", count($schools)); foreach ($schools as $index => $school) { # Get the name of the school (if any), again with xpath list($name) = $school->xpath('tag[@k = "name"]/@v') + ['(unnamed)']; list($name) = $school->xpath('tag[@k = "contact:website"]/@v'); list($name) = $school->xpath('tag[@k = "contact:email"]/@v'); printf("#%02d: ID:%' -10s [%s,%s] %s\n", $index, $school['id'], $school['lat'], $school['lon'], $name); } the question is: how to get more out of it.. at least the address and the website and now i try to figure out how i can enter more xpath queries at the above mentioned code and get out even more important data - see here Key:contact - OpenStreetMap Wiki contact:phone contact:fax contact:website contact:email I will dig into all documents and come back later the weekend... and report all the findings well - i think that i need to extend the xpath requests within the loop where xpath is used again, now relative to those school nodes to see if there is a name and if so to fetch it: tag[@k = "name"]/@v' tag[@k = "contact:website"]/@v' tag[@k = "contact:email"]/@v' Edited April 25, 2019 by dil_bert more infos Quote Link to comment https://forums.phpfreaks.com/topic/308637-osm-data-parsing-with-php-to-get-the-nodes-with-child/ Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.