Search the Community
Showing results for tags 'xmlreader'.
-
Editing PHP Script to Enable Parsing of Large XML File
maghnom posted a topic in Third Party Scripts
I'm trying to utilize a PHP script to parse a large XML file (around 450 MB) to MYSQL database into certain structure and definitions of included XML elements. The problem is that the original script uses file_get_contents and SimpleXMLElement to get it done, but the corn job executed by the server halts due to the volume of the XML file. I'm no PHP expert, so I bought this XMLSplit software and divided the XML into 17 separated XML files each at size of 30 MB, parsed them one by one using the same script. However, the output database was missing a lot of input, and I have serious doubts whether this would be the same output of the original file if left not divided automatically and parsed one by one. So, I've decided to use XMLReader with this exact PHP script to parse this big XML file, but so far I couldn't manage to simply replace the parsing code and keep other functionality intact. I'm including the script below, I'd really appreciate if someone helps me to do so. <?php set_time_limit(0); ini_set('memory_limit', '1024M'); include_once('../db.php'); include_once(DOC_ROOT.'/include/func.php'); mysql_query("TRUNCATE screenshots_list"); mysql_query("TRUNCATE pages"); mysql_query("TRUNCATE page_screenshots"); // This is the part I need help with to change into XMLReader instead of utilized function, to enable parsing of the large XML file correctly (while keeping rest of the script code as is if possible): $xmlstr = file_get_contents('t_info.xml'); $xml = new SimpleXMLElement($xmlstr); foreach ($xml->template as $item) { //print_r($item); $sql = sprintf("REPLACE INTO templates SET id = %d, state = %d, price = %d, exc_price = %d, inserted_date = '%s', update_date = '%s', downloads = %d, type_id = %d, type_name = '%s', is_flash = %d, is_adult = %d, width = '%s', author_id = %d, author_nick = '%s', package_id = %d, is_full_site = %d, is_real_size = %d, keywords = '%s', sources = '%s', description = '%s', software_required = '%s'", $item->id, $item->state, $item->price, $item->exc_price, $item->inserted_date, $item->update_date, $item->downloads, $item->template_type->type_id, $item->template_type->type_name, $item->is_flash, $item->is_adult, $item->width, $item->author->author_id, $item->author->author_nick, $item->package->package_id, $item->is_full_site, $item->is_real_size, $item->keywords, $item->sources, $item->description, $item->software_required); //echo '<br>'.$sql; mysql_query($sql); //print_r($item->screenshots_list->screenshot); foreach ($item->screenshots_list->screenshot as $scr) { $main = (!empty($scr->main_preview)) ? 1 : 0; $small = (!empty($scr->small_preview)) ? 1 : 0; insert_data($item->id, 'screenshots_list', 0, $scr->uri, $scr->filemtime, $main, $small); } foreach ($item->styles->style as $st) { insert_data($item->id, 'styles', $st->style_id, $st->style_name); } foreach ($item->categories->category as $cat) { insert_data($item->id, 'categories', $cat->category_id, $cat->category_name); } foreach ($item->sources_available_list->source as $so) { insert_data($item->id, 'sources_available_list', $so->source_id, ''); } foreach ($item->software_required_list->software as $soft) { insert_data($item->id, 'software_required_list', $soft->software_id, ''); } //print_r($item->pages->page); if (!empty($item->pages->page)) { foreach ($item->pages->page as $p) { mysql_query(sprintf("REPLACE INTO pages SET tpl_id = %d, name = '%s', id = NULL ", $item->id, $p->name)); $page_id = mysql_insert_id(); if (!empty($p->screenshots->scr)) { foreach ($p->screenshots->scr as $psc) { $href = (!empty($psc->href)) ? (string)$psc->href : ''; mysql_query(sprintf("REPLACE INTO page_screenshots SET page_id = %d, description = '%s', uri = '%s', scr_type_id = %d, width = %d, height = %d, href = '%s'", $page_id, $psc->description, $psc->uri, $psc->scr_type_id, $psc->width, $psc->height, $href)); } } } }}?> I'd appreciate your help with that... -
I need to import a 2Gb XML file into Mysql Db. The XML is built like the following sample <?xml version="1.0" encoding="UTF-8"?> <Properties> <Property> <ID></ID> <price></price> <price></price> <price></price> <price></price> <image id='1'> </image> <image id='2'> </image> <image id='3'> </image> </Property> <Property> ....... </Property> <Properties> I am now trying to get the id, the value of the 4th price node as well as the two first image node values... I am stuck... Please help. My script sofar: <? $reader = new XMLReader(); $reader->open($url); while($reader->read()){ if($reader->nodeType == XMLReader::ELEMENT) $nodeName = $reader->name; if($reader->nodeType == XMLReader::TEXT || $reader->nodeType == XMLReader::CDATA) { if ($nodeName == 'ID'){ $id = $reader->value; echo $id; } if ($nodeName == 'Price'){ $price = $reader->value; echo $price['4']; } if ($nodeName == 'Image'){ if($reader->getAttribute("id") == '1') { $image1= $reader->value; }else if($reader->getAttribute("id") == '2'){ $image2= $reader->value; } } } } ?>
-
I need to import a large XML file into my DB (2Gb) I tried to use DOMDocument and it failed cause of size and read up and understand that I should rather use XMLReader. Google search do not help much to give me an indication what I should different to read the document with XMLReader... Can anybody please help me to either point me in the right direction to fix this document I created or even help me to rewrite the parser... I just cannot get my head arounf this problem... My DOMDocument is as follow: #!/usr/bin/php –q <? error_reporting(E_ALL); ini_set('display_errors', '1'); $l = mysql_connect ( XXXXX); $total='0'; $xml = "http://xxxxxxxxxxxxxxxxxxxx"; $doc = new DOMDocument(); $doc->load( $xml); ## lets read the code block $records = $doc->getElementsByTagName( "Property" ); foreach( $records as $result ) { $adnr = $result->getElementsByTagName( "ID" ); // adnr $adnr = $adnr->item(0)->nodeValue; $town = $town->getElementsByTagName( "town" ); // town $town = $town->item(0)->nodeValue; //////////////////////////////////////////////////////////////////////////////// Add Pics $pic1= $result->getElementsByTagName( "image" ); $pic1 = $pic1->item('1')->nodeValue; $pic2= $result->getElementsByTagName( "image" ); $pic2 = $pic2->item('2')->nodeValue; $pic3= $result->getElementsByTagName( "image" ); $pic3 = $pic3->item('3')->nodeValue; } ?>
-
Hello, I am trying to read XML file and store in sql table. I got two errors. PHP Warning: XMLReader::open(): Unable to open source data Warning: XMLReader::read(): Load Data before trying to read How to solve it ? THANKS IN ADVANCED. $reader = new XMLReader; $reader->open('filename.xml'); while($reader->read()) { /// }
-
Hi, I have been searching all over the web but have not found a useful answer... Hope you guys can help me. This is the situation. I have a rather big XML file (13Mb - 4100 lines) where I need to search for data in the text elements using regex patterns. To do so, I parse the file with XMLReader (http://www.php.net/XMLReader). I have tried to use DOMDocument (http://www.php.net/m...domdocument.php), which I prefer, but it really is too slow. The script runs really well as it returns me all the matches without any issue and very very fast. BUT, I need to know the exact Xpath of every matched node and, surprisingly, XMLReader does not come with a XPath attribute or metod. So, basically, what I am searching for is a effective (speed is important) way to get to know the XPath of any node parsed with XMLReader... Any suggestions ? Thanks for your time and feedback.