abazoskib Posted November 3, 2009 Share Posted November 3, 2009 here is the code I am using to parse the table: <?php $tidy = new tidy(); $tidy->parseString($page); $tidy->cleanRepair(); //echo $tidy; $doc = new DOMDocument(); $doc->preserveWhiteSpace = false; $doc->loadHTML($tidy); $xpath = new DOMXPath( $doc ); $elements = $xpath->query( "//html/body//table/*" ); foreach ( $elements as $item ) { $newDom = new DOMDocument; $newDom->appendChild($newDom->importNode($item,true)); $xpath = new DOMXPath( $newDom ); foreach ($item->attributes as $attribute) { for ($node = $item->firstChild; $node !== NULL; $node = $node->nextSibling) { print($node->nodeValue); } } print("\n"); } ?> Here is a sample of the data I am parsing: <table width="97%" align="center" border="0" cellpadding="4" class="borderTable" cellspacing="0"> <tr bgcolor="#2a5a7c" class="trheader"> <th align="left"><a href="domain_a.cfm?int_order=1" class="trheader">Domain</a></th> <th align="left"><a href="domain_a.cfm?int_order=5" class="trheader">Registrar</a></th> <th align="center">Used</th> <th align="center">Live</th> <th align="center">Assigned</th> <th align="center">Purchased</th> <th align="center"><a href="domain_a.cfm?int_order=3" class="trheader">Enabled</a></th> <th align="center">Info</th> <th align="center">Edit</th> <th align="center">Delete</th> </tr> <form action="domain_b.cfm" method="post"> <tr id="t149"><td colspan="9" style="padding:0px;" bgcolor="#c6dbde" height="1"><img src="/images/space.gif" width="1" height="1" /></td></tr> <tr onmouseover="hilite(149);" onmouseout="hilite2(149);" bgcolor="#EEEEEE" style="border-top:thin" id="149"> <td><span id="cf_tooltip_1257270905778"> <a href="http://xyxyxyxyxyxyx.com" target="_blank" class="standard">xyxyxyxyxyxyxyx.com</a> </span></td> <td align="left">DomainSite</td> <td align="center"><a href="#" onclick="launchwin('iused149', '<span style=color:#2a5a7c;>Domain ID: 149 Usage History</span>', '/domains/noapp/u_info.cfm?p_domain_id=149',{width:800,height:300,center:true,modal:true});"><img src="/images/red.gif" border="0"/></a></td> <td align="center"><div id="s149" style="color:#FF6600"><span style='color:red'>-</span></div></td> <td align="center"><a href="#" onclick="launchwin('idomain149', '<span style=color:#2a5a7c;>Domain 149 Info</span>', '/domains/domain_info.cfm?p_domain_id=149',{width:300,height:220,center:true,modal:true});"><img src='/images/green.gif' alt='Yes' border='0'></a></td> <td align="center" width="100"><div id="e149"><img src='/images/green.gif' alt='Yes'></div></td> <td align="center"><img src='/images/green.gif' alt='Enabled'></td> <td align="center"><a href="#" onclick="launchwin('infod149', '<span style=color:#2a5a7c;>Information for Domain ID: 149</span>', '/domains/noapp/d_info.cfm?p_domain_id=149',{width:600});"><img src="/images/idetailpage.gif" border="0" alt="Domain Creation Log" /></a></td> <td align="center"><a href="#" onclick="launchwin('edomain149', '<span style=color:#2a5a7c;>Edit Domain</span>', '/domains/new_domain.cfm?edit=149',{width:350,height:250,center:true,modal:true});"><img src="/images/edit.gif" alt="Edit" border="0"></a></td> <td align="center"><a href="#" onclick="confirmation('Warning. If you delete a Domain it will delete all Domain Links. Continue?','miniwin','/domains/domain_b.cfm?del=149');"><img src="/images/drop.gif" alt="Delete" border="0"></a></td> </tr> I can get the link 'xyxyxyxyxyxyx.com' but I need to also get what images appear next to it. They signify the status of the url, green=live ,red=down. Here is what I am getting in output: xyxyxyxyxyxyxyx.com DomainSite and nothing else. Link to comment https://forums.phpfreaks.com/topic/180153-parsing-a-table-with-domxpath/ Share on other sites More sharing options...
abazoskib Posted November 3, 2009 Author Share Posted November 3, 2009 Forgot to mention, I know what is happening, I just cant get the information I need. Since there is nothing in the rest of the TD elements my program returns only the text that is available from each tr. I need to pull addition img src attributes from the same tr. Link to comment https://forums.phpfreaks.com/topic/180153-parsing-a-table-with-domxpath/#findComment-950391 Share on other sites More sharing options...
kronisk Posted November 11, 2009 Share Posted November 11, 2009 Not entirely sure what u want to do, but this will get the image src under the the 'Used' column: $xpath = new DOMXPath( $doc ); $elements = $xpath->query( "//tr/td[3]/a/img/@src" ); foreach($elements as $result){ echo $result->textContent; } Link to comment https://forums.phpfreaks.com/topic/180153-parsing-a-table-with-domxpath/#findComment-955684 Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.