Jump to content

weep

Members
  • Posts

    32
  • Joined

  • Last visited

About weep

  • Birthday 02/03/1985

Profile Information

  • Gender
    Male
  • Location
    Sweden

weep's Achievements

Newbie

Newbie (1/5)

0

Reputation

  1. Fixed HTML with Tidy extension! Works perfectly now
  2. Hi, I have stumbled upon a weird issue where I am trying to re-use a "broken" code. Here is a part of the source: <?xml version="1.0" encoding="utf-8"?><!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta http-equiv="content-type" content="text/html;charset=utf-8"> <style type="text/css">TABLE.responsedata { font-family: Calibri, Arial</style> [color=#ff0000]</meta>[/color] </head> <table class="responsedata"> <thead> <tr> <th>Ärendenr</th> <th>Status</th> <th>Ärende skapat datum</th> And so on... By using file_put_contents I then save the source in order to do other creepy stuff with it, but since the code is broken (no body tag and meta end tag) it gives me a huge headache. Now, here is the interesting part, if I save that same source using a browser it fixes the code for me! <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <!-- saved from url=(0075)https://xxxxxxxxxxx?period=3d&format=html --> <html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> <style type="text/css">TABLE.responsedata { font-family: Calibri, Arial</style> <style type="text/css"> </style> </head> [color=#ff0000]<body>[/color] <table class="responsedata"> <thead> <tr> <th>Ärendenr</th> <th>Status</th> <th>Ärende skapat datum</th> <th>Skapad av</th> And so on... Bam! Suddenly I have all kinds of cool stuff and it works perfectly. Is there a way to do this same thing via PHP?
  3. Bump: What if we treat that file as xml file (save it as .xml and reaload it)? Would that help?
  4. It's not a problem, the data is ours to use and we pay good money for it. It's just that we want to do some parts ourselves
  5. Hehe, I could try talking to them but I doubt it will help (it's in their interest to prevent me from succeeding). Removing body from path did not help...
  6. Unfortunately that is what I have to work with, that is the way the file is delivered. Is there no way to work around this without manual editing?
  7. Hi guys, Some time ago I asked for help with xpath and swiftly received it, thank you! Now, I have almost the same problem. Here where it all started: http://forums.phpfre...h/#entry1397191 It was working perfectly for some time, until today, it seems that provider made a change to his source and I cannot for my life find what the problem is. First off, Maq said in the previous thread: It is now closed and gives me a warning instead, forcing me to go with @$husdjur->loadHTMLFile. So far so good, but that's where my luck ends... I assume that my old xpath is wrong, but I cant figure out why... My code, where I grab the value from every cell and poke them inside a database: $husdjur = new DOMDocument(); @$husdjur->loadHTMLFile("mellanlagring.html"); $xpath = new DOMXPath($husdjur); $xpath->registerNamespace("xmlns", "http://www.w3.org/1999/xhtml"); *snip* $tableRows = $xpath->query('/html/body/table/tbody/tr/'); *snip* foreach ($tableRows as $row) { $cells = $xpath->query('td', $row); foreach ($cells as $cell) { $cellvalue[$i] = $cell->nodeValue; $cellvalue[$i] = utf8_decode($cellvalue[$i]); $i++; } $sql = "INSERT INTO remotexdump *snip*)"; mysql_query($sql,$con); $i = 0 ; } The new .html code: <?xml version="1.0" encoding="utf-8"?><!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta http-equiv="content-type" content="text/html;charset=utf-8"> <style type="text/css">TABLE.responsedata { font-family: Calibri, Arial, monaco, monospace; font-size: 11pt } TABLE.responsedata,TABLE.responsedata TD { border: 1px solid #ccc; border-collapse: collapse; vertical-align: top; } TABLE.responsedata TD { padding-right: 0.2em; } TABLE.responsedata TH { border-bottom: 1px solid #000; }</style> </meta> </head> <table class="responsedata"> <thead> <tr> <th>Ärendenr</th> <th>Status</th> <th>Ärende skapat datum</th> <th>Skapad av</th> <th>Ändrad</th> <th>Ändrad av</th> <th>Titel (*)</th> <th>Affärssystem Id</th> and so on... </tr> </thead> <tr style="color: #f00"> <td>7968231231241</td> <td>Påbörjad</td> <td style="mso-number-format:'yyyy-mm-dd hh:mm';">2001-02-18 12:09</td> <td>Rapid2222222</td> <td style="mso-number-format:'yyyy-mm-dd hh:mm';">2003-02-18 12:24</td> <td>shs</td> <td>Strömlöst i korridorerna </td> <td>Strömlöst i korridorerna </td> <td>12xxx4</td><td>XXXX AB - Fast</td><td>xxxx</td> td>Hus 02</td><td>Röntgen</td><td>Objekt</td><td>120xxxx</td> and so on... Any help is much appreciated!
  8. Philip, how far out are we with that? Weeks, months, years? Josh, I may be generous but i'm not made of money haha So I will do as Christian and wait (forcing you guys to work faster on a solution)
  9. Oh well, I think I will just have to keep my money
  10. Hi, I remeber that threre were an option to subsribe/donate, is this removed or am I simply so retarded that I can't find it? cheers
  11. Thank you for all your help guys! Solution for this thread: // Report all PHP errors error_reporting(E_ALL); error_reporting(-1); $husdjur = new DOMDocument(); $husdjur->loadHTMLFile("test.html"); $xpath = new DOMXPath($husdjur); $xpath->registerNamespace("xmlns", "http://www.w3.org/1999/xhtml"); $tableRows = $xpath->query('/html/body/table/tbody/tr'); foreach ($tableRows as $row) { $cells = $xpath->query('td', $row); foreach ($cells as $cell) { echo $cell->getNodePath(); echo ' has value '; var_export($cell->nodeValue); echo "<br>\n"; } }
  12. Sorry for the delay Sweet, plenty of awesome tips to try. I will poke around for a bit and return with a solution/result/more questions. I want to grab every cell within every <tr>, se picture:
  13. Hey guys, Can't seem to wrap my head around this. This is what I have: $husdjur = new DOMDocument(); @$husdjur->loadHTML("test.html"); $xpath = new DOMXPath($husdjur); $tableRows = $xpath->query('/html/body/table/tbody/tr[1]/td[1]'); print_r($tableRows); And this is what I get: DOMNodeList Object ( ) Here is a sample of test.html (in this case, I am going after the "5166" entry, this file is massive): <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <!-- saved from url=(0077)https://xxxxxxxxxxx.net/api/excel/usagequantities?period=300d&format=html --> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> <style type="text/css">TABLE.responsedata { font-family: Calibri, Arial, monaco, monospace; font-size: 11pt } TABLE.responsedata,TABLE.responsedata TD { border: *snip*</style> </head> <body> <table class="responsedata"> <thead> <tr> <th>Ärendenr</th> <th>Status</th> <th>Ärende skapat datum</th> <th>Skapad av</th> <th>Ändrad</th> <th>Ändrad av</th> And so on, 50 something more... </tr> </thead> <tbody> <tr> <td>5166</td> <td>Avslutad</td> <td style="mso-number-format:'yyyy-mm-dd hh:mm';">2012-10-08 10:27</td> <td>Name1</td> <td style="mso-number-format:'yyyy-mm-dd hh:mm';">2012-10-08 10:27</td> <td>Name2</td> <td>K8 norr städ</td> And so on, 50 something more... Any help much appreciated, cheers!
×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.