Jump to content

weep

Members
  • Posts

    32
  • Joined

  • Last visited

Posts posted by weep

  1. Hi,

     

    I have stumbled upon a weird issue where I am trying to re-use a "broken" code. Here is a part of the source:

     

    <?xml version="1.0" encoding="utf-8"?><!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
    <html xmlns="http://www.w3.org/1999/xhtml">
    <head>
    <meta http-equiv="content-type" content="text/html;charset=utf-8">
    <style type="text/css">TABLE.responsedata { font-family: Calibri, Arial</style>
    [color=#ff0000]</meta>[/color]
    </head>
    <table class="responsedata">
    <thead>
    <tr>
    <th>Ärendenr</th>
    <th>Status</th>
    <th>Ärende skapat datum</th>
    And so on...
    

     

    By using file_put_contents I then save the source in order to do other creepy stuff with it, but since the code is broken (no body tag and meta end tag) it gives me a huge headache. Now, here is the interesting part, if I save that same source using a browser it fixes the code for me!

     

    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
    <!-- saved from url=(0075)https://xxxxxxxxxxx?period=3d&format=html -->
    <html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
    <style type="text/css">TABLE.responsedata { font-family: Calibri, Arial</style>
    <style type="text/css">
    </style>
    </head>
    [color=#ff0000]<body>[/color]
    <table class="responsedata">
    <thead>
    <tr>
    <th>Ärendenr</th>
    <th>Status</th>
    <th>Ärende skapat datum</th>
    <th>Skapad av</th>
    And so on...
    

     

    Bam! Suddenly I have all kinds of cool stuff and it works perfectly. Is there a way to do this same thing via PHP?

  2. Hi guys,

     

    Some time ago I asked for help with xpath and swiftly received it, thank you! Now, I have almost the same problem. Here where it all started:

     

    http://forums.phpfre...h/#entry1397191

     

    It was working perfectly for some time, until today, it seems that provider made a change to his source and I cannot for my life find what the problem is. First off, Maq said in the previous thread:

    "First, try closing the <meta element so it's valid XHTML.".

     

    It is now closed and gives me a warning instead, forcing me to go with @$husdjur->loadHTMLFile.

    :tease-01:

     

    So far so good, but that's where my luck ends... I assume that my old xpath is wrong, but I cant figure out why...

     

    Warning: DOMXPath::query() [domxpath.query]: Invalid expression

     

    My code, where I grab the value from every cell and poke them inside a database:

     

    $husdjur = new DOMDocument();
    @$husdjur->loadHTMLFile("mellanlagring.html");
    $xpath = new DOMXPath($husdjur);
    $xpath->registerNamespace("xmlns", "http://www.w3.org/1999/xhtml");
    *snip*
    $tableRows = $xpath->query('/html/body/table/tbody/tr/');
    *snip*
    foreach ($tableRows as $row) {
    $cells = $xpath->query('td', $row);
    
    
    foreach ($cells as $cell) {
    
    $cellvalue[$i] = $cell->nodeValue;
    $cellvalue[$i] = utf8_decode($cellvalue[$i]);
    $i++;
    }
    
    
    $sql = "INSERT INTO remotexdump *snip*)";
    mysql_query($sql,$con);
    
    
    $i = 0 ;
    }
    

     

    The new .html code:

     

    <?xml version="1.0" encoding="utf-8"?><!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
    <html xmlns="http://www.w3.org/1999/xhtml">
    <head>
    <meta http-equiv="content-type" content="text/html;charset=utf-8">
    <style type="text/css">TABLE.responsedata { font-family: Calibri, Arial, monaco, monospace; font-size: 11pt } TABLE.responsedata,TABLE.responsedata TD { border: 1px solid #ccc; border-collapse: collapse; vertical-align: top; } TABLE.responsedata TD { padding-right: 0.2em; } TABLE.responsedata TH { border-bottom: 1px solid #000; }</style>
    </meta>
    </head>
    <table class="responsedata">
    <thead>
    <tr>
    <th>Ärendenr</th>
    <th>Status</th>
    <th>Ärende skapat datum</th>
    <th>Skapad av</th>
    <th>Ändrad</th>
    <th>Ändrad av</th>
    <th>Titel (*)</th>
    <th>Affärssystem Id</th>
    and so on...
    </tr>
    </thead>
    <tr style="color: #f00">
    <td>7968231231241</td>
    <td>Påbörjad</td>
    <td style="mso-number-format:'yyyy-mm-dd hh:mm';">2001-02-18 12:09</td>
    <td>Rapid2222222</td>
    <td style="mso-number-format:'yyyy-mm-dd hh:mm';">2003-02-18 12:24</td>
    <td>shs</td>
    <td>Strömlöst i korridorerna </td>
    <td>Strömlöst i korridorerna </td>
    <td>12xxx4</td><td>XXXX AB - Fast</td><td>xxxx</td>
    td>Hus 02</td><td>Röntgen</td><td>Objekt</td><td>120xxxx</td>
    and so on...
    

     

    Any help is much appreciated!

  3. Thank you for all your help guys! :happy-04: Solution for this thread:

     

    // Report all PHP errors
    error_reporting(E_ALL);
    error_reporting(-1);
    $husdjur = new DOMDocument();
    $husdjur->loadHTMLFile("test.html");
    $xpath = new DOMXPath($husdjur);
    $xpath->registerNamespace("xmlns", "http://www.w3.org/1999/xhtml");
    
    $tableRows = $xpath->query('/html/body/table/tbody/tr');
    foreach ($tableRows as $row) {
       $cells = $xpath->query('td', $row);
       foreach ($cells as $cell) {
        echo $cell->getNodePath();
        echo ' has value ';
        var_export($cell->nodeValue);
        echo "<br>\n";
       }
    }
    

  4. Sorry for the delay :sweat:

     

    Sweet, plenty of awesome tips to try. I will poke around for a bit and return with a solution/result/more questions.

     

    No, XPath indexing starts at 1. Also, your expression matches on <td>Avslutad</td>.

     

    Weep, if you tell us what exactly you're trying to match on, we can give you the best XPath solution.

     

    I want to grab every cell within every <tr>, se picture:

     

    11553699.jpg

  5. Hey guys,

     

    Can't seem to wrap my head around this. This is what I have:

     

    $husdjur = new DOMDocument();
    @$husdjur->loadHTML("test.html");
    $xpath = new DOMXPath($husdjur);
    $tableRows = $xpath->query('/html/body/table/tbody/tr[1]/td[1]');
    print_r($tableRows);
    

     

    And this is what I get:

     

    DOMNodeList Object ( )
    

     

    Here is a sample of test.html (in this case, I am going after the "5166" entry, this file is massive):

     

    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
    <!-- saved from url=(0077)https://xxxxxxxxxxx.net/api/excel/usagequantities?period=300d&format=html -->
    <html xmlns="http://www.w3.org/1999/xhtml">
    <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
    <style type="text/css">TABLE.responsedata { font-family: Calibri, Arial, monaco, monospace; font-size: 11pt } TABLE.responsedata,TABLE.responsedata TD { border: *snip*</style>
    </head>
    <body>
    <table class="responsedata">
    <thead>
    <tr>
    <th>Ärendenr</th>
    <th>Status</th>
    <th>Ärende skapat datum</th>
    <th>Skapad av</th>
    <th>Ändrad</th>
    <th>Ändrad av</th>
    And so on, 50 something more...
    </tr>
    </thead>
    <tbody>
    <tr>
    <td>5166</td>
    <td>Avslutad</td>
    <td style="mso-number-format:'yyyy-mm-dd hh:mm';">2012-10-08 10:27</td>
    <td>Name1</td>
    <td style="mso-number-format:'yyyy-mm-dd hh:mm';">2012-10-08 10:27</td>
    <td>Name2</td>
    <td>K8 norr städ</td>
    And so on, 50 something more...
    

     

    Any help much appreciated, cheers!

  6. Great. Please post your findings such that others can learn from your mistakes.

     

    Sweet jesus, I completely forgot about this thread! Here it goes, if anyone cares:

     

    $username="XXXXX";
    $password="XXXX";
    $base_url="https://xxxxxx.net/api/";
    $href_url="resources/120529-0039/workorders";
    $ch = curl_init($base_url . $href_url);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
    curl_setopt($ch, CURLOPT_HTTPAUTH, CURLAUTH_ANY);
    curl_setopt($ch, CURLOPT_SSLVERSION,3);
    curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
    curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
    curl_setopt($ch, CURLOPT_HEADER, false);
    curl_setopt($ch, CURLOPT_HTTPHEADER, array("Accept"=>"application/json"));
    curl_setopt($ch, CURLOPT_USERPWD, "$username:$password");
    $result = curl_exec($ch);
    
    $husdjur = new DOMDocument();
    $husdjur->loadXML($result);
    
    $i = 0;
    $end = 0;
    $ArendeId = array();
    
    while($end < 1)
    {
    $href = $husdjur->getElementsByTagName('Link')->item($i);
    
    if($href == ""){		
    $end = 1;
    }else{
    
    $href2[$i] = $href->getAttribute('p3:href');
    echo "<br>", $href2[$i];
    
    }
    $i++;
    }
    

     

    I guarantee that there are better ways of doing this, but this is how I solved it.

  7. Hi guys,

     

    I am trying to grab some data from an API by using cURL and DOMDocuments. This is what I have:

     

    $username="XXXXX";
    $password="XXXX";
    $base_url="https://xxxxxx.net/api/";
    $href_url="resources/120529-0039/workorders";
    $ch = curl_init($base_url . $href_url);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
    curl_setopt($ch, CURLOPT_HTTPAUTH, CURLAUTH_ANY);
    curl_setopt($ch, CURLOPT_SSLVERSION,3);
    curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
    curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
    curl_setopt($ch, CURLOPT_HEADER, false);
    curl_setopt($ch, CURLOPT_HTTPHEADER, array("Accept"=>"application/json"));
    curl_setopt($ch, CURLOPT_USERPWD, "$username:$password");
    $result = curl_exec($ch);
    

     

    So far so good, at this point if I echo the $result, and look at the source, I would get:

     

    <?xml version="1.0" encoding="utf-8"?>
    <References xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" p3:href="resources/120529-0039/workorders" xmlns:p3="http://www.w3.org/1999/xlink" xmlns="http://xxxxxx.net/Apps/20090225/Entities">
    <Link p3:href="workorders/3012-23" p3:title="Brandlarmsprov(mar-dec)" />
    <Link p3:href="workorders/3021-27" p3:title="Reservkraft(2:a torsdagen varje månad)" />
    <Link p3:href="workorders/4375-1" p3:title="Uppdatering av IBK ritningar efter heltäckande brandlarm." />
    </References>
    

     

     

    I am after the "workorders/3012-23" part. Here it all falls aparat for me, I have been playing around with:

     

    
    $husdjur = new DOMDocument();
    $husdjur->loadXML($result);
    $href = $husdjur->getElementsByTagName('Link')->item(0)->getAttribute('p3:href');
    

     

    And was able to populate $href with one value, but I want them all. I need the following:

     

    $href[0] = "workorders/3012-23"

    $href[1] = "workorders/3021-27"

    $href[2] = "workorders/4375-1"

    and so on...

     

    Any thoughts? Thanks!

  8. Hi, very sorry for the delay (I'm out on holidays) and thank you for this example! Additional question: Is there a way to limit these to "Where $array is == "something"?

     

    Ie:

     

    $array[0] = "asd";

    $array[1] = "ter";

    $array[2] = "asd";

    $array[3] = "asd";

    $array[4] = "xfh";

    $array[5] = "xfh";

     

    Your solution would print all dupes, but what if I am interested in only "asd"? How can I extract those, so I get:

     

    $newarray[0] = "0"

    $newarray[1] = "2"

    $newarray[2] = "3"

     

    and nothing more. Once again, thanks!

     

  9. Good day,

     

    Is there a simple way to find and list duplicates i an array and insert these to a new one? I have:

     

    $array[0] = "asd"

    $array[1] = "ter"

    $array[2] = "asd"

    $array[3] = "asd"

    $array[4] = "xfh"

     

    From that, I need this:

     

    $newarray[0] = "0"

    $newarray[1] = "2"

    $newarray[2] = "3"

     

    I have poked around "array_not_unique" and "in_array" a bit, but can't wrap my head around it enough for it to work... Any and all help are much appreciated. Thanks in advance!

  10. As long as I know which line they are on (I should be able to count them as I work my way though the file, right?) then I should be fine. The idea is to reconstruct an excel file, goin through csv.

     

    Excel 1 -> CSV -> Excel 2

    Excel 1 is a mess...

     

    Should be enclosed in quotes you say, but they are not... Here is a sample how it really looks:

     

    2011-03-12,,,,,John Doe,,,,,,,,,,Släpt ut dam ur hiss, stängt av hiss, felanmäld till XXXX. Jour.,,,,,,,,,,,,,,KÖ,,,1,50,,,1290,00 kr,,,,,,,,,
    Totalt arbetstid,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,1,50,,,1290,00 kr,,,,,,,,,
    ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
    Material,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,

     

    Make my eyes bleed...

     

     

     

     

     

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.