Jump to content

ian2k01

Members
  • Posts

    15
  • Joined

  • Last visited

    Never

Posts posted by ian2k01

  1. The spaces could be non-breaking spaces ( ), so try to convert those to regular spaces and then trim() the string:

     

    $str = trim(str_replace(' ', ' ', $str));

     

    This did not seem to do anything.

     

    I am quite sure that they are spaces, because any of the methods above seem to work only when I copy and paste out the result onto a new file.

  2. You're making this harder than it should be. First of all, use trim before you insert data into the DB so there won't be whitespaces afterwards. But trim would take care of the spaces for you.

     

    As you suggested, I have tried both trim and rtrim but it still gives me the same output, with 4 spaces at the end. Is there another way to do so?

  3. Hello, I have a simple yet frustrating problem here. I'm on this project where it parses all customer information as variables then insert it into database, and aimed for paypal's transaction detail pages.

     

    When parsing the customer's name, there are 4 spaces between end of name and the closing </td> tag. so this is what the source code looks like:

    <tr valign="top">
    <td align="right" class="label">Sent by:</td>
    <td><br class="textSpacer"></td>
    <td class="small">Annita Greuncuard    </td>
    </tr>
    

     

    I have tried using str_replace to get rid of the "    " with "", or preg_match_all here

    preg_match_all('~(.*)    ~', $trMatches[1][0], $name);
    

     

    however, both results come back to be negative, and the database would read the  "    " as "Â Â Â Â ". Very frustrating.

     

    Any insight?

  4. It still doesn't work. I think it has to do with the first "("

    so i skipped the ( and this works fine

    preg_match_all('~Unique Transaction ID #([^)]+)\)</td>~is', $result, $transactionIDs);
    

     

    thank you so much. problem solved

     

    Well, I am seeing .* instead of .*?... but you shouldn't require neither.. my solution of ([^)]+) should do just as well, as this matches anything that is not a ), one or more times.. Additionally, I am seeing that you mispelled Unique (you have: Unieque). If this mispelling is due to you retyping this stuff in the post, cut and paste those things instead.. less room for error that way.

     

     

    If you are not getting anything returned, this is a sign that the code you are checking doesn't conform to the pattern.. As I mentioned in my previous post, it is assumed that the code is structured exactly as the sample you provided.. if there is any differences among the other samples, the pattern will not work.

     

    what is the site you are scrapping? I can view the source and see what is going with those kind of lines..

  5. The A1A1A1A1A1A1A1A1A is actually a substitute for upper and lower case letters with numbers, so i'm using " (.*?) " for that.  but i think there is a problem occurs around " \(Unique ...." and returns nothing.

     

    preg_match_all('~<span class="emphasis">Payment Received</span> \(Unieque Transaction ID #(.*)\)</td></tr>~is',$result,$transactionIDs);
    

     

     

    Assuming a) you are going the regex route, b) the span tag is structured exactly like what you have it, and c) you only want A1A1A1A1A1A1A1A1A, you can do something like this:

     

    Example:

    $result = <<<HTML
    <span class="emphasis">Payment Received</span> (Unique Transaction ID #A1A1A1A1A1A1A1A1A)</td></tr>
    HTML;
    
    preg_match_all('~<span class="emphasis">Payment Received</span> \(Unique Transaction ID #([^)]+)\)</td>~si', $result, $transactionIDs);
    echo 'Unique Transaction ID # ' . $transactionIDs[1][0]; // Unique Transaction ID # A1A1A1A1A1A1A1A1A
    

  6. Got it thank you. I got another question, how can I parse out symbols like "(" or "#" ?

     

    I am trying to extract the id from this code:

     

    <span class="emphasis">Payment Received</span> (Unique Transaction ID #A1A1A1A1A1A1A1A1A)</td></tr>
    

    and this is what i have so far

    preg_match_all('~<span class="emphasis">Payment Received</span>(.*?)</td></tr>~is',$result,$transactionIDs);
    

     

    Using DOM / XPath, you can fetch all those specific <tr> tags' content in one fell swoop:

     

    Example:

    // You won't use this $code heredoc..I'm just using this to test on that snippet of code...
    $code = <<<HTML
    <tr bgcolor="#FFFFFF">
    <td class="small" nowrap> Jun. 8, 2009</td>
    <td class="small" nowrap> Transfer</td>
    <td class="small" nowrap> To</td>
    <td class="small" nowrap> Me</td>
    <td class="small" nowrap> Pending</td>
    <td class="small" nowrap><a href="https://history.paypal.com/us/cgi-bin/webscr?cmd=_history-1">Details</a></td>
    <td class="small" nowrap> <img align="top" alt="" border="0" height="17" src="https://www.paypalobjects.com/WEBSCR-580-20090604-1/en_US/i/scr/pixel.gif" width="1">
    </td>
    <td align="right" class="small" nowrap>-$1 USD </td>
    <td align="right" class="small" nowrap>$0.00 USD </td>
    <td align="right" class="small" nowrap>-$1 USD </td>
    </tr>
    <tr bgcolor="#EEEEEE">
    <td class="small" nowrap> Jun. 8, 2009</td>
    <td class="small" nowrap> Payment</td>
    <td class="small" nowrap> From</td>
    <td class="small" nowrap> Tom</td>
    <td class="small" nowrap> Completed</td>
    <td class="small" nowrap><a href="https://history.paypal.com/us/cgi-bin/webscr?cmd=_history-2">Details</a></td>
    <td class="small" nowrap> <img align="top" alt="" border="0" height="17" src="https://www.paypalobjects.com/WEBSCR-580-20090604-1/en_US/i/scr/pixel.gif" width="1">
    </td>
    <td align="right" class="small" nowrap>$10 USD </td>
    <td align="right" class="small" nowrap>-$1 USD </td>
    <td align="right" class="small" nowrap>$9 USD </td>
    </tr>
    HTML;
    
    $dom = new DOMDocument;
    @$dom->loadHTML($code); // change this to: @$dom->loadHTMLFile('http://www.somesite.com/someFolder/somefile.php');
    $xpath = new DOMXPath($dom);
    $tableData = $xpath->query('//tr[contains(@bgcolor, "FFFFFF") or contains(@bgcolor, "EEEEEE")]');
    
    foreach ($tableData as $val) {
    echo $val->nodeValue . "<br />\n";
    }
    

     

    Output:

    Jun. 8, 2009 Transfer To Me Pending Details -$1 USD $0.00 USD -$1 USD 
    Jun. 8, 2009 Payment From Tom Completed Details $10 USD -$1 USD $9 USD 
    

     

    Or if you want all those as separate entries, you can change:

    $tableData = $xpath->query('//tr[contains(@bgcolor, "FFFFFF") or contains(@bgcolor, "EEEEEE")]');

    To:

    $tableData = $xpath->query('//tr[contains(@bgcolor, "FFFFFF") or contains(@bgcolor, "EEEEEE")]/td');

    (I've added /td at the end).

     

    In either case, if you go this route, don't forget to change the @DOM line to the suggested one that is commented out (using the actualy URL you want to use obviously).

  7. Guys, need help with parsing out fields in the follow code. Thank you in advance :)

     

    <tr bgcolor="#FFFFFF">
    <td class="small" nowrap> Jun. 8, 2009</td>
    <td class="small" nowrap> Transfer</td>
    <td class="small" nowrap> To</td>
    <td class="small" nowrap> Me</td>
    <td class="small" nowrap> Pending</td>
    <td class="small" nowrap><a href="https://history.paypal.com/us/cgi-bin/webscr?cmd=_history-1">Details</a></td>
    <td class="small" nowrap> <img align="top" alt="" border="0" height="17" src="https://www.paypalobjects.com/WEBSCR-580-20090604-1/en_US/i/scr/pixel.gif" width="1">
    </td>
    <td align="right" class="small" nowrap>-$1 USD </td>
    <td align="right" class="small" nowrap>$0.00 USD </td>
    <td align="right" class="small" nowrap>-$1 USD </td>
    </tr>
    <tr bgcolor="#EEEEEE">
    <td class="small" nowrap> Jun. 8, 2009</td>
    <td class="small" nowrap> Payment</td>
    <td class="small" nowrap> From</td>
    <td class="small" nowrap> Tom</td>
    <td class="small" nowrap> Completed</td>
    <td class="small" nowrap><a href="https://history.paypal.com/us/cgi-bin/webscr?cmd=_history-2">Details</a></td>
    <td class="small" nowrap> <img align="top" alt="" border="0" height="17" src="https://www.paypalobjects.com/WEBSCR-580-20090604-1/en_US/i/scr/pixel.gif" width="1">
    </td>
    <td align="right" class="small" nowrap>$10 USD </td>
    <td align="right" class="small" nowrap>-$1 USD </td>
    <td align="right" class="small" nowrap>$9 USD </td>
    </tr>
    
    

     

    This is what i have got so far, which doesn't seem to be working :/

     

    
    preg_match_all('~<tr[^>]*bgcolor\s?=\s?"#f(?FFFFFF|EEEEEE)"[^>]*>(.*?)</tr>~is',$result,$trMatches);
            foreach ($trMatches[1] as $tr) {
            //get individual fields
              preg_match_all('~<td[^>]*>(.*?)</td>~is',$tr,$tdMatches);
              echo "<pre>";
              print_r($tdMatches[1]);
    
    

     

    Thanks!

     

  8. Are you sure that '$dd[4]' explodes into 7 values?

     

    After the explode put:

     

    print_r($expHd);

     

    now it gaves me this, not really sure what it means

    Array
    (
        [0] => %2F
    )
    PHP Notice:  Undefined offset:  6 in /var/www/html/eport.php on line 56 PHP Warning:  Cannot modify header information - headers already sent by (output started at /var/www/html/report.php:55) in /var/www/html/report.php on line 60
    

     

     

  9. I'm using crontab to run a page every hour. But the crontab-generated mail gives me this message about the ereg_replace(). "PHP Notice:  Undefined offset:  6 in /var/www/html/report.php on line 56". Any ideas?

     

     

    54        $dd = $cookie_file_path;
    55        $expHd=explode("%09",urlencode($dd[4]));
    56        $expHd[6]=urldecode(ereg_replace("%0A","\\0",$expHd[6]));
    57        $val=$expHd[6];
    58        header("Set-Cookie: JSESSIONID=$val; path=/");
    59         echo $val;
    

     

  10. Thanks so much Crayon Violent, it works just the way I wanted. :)

     

    This is only going to really be accurate if the only tr's on the page that have those 2 colors are the rows...

    preg_match_all('~<tr[^>]*bgcolor\s?=\s?"#f(?:ceed8|fffff)"[^>]*>(.*?)</tr>~is',$string,$trMatches);
    foreach ($trMatches[1] as $tr) {
      preg_match_all('~<td[^>]*>(.*?)</td>~is',$tr,$tdMatches);
      echo "<pre>"; print_r($tdMatches[1]); // example of where the data is at
      // put your db insert code here based off $tdMatches[1] array
    }
    

  11. Hello,

     

    I am trying to extract data from a cURL result into array and then insert into database. But need some hints on the regex for information extraction here:

    ...
    <TD colspan="12"><img border="0" width="100%" height="1" src="./images/mdot.jpg"></TD>
    </TR>
    <TR height="25" bgcolor="#ffffff">
    <TD></TD><TD align="left" class="bodytext">CLIENT_ID</TD><TD align="left" class="bodytext">TRANSACTION_ID_A</TD><TD align="left"><a class="hyperlink" href="javascript:DetailedTxn('CLIENT_ID', 'TRANSACTION_ID_B', 'ID')">TRANSACTION_ID_B</a></TD><TD align="right" class="bodytext">ID</TD><TD align="left" class="bodytext"><img border="0" width="10" height="1" src="./images/spacer.gif">CUSTOMER_NAME</TD><TD align="right" class="bodytext">AMOUNT</TD><TD align="right" class="bodytext">000</TD><TD align="right" class="bodytext">DATE</TD><TD align="right" class="bodytext"></TD><TD align="right" class="bodytext"><img border="0" width="2" height="1" src="./images/spacer.gif"></TD><TD width="1%"></TD>
    </TR>
    <TR height="25" bgcolor="#fceed8"> ...
    
    

     

    The bgcolor="#ffffff" are odd rows and #fceed8 are even rows.

     

    There is more than one row/record, and I'm not sure how to set each row's data into variables :(

     

    Please help. Thank you!

  12. Sorry not sure what jacking means.

     

    The original code I wanted to preg was

     

    <input type="hidden" name="SwtAcctID" value="_____"><input type="hidden" name="SessionID" value="12434392951101243439295110U[[bR??=96<692"><input type="hidden" name="IsSuperUser" value="FALSE"><input type="hidden" name="IsGroupUser" value="FALSE"><input type="hidden" name="IsAdminUser" value="FALSE"><input type="hidden" name="PageAppList" value="<APPLIST><APP id="qfind" caption="Quick Find" url="qfPaymentInquiry" pageType="static"></APP></APPLIST>
    

     

    But then it grabbed everything until the end. So a friend of mine modified the preg with a "?":

     

    preg_match_all('/name="SessionID" value="(.*?)"/i', $str, $match);
    

     

    The code works now, thank you guys! :)

  13. Actually I think I need the Regex. But there are more than just one <input type="hidden" ...> on the page. if I only want the SessionID, do i use:

     

    $str = '<input type="hidden" name="SessionID" value="12433690503231243369050323U[[bR??=96<692">';
    
    preg_match_all('/name="SessionID" value="(.*)"/i', $str, $match);
    
    echo '<pre>';
    
    var_dump($match);
    
    echo '</pre>';
    
    
    
    echo $matches[1][0];
    

  14. I'm not too familiar with parsing function, or preg_match, need help with parsing simple HTML.

     

    Here is the code:

    <input type="hidden" name="SessionID" value="12433690503231243369050323U[[bR??=96<692">

     

    I would like to parse out the value, and set it as a variable. Thank you so much!

  15. Hello everyone, I am working on a PHP cURL project for my company to automate downloading bank statement with login process. I pass through the login page no problem and able to store and reuse the cookie information;

     

    but the site is a bit odd: the session/cookie i stored is in JSESSIONID, but in order for me to proceed to next page after login, there is an automatically generated SessionID which is different from JSESSIONID which i need to put on the URL request.

     

    I have used network monitoring program, and was able to see the SessionID as one of the variable/parameter. But how could I do the same on PHP cURL? to get page parameter, set it as variable and reuse it to postfield? Thank you.

     

    Here is my script

    P.S. the SessionID would be like this: 12426624329771242662432977U[[bR??=96<692

    and JSESSION would be like this: 33555EF380C964DBCDF8504A994DCBC1.78n4

    <?

     

    // This page will set some cookies and we will use them for Posting in Form data.

     

            $username = "aaa"; // Please set your Ebay ID

            $password = "bbb"; // Please set your Ebay Password

    $clientid = "ccc";

            $cookie_file_path = "cookies/wn.txt"; // Please set your Cookie File path

     

            $LOGINURL = "https://trackpayments.westernunion.com/";

            $agent = "Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4) Gecko/20030624 Netscape/7.1 (ax)";

        $ch = curl_init();

        curl_setopt($ch, CURLOPT_URL,$LOGINURL);

            curl_setopt($ch, CURLOPT_USERAGENT, $agent);

        curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);

            curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);

            curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_file_path);

            curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file_path);

        $result = curl_exec ($ch);

        curl_close ($ch);

     

    // 2- Post Login Data to Page http://signin.ebay.com/aw-cgi/eBayISAPI.dll

     

            $LOGINURL = "https://trackpayments.westernunion.com/tally/Logon.do";

        $POSTFIELDS = "Errors=&LoginStatus=N&LogonID=" . $username . "&LogonAcct=" . $clientid . "&SwtAcctID=&SessionID=&IsSuperUser=&IsGroupUser=&IsAdminUser=&PageAppList=&PageFunList=&sys=&JspID=&UserID=" . $username . "&Password=" . $password . "&ClientID=" . $clientid . "&ErrorFlag=";

    $reffer = "https://trackpayments.westernunion.com/";

     

            $ch = curl_init();

        curl_setopt($ch, CURLOPT_URL,$LOGINURL);

            curl_setopt($ch, CURLOPT_USERAGENT, $agent);

        curl_setopt($ch, CURLOPT_POST, 1);

        curl_setopt($ch, CURLOPT_POSTFIELDS,$POSTFIELDS);

        curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);

            curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);

            curl_setopt($ch, CURLOPT_REFERER, $reffer);

            curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_file_path);

            curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file_path);

        $result = curl_exec ($ch);

     

    //extract session ID from URL

     

     

        curl_close ($ch);

           

    // parse the session id into $val

     

            $dd=file('cookies/wn.txt');

     

            $expHd=explode("%09",urlencode($dd[4]));

     

            $expHd[6]=urldecode(ereg_replace("%0A","",$expHd[6]));

     

            $val=$expHd[6];

     

            header("Set-Cookie: JSESSIONID=$val; path=/");

    echo $val;

     

    // 3- Proceed to Quick Search

     

    $LOGINURL = "https://trackpayments.westernunion.com/tally/DispatchStaticPage.do";

            $POSTFIELDS = "Errors=&LoginStatus=Y&LogonID=" . $username . "&LogonAcct=" . $clientid . "&SwtAcctID=" . $clientid . "&SessionID=" . $val . "&IsSuperUser=FALSE&IsGroupUser=FALSE&IsAdminUser=FALSE&PageAppList=%3CAPPLIST%3E%3CAPP+id%3D%22qfind%22+caption%3D%22Quick+Find%22+url%3D%22qfPaymentInquiry%22+pageType%3D%22static%22%3E%3C%2FAPP%3E%3C%2FAPPLIST%3E%0D%0A&PageFunList=%3CFUNLIST%3E%3CFN%3EF08%3C%2FFN%3E%3CFN%3EW01%3C%2FFN%3E%3CFN%3EF07%3C%2FFN%3E%3C%2FFUNLIST%3E%0D%0A&sys=&JspID=qfPaymentInquiry";

    $reffer = "https://trackpayments.westernunion.com/tally/Logon.do";

     

            $ch = curl_init();

        curl_setopt($ch, CURLOPT_URL,$LOGINURL);

            curl_setopt($ch, CURLOPT_USERAGENT, $agent);

        curl_setopt($ch, CURLOPT_POST, 1);

        curl_setopt($ch, CURLOPT_POSTFIELDS,$POSTFIELDS);

        curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);

            curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);

            curl_setopt($ch, CURLOPT_REFERER, $reffer);

            curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_file_path);

            curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file_path);

        $result = curl_exec ($ch);

        curl_close ($ch);

     

    // 4- Proceed to Payment by Day

     

    $LOGINURL = "https://trackpayments.westernunion.com/tally/DispatchStaticPage.do";

            $POSTFIELDS = "Errors=&LoginStatus=Y&LogonID=" . $username . "&LogonAcct=" . $clientid . "&SwtAcctID=" . $clientid . "&SessionID=" . $val . "&IsSuperUser=FALSE&IsGroupUser=FALSE&IsAdminUser=FALSE&PageAppList=%3CAPPLIST%3E%3CAPP+id%3D%22qfind%22+caption%3D%22Quick+Find%22+url%3D%22qfPaymentInquiry%22+pageType%3D%22static%22%3E%3C%2FAPP%3E%3C%2FAPPLIST%3E%0D%0A&PageFunList=%3CFUNLIST%3E%3CFN%3EF08%3C%2FFN%3E%3CFN%3EW01%3C%2FFN%3E%3CFN%3EF07%3C%2FFN%3E%3C%2FFUNLIST%3E%0D%0A&sys=&JspID=qfDailyReport&GroupID=&ConsNumber=&MTCN=&Amount=&FromDate=&ToDate=&FromTime=&ToTime=&LastName=&FirstName=&Status=ALL&hdn_currentDate=5%2F21%2F2009&hdn_inputDateFormat=mm%2Fdd%2Fyyyy&Flag=A&QueryFlag=R&PageNo=1&CCID=" . $clientid . "&ReportFlag=P&Today=Thu+May+21+15%3A21%3A41+EDT+2009&ErrorFlag=";

    $reffer = "https://trackpayments.westernunion.com/tally/DispatchStaticPage.do";

     

            $ch = curl_init();

        curl_setopt($ch, CURLOPT_URL,$LOGINURL);

            curl_setopt($ch, CURLOPT_USERAGENT, $agent);

        curl_setopt($ch, CURLOPT_POST, 1);

        curl_setopt($ch, CURLOPT_POSTFIELDS,$POSTFIELDS);

        curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);

            curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);

            curl_setopt($ch, CURLOPT_REFERER, $reffer);

            curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_file_path);

            curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file_path);

        $result = curl_exec ($ch);

        curl_close ($ch);

     

    // 5- Scrape Daily Payment now

     

    $LOGINURL = "https://trackpayments.westernunion.com/tally/DailyReport.do";

            $POSTFIELDS = "Errors=&LoginStatus=Y&LogonID=" . $username . "&LogonAcct=" . $clientid . "&SwtAcctID=" . $clientid . "&SessionID=" . $val . "&IsSuperUser=FALSE&IsGroupUser=FALSE&IsAdminUser=FALSE&PageAppList=%3CAPPLIST%3E%3CAPP+id%3D%22qfind%22+caption%3D%22Quick+Find%22+url%3D%22qfPaymentInquiry%22+pageType%3D%22static%22%3E%3C%2FAPP%3E%3C%2FAPPLIST%3E%0D%0A&PageFunList=%3CFUNLIST%3E%3CFN%3EF08%3C%2FFN%3E%3CFN%3EW01%3C%2FFN%3E%3CFN%3EF07%3C%2FFN%3E%3C%2FFUNLIST%3E%0D%0A&sys=&JspID=&GroupID=&DayOption=1&FromDate=&RepType=D&hdn_currentDate=&hdn_inputDateFormat=mm%2Fdd%2Fyyyy&ToDate=&ReportFlag=D&Period=6&PageNo=1&CCID=" . $clientid . "&ErrorFlag=";

    $reffer = "https://trackpayments.westernunion.com/tally/DispatchStaticPage.do";

     

            $ch = curl_init();

        curl_setopt($ch, CURLOPT_URL,$LOGINURL);

            curl_setopt($ch, CURLOPT_USERAGENT, $agent);

        curl_setopt($ch, CURLOPT_POST, 1);

        curl_setopt($ch, CURLOPT_POSTFIELDS,$POSTFIELDS);

        curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);

            curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);

            curl_setopt($ch, CURLOPT_REFERER, $reffer);

            curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_file_path);

            curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file_path);

        $result = curl_exec ($ch);

        curl_close ($ch);

    echo($result);

     

    ?>

     

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.