Jump to content

Help with cURL and breaking down the data


seth001

Recommended Posts

First of all... this is a GREAT forum. Been lurking around here for a long time and searched all over but still can't find a solution to a problem I'm facing.

 

Here's my cURL code - I am basically retrieving data from a web site:

<?php

$curl_handle=curl_init();
curl_setopt($curl_handle,CURLOPT_URL,'http://myprimemobile.com/update.html');
curl_setopt($curl_handle,CURLOPT_CONNECTTIMEOUT,2);
curl_setopt($curl_handle,CURLOPT_RETURNTRANSFER,1);
$buffer = curl_exec($curl_handle);
curl_close($curl_handle);

if (empty($buffer))
{
    print "Connection Timeout<p>";
}
else
{
    print $buffer;
}
?>

 

If you take a look at the site http://myprimemobile.com/update.html -- it will begin to update new information. I want to basically retrieve the data from that page but in my own way.

 

I want to loop through the page and search for all the names, addresses, and phone numbers only (not store hours or anything else), and want to put it in a variable so I can create a table with all that information on my page.

 

Based on the code, all the information is stored in the $buffer variable. I want to break this variable down to extract certain data in a 'div' or a 'table'. Please help me out. Thanks!

First let me say I'm no expert at this.  But it's a matter of Parsing the data.  Your cUrl is not your real question here.  You'll be working with the DOM (document object model).

 

I havn't tested it, but how about something like this:

 

foreach($buffer->find('body') as $stores) {
    $item['storename']     = $article->find('div.storename', 0);
    $item['detail'] = $article->find('div.storedetail', 0);
    $stores[] = $item;
}

 

Their code at myprimemobile kindof not good cuz they're not putting data into appropriately named divs, they're using a bunch of tables in divs..  gonna have to hack the results, maybe use results from the snippet above to get the data separated for each store, then use a similar snippet to break down the html for each store table in the storedetail div results.  Please be sure to share the final code when you get it nailed, I could apply that on a thing or two..

What you're doing is scraping for data. You'll probably either want to cache the page you're parsing, or learn the DOM classes in PHP.

When I first got into cURL I used it from the command line and mixed exec() and scraped what I needed using php's split() and other functions. It takes patience and will. Remember that the site's tags may change in the future, and maybe there is an extra <div> or something in the page's source that you're not expecting. since not everything is named or assigned an id or class, you may want to consider using a combination of just text and regular expressions to find all the data you're looking for.

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.