Jump to content

Collecting data from a external website


kcannon

Recommended Posts

I am trying to collect data from an external website. I can currently collect the whole website but I dont know how to get certain sections of the website.

Im trying to get data from a wiki example page : http://wiki.eveonline.com/en/wiki/Category:Aridia_%28Region%29 

The code i cuurently have is

 

<?php

if($_POST['url']) {

$url = $_POST['url'];

}

?>

Enter your URL:<br>

<form method="post">

<input type="text" name="url" value="<?php echo $url ?>">

<input type="submit" value="Go">

</form>

<hr>

<?php

    $content = file_get_contents($url);

    echo $content;

?>

 

 

What im trying to get out of the website is the name (without (region)), Average sec, number of constellations and so on. The only bit I dont see how to do is cut the page up so I can get certain bits. So if anyone would be able to help I would be thankful.

Link to comment
Share on other sites

Yer I got that was just wondering if anyone would be able to give me abit of help with it as ive had a try but it doesnt seem to work.

<?php
$url = 'http://wiki.eveonline.com/en/wiki/Category:Aridia_%28Region%29';
    $content = file_get_contents($url);
//    echo $content;

$titlesearch = 'var wgTitle';
$titlesearch1 = '(Region)';
$titlepos = strpos ($content, $titlesearch);
$titlepos1 = strpos ($content, $titlesearch1);

echo $content [strlen($content) -$titlepos];
?> 

 

Is what ive tried but it just doesnt work.

Link to comment
Share on other sites

I decided to actually test it :P

 

here's working code:

<?php
$url = 'http://wiki.eveonline.com/en/wiki/Category:Aridia_%28Region%29';
    $content = file_get_contents($url);
//    echo $content;

$titlesearch = 'var wgTitle';
$titlesearch1 = '(Region)';
$titlepos = strpos ($content, $titlesearch);
$start=$titlepos+15;
$titlepos1 = strpos ($content, $titlesearch1, $start);

$mid=$titlepos1 - $start;

echo substr($content, $start, $mid);
?> 

 

Link to comment
Share on other sites

There are better methods that will work with this and be less error prone. I think using regular expressions and/or the DOM XML parser would make more sense.

 

For example, to get the Name you could do this:


$url = "[url=http://wiki.eveonline.com/en/wiki/Category:Aridia_%28Region%29]http://wiki.eveonline.com/en/wiki/Category:Aridia_%28Region%29[/url]";
$content = file_get_contents($url);

preg_match( "/<td>Name<\/td><td>\s*([^<]*)\s*/", $content, $nameMatch);
$name = $nameMatch[1];

echo $name;

Link to comment
Share on other sites

This will get the first three values from the table. I'd do the other two but I have a meeting

 

preg_match( "/<td>Name<\/td><td>\s*([^<]*)\s*/", $content, $nameMatch);
$name = $nameMatch[1];
preg_match( "/<td>Average.*?<\/td><td>\s*([^<]*)\s*/", $content, $avgMatch);
$avg = $avgMatch[1];
preg_match( "/<td>Number of constellations<\/td><td>\s*([^<]*)\s*/", $content, $constMatch);
$contellations = $constMatch[1];

Link to comment
Share on other sites

wow thanks mjdamato I had to make a litle change but I got it working with

<?php

$url = 'http://wiki.eveonline.com/en/wiki/Category:Aridia_%28Region%29';
$content = file_get_contents($url);

preg_match( "/<td>Name<\/td><td>\s*([^<]*)\s*/", $content, $nameMatch);
$name = $nameMatch[1];
echo $name;
preg_match( "/<td>Average.*?<\/td><td>\s*([^<]*)\s*/", $content, $avgMatch);
$avg = $avgMatch[1];
echo $avg;
preg_match( "/<td>Number of constellations<\/td><td>\s*([^<]*)\s*/", $content, $constMatch);
$contellations = $constMatch[1];
echo $contellations;
?> 

Link to comment
Share on other sites

OK, here you go. This script will pull all the values from the table on that page and dump into an associative array


$url = 'http://wiki.eveonline.com/en/wiki/Category:Aridia_%28Region%29';
$content = file_get_contents($url);
$data = array();

//Process the data
preg_match( "/<td>Name<\/td><td>\s*([^<]*)/s", $content, $nameMatch);
preg_match( "/<td>Average.*?<\/td><td>\s*([^<]*)/s", $content, $avgMatch);
preg_match( "/<td>Number of constellations<\/td><td>\s*([^<]*)/s", $content, $constMatch);
preg_match( "/<td><a[^>]*>Sovereignty<\/a><\/td><td>\s*<a href=\"([^\"]*)[^>]*>([^<]*)/s", $content, $sovMatch);
preg_match( "/<td>Adjacent Regions<\/td><td>\s*<ul>(.*?)<\/ul>\s*<\/td>/s", $content, $regionsList);
preg_match_all( "/<li>\s*<a href=\"([^\"]*)[^>]*>([^<]*)/s", $regionsList[1], $regionsMatch);

//Add data to array
$data['name']            = trim($nameMatch[1]);
$data['average']         = trim($avgMatch[1]);
$data['constellations']  = trim($constMatch[1]);
$data['sovereignty']     = trim($sovMatch[1]);
$data['sovereignty_url'] = trim($sovMatch[2]);
$data['regions'] = array_combine($regionsMatch[2], $regionsMatch[1]);

echo "<pre>\n";
print_r($data);
echo "\n<pre>";

 

Output:

Array
(
    [name] => Aridia
    [average] => 0.244786045
    [constellations] => 11
    [sovereignty] => Amarr Empire
    [sovereignty_url] => /en/wiki/Category:Amarr_Empire_%28Faction%29
    [regions] => Array
        (
            [solitude] => /en/wiki/Category:Solitude_%28Region%29
            [Khanid] => /en/wiki/Category:Khanid_%28Region%29
            [Fountain] => /en/wiki/Category:Fountain_%28Region%29
            [Delve] => /en/wiki/Category:Delve_%28Region%29
            [Kor-Azor] => /en/wiki/Category:Kor-Azor_%28Region%29
            [Genesis] => /en/wiki/Category:Genesis_%28Region%29
        )
)

Link to comment
Share on other sites

New and improved verision. I've made it into a function so you only have to call the function with the region name:

 

<?php
function getRegionData($region)
{
$domain = "http://wiki.eveonline.com/";
$url = "{$domain}en/wiki/Category:{$region}_%28Region%29";
$content = file_get_contents($url);

//Process the data
preg_match( "/<td>Name<\/td><td>\s*([^<]*)/s", $content, $nameMatch);
preg_match( "/<td>Average.*?<\/td><td>\s*([^<]*)/s", $content, $avgMatch);
preg_match( "/<td>Number of constellations<\/td><td>\s*([^<]*)/s", $content, $constMatch);
preg_match( "/<td><a[^>]*>Sovereignty<\/a><\/td><td>\s*<a href=\"([^\"]*)[^>]*>([^<]*)/s", $content, $sovMatch);
preg_match( "/<td>Adjacent Regions<\/td><td>\s*<ul>(.*?)<\/ul>\s*<\/td>/s", $content, $regionsList);
preg_match_all( "/<li>\s*<a href=\"([^\"]*)[^>]*>([^<]*)/s", $regionsList[1], $regionsMatch);
$appendDomain = create_function('&$item, $key, $domain', '$item = $domain.$item;');
array_walk( $regionsMatch[1], $appendDomain, $domain );

//Add data to array
$regionData = array();
$regionData['name']            = trim($nameMatch[1]);
$regionData['average']         = trim($avgMatch[1]);
$regionData['constellations']  = trim($constMatch[1]);
$regionData['sovereignty']     = trim($sovMatch[2]);
$regionData['sovereignty_url'] = $domain . trim($sovMatch[1]);
$regionData['regions'] = array_combine($regionsMatch[2], $regionsMatch[1]);
return $regionData;
}

//Call function with region name
$regionData = getRegionData('Everyshore');

echo "<pre>\n";
print_r($regionData);
echo "\n<pre>";

?>

Link to comment
Share on other sites

I got bored watching TV. Its friday night and its raining. :(

 

<?php
function stellarium($region)
{
$domain = "http://wiki.eveonline.com/";
$url = "{$domain}en/wiki/Category:{$region}_%28Region%29";

$content=file_get_contents("$url");

$start = '<table .* class="itemdb-atribs">';
$end = '<\/td><\/tr><\/table>';
preg_match( "/$start(.*)$end/s", $content, $tables );
$table = $tables[1];

$btable=strip_tags($table);
$atable=explode("\n",$btable);
$atable=array_filter($atable);

$start = '<ul><li>';
$end = '<\/li><\/ul>';
preg_match( "/$start(.*)$end/s", $table, $links );
$rl=$links[0];
$rll=str_replace("<a href=\"","<a href=\"http://wiki.eveonline.com",$rl);
$region_links_list=$rll;

$rll=strip_tags($region_links_list,'<a>');
$region_links_array=explode("\n",$rll);
$region_links_array=array_filter($region_links_array);

$i=0;
$data_string=array();

   foreach($atable as $line)
   {
   trim($line);
   $line=str_replace("Name","Name - ",$line, $acount);
   if($acount){array_shift($atable);array_push($data_string,$line);}
   $line=str_replace("status","status - ",$line, $bcount);
   if($bcount){array_shift($atable);array_push($data_string,$line);}
   $line=str_replace("constellations","constellations - ",$line, $ccount);
   if($ccount){array_shift($atable);array_push($data_string,$line);}
   $line=str_replace("Sovereignty","Sovereignty - ",$line, $dcount);
   if($dcount){array_shift($atable);array_push($data_string,$line);}
   $i++;
   }

$regions=array();
array_shift($atable);

  $i=0;
   foreach($atable as $line)
   {
   $regions[$i]=$line;
   $i++;
   }

$data_assoc=array();
    foreach($data_string as $string)
    {
    list($key,$value)=explode("-",$string);
    $key=trim($key);
    $value=trim($value);
    $data_assoc[$key]=$value;
    }
// $return_array=array();
$return_array[$region][data_asoc]= $data_assoc;
$return_array[$region][data_string]= $data_string;
$return_array[$region][regions]= $regions;
$return_array[$region][region_links_list]= $region_links_list;
$return_array[$region][region_links_array]= $region_links_array;

global $get_this;
foreach($return_array[$get_this]['regions'] as $reg)
{     $reg=trim($reg);
     $return_array[$reg]=stellarium($reg);
}

return $return_array;

      }

         $get_this='Aridia';
$return_array = @stellarium($get_this);


echo "<pre>";
print_r($return_array);

      ?>

 

Array

(

    [Aridia] => Array

        (

            [data_asoc] => Array

                (

                    [Name] => Aridia

                    [Average security status] => 0.244786045

                    [Number of constellations] => 11

                    [sovereignty] => Amarr Empire

                )

 

            [data_string] => Array

                (

                    [0] => Name - Aridia

                    [1] => Average security status - 0.244786045

                    [2] => Number of constellations - 11

                    [3] => Sovereignty - Amarr Empire

                )

 

            [regions] => Array

                (

                    [0] =>  Solitude

                    [1] =>  Khanid

                    [2] =>  Fountain

                    [3] =>  Delve

                    [4] =>  Kor-Azor

                    [5] =>  Genesis

                )

 

            [region_links_list] =>

 

    * Solitude

    * Khanid

    * Fountain

    * Delve

    * Kor-Azor

    * Genesis

 

 

            [region_links_array] => Array

                (

                    [0] =>  Solitude

                    [1] =>  Khanid

                    [2] =>  Fountain

                    [3] =>  Delve

                    [4] =>  Kor-Azor

                    [5] =>  Genesis

                )

 

        )

 

    [solitude] => Array

        (

            [solitude] => Array

                (

                    [data_asoc] => Array

                        (

                            [Name] => Solitude

                            [Average security status] => 0.43695860232558

                            [Number of constellations] => 6

                            [sovereignty] => Gallente Federation

                        )

 

                    [data_string] => Array

                        (

                            [0] => Name - Solitude

                            [1] => Average security status - 0.43695860232558

                            [2] => Number of constellations - 6

                            [3] => Sovereignty - Gallente Federation

                        )

 

                    [regions] => Array

                        (

                            [0] =>  Syndicate

                            [1] =>  Aridia

                        )

 

                    [region_links_list] =>

 

    * Syndicate

    * Aridia

 

 

                    [region_links_array] => Array

                        (

                            [0] =>  Syndicate

                            [1] =>  Aridia

                        )

 

                )

 

        )

 

    [Khanid] => Array

        (

            [Khanid] => Array

                (

                    [data_asoc] => Array

                        (

                            [Name] => Khanid

                            [Average security status] => 0.47070184880952

                            [Number of constellations] => 12

                            [sovereignty] => Khanid Kingdom

                        )

 

                    [data_string] => Array

                        (

                            [0] => Name - Khanid

                            [1] => Average security status - 0.47070184880952

                            [2] => Number of constellations - 12

                            [3] => Sovereignty - Khanid Kingdom

                        )

 

                    [regions] => Array

                        (

                            [0] =>  Catch

                            [1] =>  Tash-Murkon

                            [2] =>  Querious

                            [3] =>  Aridia

                            [4] =>  Kor-Azor

                        )

 

                    [region_links_list] =>

 

    * Catch

    * Tash-Murkon

    * Querious

    * Aridia

    * Kor-Azor

 

 

                    [region_links_array] => Array

                        (

                            [0] =>  Catch

                            [1] =>  Tash-Murkon

                            [2] =>  Querious

                            [3] =>  Aridia

                            [4] =>  Kor-Azor

                        )

 

                )

 

        )

 

    [Fountain] => Array

        (

            [Fountain] => Array

                (

                    [data_asoc] => Array

                        (

                            [Name] => Fountain

                            [Average security status] =>

                            [Number of constellations] => 17

                            [sovereignty] => None

                        )

 

                    [data_string] => Array

                        (

                            [0] => Name - Fountain

                            [1] => Average security status - -0.29222406643478

                            [2] => Number of constellations - 17

                            [3] => Sovereignty - None

                        )

 

                    [regions] => Array

                        (

                            [0] =>  Cloud Ring

                            [1] =>  Aridia

                            [2] =>  Outer Ring

                            [3] =>  Delve

                        )

 

                    [region_links_list] =>

 

    * Cloud Ring

    * Aridia

    * Outer Ring

    * Delve

 

 

                    [region_links_array] => Array

                        (

                            [0] =>  Cloud Ring

                            [1] =>  Aridia

                            [2] =>  Outer Ring

                            [3] =>  Delve

                        )

 

                )

 

        )

 

    [Delve] => Array

        (

            [Delve] => Array

                (

                    [data_asoc] => Array

                        (

                            [Name] => Delve

                            [Average security status] =>

                            [Number of constellations] => 15

                            [sovereignty] => None

                        )

 

                    [data_string] => Array

                        (

                            [0] => Name - Delve

                            [1] => Average security status - -0.39469514752577

                            [2] => Number of constellations - 15

                            [3] => Sovereignty - None

                        )

 

                    [regions] => Array

                        (

                            [0] =>  Querious

                            [1] =>  Aridia

                            [2] =>  Fountain

                            [3] =>  Period Basis

                        )

 

                    [region_links_list] =>

 

    * Querious

    * Aridia

    * Fountain

    * Period Basis

 

 

                    [region_links_array] => Array

                        (

                            [0] =>  Querious

                            [1] =>  Aridia

                            [2] =>  Fountain

                            [3] =>  Period Basis

                        )

 

                )

 

        )

 

    [Kor-Azor] => Array

        (

            [Kor-Azor] => Array

                (

                    [data_asoc] => Array

                        (

                            [Name] => Kor

                            [Average security status] => 0.51623962295082

                            [Number of constellations] => 9

                            [sovereignty] => Amarr Empire

                        )

 

                    [data_string] => Array

                        (

                            [0] => Name - Kor-Azor

                            [1] => Average security status - 0.51623962295082

                            [2] => Number of constellations - 9

                            [3] => Sovereignty - Amarr Empire

                        )

 

                    [regions] => Array

                        (

                            [0] =>  Domain

                            [1] =>  Khanid

                            [2] =>  Kador

                            [3] =>  Aridia

                            [4] =>  Genesis

                        )

 

                    [region_links_list] =>

 

    * Domain

    * Khanid

    * Kador

    * Aridia

    * Genesis

 

 

                    [region_links_array] => Array

                        (

                            [0] =>  Domain

                            [1] =>  Khanid

                            [2] =>  Kador

                            [3] =>  Aridia

                            [4] =>  Genesis

                        )

 

                )

 

        )

 

    [Genesis] => Array

        (

            [Genesis] => Array

                (

                    [data_asoc] => Array

                        (

                            [Name] => Genesis

                            [Average security status] => 0.4499963592233

                            [Number of constellations] => 15

                            [sovereignty] => Amarr Empire

                        )

 

                    [data_string] => Array

                        (

                            [0] => Name - Genesis

                            [1] => Average security status - 0.4499963592233

                            [2] => Number of constellations - 15

                            [3] => Sovereignty - Amarr Empire

                        )

 

                    [regions] => Array

                        (

                            [0] =>  Sinq Laison

                            [1] =>  Everyshore

                            [2] =>  Kador

                            [3] =>  Aridia

                            [4] =>  Essence

                            [5] =>  Kor-Azor

                            [6] =>  Verge Vendor

                        )

 

                    [region_links_list] =>

 

    * Sinq Laison

    * Everyshore

    * Kador

    * Aridia

    * Essence

    * Kor-Azor

    * Verge Vendor

 

 

                    [region_links_array] => Array

                        (

                            [0] =>  Sinq Laison

                            [1] =>  Everyshore

                            [2] =>  Kador

                            [3] =>  Aridia

                            [4] =>  Essence

                            [5] =>  Kor-Azor

                            [6] =>  Verge Vendor

                        )

 

                )

 

        )

 

)

 

;)

Teamatomic

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.