kcannon Posted March 12, 2010 Share Posted March 12, 2010 I am trying to collect data from an external website. I can currently collect the whole website but I dont know how to get certain sections of the website. Im trying to get data from a wiki example page : http://wiki.eveonline.com/en/wiki/Category:Aridia_%28Region%29 The code i cuurently have is <?php if($_POST['url']) { $url = $_POST['url']; } ?> Enter your URL:<br> <form method="post"> <input type="text" name="url" value="<?php echo $url ?>"> <input type="submit" value="Go"> </form> <hr> <?php $content = file_get_contents($url); echo $content; ?> What im trying to get out of the website is the name (without (region)), Average sec, number of constellations and so on. The only bit I dont see how to do is cut the page up so I can get certain bits. So if anyone would be able to help I would be thankful. Quote Link to comment https://forums.phpfreaks.com/topic/195044-collecting-data-from-a-external-website/ Share on other sites More sharing options...
db323c Posted March 12, 2010 Share Posted March 12, 2010 you can use strpos and substr to slice your way through the site. Quote Link to comment https://forums.phpfreaks.com/topic/195044-collecting-data-from-a-external-website/#findComment-1025310 Share on other sites More sharing options...
kcannon Posted March 12, 2010 Author Share Posted March 12, 2010 Yer I got that was just wondering if anyone would be able to give me abit of help with it as ive had a try but it doesnt seem to work. <?php $url = 'http://wiki.eveonline.com/en/wiki/Category:Aridia_%28Region%29'; $content = file_get_contents($url); // echo $content; $titlesearch = 'var wgTitle'; $titlesearch1 = '(Region)'; $titlepos = strpos ($content, $titlesearch); $titlepos1 = strpos ($content, $titlesearch1); echo $content [strlen($content) -$titlepos]; ?> Is what ive tried but it just doesnt work. Quote Link to comment https://forums.phpfreaks.com/topic/195044-collecting-data-from-a-external-website/#findComment-1025320 Share on other sites More sharing options...
db323c Posted March 12, 2010 Share Posted March 12, 2010 try: echo substr($content, $titlepos+11, $titlepos1-$titlepos+11); Quote Link to comment https://forums.phpfreaks.com/topic/195044-collecting-data-from-a-external-website/#findComment-1025325 Share on other sites More sharing options...
db323c Posted March 12, 2010 Share Posted March 12, 2010 I decided to actually test it here's working code: <?php $url = 'http://wiki.eveonline.com/en/wiki/Category:Aridia_%28Region%29'; $content = file_get_contents($url); // echo $content; $titlesearch = 'var wgTitle'; $titlesearch1 = '(Region)'; $titlepos = strpos ($content, $titlesearch); $start=$titlepos+15; $titlepos1 = strpos ($content, $titlesearch1, $start); $mid=$titlepos1 - $start; echo substr($content, $start, $mid); ?> Quote Link to comment https://forums.phpfreaks.com/topic/195044-collecting-data-from-a-external-website/#findComment-1025327 Share on other sites More sharing options...
kcannon Posted March 12, 2010 Author Share Posted March 12, 2010 That worked great thanks Quote Link to comment https://forums.phpfreaks.com/topic/195044-collecting-data-from-a-external-website/#findComment-1025331 Share on other sites More sharing options...
Psycho Posted March 12, 2010 Share Posted March 12, 2010 There are better methods that will work with this and be less error prone. I think using regular expressions and/or the DOM XML parser would make more sense. For example, to get the Name you could do this: $url = "[url=http://wiki.eveonline.com/en/wiki/Category:Aridia_%28Region%29]http://wiki.eveonline.com/en/wiki/Category:Aridia_%28Region%29[/url]"; $content = file_get_contents($url); preg_match( "/<td>Name<\/td><td>\s*([^<]*)\s*/", $content, $nameMatch); $name = $nameMatch[1]; echo $name; Quote Link to comment https://forums.phpfreaks.com/topic/195044-collecting-data-from-a-external-website/#findComment-1025335 Share on other sites More sharing options...
Psycho Posted March 12, 2010 Share Posted March 12, 2010 This will get the first three values from the table. I'd do the other two but I have a meeting preg_match( "/<td>Name<\/td><td>\s*([^<]*)\s*/", $content, $nameMatch); $name = $nameMatch[1]; preg_match( "/<td>Average.*?<\/td><td>\s*([^<]*)\s*/", $content, $avgMatch); $avg = $avgMatch[1]; preg_match( "/<td>Number of constellations<\/td><td>\s*([^<]*)\s*/", $content, $constMatch); $contellations = $constMatch[1]; Quote Link to comment https://forums.phpfreaks.com/topic/195044-collecting-data-from-a-external-website/#findComment-1025337 Share on other sites More sharing options...
kcannon Posted March 12, 2010 Author Share Posted March 12, 2010 wow thanks mjdamato I had to make a litle change but I got it working with <?php $url = 'http://wiki.eveonline.com/en/wiki/Category:Aridia_%28Region%29'; $content = file_get_contents($url); preg_match( "/<td>Name<\/td><td>\s*([^<]*)\s*/", $content, $nameMatch); $name = $nameMatch[1]; echo $name; preg_match( "/<td>Average.*?<\/td><td>\s*([^<]*)\s*/", $content, $avgMatch); $avg = $avgMatch[1]; echo $avg; preg_match( "/<td>Number of constellations<\/td><td>\s*([^<]*)\s*/", $content, $constMatch); $contellations = $constMatch[1]; echo $contellations; ?> Quote Link to comment https://forums.phpfreaks.com/topic/195044-collecting-data-from-a-external-website/#findComment-1025341 Share on other sites More sharing options...
Psycho Posted March 12, 2010 Share Posted March 12, 2010 OK, here you go. This script will pull all the values from the table on that page and dump into an associative array $url = 'http://wiki.eveonline.com/en/wiki/Category:Aridia_%28Region%29'; $content = file_get_contents($url); $data = array(); //Process the data preg_match( "/<td>Name<\/td><td>\s*([^<]*)/s", $content, $nameMatch); preg_match( "/<td>Average.*?<\/td><td>\s*([^<]*)/s", $content, $avgMatch); preg_match( "/<td>Number of constellations<\/td><td>\s*([^<]*)/s", $content, $constMatch); preg_match( "/<td><a[^>]*>Sovereignty<\/a><\/td><td>\s*<a href=\"([^\"]*)[^>]*>([^<]*)/s", $content, $sovMatch); preg_match( "/<td>Adjacent Regions<\/td><td>\s*<ul>(.*?)<\/ul>\s*<\/td>/s", $content, $regionsList); preg_match_all( "/<li>\s*<a href=\"([^\"]*)[^>]*>([^<]*)/s", $regionsList[1], $regionsMatch); //Add data to array $data['name'] = trim($nameMatch[1]); $data['average'] = trim($avgMatch[1]); $data['constellations'] = trim($constMatch[1]); $data['sovereignty'] = trim($sovMatch[1]); $data['sovereignty_url'] = trim($sovMatch[2]); $data['regions'] = array_combine($regionsMatch[2], $regionsMatch[1]); echo "<pre>\n"; print_r($data); echo "\n<pre>"; Output: Array ( [name] => Aridia [average] => 0.244786045 [constellations] => 11 [sovereignty] => Amarr Empire [sovereignty_url] => /en/wiki/Category:Amarr_Empire_%28Faction%29 [regions] => Array ( [solitude] => /en/wiki/Category:Solitude_%28Region%29 [Khanid] => /en/wiki/Category:Khanid_%28Region%29 [Fountain] => /en/wiki/Category:Fountain_%28Region%29 [Delve] => /en/wiki/Category:Delve_%28Region%29 [Kor-Azor] => /en/wiki/Category:Kor-Azor_%28Region%29 [Genesis] => /en/wiki/Category:Genesis_%28Region%29 ) ) Quote Link to comment https://forums.phpfreaks.com/topic/195044-collecting-data-from-a-external-website/#findComment-1025408 Share on other sites More sharing options...
Psycho Posted March 13, 2010 Share Posted March 13, 2010 New and improved verision. I've made it into a function so you only have to call the function with the region name: <?php function getRegionData($region) { $domain = "http://wiki.eveonline.com/"; $url = "{$domain}en/wiki/Category:{$region}_%28Region%29"; $content = file_get_contents($url); //Process the data preg_match( "/<td>Name<\/td><td>\s*([^<]*)/s", $content, $nameMatch); preg_match( "/<td>Average.*?<\/td><td>\s*([^<]*)/s", $content, $avgMatch); preg_match( "/<td>Number of constellations<\/td><td>\s*([^<]*)/s", $content, $constMatch); preg_match( "/<td><a[^>]*>Sovereignty<\/a><\/td><td>\s*<a href=\"([^\"]*)[^>]*>([^<]*)/s", $content, $sovMatch); preg_match( "/<td>Adjacent Regions<\/td><td>\s*<ul>(.*?)<\/ul>\s*<\/td>/s", $content, $regionsList); preg_match_all( "/<li>\s*<a href=\"([^\"]*)[^>]*>([^<]*)/s", $regionsList[1], $regionsMatch); $appendDomain = create_function('&$item, $key, $domain', '$item = $domain.$item;'); array_walk( $regionsMatch[1], $appendDomain, $domain ); //Add data to array $regionData = array(); $regionData['name'] = trim($nameMatch[1]); $regionData['average'] = trim($avgMatch[1]); $regionData['constellations'] = trim($constMatch[1]); $regionData['sovereignty'] = trim($sovMatch[2]); $regionData['sovereignty_url'] = $domain . trim($sovMatch[1]); $regionData['regions'] = array_combine($regionsMatch[2], $regionsMatch[1]); return $regionData; } //Call function with region name $regionData = getRegionData('Everyshore'); echo "<pre>\n"; print_r($regionData); echo "\n<pre>"; ?> Quote Link to comment https://forums.phpfreaks.com/topic/195044-collecting-data-from-a-external-website/#findComment-1025446 Share on other sites More sharing options...
teamatomic Posted March 13, 2010 Share Posted March 13, 2010 I got bored watching TV. Its friday night and its raining. <?php function stellarium($region) { $domain = "http://wiki.eveonline.com/"; $url = "{$domain}en/wiki/Category:{$region}_%28Region%29"; $content=file_get_contents("$url"); $start = '<table .* class="itemdb-atribs">'; $end = '<\/td><\/tr><\/table>'; preg_match( "/$start(.*)$end/s", $content, $tables ); $table = $tables[1]; $btable=strip_tags($table); $atable=explode("\n",$btable); $atable=array_filter($atable); $start = '<ul><li>'; $end = '<\/li><\/ul>'; preg_match( "/$start(.*)$end/s", $table, $links ); $rl=$links[0]; $rll=str_replace("<a href=\"","<a href=\"http://wiki.eveonline.com",$rl); $region_links_list=$rll; $rll=strip_tags($region_links_list,'<a>'); $region_links_array=explode("\n",$rll); $region_links_array=array_filter($region_links_array); $i=0; $data_string=array(); foreach($atable as $line) { trim($line); $line=str_replace("Name","Name - ",$line, $acount); if($acount){array_shift($atable);array_push($data_string,$line);} $line=str_replace("status","status - ",$line, $bcount); if($bcount){array_shift($atable);array_push($data_string,$line);} $line=str_replace("constellations","constellations - ",$line, $ccount); if($ccount){array_shift($atable);array_push($data_string,$line);} $line=str_replace("Sovereignty","Sovereignty - ",$line, $dcount); if($dcount){array_shift($atable);array_push($data_string,$line);} $i++; } $regions=array(); array_shift($atable); $i=0; foreach($atable as $line) { $regions[$i]=$line; $i++; } $data_assoc=array(); foreach($data_string as $string) { list($key,$value)=explode("-",$string); $key=trim($key); $value=trim($value); $data_assoc[$key]=$value; } // $return_array=array(); $return_array[$region][data_asoc]= $data_assoc; $return_array[$region][data_string]= $data_string; $return_array[$region][regions]= $regions; $return_array[$region][region_links_list]= $region_links_list; $return_array[$region][region_links_array]= $region_links_array; global $get_this; foreach($return_array[$get_this]['regions'] as $reg) { $reg=trim($reg); $return_array[$reg]=stellarium($reg); } return $return_array; } $get_this='Aridia'; $return_array = @stellarium($get_this); echo "<pre>"; print_r($return_array); ?> Array ( [Aridia] => Array ( [data_asoc] => Array ( [Name] => Aridia [Average security status] => 0.244786045 [Number of constellations] => 11 [sovereignty] => Amarr Empire ) [data_string] => Array ( [0] => Name - Aridia [1] => Average security status - 0.244786045 [2] => Number of constellations - 11 [3] => Sovereignty - Amarr Empire ) [regions] => Array ( [0] => Solitude [1] => Khanid [2] => Fountain [3] => Delve [4] => Kor-Azor [5] => Genesis ) [region_links_list] => * Solitude * Khanid * Fountain * Delve * Kor-Azor * Genesis [region_links_array] => Array ( [0] => Solitude [1] => Khanid [2] => Fountain [3] => Delve [4] => Kor-Azor [5] => Genesis ) ) [solitude] => Array ( [solitude] => Array ( [data_asoc] => Array ( [Name] => Solitude [Average security status] => 0.43695860232558 [Number of constellations] => 6 [sovereignty] => Gallente Federation ) [data_string] => Array ( [0] => Name - Solitude [1] => Average security status - 0.43695860232558 [2] => Number of constellations - 6 [3] => Sovereignty - Gallente Federation ) [regions] => Array ( [0] => Syndicate [1] => Aridia ) [region_links_list] => * Syndicate * Aridia [region_links_array] => Array ( [0] => Syndicate [1] => Aridia ) ) ) [Khanid] => Array ( [Khanid] => Array ( [data_asoc] => Array ( [Name] => Khanid [Average security status] => 0.47070184880952 [Number of constellations] => 12 [sovereignty] => Khanid Kingdom ) [data_string] => Array ( [0] => Name - Khanid [1] => Average security status - 0.47070184880952 [2] => Number of constellations - 12 [3] => Sovereignty - Khanid Kingdom ) [regions] => Array ( [0] => Catch [1] => Tash-Murkon [2] => Querious [3] => Aridia [4] => Kor-Azor ) [region_links_list] => * Catch * Tash-Murkon * Querious * Aridia * Kor-Azor [region_links_array] => Array ( [0] => Catch [1] => Tash-Murkon [2] => Querious [3] => Aridia [4] => Kor-Azor ) ) ) [Fountain] => Array ( [Fountain] => Array ( [data_asoc] => Array ( [Name] => Fountain [Average security status] => [Number of constellations] => 17 [sovereignty] => None ) [data_string] => Array ( [0] => Name - Fountain [1] => Average security status - -0.29222406643478 [2] => Number of constellations - 17 [3] => Sovereignty - None ) [regions] => Array ( [0] => Cloud Ring [1] => Aridia [2] => Outer Ring [3] => Delve ) [region_links_list] => * Cloud Ring * Aridia * Outer Ring * Delve [region_links_array] => Array ( [0] => Cloud Ring [1] => Aridia [2] => Outer Ring [3] => Delve ) ) ) [Delve] => Array ( [Delve] => Array ( [data_asoc] => Array ( [Name] => Delve [Average security status] => [Number of constellations] => 15 [sovereignty] => None ) [data_string] => Array ( [0] => Name - Delve [1] => Average security status - -0.39469514752577 [2] => Number of constellations - 15 [3] => Sovereignty - None ) [regions] => Array ( [0] => Querious [1] => Aridia [2] => Fountain [3] => Period Basis ) [region_links_list] => * Querious * Aridia * Fountain * Period Basis [region_links_array] => Array ( [0] => Querious [1] => Aridia [2] => Fountain [3] => Period Basis ) ) ) [Kor-Azor] => Array ( [Kor-Azor] => Array ( [data_asoc] => Array ( [Name] => Kor [Average security status] => 0.51623962295082 [Number of constellations] => 9 [sovereignty] => Amarr Empire ) [data_string] => Array ( [0] => Name - Kor-Azor [1] => Average security status - 0.51623962295082 [2] => Number of constellations - 9 [3] => Sovereignty - Amarr Empire ) [regions] => Array ( [0] => Domain [1] => Khanid [2] => Kador [3] => Aridia [4] => Genesis ) [region_links_list] => * Domain * Khanid * Kador * Aridia * Genesis [region_links_array] => Array ( [0] => Domain [1] => Khanid [2] => Kador [3] => Aridia [4] => Genesis ) ) ) [Genesis] => Array ( [Genesis] => Array ( [data_asoc] => Array ( [Name] => Genesis [Average security status] => 0.4499963592233 [Number of constellations] => 15 [sovereignty] => Amarr Empire ) [data_string] => Array ( [0] => Name - Genesis [1] => Average security status - 0.4499963592233 [2] => Number of constellations - 15 [3] => Sovereignty - Amarr Empire ) [regions] => Array ( [0] => Sinq Laison [1] => Everyshore [2] => Kador [3] => Aridia [4] => Essence [5] => Kor-Azor [6] => Verge Vendor ) [region_links_list] => * Sinq Laison * Everyshore * Kador * Aridia * Essence * Kor-Azor * Verge Vendor [region_links_array] => Array ( [0] => Sinq Laison [1] => Everyshore [2] => Kador [3] => Aridia [4] => Essence [5] => Kor-Azor [6] => Verge Vendor ) ) ) ) Teamatomic Quote Link to comment https://forums.phpfreaks.com/topic/195044-collecting-data-from-a-external-website/#findComment-1025457 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.