Jump to content

parse with file_gets_contents - review of a 10 liner


dilbertone

Recommended Posts

hello dear community _ good evening!

 

For the purpose of scraping this dataset with ++ 2700 records on foundation - in Switzerland

you see it here http://www.edi.admin.ch/esv/00475/00698/index.html?lang=de

 

<?PHP // Original PHP code by Chirp Internet: www.chirp.com.au 
// Please acknowledge use of this code by including this header. 

$url = "http://www.edi.admin.ch/esv/00475/00698/index.html?lang=de"; 

$input = @file_get_contents($url) or die("Could not access file: $url"); 
$regexp = "<a\s[^>]*href=(\"??)([^\" >]*?)\\1[^>]*>(.*)<\/a>"; 
if(preg_match_all("/$regexp/siU", $input, $matches, PREG_SET_ORDER)) { foreach($matches as $match) { 

// $match[2] = all the data i want to collect... 
// $match[3] = text that i need to collect - see a detail-page

} 
} ?>

 

well to be frank - i am not sure - my console gives back some bad errors...

 

can you help me please in this issue.

 

love to hear from you

 

db1

 

 

 

btw: see a detailpage: http://www.edi.admin.ch/esv/00475/00698/index.html?lang=de&webgrab_path=http://esv2000.edi.admin.ch/d/entry.asp?Id=3221

 

 

 

with the following information:

Name: "baiji.org" Foundation

Schlüsselwort: BAIJI

Adresse: Seefeldstr. 94

8008 Zürich

Mail: august@baiji.com

Zweck:

 

 

btw: see a translation;

 

Name: - > name

Schlüsselwort: - keyword

Adresse:  - adress

Mail: - mail

Zweck: - purpose

 

 

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.