Parse bus plan

cordoprod · December 22, 2009

Hi,

I am developing an app for iPhone and for the parsing part and getting the data I need PHP.

I have managed to parse somethings, but the times when the bus arrives i haven't managed to do.

This is the site i need to parse:

http://www.rutebok.no/NRIIISStaticTables/Tables/ruter/t/01-100.htm

there are two tabs on this page. #1 is Tur, and #2 is Retur. These tabs are when the bus drives, and when it comes back.

These tabs are powered by javascript, and there is a <div> that managed those two. How can i parse the source so i start at <div id="tab1"> and end it where that div is ended.

It's so i can seperate these two, so i know which one is Tur and Retur.

ChemicalBliss · December 22, 2009

The easiest and safest method by far is to use a HTML DOM Parser Class. (Some Free, Some Commercial).

This page coud help you a little:

http://www.onderstekop.nl/articles/114/

If the link get's removed then just google PHP HTML PARSER

Good Luck

-CB-

cordoprod · December 22, 2009

I'm kind of used to preg_match_all. But I'm kind of stuck on this one.

salathe · December 22, 2009

What have you got so far?

cordoprod · December 22, 2009

Not anything on this, just parsed all the routes and stuff, not the times.

Adam · December 22, 2009

You could use a regular expression to parse the two div's content:

preg_match_all('/<div id="Tab[12]"[^>]*>(.*?)<\/div>/s', $str, $matches);

Edit: When I started writing that the last 2 posts weren't there and it made more sense to your OP.

salathe · December 22, 2009

Do you need to use regular expressions for this? It would be much easier using a proper parser.

cordoprod · December 22, 2009

Ok i tested your suggestion MrAdam, output is Array.

<?PHP
$content = file_get_contents("http://www.rutebok.no/NRIIISStaticTables/Tables/ruter/t/01-100.htm");
preg_match_all('/<div id="Tab[12]"[^>]*>(.*?)<\/div>/s', $content, $matches);

echo $matches[1];
?>

And salathe: No, but i figured it might be easier for my needs. Further on i will need to parse all the times as you see on the page.

cags · December 22, 2009

Do you need to use regular expressions for this? It would be much easier using a proper parser.

I already explained this to the OP when they asked about getting the Bus number and names yesterday. But since nobody posted code for that method and people did post regular expressions that is what they have used.

Ok i tested your suggestion MrAdam, output is Array.
<?PHP
$content = file_get_contents("http://www.rutebok.no/NRIIISStaticTables/Tables/ruter/t/01-100.htm");
preg_match_all('/<div id="Tab[12]"[^>]*>(.*?)<\/div>/s', $content, $matches);

echo $matches[1];
?>
And salathe: No, but i figured it might be easier for my needs. Further on i will need to parse all the times as you see on the page.

I also explained in the thread yesterday why code such as that will output the word Array and how to get around it.

cordoprod · December 22, 2009

cags: i remember you did. But i tried outputting the way you explained to me, but still no luck. Do you think the regex is correct?

Sign In

Parse bus plan

Recommended Posts

cordoprod

Link to comment

Share on other sites

ChemicalBliss

Link to comment

Share on other sites

cordoprod

Link to comment

Share on other sites

salathe

Link to comment

Share on other sites

cordoprod

Link to comment

Share on other sites

Adam

Link to comment

Share on other sites

salathe

Link to comment

Share on other sites

cordoprod

Link to comment

Share on other sites

cags

Link to comment

Share on other sites

cordoprod

Link to comment

Share on other sites

Archived

Browse

Activity

Important Information