Jump to content

Parsing source


jiat

Recommended Posts

Hi all,

 

I'm new to php (and the forum) and I'm trying to figure out something I feel should be relatively easy but can't figure out.

I am wanting to parse the source of multiple web pages to get a list of classes. Here is what I've kind of tried to follow, with the help of the internet:

 

$buffer = file_get_contents("http://catalog.utk.edu/content.php?catoid=5&navoid=386&cpage=1"); 
$regex = '/ something something /';
preg_match($regex,$buffer,$match);
var_dump($match);
echo $match[1];
}

 

I'm trying to extract the course number and name from part of the source that looks like this:

 

<td width="15">&#160;&#160;</td>
		<td width="100%">&#8226;&#160;				<a href="preview_course_nopop.php?catoid=1&coid=32666" onClick="showCourse('1', '32666',this, 'a:2:{s:8:~location~;s:8:~template~;s:28:~course_program_display_field~;N;}'); return false;" target="_blank">ACCT 200 - Foundations of Accounting</a>

		</td>
	</tr>

 

Now, here's the thing, beside the fact that I can't use regex properly, I want to be able to put this into a loop for multiple courses per page of source. I am pretty fluent in c++, but this is throwing me for a serious loop (pun intended) :(

 

As a side note, I want to be able to do this for multiple pages as well, the only thing that changes in the page URL is the page=# part, so would it be possible to automate it for all 33 pages?

 

Thanks for any help. 

 

jiat

Link to comment
https://forums.phpfreaks.com/topic/233576-parsing-source/
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.