Jump to content

Which one is best html parser SimpleHtmlDom or PHPQuery or Ganon?


Recommended Posts

What language are you more comfortable with or going to be using?


Depends what you really want to parse, data trying to get, cleaned html, possibly works better but then maybe not, having more control or not.

That list is a lot of third party premade classes or applications and parsed how they deemed it. I suppose can extend onto those classes more if willing to study them a while.


If you want to do it directly and have control of what gets parsed along with output...use DOM, SimpleXML, for anything malformed or not within tags you can do preg_match / preg_match_all with some regex


As far as I know is not one complete solution that does every document type and also everything within the document let alone handle malformed data well, you have to make your own most of the time or learn to embrace errors. I know this because I had to make a universal website,page,document,media parser using the above methods.



Another suggestion is to use curl and follow any redirects including javascript.

If you use anything else ensure is a protocol and create a stream context or can fail easily with the connection.

Link to post
Share on other sites


This topic is now archived and is closed to further replies.

  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.