Jump to content

Strip table from HTML scrape


johnsmith153

Recommended Posts

I have scraped some data from a website and have the HTML. I just need to now remove a table from that HTML, so I'm guessing I need to search in the HTML for a start and end <table> tag.

 

If there is more than one table, then I want to remove all of them.

 

Can someone point me in the right direction?

 

Also, removing images would be great (anything with <img).

 

 

Link to comment
https://forums.phpfreaks.com/topic/264284-strip-table-from-html-scrape/
Share on other sites

You can use a regular expression:

 

$html = preg_replace('/<table[^>]*>.*?<\/table>/s', '', $html);

 

That will match any opening table tag with or without attributes, up until the first found closing table tag, and then replace it all with nothing. Also preg_replace() will replace all occurences unless you tell it not to, using the 4th parameter.

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.