CodeMama Posted April 10, 2009 Share Posted April 10, 2009 trying to clean out all tags except the <br> on some data so I can put it in a database How can I write this: <?php $TESTING = TRUE; $target_url = "http://www.awebsite.com"; $userAgent = 'Googlebot/2.1 (http://www.googlebot.com/bot.html)'; $ch = curl_init(); curl_setopt($ch, CURLOPT_USERAGENT, $userAgent); curl_setopt($ch, CURLOPT_URL,$target_url); curl_setopt($ch, CURLOPT_FAILONERROR, true); curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true); curl_setopt($ch, CURLOPT_AUTOREFERER, true); curl_setopt($ch, CURLOPT_RETURNTRANSFER,true); curl_setopt($ch, CURLOPT_TIMEOUT, 100); $html = curl_exec($ch); if (!$html) { echo "<br />cURL error number:" .curl_errno($ch); echo "<br />cURL error:" . curl_error($ch); exit; } // parse the html into a DOMDocument $dom = new DOMDocument(); @$dom->loadHTML($html); echo $html; $graphs = split("<p", $html); // Start at 6 to clear out junk at top. Use $i+1 since last paragraph // is footnote that is not needed. for ($i = 6; $i+1 < count($graphs); $i++) { if($TESTING) echo "$i: $graphs[$i]<br />"; //split the paragraphs into lines $graphs->getAttribute('graphs'); $clean = $graphs(\<)(?!br(\s|\/|\>))(.*?\>); $lines = split("<br", $graphs); //for ($i = 1; $i+1 < count($lines); $i++) { // Grab restaurant name if($TESTING) echo "$i: $lines[$i]<br />"; } // Grab address // Grab city // Grab date and visit type // Grab rest of text and store it. Grab numbers of violations? Link to comment https://forums.phpfreaks.com/topic/153528-solved-help-with-php-and-regex/ Share on other sites More sharing options...
jackpf Posted April 11, 2009 Share Posted April 11, 2009 $string = strip_tags($string, '<br>'); Link to comment https://forums.phpfreaks.com/topic/153528-solved-help-with-php-and-regex/#findComment-807550 Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.