jarvis Posted October 10, 2016 Share Posted October 10, 2016 Hi, I'm going mad and in desperation am reaching out for some help. I have a site that outputs a snippet of text (30 words) from a full description (string). The description includes HTML for formatting, which I need to keep in the snippet. My issue is that the snippet never returns 30 words! Below is my code: function limit_words($string, $word_limit) { #$words = explode(" ",$string); $words = preg_split('/\s+/', $string); return implode(" ",array_splice($words,0,$word_limit)); } function limit_words2($string, $word_limit) { $words = explode(" ",$string); #$words = preg_split('/\s+/', $string); return implode(" ",array_splice($words,0,$word_limit)); } $content = " <p><p><strong>LOCATION</strong><br />Centrally located in Kent just of the High Street.</p> <p><strong>ACCOMMODATION</strong><br />231 sq ft.</p> <p><strong>AMENITIES</strong><br />Entry Phone System<br /> Central Heating<br /> Car Parking</p> <p><strong>TERMS</strong><br />A new lease for a term to be agreed.</p> <p><strong>OUTGOINGS</strong><br />To be assessed.</p> <p><strong>VAT</strong><br />All prices and rents are quoted exclusive of VAT. Any intending purchaser or lessee must satisfy themselves as to the incidence of VAT in respect of any transaction.</p> <p><strong>LEGAL COSTS</strong><br />Each party to be responsible for their own legal costs.</p> <p><strong>SERVICE CHARGE</strong><br />Tenant to be responsible for a proportion of costs towards insurance, maintenance and repairs.</p> </p> "; $content2 = " <p><p><strong>LOCATION</strong><br />A shop/office premises to let in Kent, close to NatWest, Holland & Barrett and Fat Face.</p> <p><strong>DESCRIPTION</strong><br />A shop/office premises to let in Kent, close to NatWest, Holland & Barrett and Fat Face.</p> <p><strong>ACCOMMODATION</strong><br />Approximately 159 sq ft. </p> <p><strong>AMENITIES</strong><br />Attractive display window<br /> Laminate floor<br /> Display lighting<br /> Alarm</p> <p><strong>TERMS</strong><br />Easy in easy out terms.</p> <p><strong>OUTGOINGS</strong><br />We understand that the current rateable value is £2550.<br /> Current UBR – 48.2p in £<br /> Small business relief may be available.</p> <p><strong>VAT</strong><br />All prices and rents are quoted exclusive of VAT. Any intending purchaser or lessee must satisfy themselves as to the incidence of VAT in respect of any transaction. The rent is also subject to VAT.</p> <p><strong>LEGAL COSTS</strong><br />Each party responsible for their own legal costs.</p> <p><strong>SERVICE CHARGE</strong><br />Insurance currently £223.32 per annum plus VAT.</p> </p> "; echo limit_words($content,30); echo '<hr>'; echo limit_words($content2,30); echo '<hr>'; echo limit_words2($content,30); echo '<hr>'; echo limit_words2($content2,30); What on earth am I doing wrong? Any help is much appreciated! Quote Link to comment Share on other sites More sharing options...
cyberRobot Posted October 10, 2016 Share Posted October 10, 2016 If you output the array of words ($words), you'll see that some of the slots are taken up by white space. function limit_words($string, $word_limit) { #$words = explode(" ",$string); $words = preg_split('/\s+/', $string); echo '<pre>' . print_r($words, true) . '</pre>'; return implode(" ",array_splice($words,0,$word_limit)); } Quote Link to comment Share on other sites More sharing options...
jarvis Posted October 10, 2016 Author Share Posted October 10, 2016 Thank you cyberRobot! Is there a way around this without compromise to outputting the HTML? Quote Link to comment Share on other sites More sharing options...
cyberRobot Posted October 10, 2016 Share Posted October 10, 2016 You could loop through the words and remove entries that are empty. Are you looking to preserve the HTML code? If not, you could look into using strip_tags() to remove it. Otherwise, you'll need to figure out how to repair any broken tags caused by limiting the text to X words. Quote Link to comment Share on other sites More sharing options...
jarvis Posted October 10, 2016 Author Share Posted October 10, 2016 Thanks again for the reply. Yes, I'd like to keep the HTML formatting Quote Link to comment Share on other sites More sharing options...
jarvis Posted October 10, 2016 Author Share Posted October 10, 2016 So something like this: function limit_words($string, $word_limit) { #$words = explode(" ",$string); $words = preg_split('/\s+/', $string); #echo '<pre>' . print_r($words, true) . '</pre>'; #echo '<pre>' .print_r(array_filter($words)) . '</pre>'; $filter = array_filter($words); #return implode(" ",array_splice($words,0,$word_limit)); return implode(" ",array_splice($filter,0,$word_limit)); } Although that doesn't seem to work either? Quote Link to comment Share on other sites More sharing options...
cyberRobot Posted October 11, 2016 Share Posted October 11, 2016 ...I'd like to keep the HTML formatting Hmm...that increases the difficulty. You could try something like this: function limit_words($string, $word_limit) { //PREPARE STRING $words = preg_replace('|<br[ /]*>|', ' ', $string); //replace <br> tags with spaces so that "LOCATION</strong><br />Centrally" is not considered one word after HTML tags are removed $words = strip_tags($words); //remove HTML tags $words = preg_split('/\s+/', $words); //split sting into words //LOCATE 30TH WORD $currOffset = 0; $wordsFound = 0; $lastWord = ''; foreach($words as $currWord) { //IF NOT BLANK, FIND CURRENT WORD IN ORIGINAL STRING if($currWord != '') { //IF WORD IS FOUND $newOffset = strpos($string, $currWord, $currOffset); if($newOffset !== false) { //echo "<div>$currWord || $currOffset || $newOffset</div>"; //UPDATE OFFSET AND WORD COUNTER $currOffset = $newOffset; //offset is used so the next word will be found after the current word $wordsFound++; //IF WORD LIMIT WAS REACHED, STORE 3OTH WORD AND BREAK OUT OF THE LOOP if($wordsFound == $word_limit) { $lastWord = $currWord; break; } } } } //RETURN RESULT return substr($string, 0, $currOffset) . $lastWord; } echo limit_words($content,30); Quote Link to comment Share on other sites More sharing options...
cyberRobot Posted October 11, 2016 Share Posted October 11, 2016 Of course, I should mention that the above code doesn't deal with close tags that get cut off. Quote Link to comment Share on other sites More sharing options...
jarvis Posted October 12, 2016 Author Share Posted October 12, 2016 Apologies, than you cyberRobot, that's much appreciated Quote Link to comment Share on other sites More sharing options...
Solution Barand Posted October 12, 2016 Solution Share Posted October 12, 2016 This closes off any current tags $text = "<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Etiam elementum ornare scelerisque.<br> <a href='xyz.com' target='_blank'>Vestibulum</a> iaculis mattis dui.</p> <p>Aliquam <i>scelerisque</i> sapien at tellus accumsan varius. <img src='a.jpg'> Fusce facilisis ullamcorper dapibus. Aliquam dignissim</p> <ul> <li>gravida</li> <li>dui eget</li> <li>aliquam</li> </ul> <p>Duis odio, semper eu sodales vel, sollicitudin eu enim. Cras tortor libero, pellentesque accumsan tempus in, ullamcorper nec augue. Mauris eu ipsum mauris, non imperdiet ipsum. In hac habitasse platea dictumst. Morbi ipsum mauris, tincidunt vitae pretium tempor, pretium a turpis. Nulla quis eros eu lorem aliquam congue non a nisl.</p>"; $voidtags = ['br','hr','img']; $keeptags = '<a><b><i><br><p><ul><ol><li><u><strong><emphasis>'; $limit = 30; $summary = limitText($text, $limit, $voidtags, $keeptags); echo $summary; function limitText($text, $limit, $voidtags, $keeptags) { $result = ''; $p=0; $tags=[]; $currtag = ''; $words = 0; $intag = $inword = 0; $text = strip_tags($text, $keeptags); $len = strlen($text); while ($p<$len) { $c = $text[$p]; switch ($c) { case '<': if ($inword) { $inword = 0; $words++; if ($words > $limit) break 2; } $intag = 1; break; case '>': if ($intag && $currtag != '') { if (!in_array($currtag, $voidtags)) $tags[] = $currtag; $currtag = ''; } $intag = 0; break; case '/': if ($intag) { array_pop($tags); do { $result .= $c; } while (($c=$text[++$p]) !='>'); $intag = 0; } break; case "\n": case "\t": case ' ': if ($inword) { $inword = 0; $words++; if ($words >= $limit) break 2; } elseif ($intag) { $tags[] = $currtag; do { $result .= $c; } while (($c=$text[++$p]) !='>'); $intag = 0; } break; default: if ($intag) { $currtag .= $c; } else $inword = 1; break; } $result .= $c; ++$p; } while ($t=array_pop($tags)) { $result .= "</{$t}>"; // close any open tags } return $result; } results <p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Etiam elementum ornare scelerisque.<br> <a href="xyz.com" target="_blank">Vestibulum</a> iaculis mattis dui.</p> <p>Aliquam <i>scelerisque</i> sapien at tellus accumsan varius. Fusce facilisis ullamcorper dapibus. Aliquam dignissim</p> <ul> <li>gravida</li> <li>dui</li></ul> 1 Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.