Jump to content

explode a "." but ignore "Dr."


kevin_newbie

Recommended Posts

Hello,

 

I am working on a project and I want to explode after the second period of the description but if the sentence has a "Dr." I don't want that to be counted as part of the array. This is what I have now:

 


$content = "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nunc  Dr. Tom id lectus ante. Suspendisse potenti. Fusce eu ante mattis eros hendrerit imperdiet. Fusce at ante mauris, vel dapibus quam. Duis vel vestibulum neque. Aenean viverra condimentum ante, ut vulputate diam volutpat non. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Fusce nibh diam, sollicitudin at vulputate a, commodo nec ligula."

$explode = explode('.', $content);

echo $content[0]  .  '.' . $content[1]  .  '.' ; 

 

The code above displays on the browser as:

 

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nunc  Dr.

 

I would like it to skip the "Dr." if possible. Is it?

 

Thanks for your help :)

 

 

 

Link to comment
Share on other sites

Using preg_split this should be possible.

 

$string = "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nunc  Dr. Tom id lectus ante. Suspendisse potenti. Fusce eu ante mattis eros hendrerit imperdiet. Fusce at ante mauris, vel dapibus quam. Duis vel vestibulum neque. Aenean viverra condimentum ante, ut vulputate diam volutpat non. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Fusce nibh diam, sollicitudin at vulputate a, commodo nec ligula.";

$array = preg_split("~[^D][^r]\.~", $string);

print_r($array);

 

Maybe not done the best it could be (or properly), but should work.

Link to comment
Share on other sites

Does this work?

 

$string = "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nunc  Dr. Tom id lectus ante. Suspendisse potenti. Fusce eu ante mattis eros hendrerit imperdiet. Fusce at ante mauris, vel dapibus quam. Duis vel vestibulum neque. Aenean viverra condimentum ante, ut vulputate diam volutpat non. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Fusce nibh diam, sollicitudin at vulputate a, commodo nec ligula.";

$array = preg_split("~\b\.\b~", $string);

print_r($array);

 

 

Not too good with regular expressions but lookup word boundaries.

Link to comment
Share on other sites

I'm not too sure whether it'd work.. but what about an explode? It depends on your system, but from what you gave its limitted.

 

$take = 'Dr.';
$string = "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nunc  Dr. Tom id lectus ante. Suspendisse potenti. Fusce eu ante mattis eros hendrerit imperdiet. Fusce at ante mauris, vel dapibus quam. Duis vel vestibulum neque. Aenean viverra condimentum ante, ut vulputate diam volutpat non. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Fusce nibh diam, sollicitudin at vulputate a, commodo nec ligula.";
$string = explode($take, $string);
echo $string[0] . $string[1]

Link to comment
Share on other sites

Out of curiosity, is "Dr." the only thing you need to worry about? What about other things like:

Mr.

Mrs.

The main navigation; which contains the About Us, Contact Us, etc. links; is on the left

This is an interesting idea, but...

John M. Smith

Have you ever played .Hack?

 

that's the good question...

 

but what are you trying to achieve ??

we might come up with alternate solutions...

Link to comment
Share on other sites

Hello,

 

I have an update, I found a function where it grabs the first leading or however many sentence you want from a string. So it finds the sentence that either has a "! . or ?" which are considered leading sentences. Now is there a way to find a match where if there are any "something." with in this function we can just skip over that?

 


    function getLeadingSentences($data, $max) 
    { 
        $re = "[^s*[^.?!]+[.?!]+s*]"; 
        $out = ""; 
        for($i = 0; $i < $max; $i++) { 
            if(preg_match($re, $data, $match)) { 
                //if a sentence is found, take it out of $data and add it to $out 
                $out .= $match[0]; 
                $data = preg_replace($re, "", $data); 
            } 
            else { 
                $i = $max; 
            } 
        } 
        return $out; 
    }  


 

 

Thanks :)

 

Link to comment
Share on other sites

Old man musing...

 

$string = "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nunc Dr. Tom id lectus ante. Suspendisse potenti. Fusce eu ante mattis eros hendrerit imperdiet. Fusce at ante mauris, vel dapibus quam. Duis Mr. vel vestibulum neque. Aenean viverra condimentum ante, ut vulputate diam volutpat non. Cum Mrs. sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Fusce nibh diam, sollicitudin at vulputate a, commodo nec ligula.";

$remove_this = array(" Dr. ", " Ms. ", " Mr. ", " Mrs. ");

$string = str_replace($remove_this, " ", $string);


echo $string;

 

 

Link to comment
Share on other sites

I don't want to remove those I just want the function to skip over those so that it is not considered an end to a sentence.

 

Right now I am doing a function that replace the word to full words.


function replaceWord($description)
{
	$abrev = array("dr.","dr", "DR.", "DR", "Dr.", "Dr"  ); 
	$words = array("doctor", "doctor", "Doctor", "Doctor", "Doctor", "Doctor");
	$change = str_replace($abrev, $words, $description);

	return $change; 
    }

 

It is alright for now but I want it more dynamic than listing out every possible ways.

 

Thanks.

Link to comment
Share on other sites

Yeah the function is doing that but it breaks if there is something like "Mr." or "Dr." anything that is two letters with a "." in front of it.

 

 

So then the function you're using doesn't break if there are more than 2 letters...such as "Mrs."?  What if there is an ellipsis "..." in the middle of the sentence?

Link to comment
Share on other sites

hi Kevin,

 

I think the smartest way to do this is to check whether the "." is followed by a space AND an uppercase...

Although... it gets harder when you have ...Dr. Newton... or something like this.

 

In that case, I would try to get my hands on a list of common abbreviations.

Link to comment
Share on other sites

<?php

$skipped = array(
"Mrs.",
"Mr.",
"Ms.",
"Dr.",
"PhD.",
"Miss.",
"St."
);
$Sentence = "This is a sentence by Mr. Smith, and Mrs. Hannock. This is another sentence by Dr. Fred, Mrs. Wills, PhD. John and St. Peter. etc. Oh And Miss. Han.";

// Encode Trick
function encode_trick($skips,$string){
$Sentence = $string;
$encoded = array();
foreach($skips as $item){
	$encoded[] = str_replace(".","&#46",$item);
}
$Sentence = str_replace($skips,$encoded,$Sentence); 

$Sentence_Array = explode(". ",$Sentence);
for($i=0;$i<count($Sentence_Array);$i++){
	$Sentence_Array[$i] = str_replace("&#46",".",$Sentence_Array[$i]);
}
return $Sentence_Array;
}


// Placement Index
function advanced_index($skips,$string){
$Sentence = $string;
// Create the index
$indexes = array();
foreach($skips As $item){
	while(($index = strpos($Sentence, $item)) !== FALSE){
		$indexes[] = array($index,$item);
		$Sentence = substr_replace($Sentence,"",$index,strlen($item));
	}
}

// Get seperate Sentences
$Sentence_Array = explode(". ",$Sentence);

// Prepare variables for insertion
$SA_Count = count($Sentence_Array);
$indexes = array_reverse($indexes);
$current_strlen = 0;

// Loop each index that was stored
Foreach($indexes As $idxarray){
	// Then, Loop for each sentence
	for($i=0;$i<$SA_Count;$i++){
		// Couple variables used in substr calculations
		$previous_strlen = $current_strlen;
		$current_strlen += strlen($Sentence_Array[$i]) + 2; // 2 is here for the dot + space we exploded ealier.

		// If the current str length is bigger than the index then insert in this sentence.
		if($current_strlen > $idxarray[0]){
			// Get position
			$position = ($i > 0)? ($idxarray[0] - $previous_strlen) : $idxarray[0];
			// Insert
			$Sentence_Array[$i] = substr_replace($Sentence_Array[$i],$idxarray[1],$position,0);
			break;
		}
	}
	// Reset for each index.
	$current_strlen = 0;
}
return $Sentence_Array;
}

// Tests
print_r(encode_trick($skipped,$Sentence)); // The result.
print_r(advanced_index($skipped,$Sentence)); // The result.
?>

 

This should get you on your way, the first function is reliant on the fact that the text wont have that specific pattern in the there.

The second has no limitations, and should work in any situation with any text of any patterns.

 

-cb-

 

Edit: You can replace strpos() with stripos() if you want to match any case. Same goes for str_replace(), you can use str_ireplace();

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.