Jump to content

Can regex help in converting all caps records to sentence case


swatisonee

Recommended Posts

Hi,

I have users who enter data in all caps all the time ! I would like to do 2 things :

a.) Find a way  I can convert all data in a field of a table from ALL CAPS to Sentence case

b.) Enforce Sentence case for all future data entries.

The fields are either of text or varchar type.

c.) I would also like the phone number field to be recorded as say +1 000 111 222 instead of 000-111222 or 000111222 or (000)111-222 etc.

How would i do these with regex please ?

Thanks. Swati
Link to comment
Share on other sites

Swati, for the phone number record, you can really do this quite easily. First, simply remove everything that is not a digit, then use substr() to grab the sections you need. For instance, a function like this would help you:
[code]
<?php
// Format phone number to +1 000 111 222
function formatPhone($num) {
  $ph = preg_replace('|[^0-9]|', '', $num);
  if (strlen($ph) != 9) {
    // invalid number of digits, return false
    return false;
  } else {
    $ph = "+1 " . substr($ph, 0, 3) . ' ' . substr($ph, 3, 3) . ' ' . substr($ph, 6);
    return $ph;
  }
}

echo formatPhone("000-111222") . '<br />';
echo formatPhone("000111222") . '<br />';
echo formatPhone("(000)111-222") . '<br />';
?>
[/code]

Now, for your other questions, you'll have to define your rules for what you consider [i]sentence case[/i], but yes, it should be doable. For instance, making the first character following a period, exclamation point, question mark, quotation mark, etc. to uppercase should be easy enough; however, you need to take into consideration what you'll do if someone is referencing a title of a work of art that needs multiple words capitalized. Or, what if someone uses the "etc" like I just have above, or even quotes a word like I have in this sentence? Those are all going to be exceptions, and without some sort of rules set up to train your users in, you'll have some things not displaying properly.

If you'll come up with a concrete set of rules for capitalization, I'm sure we can help you build something workable.
Link to comment
Share on other sites

Without exeptions (like etc.):
[code=php:0]$text = 'your uncapitalized text. it also contains question marks? oh yeah, exclamation marks too!';

function capitalize_first ($match){
  return ucfirst($match[1]);
}

function capitalize ($match){
  return $match[1] . $match[2] . ucfirst($match[3]);
}

// Capitalize the first word of a text
$text = preg_replace_callback ('[^(\w)*]', 'capitalize_first', $text);
// Capitalize every word following  a dot, a question mark or an exclamation mark
$text = preg_replace_callback ('[(\?|\!|\.)(\s)*(\w*)]', 'capitalize', $text);[/code]

I've tested it, and it works.
Link to comment
Share on other sites

Vinze, that's a great start, but you really need to account for all sorts of things besides simply ending punctuation. Typically, you'll want to capitalize the first word within double quotes. You also want to allow for other characters between the ending punctuation and the first letter of the next sentence (such as parenthesis).
Link to comment
Share on other sites

[quote author=obsidian link=topic=120986.msg498117#msg498117 date=1168089378]
Vinze, that's a great start, but you really need to account for all sorts of things besides simply ending punctuation. Typically, you'll want to capitalize the first word within double quotes. You also want to allow for other characters between the ending punctuation and the first letter of the next sentence (such as parenthesis).
[/quote]

Of course, but I felt like doing a bit regular expressions so I though "a start won't hurt".

But OK, put all your desired exceptions in the array $exceptions like I did, and the letters after it won't be capitalized (first word withing " not implemented):
[code=php:0]$text = 'your uncapitalized text. it also contains question marks? oh yeah, etc. exclamation marks too! ';

$exceptions = array('etc.', 'other exception?', 'more exceptions!');
$pattern = '';
foreach($exceptions as $exception){
  $pattern .= '|' . $exception;
}
// Remove the first '|'
$pattern = substr($pattern, 1, strlen($pattern) -1);
$pattern = '[' . $pattern . ']';
// Backup all the exceptions to $exceptions, then replace them by "sometextyoudonotuse"
preg_match_all($pattern, $text, $exceptions);
$text = preg_replace($pattern, 'sometextyoudonotuse', $text);

function capitalize_first ($match){
  return strtoupper($match[1]);
}

function capitalize ($match){
  return $match[1] . $match[2] . strtoupper($match[3]);
}

// Capitalize the first word of a text
$text = preg_replace_callback ('[^(\w)]', 'capitalize_first', $text);
// Capitalize every word following  a dot, a question mark or an exclamation mark
$text = preg_replace_callback ('[(\?|\!|\.)(\s)*(\w)]', 'capitalize', $text);
// Put back the exceptions from $exceptions
$i = 0;
while(strpos($text, 'sometextyoudonotuse') !== false){
  $text = preg_replace('[sometextyoudonotuse]', $exceptions[0][$i], $text, 1);
  $i++;
}[/code]
Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.