Jump to content

Recommended Posts

preg_match('/^[A-Z][\'|A-Z][-|A-Z]{5}/i', $surname)

Hi I need a regex that matches first 2 and remaining characters to the end of the string.

 

This will Match O'Brian but also O'Brian£$%^&*()

 

As far as it is concerned it only has to match 7 characters. I want all remaining to be matched. I can not guarantee the length of the surname but I don't want some idiot messing up my program and trying to trash it with garbage. I can limit the length within reason for the database.

 

I could at a push get the string length but could I put the variable into the string?

Edited by otuatail
Link to comment
https://forums.phpfreaks.com/topic/288865-need-a-za-z-to-the-end-of-the-string/
Share on other sites

That's a bad way to handle names. You should never modify a name that has been submitted (with the exception of trim()). If you need to restrict certain input, then add validation and error handling.

 

Also, for names, No matter what logic you come up with there will be some people's names that would be considered invalid. People hate that. Why would you assume that the apostrophe would only ever be in the second position? What about accented letters: ö, ò, û, etc.?

The quality of the responses received is directly proportional to the quality of the question asked.

No one will be entering Ö ò û. So my question still remains unanswered. I am not requesting the rites and wrongs of someone's names.

 

This above is obviously meaningless :})

Edited by otuatail

No reason to get bent out of shape. If you are asking for free help you have no room to complain on how that help is provided.

 

 

$surname = "O'Brian";
preg_match('/^[A-Z][\'|A-Z][-|A-Z]{5}.*/i', $surname, $match);
echo $match[0]; //Output: O'Brian
 
$surname = "O'Brian£$%^&*()";
preg_match('/^[A-Z][\'|A-Z][-|A-Z]{5}.*/i', $surname, $match);
echo $match[0]; //Output: O'Brian£$%^&*()
This works without regexes preg_match

$flag = 1;
$expresion = "ABCDEFGHIJKLMNOPQRSTUVWXYZ-'";
for($x=0; $x < $count; $x++)
{
	$val = substr($upper, $x, 1);
	if(strpos($expresion, $val) == false)
		$flag =0;
}

// If flag = 0 then this is a stupid name and to be ignored.
$surname = "O'Brian£$%^&*()";

if(preg_match('/^[A-Z][\'|A-Z][-|A-Z]{5}.*/i', $surname)) 
{
	echo "This is a match";
}

else
{
	echo "This does not match";
}

Sorry this does not work. This should not match "O'Brian

£$%^&*(). Whoever heard of someone with a name like that.

 

This works

O'Brian-Stevenson 
This does not match 

O'Brian£$%^&*() 
This does not match 

// Code
$surname = "O'Brian£$%^&*()";
//$surname = "O'Brian-Stevenson";

if(preg_match('/^[A-Z][\'|A-Z][-|A-Z]{5}*/i', $surname)) 
{
	echo "$surname <br>";
	echo "This is a match";
}

else
{
	echo "$surname <br>";
	echo "This does not match";
}



Sorry .Josh but just having a * at the end does not work either as a good example also fails. You can have hyphenated names so O'Brian-Stevenson should be matched. What I need is that last square bracket to carry on to the end of the string matching any letter (I have uppercased for the test) and the hyphen as I don't know where it would be but I can do a separate count on that.

 

 

Ps. Why does this forum /forums/34-php-regex not appear on the home page of forums. There is no regex section there.

Edited by otuatail

 

 

Sorry .Josh but just having a * at the end does not work either as a good example also fails.

Wrong character. You need to read .josh post again. This is the output of your code with the regex ended with a $

O'Brian£$%^&*() 
This does not match

 

 

Ps. Why does this forum /forums/34-php-regex not appear on the home page of forums. There is no regex section there.

Because it is a sub forum to the PHP Help forum.

Edited by Ch0cu3r

As Ch0cu3r pointed out, I said $ as in dollar sign not * as in star. Also, you still aren't being very clear about what you want to match, but in general, I am guessing what you really want here is to match for something like this:

 

// Code
$surname = "O'Brian£$%^&*()";
//$surname = "O'Brian-Stevenson";

if(preg_match("/^([a-z]')?[a-z]+(-[a-z]+)?$/i", $surname)) 
{
	echo "$surname <br>";
	echo "This is a match";
}

else
{
	echo "$surname <br>";
	echo "This does not match";
}
^([a-z]')?[a-z]+(-[a-z]+)?$

 

^ matches for beginning of string

([a-z]') matches for one letter followed by an apostrophe

? makes that previous match optional

[a-z]+ matches for one or more letters

(-[a-z]+) matches for a hyphen followed by one or more letters

? makes that previous match optional

$ matches for end of string

 

Overall, this will allow for surnames with an O' or M' prefix (or whatever other letter, though I think O and M are the only ones out there), and will also allow for single-hyphenated surnames (multiple hyphens not allowed). This will NOT match for any special letters (e.g. letters with accents). This will NOT match for prefixes that have a space between them and the main surname (e.g. Mac Cartaine)

 

This will match:

 

Brian

O'Brian

Brian-Stevenson

O'Brian-Stevenson

 

This will not match:

 

O'Brian£$%^&*()

Brian-Stevenson-foobar

Mac Cartaine

Mac Cárthaigh

On a sidenote, I wanted to point out a few things about your original regex:

 

preg_match('/^[A-Z][\'|A-Z][-|A-Z]{5}/i', $surname)
So firstly, you have this: [\'|A-Z]

 

It looks like the intention here is to match for a quote or a letter. You used a pipe in there to signify this. That is not how character classes work. Character classes do not use a pipe for alternation like you do elsewhere in a regex pattern, because everything in a character class is essentially an alternation. So a pipe has no special meaning in a character class context. Which means your pattern would allow for a literal pipe in the name to be matched. Same thing with [-|A-Z]. On that note..

 

2nd, you had [-|A-Z]{5}. {5} is a quantifier. It specifies how many of the previous to match. So you specified to match exactly 5 hyphens, pipes or letters. In your later posts, you show that surnames with more than 5 characters should match, so this is also wrong.

 

The regex I posted above will allow for basically any length (minimum 1 char) to be matched. So technically my regex will match this:

 

O'Brian-somereallylooooooooooooooooooooooooooongname

 

There is not an easy way to limit how long it can actually be, given the rest of the regex. It is possible to both limit the total length and ensure there is only one hyphen but this will make the regex significantly more complex. The regex could alternatively be made to limit it and still be somewhat simple if we were to remove the restriction on how many hyphens can appear, but then that is not ideal either. So, overall, if you really want to limit the length, it would be a lot easier if you were to just follow up with a strlen check.

otuatail,

 

Yeah, I don't think the requests have been very clear. The original post stated

 

 

preg_match('/^[A-Z][\'|A-Z][-|A-Z]{5}/i', $surname)

Hi I need a regex that matches first 2 and remaining characters to the end of the string.

 

This will Match O'Brian but also O'Brian£$%^&*()

 

The part in red is FALSE and is what started the confusion for me. That regex will not match "O'Brian£$%^&*()", it will only match the first part "O'Brian" and discard the rest.

$surname = "O'Brian£$%^&*()";
preg_match('/^[A-Z][\'|A-Z][-|A-Z]{5}/i', $surname, $match);
echo $match[0]; //Output: O'Brian£$%^&*()

So, I interpreted your request as you were needing the rest of that string. That was my poor interpretation based upon poorly stated requirements. So, I stand behind the statement

 

 

The quality of the responses received is directly proportional to the quality of the question asked.

 

I think what you meant to ask was that you want the regex to ONLY match entire values where the last part of the expression matches ALL of the remaining characters and there has to be at least 5. That is easily solved a couple of ways.

 

First, you need to add the dollar sign to the end of the expression  (before the /) to tell the expression that whatever it matches must continue to the end of the string

 

Second, you need to add logic to match "approved" values past the 5 that it is currently set for. This can be accomplished two ways:

 

1. The {5} tells the expression to match 5 and only 5 characters. You may know that you can put two values in the curly braces separated by a comma such as {5,10}. That tells the expression that it can match that expression from 5 to 10 times. Well, you can just put the comma in and that tells the expression it must match a minimum of 5 characters to an infinite maximum.

 

2. You could repeat the last condition and use a * to say it should match 0 or infinite number of matches.

 

Lastly, the [] are being used to define characters to be matched. You are including the pipe character | and I believe you are doing so to be an OR condition. For example, in this [-|A-Z] I think you mean for that to match an underscore OR a letter. However, that doesn't work that way within the brackets. That is telling the regex to match an underscore, a pipe, or a letter. In other words, you don't need the pipe in there. Otherwise you would match "O|Brian" and, using your words, "Whoever heard of someone with a name like that."?

 

Either of these will do what you are asking.

/^[A-Z][\'A-Z][-A-Z]{5,}$/i
/^[A-Z][\'A-Z][-A-Z]{5}[-A-Z]*$/i

However, those will exclude real names. As I said before, trying to verify names is only going to end up invalidating names of real people. But, it's your application, so do what you want.

 

Here are few real names that would not create a match based upon the rules you created:

 

O'Day : Too short since the rules are requiring it to be at least 7 characters

De la Hoya : The rules aren't allowing for spaces

Day-O'Connor : The rules only allow the apostrophe as the 2nd character

Venegás : The rules don't allow characters with diacritics

Edited by Psycho

Yes, overall I agree with Psycho. Sure, it's exceedingly rare, but there is no legal limitation for a person's legal name being "O'Brian£$%^&*()". Famous example: The musician Prince, who at one point changed his name to some random symbol and consequently was referred to "The artist formerly known as Prince".

 

More common examples which are very common, are examples that Psycho listed, as well as names with letters with accents and other symbols above them. And there are no legal limits to name lengths. Many people (especially Asians and Latin Americans) have extremely short names like Xo or Xu and can even be multiple with spaces e.g. "Jo Ra Xu" or "De La Hoya". So, the "best practice" is to only enforce an upper string length, even though there is technically no legal limitations to this either (but at some point you have to be reasonable for sake of storage limits. e.g. it's not reasonable to have to allot a varchar(1000) field for surnames because of the 1 in a million person with a name that long).

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.