Jump to content

[SOLVED] regex help


doug007

Recommended Posts

Hi Guys

 

I have a file containing entries like so <PostCode>WB5 7RT</PostCode>. so i want to identify all invalid postcodes ant output them. the regex that i need should grep the value between the '<PostCode>WB5 7RT</PostCode>' tags check that the first char is alpha, that it contains hyphens, and that it is greater then 8 characters.

 

i've spent hours, yet no joy. 

 


function parse1()
{
    $file_path= file('xml.txt');
    $file_count = count($file_path); 
    if($file_count==0)
    {
            print('file empty');
    }
   
   $pattern='~\b<PostCode>[a-zA-Z]{1}[a-zA-Z0-9-]{8,}</PostCode>\b~';
   
   $search = preg_quote($pattern);
   
     
     
        $result=preg_grep($pattern, $file_path);

        if ($result) 
	{
		foreach($result as $value)
		{
			echo $value;
		}
	} 
    }


parse1();

Link to comment
https://forums.phpfreaks.com/topic/89918-solved-regex-help/
Share on other sites

data:

<InYearMovements>
	<Employer>
		<Name>Badenoch & Clark</Name>
		<Address>
			<Line>Project House</Line>
			<Line>110-113 Tottenham Court Road</Line>

			<Line>London</Line>
			<PostCode>W1T 5AE</PostCode>

		</Address>
	</Employer>

i wan to identify all invalid PostCodes that are more then 7 characters, or have spaces, have hyphens, but must start with a alpha and output them.

 

 

 

Link to comment
https://forums.phpfreaks.com/topic/89918-solved-regex-help/#findComment-461165
Share on other sites

You can adapt this:

 

<pre>
<?php

$data = <<<DATA
	<Employer>
		<Name>Badenoch & Clark</Name>
		<Address>
			<Line>Project House</Line>
			<Line>110-113 Tottenham Court Road</Line>

			<Line>London</Line>
			<PostCode>W1T 5AE</PostCode>

		</Address>
	</Employer>
DATA;
preg_match_all('%(?<=<PostCode>)(.{8,}|.*?[-\s].*?)(?=</PostCode>)%', $data, $matches);
print_r($matches);
?>
</pre>

Link to comment
https://forums.phpfreaks.com/topic/89918-solved-regex-help/#findComment-461220
Share on other sites

One minor detail: The pattern will only catch 2 or more consecutive instances of whitespace--not just spaces. If they can appear apart, use:

%

(?<=<PostCode>)

(

### 8 characters or more.

[^<]{8,}

|

### A hyphen.

[^<]*?-[^<]*?

|

### More than 1 space.

[^<]*?\s[^<]*?\s[^<]*?

)

(?=</PostCode>)

%x

Link to comment
https://forums.phpfreaks.com/topic/89918-solved-regex-help/#findComment-461276
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.