Jump to content

[SOLVED] regex help


doug007

Recommended Posts

Hi Guys

 

I have a file containing entries like so <PostCode>WB5 7RT</PostCode>. so i want to identify all invalid postcodes ant output them. the regex that i need should grep the value between the '<PostCode>WB5 7RT</PostCode>' tags check that the first char is alpha, that it contains hyphens, and that it is greater then 8 characters.

 

i've spent hours, yet no joy. 

 


function parse1()
{
    $file_path= file('xml.txt');
    $file_count = count($file_path); 
    if($file_count==0)
    {
            print('file empty');
    }
   
   $pattern='~\b<PostCode>[a-zA-Z]{1}[a-zA-Z0-9-]{8,}</PostCode>\b~';
   
   $search = preg_quote($pattern);
   
     
     
        $result=preg_grep($pattern, $file_path);

        if ($result) 
	{
		foreach($result as $value)
		{
			echo $value;
		}
	} 
    }


parse1();

Link to comment
Share on other sites

data:

<InYearMovements>
	<Employer>
		<Name>Badenoch & Clark</Name>
		<Address>
			<Line>Project House</Line>
			<Line>110-113 Tottenham Court Road</Line>

			<Line>London</Line>
			<PostCode>W1T 5AE</PostCode>

		</Address>
	</Employer>

i wan to identify all invalid PostCodes that are more then 7 characters, or have spaces, have hyphens, but must start with a alpha and output them.

 

 

 

Link to comment
Share on other sites

You can adapt this:

 

<pre>
<?php

$data = <<<DATA
	<Employer>
		<Name>Badenoch & Clark</Name>
		<Address>
			<Line>Project House</Line>
			<Line>110-113 Tottenham Court Road</Line>

			<Line>London</Line>
			<PostCode>W1T 5AE</PostCode>

		</Address>
	</Employer>
DATA;
preg_match_all('%(?<=<PostCode>)(.{8,}|.*?[-\s].*?)(?=</PostCode>)%', $data, $matches);
print_r($matches);
?>
</pre>

Link to comment
Share on other sites

One minor detail: The pattern will only catch 2 or more consecutive instances of whitespace--not just spaces. If they can appear apart, use:

%

(?<=<PostCode>)

(

### 8 characters or more.

[^<]{8,}

|

### A hyphen.

[^<]*?-[^<]*?

|

### More than 1 space.

[^<]*?\s[^<]*?\s[^<]*?

)

(?=</PostCode>)

%x

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.