doug007 Posted February 7, 2008 Share Posted February 7, 2008 Hi Guys I have a file containing entries like so <PostCode>WB5 7RT</PostCode>. so i want to identify all invalid postcodes ant output them. the regex that i need should grep the value between the '<PostCode>WB5 7RT</PostCode>' tags check that the first char is alpha, that it contains hyphens, and that it is greater then 8 characters. i've spent hours, yet no joy. function parse1() { $file_path= file('xml.txt'); $file_count = count($file_path); if($file_count==0) { print('file empty'); } $pattern='~\b<PostCode>[a-zA-Z]{1}[a-zA-Z0-9-]{8,}</PostCode>\b~'; $search = preg_quote($pattern); $result=preg_grep($pattern, $file_path); if ($result) { foreach($result as $value) { echo $value; } } } parse1(); Quote Link to comment https://forums.phpfreaks.com/topic/89918-solved-regex-help/ Share on other sites More sharing options...
effigy Posted February 7, 2008 Share Posted February 7, 2008 Only variables being used in an expression should be passed through preg_quote. Quote Link to comment https://forums.phpfreaks.com/topic/89918-solved-regex-help/#findComment-461019 Share on other sites More sharing options...
doug007 Posted February 7, 2008 Author Share Posted February 7, 2008 thanks for the feedback, I've been hammering my head figuring how to do this yet i still can not dont know if anyone can help fix my code? thanks Quote Link to comment https://forums.phpfreaks.com/topic/89918-solved-regex-help/#findComment-461095 Share on other sites More sharing options...
effigy Posted February 7, 2008 Share Posted February 7, 2008 What happens when you remove $search = preg_quote($pattern);? Quote Link to comment https://forums.phpfreaks.com/topic/89918-solved-regex-help/#findComment-461100 Share on other sites More sharing options...
doug007 Posted February 7, 2008 Author Share Posted February 7, 2008 nothing, i get an empty page. Quote Link to comment https://forums.phpfreaks.com/topic/89918-solved-regex-help/#findComment-461127 Share on other sites More sharing options...
effigy Posted February 7, 2008 Share Posted February 7, 2008 What troubleshooting have you done? Is the file being loaded as you expect? Is there any data that will actually match your pattern? Can you show some sample data? Quote Link to comment https://forums.phpfreaks.com/topic/89918-solved-regex-help/#findComment-461157 Share on other sites More sharing options...
doug007 Posted February 7, 2008 Author Share Posted February 7, 2008 data: <InYearMovements> <Employer> <Name>Badenoch & Clark</Name> <Address> <Line>Project House</Line> <Line>110-113 Tottenham Court Road</Line> <Line>London</Line> <PostCode>W1T 5AE</PostCode> </Address> </Employer> i wan to identify all invalid PostCodes that are more then 7 characters, or have spaces, have hyphens, but must start with a alpha and output them. Quote Link to comment https://forums.phpfreaks.com/topic/89918-solved-regex-help/#findComment-461165 Share on other sites More sharing options...
effigy Posted February 7, 2008 Share Posted February 7, 2008 You can adapt this: <pre> <?php $data = <<<DATA <Employer> <Name>Badenoch & Clark</Name> <Address> <Line>Project House</Line> <Line>110-113 Tottenham Court Road</Line> <Line>London</Line> <PostCode>W1T 5AE</PostCode> </Address> </Employer> DATA; preg_match_all('%(?<=<PostCode>)(.{8,}|.*?[-\s].*?)(?=</PostCode>)%', $data, $matches); print_r($matches); ?> </pre> Quote Link to comment https://forums.phpfreaks.com/topic/89918-solved-regex-help/#findComment-461220 Share on other sites More sharing options...
doug007 Posted February 7, 2008 Author Share Posted February 7, 2008 this works, but it outputs all PostCode values. I only want to output PostCode values thus greater then 8, have hyphens, and a space. i looked at your regex, but dont really understand it to modify it myself. many thanks for you help Quote Link to comment https://forums.phpfreaks.com/topic/89918-solved-regex-help/#findComment-461237 Share on other sites More sharing options...
effigy Posted February 7, 2008 Share Posted February 7, 2008 Try this: %(?<=<PostCode>)([^<]{8,}|[^<]*?[-\s][^<]*?)(?=</PostCode>)% Quote Link to comment https://forums.phpfreaks.com/topic/89918-solved-regex-help/#findComment-461241 Share on other sites More sharing options...
doug007 Posted February 7, 2008 Author Share Posted February 7, 2008 no difference. it should output like so: <PostCode>W1T 5AEY</PostCode> -> invalid output more then 8 <PostCode>W1T 6AE</PostCode> -> valid dont output <PostCode>W1T 6A-E</PostCode> -> invalid output has hyphen Quote Link to comment https://forums.phpfreaks.com/topic/89918-solved-regex-help/#findComment-461249 Share on other sites More sharing options...
effigy Posted February 7, 2008 Share Posted February 7, 2008 You had said that the code needs to be output if it contains a space, that's why W1T 6AE is being output. Quote Link to comment https://forums.phpfreaks.com/topic/89918-solved-regex-help/#findComment-461252 Share on other sites More sharing options...
doug007 Posted February 7, 2008 Author Share Posted February 7, 2008 damn sorry ??? i meant to say if a additional space makes the postcode greater then 8 then print it otherwise if it has a space but total chars are less then 8 then its valid. Quote Link to comment https://forums.phpfreaks.com/topic/89918-solved-regex-help/#findComment-461253 Share on other sites More sharing options...
effigy Posted February 7, 2008 Share Posted February 7, 2008 How about this? The code is shown if it is 8 characters or more, contains a hyphen, or contains two or more spaces. %(?<=<PostCode>)([^<]{8,}|[^<]*?(?:-|\s{2,})[^<]*?)(?=</PostCode>)% Quote Link to comment https://forums.phpfreaks.com/topic/89918-solved-regex-help/#findComment-461258 Share on other sites More sharing options...
doug007 Posted February 7, 2008 Author Share Posted February 7, 2008 effigy you dog you done it million thanks.....i can sleep fine now.... cheers dug Quote Link to comment https://forums.phpfreaks.com/topic/89918-solved-regex-help/#findComment-461259 Share on other sites More sharing options...
effigy Posted February 7, 2008 Share Posted February 7, 2008 One minor detail: The pattern will only catch 2 or more consecutive instances of whitespace--not just spaces. If they can appear apart, use: % (?<=<PostCode>) ( ### 8 characters or more. [^<]{8,} | ### A hyphen. [^<]*?-[^<]*? | ### More than 1 space. [^<]*?\s[^<]*?\s[^<]*? ) (?=</PostCode>) %x Quote Link to comment https://forums.phpfreaks.com/topic/89918-solved-regex-help/#findComment-461276 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.