doug007 Posted February 7, 2008 Share Posted February 7, 2008 Hi Guys I have a file containing entries like so <PostCode>WB5 7RT</PostCode>. so i want to identify all invalid postcodes ant output them. the regex that i need should grep the value between the '<PostCode>WB5 7RT</PostCode>' tags check that the first char is alpha, that it contains hyphens, and that it is greater then 8 characters. i've spent hours, yet no joy. function parse1() { $file_path= file('xml.txt'); $file_count = count($file_path); if($file_count==0) { print('file empty'); } $pattern='~\b<PostCode>[a-zA-Z]{1}[a-zA-Z0-9-]{8,}</PostCode>\b~'; $search = preg_quote($pattern); $result=preg_grep($pattern, $file_path); if ($result) { foreach($result as $value) { echo $value; } } } parse1(); Link to comment https://forums.phpfreaks.com/topic/89918-solved-regex-help/ Share on other sites More sharing options...
effigy Posted February 7, 2008 Share Posted February 7, 2008 Only variables being used in an expression should be passed through preg_quote. Link to comment https://forums.phpfreaks.com/topic/89918-solved-regex-help/#findComment-461019 Share on other sites More sharing options...
doug007 Posted February 7, 2008 Author Share Posted February 7, 2008 thanks for the feedback, I've been hammering my head figuring how to do this yet i still can not dont know if anyone can help fix my code? thanks Link to comment https://forums.phpfreaks.com/topic/89918-solved-regex-help/#findComment-461095 Share on other sites More sharing options...
effigy Posted February 7, 2008 Share Posted February 7, 2008 What happens when you remove $search = preg_quote($pattern);? Link to comment https://forums.phpfreaks.com/topic/89918-solved-regex-help/#findComment-461100 Share on other sites More sharing options...
doug007 Posted February 7, 2008 Author Share Posted February 7, 2008 nothing, i get an empty page. Link to comment https://forums.phpfreaks.com/topic/89918-solved-regex-help/#findComment-461127 Share on other sites More sharing options...
effigy Posted February 7, 2008 Share Posted February 7, 2008 What troubleshooting have you done? Is the file being loaded as you expect? Is there any data that will actually match your pattern? Can you show some sample data? Link to comment https://forums.phpfreaks.com/topic/89918-solved-regex-help/#findComment-461157 Share on other sites More sharing options...
doug007 Posted February 7, 2008 Author Share Posted February 7, 2008 data: <InYearMovements> <Employer> <Name>Badenoch & Clark</Name> <Address> <Line>Project House</Line> <Line>110-113 Tottenham Court Road</Line> <Line>London</Line> <PostCode>W1T 5AE</PostCode> </Address> </Employer> i wan to identify all invalid PostCodes that are more then 7 characters, or have spaces, have hyphens, but must start with a alpha and output them. Link to comment https://forums.phpfreaks.com/topic/89918-solved-regex-help/#findComment-461165 Share on other sites More sharing options...
effigy Posted February 7, 2008 Share Posted February 7, 2008 You can adapt this: <pre> <?php $data = <<<DATA <Employer> <Name>Badenoch & Clark</Name> <Address> <Line>Project House</Line> <Line>110-113 Tottenham Court Road</Line> <Line>London</Line> <PostCode>W1T 5AE</PostCode> </Address> </Employer> DATA; preg_match_all('%(?<=<PostCode>)(.{8,}|.*?[-\s].*?)(?=</PostCode>)%', $data, $matches); print_r($matches); ?> </pre> Link to comment https://forums.phpfreaks.com/topic/89918-solved-regex-help/#findComment-461220 Share on other sites More sharing options...
doug007 Posted February 7, 2008 Author Share Posted February 7, 2008 this works, but it outputs all PostCode values. I only want to output PostCode values thus greater then 8, have hyphens, and a space. i looked at your regex, but dont really understand it to modify it myself. many thanks for you help Link to comment https://forums.phpfreaks.com/topic/89918-solved-regex-help/#findComment-461237 Share on other sites More sharing options...
effigy Posted February 7, 2008 Share Posted February 7, 2008 Try this: %(?<=<PostCode>)([^<]{8,}|[^<]*?[-\s][^<]*?)(?=</PostCode>)% Link to comment https://forums.phpfreaks.com/topic/89918-solved-regex-help/#findComment-461241 Share on other sites More sharing options...
doug007 Posted February 7, 2008 Author Share Posted February 7, 2008 no difference. it should output like so: <PostCode>W1T 5AEY</PostCode> -> invalid output more then 8 <PostCode>W1T 6AE</PostCode> -> valid dont output <PostCode>W1T 6A-E</PostCode> -> invalid output has hyphen Link to comment https://forums.phpfreaks.com/topic/89918-solved-regex-help/#findComment-461249 Share on other sites More sharing options...
effigy Posted February 7, 2008 Share Posted February 7, 2008 You had said that the code needs to be output if it contains a space, that's why W1T 6AE is being output. Link to comment https://forums.phpfreaks.com/topic/89918-solved-regex-help/#findComment-461252 Share on other sites More sharing options...
doug007 Posted February 7, 2008 Author Share Posted February 7, 2008 damn sorry ??? i meant to say if a additional space makes the postcode greater then 8 then print it otherwise if it has a space but total chars are less then 8 then its valid. Link to comment https://forums.phpfreaks.com/topic/89918-solved-regex-help/#findComment-461253 Share on other sites More sharing options...
effigy Posted February 7, 2008 Share Posted February 7, 2008 How about this? The code is shown if it is 8 characters or more, contains a hyphen, or contains two or more spaces. %(?<=<PostCode>)([^<]{8,}|[^<]*?(?:-|\s{2,})[^<]*?)(?=</PostCode>)% Link to comment https://forums.phpfreaks.com/topic/89918-solved-regex-help/#findComment-461258 Share on other sites More sharing options...
doug007 Posted February 7, 2008 Author Share Posted February 7, 2008 effigy you dog you done it million thanks.....i can sleep fine now.... cheers dug Link to comment https://forums.phpfreaks.com/topic/89918-solved-regex-help/#findComment-461259 Share on other sites More sharing options...
effigy Posted February 7, 2008 Share Posted February 7, 2008 One minor detail: The pattern will only catch 2 or more consecutive instances of whitespace--not just spaces. If they can appear apart, use: % (?<=<PostCode>) ( ### 8 characters or more. [^<]{8,} | ### A hyphen. [^<]*?-[^<]*? | ### More than 1 space. [^<]*?\s[^<]*?\s[^<]*? ) (?=</PostCode>) %x Link to comment https://forums.phpfreaks.com/topic/89918-solved-regex-help/#findComment-461276 Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.