jmace Posted October 28, 2010 Share Posted October 28, 2010 This seems like it would be a pretty simple problem, but I can't figure it out. I have an error file that pull in lines, and then my php script is suppose to capture important details from it, and then store them. Here is an example of a line from the error file (For reference, I put in brackets where it's suppose to capture): Please provide more data and resubmit. '[merchant_sku]' Merchant value: '[03/31096]' catalog value: '[]' '[manufacturer]' Merchant value: '[]' catalog value: '[Oriental Trading]' '[brand]' Merchant value: '[]' catalog value: '[]' '[item_name]' Merchant value: '[bamboo Limbo Set]' catalog value: '[bamboo 6' Foot Limbo Kit]' '[part_number]' Merchant value: '[]' catalog value: '[iN-34/64]'. For details, see http://website.com My REGEXP looks like this: "/ '(.*?)' Merchant value: '(.*?)' catalog value: '(.*?)'/i" And is works alright, until it reaches "Bamboo 6' Foot Limbo Kit". At that point, it stops at the 6'. I can't figure out how to make it go past that point and capture what it's suppose to. I also need to apply that to the Merchant Value: '(.*?)' just in case it has a ' in it. Any ideas would be great. For reference, here is the rest of my code: $location = './errors.txt'; $file = fopen($location,'r'); $contents = fread($file,filesize($location)); fclose($file); $contents = split("\n",$contents); $att = array(); $x = 0; foreach($contents as $content){ $content = split("\t",$content); if(@isset($content[4])){ if(preg_match("/SKU '(\d+?\/\d+?)' appears to correspond to ASIN/i",$content[4],$match)){ $att[$x]['sku'] = $match[1]; preg_match_all("/ '(.*?)' Merchant value: '(.*?)' catalog value: '(.*?)'/i",$content[4],$match); print_r($match); $x++; } }else{ continue; } } Thank you very much, Jmace Quote Link to comment Share on other sites More sharing options...
akitchin Posted October 28, 2010 Share Posted October 28, 2010 it looks as though every value will also be encased in brackets. could you not add those to the pattern? "/ '\[([.*?)\]' Merchant value: '\[(.*?)\]' catalog value: '\[(.*?)\]'/i" Quote Link to comment Share on other sites More sharing options...
jmace Posted October 28, 2010 Author Share Posted October 28, 2010 Thanks for the quick response. But no, the brackets are just there for reference. In the actual error file they are not there. Sorry for the confusion. Quote Link to comment Share on other sites More sharing options...
akitchin Posted October 28, 2010 Share Posted October 28, 2010 in that case, you might be able to use a look-ahead assertion, which allows for two cases: 1. the closing single-quote in the category value subpattern must be followed by a " '" (space then single-quote) to capture all of the middle data sets, and 2. the closing single-quote in the category value subpattern must be followed by a "." (period) to capture the last data set. give this a shot. i'll admit i haven't tested it, i simply constructed it from the PHP manual entry on Assertions (click to view): "/ '(.*?)' Merchant value: '(.*?)' catalog value: '(.*?)'(?= \'|\.)/i" there is mention of the look-behind assertion needing to match strings of the same fixed length, but i don't know if the same holds true for look-ahead assertions. hope this helps. Quote Link to comment Share on other sites More sharing options...
jmace Posted October 28, 2010 Author Share Posted October 28, 2010 You had a slight typo in your REGEXP, but yes, you were right. Solution: "/ '(.*?)' Merchant value: '(.*?)' catalog value: '(.*?)'(?= |\.)/i" That really helped me out a ton. Thank you so much for all of your help. I will have to study up more on this stuff. Quote Link to comment Share on other sites More sharing options...
akitchin Posted October 28, 2010 Share Posted October 28, 2010 glad it helped - i didn't realize there were two spaces between each data set, i suppose the BB code might have stripped that out. i should mention that if you yourself wrote the script to write to the log file, it might be more handy to use serialize to compress the data that forms part of the error, strip that whole chunk out somehow (perhaps using a set delimiter), and use unserialize to break that data back out into variable form. then again, if this method works just fine now, no need to mess with it. just a thought for future applications. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.