Rottingham Posted December 27, 2007 Share Posted December 27, 2007 Ok, This is my first attempt at using regex.. I've been coding for years and can't believe I survived this long, but the day is come that I must accept it... So I'm having a problem... I am trying to interpret 6000 lines of the following style "792036": { d:"Holmes Heavy Duty Slide Bolt", p:"9.11", q:"1" }, My goal is to extract the first 6 numbers, everything withing the quotes after d: and everthing in the quotes after the p: In this line, I'm hoping to get an array like such array[0] = 792036 array[1] = Holmes Heavy Duty Slide Bolt array[2] = 9.11 My regex statement of: ereg("\"[1-9]{6}\"", $line, $regs); gives me some encouraging results but frustating also... Warning: Invalid argument supplied for foreach() in /home/macactio/public_html/topsoftweb/test.php on line 48 "101303": { d:"4015 NS 13 Key Blank", p:"0.00", q:"0" }, Warning: Invalid argument supplied for foreach() in /home/macactio/public_html/topsoftweb/test.php on line 48 "270625": { d:"Dor-O-Matic SCREW.1022 Dog Screws (25 Pack)", p:"64.00", q:"0" }, "271766" "271766": { d:"BRIG 691259 Key Blank, High Security", p:"45.00", q:"0" }, You will notice that my first line gets a warning, the second line gets a warning, but then the third line actually gets the number, albeit the " signs too. I need to remove the \"...\" from my regex expression. I can't figure out a) why it works on some of the lines and not others, and b) how will I get the rest of the parts I need? I think I have an idea but I'm going to show my ever simple function to see if someone can tell me why it works some times, and not on every line... foreach($file_lines as $line) { unset($regs); // Interpret Line // "792036": { d:"Holmes Heavy Duty Slide Bolt", p:"9.11", q:"1" }, ereg("\"[1-9]{6}\"", $line, $regs); foreach($regs as $reg) echo $reg." "; echo $line; echo "<br>"; } Link to comment https://forums.phpfreaks.com/topic/83315-solved-ereg-help/ Share on other sites More sharing options...
dsaba Posted December 27, 2007 Share Posted December 27, 2007 Here's a preg (PCRE) solution, I hear its faster than ereg (POSIX) you can use preg_match_all() to grab all the matches in 1 parse ~"([0-9]{6})": { d:"([^"]+)", p:"([0-9]\.[0-9]{2})", q:"([0-9])" }~ tested: http://nancywalshee03.freehostia.com/regextester/regex_tester.php?seeSaved=yyvefddg I also noticed in your error report above one of you lines of data does not follow the format you specified: "271766" "271766": { d:"BRIG 691259 Key Blank, High Security", p:"45.00", q:"0" } Link to comment https://forums.phpfreaks.com/topic/83315-solved-ereg-help/#findComment-423888 Share on other sites More sharing options...
Rottingham Posted December 27, 2007 Author Share Posted December 27, 2007 Thanks man, looks like a little more success... I changed my code to the following foreach($file_lines as $line) { // Interpret Line // "792036": { d:"Holmes Heavy Duty Slide Bolt", p:"9.11", q:"1" }, // Places three space separated words into $regs[1], $regs[2] and $regs[3]. //ereg("[0-9]{6}", $line, $regs); preg_match_all('~"([0-9]{6})": { d:"([^"]+)", p:"([0-9]\.[0-9]{2})", q:"([0-9])" }~', $line, $regs, PREG_SET_ORDER); echo $regs[0][1].' '; echo $line; echo "<br>"; } You can see the results here: http://macaction.org/topsoftweb/test.php Unfortunately, it still works on some lines but not the others. Link to comment https://forums.phpfreaks.com/topic/83315-solved-ereg-help/#findComment-423899 Share on other sites More sharing options...
dsaba Posted December 27, 2007 Share Posted December 27, 2007 you need to show your parse/input data you're working with, the regex I supplied worked fine with what you showed, I cannot see what your problem if I don't see the input data instead of going though each line of the input data: foreach($file_lines as $line) read it: <?php $data = file_get_contents('whatever.txt'); $pat = '~"([0-9]{6})": { d:"([^"]+)", p:"([0-9]\.[0-9]{2})", q:"([0-9])" }~'; preg_match_all($pat, $data, $out); foreach ($out[0] as $k => $fullMatch) { $num = $out[1][$k]; $d = $out[2][$k]; $p = $out[3][$k]; $q = $out[4][$k]; echo "$num<br>$d<br>$p<br>$q<br><br>"; } ?> The matches array in my website is verbatim the same array that you will see spit out in the $out array from preg_match_all() with no special flags set. Link to comment https://forums.phpfreaks.com/topic/83315-solved-ereg-help/#findComment-423902 Share on other sites More sharing options...
Rottingham Posted December 27, 2007 Author Share Posted December 27, 2007 Hmm... You can view my results again at http://macaction.org/topsoftweb/test.php You can view the source file at http://macaction.org/topsoftweb/parts_prices.txt There are 6997 lines in that file of those parts, and I'm only getting 1400 results. I'm not sure what the deal is. Link to comment https://forums.phpfreaks.com/topic/83315-solved-ereg-help/#findComment-423905 Share on other sites More sharing options...
dsaba Posted December 27, 2007 Share Posted December 27, 2007 this works: ~"([0-9]*)": { d:"([^"]*)", p:"([0-9]{1,}\.[0-9]{2})", q:"([0-9]{1,})" }~ it was because you had varying formats in your data change + to * because the d:.. could be blank changed p to accept 1 or more digits {1,} changed the others accordingly.. <a href="http://www.regular-expressions.info/reference.html">Here some a reference to simple regex symbols/terms</a> Link to comment https://forums.phpfreaks.com/topic/83315-solved-ereg-help/#findComment-423911 Share on other sites More sharing options...
Rottingham Posted December 27, 2007 Author Share Posted December 27, 2007 Thanks a ton! That did the trick. I really appreciate your help. Link to comment https://forums.phpfreaks.com/topic/83315-solved-ereg-help/#findComment-423923 Share on other sites More sharing options...
dsaba Posted December 27, 2007 Share Posted December 27, 2007 *edited last post glad to help.. return the favor on the forums.. Link to comment https://forums.phpfreaks.com/topic/83315-solved-ereg-help/#findComment-423931 Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.