Jump to content

[SOLVED] EREG HELP!


Rottingham

Recommended Posts

Ok, This is my first attempt at using regex.. I've been coding for years and can't believe I survived this long, but the day is come that I must accept it... So I'm having a problem... I am trying to interpret 6000 lines of the following style

 

"792036": { d:"Holmes Heavy Duty Slide Bolt", p:"9.11", q:"1" },

 

My goal is to extract the first 6 numbers, everything withing the quotes after d: and everthing in the quotes after the p:

 

In this line, I'm hoping to get an array like such

 

array[0] = 792036

array[1] = Holmes Heavy Duty Slide Bolt

array[2] = 9.11

 

My regex statement of:

ereg("\"[1-9]{6}\"", $line, $regs);

 

gives me some encouraging results but frustating also...

 

Warning: Invalid argument supplied for foreach() in /home/macactio/public_html/topsoftweb/test.php on line 48
"101303": { d:"4015 NS 13 Key Blank", p:"0.00", q:"0" }, 

Warning: Invalid argument supplied for foreach() in /home/macactio/public_html/topsoftweb/test.php on line 48
"270625": { d:"Dor-O-Matic SCREW.1022 Dog Screws (25 Pack)", p:"64.00", q:"0" }, 
"271766" "271766": { d:"BRIG 691259 Key Blank, High Security", p:"45.00", q:"0" }, 

 

You will notice that my first line gets a warning, the second line gets a warning, but then the third line actually gets the number, albeit the " signs too. I need to remove the \"...\" from my regex expression. I can't figure out a) why it works on some of the lines and not others, and b) how will I get the rest of the parts I need? I think I have an idea but I'm going to show my ever simple function to see if someone can tell me why it works some times, and not on every line...

 

foreach($file_lines as $line)
{
unset($regs);
// Interpret Line
// "792036": { d:"Holmes Heavy Duty Slide Bolt", p:"9.11", q:"1" },

ereg("\"[1-9]{6}\"", $line, $regs); 

foreach($regs as $reg)
	echo $reg." ";

echo $line; echo "<br>"; 
}       

Link to comment
Share on other sites

Here's a preg (PCRE) solution, I hear its faster than ereg (POSIX)

you can use preg_match_all() to grab all the matches in 1 parse

 

~"([0-9]{6})": { d:"([^"]+)", p:"([0-9]\.[0-9]{2})", q:"([0-9])" }~

 

tested:

http://nancywalshee03.freehostia.com/regextester/regex_tester.php?seeSaved=yyvefddg

 

 

I also noticed in your error report above one of you lines of data does not follow the format you specified:

"271766" "271766": { d:"BRIG 691259 Key Blank, High Security", p:"45.00", q:"0" }

Link to comment
Share on other sites

Thanks man, looks like a little more success... I changed my code to the following

 

foreach($file_lines as $line)
{
// Interpret Line
// "792036": { d:"Holmes Heavy Duty Slide Bolt", p:"9.11", q:"1" },

// Places three space separated words into $regs[1], $regs[2] and $regs[3].
//ereg("[0-9]{6}", $line, $regs); 
preg_match_all('~"([0-9]{6})": { d:"([^"]+)", p:"([0-9]\.[0-9]{2})", q:"([0-9])" }~', $line, $regs, PREG_SET_ORDER);

echo $regs[0][1].' ';

echo $line; echo "<br>"; 
}    

 

You can see the results here:

http://macaction.org/topsoftweb/test.php

 

Unfortunately, it still works on some lines but not the others.

Link to comment
Share on other sites

you need to show your parse/input data you're working with, the regex I supplied worked fine with what you showed, I cannot see what your problem if I don't see the input data

 

instead of going though each line of the input data:

foreach($file_lines as $line)

 

read it:

<?php
$data = file_get_contents('whatever.txt');
$pat = '~"([0-9]{6})": { d:"([^"]+)", p:"([0-9]\.[0-9]{2})", q:"([0-9])" }~';
preg_match_all($pat, $data, $out);
foreach ($out[0] as $k => $fullMatch) {
$num = $out[1][$k];
$d = $out[2][$k];
$p = $out[3][$k];
$q = $out[4][$k];
echo "$num<br>$d<br>$p<br>$q<br><br>";
}
?>

 

The matches array in my website is verbatim the same array that you will see spit out in the $out array from preg_match_all() with no special flags set.

Link to comment
Share on other sites

this works:

~"([0-9]*)": { d:"([^"]*)", p:"([0-9]{1,}\.[0-9]{2})", q:"([0-9]{1,})" }~

 

it was because you had varying formats in your data

 

change + to * because the d:.. could be blank

 

changed p to accept 1 or more digits {1,}

 

changed the others accordingly..

 

<a href="http://www.regular-expressions.info/reference.html">Here some a reference to simple regex symbols/terms</a>

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.