Jump to content

Capture parenthesis containing specific word


ChenXiu

Recommended Posts

Desired:
Use preg_match to capture only the parentheses containing the word "required."

Example:
$string = 'Large toolbox (metal) for sale (hammer is required) serious inquiries only.';
Desired: (hammer is required)
$string = 'Meeting scheduled for Tuesday (Formal attire required) otherwise call (or email) us.';
Desired: (Formal attire required)

I would be happy to share the 3-days worth of experiments I tried, but posting said experiments would not be pretty, and, most importantly, would show my IQ.

Thank you in advance!

 

Link to comment
Share on other sites

Can you write something that captures the word "required" through to the closing parenthesis?

What do you know about the input $string. Does it always have the word? Is it always going to be contained in parentheses? What if it shows up multiple times?

Link to comment
Share on other sites

The input $string will only have one parenthesis with the word "required" in it (it will never show up more than once).
The very best I could come up with is the following disgustingly obtuse code:
\(((?![\(]).)*required\)
I will not use this code because:
1.) it makes 2 capture groups, which is stupid.
2.) it is disgustingly obtuse.
There must be an elegant one liner a pro, such as yourself, would use 😀

Link to comment
Share on other sites

12 minutes ago, ChenXiu said:

1.) it makes 2 capture groups, which is stupid.

It's not stupid.

12 minutes ago, ChenXiu said:

2.) it is disgustingly obtuse.

Not by much.

My version looks almost the same: opening parenthesis, capturing group, some ungreedy amount of stuff that isn't an opening or closing parenthesis (you used an assertion instead), the word "required", and the closing parenthesis.

If you absolutely can't stand having a $1 group, swap the opening and closing parentheses for assertions. But this will make it look more "obtuse".
If you care about squeezing every bit of performance out of this regex then we can keep going, but with short input strings I'd need a pretty darn good reason to put in the effort.

Link to comment
Share on other sites

Alternative sans-regex solution...

Input (test.txt)

Large toolbox (metal) for sale (hammer is required) serious inquiries only.
All employees are required to attend.
Meeting scheduled for Tuesday (Formal attire required) otherwise call (or email) us.

Code

$data = file('test.txt', FILE_IGNORE_NEW_LINES|FILE_SKIP_EMPTY_LINES);

foreach ($data as $line) {
    if ($p = parens_and_req($line)) {
        $line = str_replace($p, "<span style='color:red;'>$p</span>", $line );
    }
    echo "<p>$line</p>";
}

function parens_and_req($str)
{
    $k = strlen($str);
    $p2 = 0;
    while (( $p1 = strpos($str, '(', $p2)) !== false ) {
        $p2 = strpos($str, ')', $p1);
        if ($p2 === false) {
            $p2 = $k-1;
        }
        $parens = substr($str, $p1, $p2-$p1+1);
        if (strpos($parens, 'required') !== false) {
            return $parens;
        }
    }
    return false;
}

Output

image.png.e78ace63946dd8611e77c16512ac5b25.png

  • Like 1
Link to comment
Share on other sites

1 hour ago, Barand said:

Alternative sans-regex solution...

Thank you! That was my next step -- a string function! Especially since string-style functions are much faster than preg_match.
It's funny, I had been thinking "just this one time I'll use preg_match in my coding to save time," and now it's days later, 3 days wasted (I thought preg_match "capture this group as long as it doesn't contain" would be easy, but all the answers I have found are as convoluted as a furniture assembly manual haha.)

Thank you again!!

Link to comment
Share on other sites

On 3/8/2022 at 4:43 AM, ChenXiu said:

Thank you! That was my next step -- a string function! Especially since string-style functions are much faster than preg_match.
It's funny, I had been thinking "just this one time I'll use preg_match in my coding to save time," and now it's days later, 3 days wasted (I thought preg_match "capture this group as long as it doesn't contain" would be easy, but all the answers I have found are as convoluted as a furniture assembly manual haha.)

Regex can do amazing things, but it will never be simple and elegant.

Here's a relatively simple regex solution to your question, with a single capture group.  I can't speak to relative performance vs. Barand's solution, or perhaps if there are issues with the input that weren't clear to me.

(\([^(]*required.*?\))

 

Link to comment
Share on other sites

  • 1 month later...
This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.