Jump to content

My preg_match isn't working..


physaux

Recommended Posts

Prefix:

<p class="g"><font size="-2"><b></b></font> <a href="

Pattern:

^(http|https):\/\/[a-z0-9]+([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,5}(([0-9]{1,5})?\/.*)?$/ix

Suffix:

">

 

How can I make that into a preg_match() expression, I can't get it to work!

I have tried:

preg_match($prefix."(".$pattern.")".$suffix, $result);

 

Should that work? Or any suggestions?

Link to comment
Share on other sites

Use preg_quote on the prefix and suffix strings to help prevent any conflicting characters that might be in there. Also, your pattern (the entire first argument to preg_match) needs delimiters at the start and end, the default is a forward-slash (/) character.  The string containing the "pattern" probably does not need ^ nor $ and almost certainly doesn't need the /ix in it (that last bit should probably be right at the very end of everything.

 

Assuming your prefix, expression and suffic are correct then it's likely you'd rather want:

 

Pattern (just removed bits from the start/end):

(http|https):\/\/[a-z0-9]+([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,5}(([0-9]{1,5})?\/.*)?

 

Code:

$regex = '/' . preg_quote($prefix) . '(' . $pattern . ')' . preg_quote($suffix) . '/ix';
echo $regex; // Just to see what it looks like
preg_match($regex, $subject, $match);

 

Hopefully that'll help a bit, if not just ask.

 

 

Link to comment
Share on other sites

Yyeaa, it's not working. I tried echo'ing and printing the prefix and suffix both preg_quoted and not, as well as the regex.

 

They all give me random combinations of "/\\\/\/\/\"

 

What is going on? Maybe the browser is interpreting the code as html code, is there any way for me to print it so that it only shows the string, not "renders" it?

Link to comment
Share on other sites

I can't edit anymore, but I wanted to add this:

Could it be relevant that what the "content" is is (I think) the source code (or is it rendered ouput?) of a URL. Here is the function that generates it:

 

<?php
$ch = curl_init();
$timeout = 5;
curl_setopt($ch,CURLOPT_URL,$url);
curl_setopt($ch,CURLOPT_RETURNTRANSFER,1);	
curl_setopt($ch,CURLOPT_CONNECTTIMEOUT,$timeout);
$data = curl_exec($ch);
curl_close($ch);
?>

 

So my prefix and suffix is based on what was surrounding my target when I "viewed source" manually at the target URL

Link to comment
Share on other sites

That would be the source code you're retrieving. If you 'trust' the URL you're grabbing, you could simply do

 

<?php
//$data holds the source code of the remote page
preg_match('~<p class="g"><font size="-2"><b></b></font> <a href="([^"]*)">~i', $data, $match);
echo $match[1];
?>

 

Assuming the prefix and suffix actually match with the source code.

 

Else if you want to keep your URL pattern, try this, using a modified version of the pattern you provided (it had some errors/opportunities for improvement):

 

<?php
preg_match('~<p class="g"><font size="-2"><b></b></font> <a href="(https?://[a-z0-9]+(?:[-.][a-z0-9]+)*\.[a-z]{2,6}(?::[0-9]{1,5})?(?:/.*?)?)">~is', $data, $match);
echo $match[1];
?>

 

@salathe

You forgot to add the delimiter as the second parameter to preg_quote().

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.