Jump to content

[SOLVED] Re: Another regex problem. [SPLIT]


ShadeSlayer

Recommended Posts

Alright, thanks for the help there. I don't think it'd be very useful in starting a new thread, so I'm just gonna put my next problem here.

 

Here's my updated code now:

<?php
include("common.php");
$info['tid'] = 29382;
echo "<a href=\"".$_SERVER['PHP_SELF']."?action=go\">Run Query</a>";

if($_POST['action'] == "go"){
$sql = "SELECT threadid, body FROM ".TABLE_POSTS." WHERE threadid = ".$info['tid']." ORDER BY threadid ASC";
if ($exe = runQuery($sql)){
  while ($row = fetchResultArray($exe)){
  $f[0] = "[b]"; $r[0] = "<b>"; 
  $f[1] = "[/b]"; $r[1] = "</b>";
  $f[2] = "[adminbr]"; $r[2] = "<br />";
  $f[3] = "[i]"; $r[3] = "<i>";
  $f[4] = "[/i]"; $r[4] = "</i>";
  $f[5] = "[bq]"; $r[5] = "<blockquote>";
  $f[6] = "[/bq]"; $r[6] = "</blockquote>";
  $f[7] = "\n"; $r[7] = "<br />";
  $f[8] = "\\\""; $r[8] = "\"";
  $f[9] = "\'"; $r[9] = "'";
  
  preg_match('#^<b>(.+?)</b>#', $row['body'], $match);
  $i['title'] = $match[1];
  $i['body'] = str_replace($f, $r, $row['body']);
  $sql = "INSERT INTO ".TABLE_ACH." (date, title, content) VALUES (".date("U").", ".$i['title'].", ".$i['content'].")";
  runQuery($sql) or die("Query won't run");
  echo "Query complete.";
  }
}
}
?>

 

When I click on the "run query" link, it just displays a blank page. Doesn't do anything. I'm guessing it's just a silly error on my part, but I can't decipher it right now.

Link to comment
Share on other sites

Alright, thanks for the help there. I don't think it'd be very useful in starting a new thread, so I'm just gonna put my next problem here.

 

Actually, it's probably better to create a new thread for a new problem and flag your current one as resolved if it is. Posting problems in new threads completely separates issues (and places them in different / appropriate forums if need be - as in this case, looks to be a mysql problem?) while resolved threads let people know which problems have been resolved. (I'll ask an admin about this particular case.. see what can be done here).

 

As for the new issue, try trouble shooting by examining the contents of variables and check your tables to see if they now contain what you expect them to. Learning to trouble shoot is a good habit to get into as opposed to relying strictly on forum members to do it for you.

Link to comment
Share on other sites

I just found another problem with the regex.

 

It seems that with areas that already have line breaks, it extracts the title.

 

eg. This works:

<b>Call of Duty 4</b>
<br />
<br /><b>General Achievements</b>
<br /><blockquote><b>Make the Jump</b> - <i>20 points</i>
<br />Infiltrate a cargo ship. 
....

 

but not this:

<b>Lego Star Wars: The Complete Saga</b><br /><br /><b>General Achievements</b><br /><blockquote><b>The Phantom Menace</b> - <i>20 points</i><br />Finish Episode I in Story mode. <br /><br /><b>Attack of the Clones</b> - <i>20 points</i><br />Finish Episode II in Story mode.
....

 

Is it relying on a line break to successfully extract a title?

Link to comment
Share on other sites

In that second example. is the <b>Lego Star Wars: The Complete Saga</b> at the very beginning of the next line? (in other words, there is nothing before the <b>?), because it works for me:

 

$str = '<b>Lego Star Wars: The Complete Saga</b><br /><br /><b>General Achievements</b><br /><blockquote><b>The Phantom Menace</b> - <i>20 points</i><br />Finish Episode I in Story mode. <br /><br /><b>Attack of the Clones</b> - <i>20 points</i><br />Finish Episode II in Story mode.
....';
preg_match('#^<b>(.+?)</b>#', $str, $match);
echo $match[1]; // outputs: Lego Star Wars: The Complete Saga

 

EDIT - In the event that there is something (like say a space by example ) before the <b>.... You could try this regex pattern:

 

#^.*?<b>(.+?)</b>#

 

Would that help?

Link to comment
Share on other sites

Both of those examples are the very start of each line, the only difference is the Call of Duty one (the one that works) has each piece of info on a new line while the Star Wars one (doesn't work) is everything on one line (I think the code tag is just wrapping the text).

Link to comment
Share on other sites

Both of those examples are the very start of each line, the only difference is the Call of Duty one (the one that works) has each piece of info on a new line while the Star Wars one (doesn't work) is everything on one line (I think the code tag is just wrapping the text).

 

Whether it is all on a single line or not shouldn't matter (unless the title itself is split into different lines, which then goes back to me mentioning the s modifier - but odds are, you won't need that I suspect).. So this example will also work:

 

$str = '<b>Lego Star Wars: The Complete Saga</b>
<br /><br /><b>General Achievements</b><br />
<blockquote><b>The Phantom Menace</b> - <i>20 points</i><br />Finish Episode I in Story mode. <br /><br /><b>Attack of the Clones</b> - <i>20 points</i><br />Finish Episode II in Story mode.
....';
preg_match('#^<b>(.+?)</b>#', $str, $match);
echo $match[1]; // outputs: Lego Star Wars: The Complete Saga

 

Notice how this too is now on different lines.. the key to the pattern is that it looks at the beginning of the string, and finds <b>Lego Star Wars: The Complete Saga</b> and uses that..

 

If however there is something (anything at all) in front at the start of the string that is not <b>, then perhaps the last snippet I included as an EDIT in my previous post:

 

preg_match('#^.*?<b>(.+?)</b>#', $str, $match);

 

If this doesn't work, do you get any kind of errors / warnings?

 

 

 

 

Link to comment
Share on other sites

Come to think of it, since preg_match stops matching after the first instance of what it finds with the given pattern, I suppose the pattern could be simplified to:

 

preg_match('#<b>(.+?)</b>#', $str, $match);

(no ^ to signify the beginning of the string, as since the first set of <b>...</b> within the string contains the title, we don't really need it. My bad).

Link to comment
Share on other sites

I did a bit of troubleshooting, and I remembered that originally, some of the body values are inputted as bbcode (which is fixed in my str_replace).

 

But, even with doing this:

 preg_match('#([b]|<b>)(.+?)([/b]|</b>)#', $row['body'], $m);
$i['title'] = $m[1];

 

It still doesn't work. Not sure why the ([b]|<b>) doesn't work... All it outputs is "b" as the first title, and nothing anywhere else.

Link to comment
Share on other sites

The problem with (|<b>) is that the first part is being treated as a character class... You want to literally find [ then b then ].. so the [ and ] characters (which are the brackets that form a character class) need to be escaped to represent literal versions instead:

 

So perhaps

preg_match('#(\[b\]|<b>)(.+?)(\[/b\]|</b>)#', $row['body'], $m);

?

 

With regards to variables, I would recommend giving your variables more meaning.. $m, as temporary as it is, isn't terribly descriptive. It's not the end of the world by any stretch, but never hurts to give variables meaningful names.

 

Good to see you troubleshooting ;) Keep it up!

 

EDIT - Since you use alternations (...|...), your desired results should be stored under $m[2] (as brackets capture and store what they find into variables, in this case $m[1], $m[2], etc.. ) So the second set of brackets is what you are after.

Link to comment
Share on other sites

EDIT - Since you use alternations (...|...), your desired results should be stored under $m[2] (as brackets capture and store what they find into variables, in this case $m[1], $m[2], etc.. ) So the second set of brackets is what you are after.

 

Mentally slow day.. when using alternations that won't serve any use doing any capturing, we can simply use the non-capturing (?:...|....) format instead:

preg_match('#(?:\[b\]|<b>)(.+?)(?:\[/b\]|</b>)#', $row['body'], $m);
echo $m[1];

 

So we only use one set of capturing brackets (the middle set) and as a result, you can once again resort to using $m[1] instead...

As a side note, the current pattern will technically match stuff like title</b>, for all intents and purposes it should be fine, as I wouldn't suspect you running into such craziness (at least, hopefully not!)

Link to comment
Share on other sites

Alright, rock on. It works perfectly!

 

I just decided I'd change my ways up a bit and just make each thing output in to an "INSERT INTO (...." instead of automatically uploading to a DB, because I'm lazy.

 

So I suppose this is solved. Thanks a bunch!

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.