Jump to content

Another regex problem.


ShadeSlayer

Recommended Posts

Basically what I'm trying to do is create a script that will extract a "title" and insert it into a database (amongst other things).

 

So here is the general idea of what my inputs will look like, values will vary:

 

<b>2006 FIFA World Cup</b><br /><br /><b>General Achievements</b><br /><blockquote><br /><br /><b>Qualify for the World Cup</b> - <i>50 points</i><br />Qualify for the World Cup in 2006 FIFA World Cup mode. <br /><br /><b>Beat the Host Nation</b> - <i>50 points</i><br />Defeat Germany in a full match. <br /><br /><b>Complete a Scenario</b> - <i>150 points</i><br />Complete any challenge in Global Challenge mode. <br /><br /><b>Complete all Scenarios</b> - <i>500 points</i><br />Complete all the challenges in Global Challenge mode. <br /><br /><b>Win the World Cup</b> - <i>250 points</i><br />Win the World Cup in 2006 FIFA World Cup mode. </blockquote><br /><br /><b>Secret Achievements</b><br /><blockquote>This game has no Secret Achievements.</blockquote>

 

As you can see there, the start (<b>2006 FIFA World Cup</b>) is the title, and what I want to do is extract it out of there, dropping the bold tags as well.

 

So it would look like this in the end:

 

$i['title']:

2006 FIFA World Cup

 

$i['body']:

<b>2006 FIFA World Cup</b><br /><br /><b>General Achievements</b><br /><blockquote><br /><br /><b>Qualify for the World Cup</b> - <i>50 points</i><br />Qualify for the World Cup in 2006 FIFA World Cup mode. <br /><br /><b>Beat the Host Nation</b> - <i>50 points</i><br />Defeat Germany in a full match. <br /><br /><b>Complete a Scenario</b> - <i>150 points</i><br />Complete any challenge in Global Challenge mode. <br /><br /><b>Complete all Scenarios</b> - <i>500 points</i><br />Complete all the challenges in Global Challenge mode. <br /><br /><b>Win the World Cup</b> - <i>250 points</i><br />Win the World Cup in 2006 FIFA World Cup mode. </blockquote><br /><br /><b>Secret Achievements</b><br /><blockquote>This game has no Secret Achievements.</blockquote>

 

I don't know exactly what I'd do to extract the title. The title will always be the first thing there, though.

 

Here's my code so far:

<?php
include("common.php");
$info['tid'] = 29382;

$sql = "SELECT threadid, body FROM ".TABLE_POSTS." WHERE threadid = ".$info['tid']." ORDER BY threadid ASC";
if ($exe = runQuery($sql)){
while ($row = fetchResultArray($exe)){
$f[0] = "[b]";
$r[0] = "<b>"; 
$f[1] = "[/b]";
$r[1] = "</b>";
$f[2] = "[adminbr]";
$r[2] = "<br />";
$f[3] = "[i]";
$r[3] = "<i>";
$f[4] = "[/i]";
$r[4] = "</i>";
$f[5] = "[bq]";
$r[5] = "<blockquote>";
$f[6] = "[/bq]";
$r[6] = "</blockquote>";
$f[7] = "\n";
$r[7] = "<br />";
$f[8] = "\\\"";
$r[8] = "\"";
$f[9] = "\'";
$r[9] = "'";

$i['title'] = ...?!
$i['body'] = str_replace($f, $r, $row['body']);
$sql = "INSERT INTO ".TABLE_ACH." (date, title, content) VALUES (".date("U").", ".$i['title'].", ".$i['content'].")";
runQuery($sql);
}
}

?>

 

So as you see, most of it's already done, I just need to extract a title.

Link to comment
Share on other sites

Making assumptions that a) The title is always the first thing in the string, and b) it is always encased in bold tags:

 

$i['body'] = '<b>2006 FIFA World Cup</b><br /><br /><b>General Achievements</b><br /><blockquote><br /><br /><b>Qualify for the World Cup</b> - <i>50 points</i><br />Qualify for the World Cup in 2006 FIFA World Cup mode. <br /><br /><b>Beat the Host Nation</b> - <i>50 points</i><br />Defeat Germany in a full match. <br /><br /><b>Complete a Scenario</b> - <i>150 points</i><br />Complete any challenge in Global Challenge mode. <br /><br /><b>Complete all Scenarios</b> - <i>500 points</i><br />Complete all the challenges in Global Challenge mode. <br /><br /><b>Win the World Cup</b> - <i>250 points</i><br />Win the World Cup in 2006 FIFA World Cup mode. </blockquote><br /><br /><b>Secret Achievements</b><br /><blockquote>This game has no Secret Achievements.</blockquote>
';
preg_match('#^<b>(.+?)</b>#', $i['body'], $match);
$i['title'] = $match[1]; // as what Mchl did...

 

I didn't bother adding the s or i modifier (s = DOTALL includes new lines, i = case insensitive).. but if need be, you can simply add either / both after the closing # delimiter...

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.