Jump to content

Another regex problem.


ShadeSlayer

Recommended Posts

Basically what I'm trying to do is create a script that will extract a "title" and insert it into a database (amongst other things).

 

So here is the general idea of what my inputs will look like, values will vary:

 

<b>2006 FIFA World Cup</b><br /><br /><b>General Achievements</b><br /><blockquote><br /><br /><b>Qualify for the World Cup</b> - <i>50 points</i><br />Qualify for the World Cup in 2006 FIFA World Cup mode. <br /><br /><b>Beat the Host Nation</b> - <i>50 points</i><br />Defeat Germany in a full match. <br /><br /><b>Complete a Scenario</b> - <i>150 points</i><br />Complete any challenge in Global Challenge mode. <br /><br /><b>Complete all Scenarios</b> - <i>500 points</i><br />Complete all the challenges in Global Challenge mode. <br /><br /><b>Win the World Cup</b> - <i>250 points</i><br />Win the World Cup in 2006 FIFA World Cup mode. </blockquote><br /><br /><b>Secret Achievements</b><br /><blockquote>This game has no Secret Achievements.</blockquote>

 

As you can see there, the start (<b>2006 FIFA World Cup</b>) is the title, and what I want to do is extract it out of there, dropping the bold tags as well.

 

So it would look like this in the end:

 

$i['title']:

2006 FIFA World Cup

 

$i['body']:

<b>2006 FIFA World Cup</b><br /><br /><b>General Achievements</b><br /><blockquote><br /><br /><b>Qualify for the World Cup</b> - <i>50 points</i><br />Qualify for the World Cup in 2006 FIFA World Cup mode. <br /><br /><b>Beat the Host Nation</b> - <i>50 points</i><br />Defeat Germany in a full match. <br /><br /><b>Complete a Scenario</b> - <i>150 points</i><br />Complete any challenge in Global Challenge mode. <br /><br /><b>Complete all Scenarios</b> - <i>500 points</i><br />Complete all the challenges in Global Challenge mode. <br /><br /><b>Win the World Cup</b> - <i>250 points</i><br />Win the World Cup in 2006 FIFA World Cup mode. </blockquote><br /><br /><b>Secret Achievements</b><br /><blockquote>This game has no Secret Achievements.</blockquote>

 

I don't know exactly what I'd do to extract the title. The title will always be the first thing there, though.

 

Here's my code so far:

<?php
include("common.php");
$info['tid'] = 29382;

$sql = "SELECT threadid, body FROM ".TABLE_POSTS." WHERE threadid = ".$info['tid']." ORDER BY threadid ASC";
if ($exe = runQuery($sql)){
while ($row = fetchResultArray($exe)){
$f[0] = "[b]";
$r[0] = "<b>"; 
$f[1] = "[/b]";
$r[1] = "</b>";
$f[2] = "[adminbr]";
$r[2] = "<br />";
$f[3] = "[i]";
$r[3] = "<i>";
$f[4] = "[/i]";
$r[4] = "</i>";
$f[5] = "[bq]";
$r[5] = "<blockquote>";
$f[6] = "[/bq]";
$r[6] = "</blockquote>";
$f[7] = "\n";
$r[7] = "<br />";
$f[8] = "\\\"";
$r[8] = "\"";
$f[9] = "\'";
$r[9] = "'";

$i['title'] = ...?!
$i['body'] = str_replace($f, $r, $row['body']);
$sql = "INSERT INTO ".TABLE_ACH." (date, title, content) VALUES (".date("U").", ".$i['title'].", ".$i['content'].")";
runQuery($sql);
}
}

?>

 

So as you see, most of it's already done, I just need to extract a title.

Link to comment
https://forums.phpfreaks.com/topic/157460-another-regex-problem/
Share on other sites

Making assumptions that a) The title is always the first thing in the string, and b) it is always encased in bold tags:

 

$i['body'] = '<b>2006 FIFA World Cup</b><br /><br /><b>General Achievements</b><br /><blockquote><br /><br /><b>Qualify for the World Cup</b> - <i>50 points</i><br />Qualify for the World Cup in 2006 FIFA World Cup mode. <br /><br /><b>Beat the Host Nation</b> - <i>50 points</i><br />Defeat Germany in a full match. <br /><br /><b>Complete a Scenario</b> - <i>150 points</i><br />Complete any challenge in Global Challenge mode. <br /><br /><b>Complete all Scenarios</b> - <i>500 points</i><br />Complete all the challenges in Global Challenge mode. <br /><br /><b>Win the World Cup</b> - <i>250 points</i><br />Win the World Cup in 2006 FIFA World Cup mode. </blockquote><br /><br /><b>Secret Achievements</b><br /><blockquote>This game has no Secret Achievements.</blockquote>
';
preg_match('#^<b>(.+?)</b>#', $i['body'], $match);
$i['title'] = $match[1]; // as what Mchl did...

 

I didn't bother adding the s or i modifier (s = DOTALL includes new lines, i = case insensitive).. but if need be, you can simply add either / both after the closing # delimiter...

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.