Jump to content

extract <div> block


Epimetheus1980

Recommended Posts

I draw contents from a database. Some of the texts contain a footnote, which is formatted using a div class. Following the HTML of this:

<div class=""footnotes"">
<br>
<hr align=""left"" noshade=""noshade"" size=""1"" width=""150"" />
      <blockquote> <a href=""#f1"" name=""fn1""> <span class=""superscript""> * </span>

        </a>Some footnotetext</blockquote></div>

Since my bibliography uses data from the database as well, the footnote now appears before the references, but I would like it at the very bottom of the page. I was thinking of using preg_replace in order to separate the textfield into two variables, one for the text itself, the other one for the footnote (it is always just one) and integrate after the bibliography is compiled. 

 

Unfortunately, it seems that the preg_replace does not work. It always displays the whole content of the textfield. Here's the PHP:

$text = preg_replace('/(.*)(\div class=\"footnotes\"\>.*?\<\/div\>)/s', '$1', $result['text']);$footnote = preg_replace('/(.*)(\div class=\"footnotes\"\>.+?\<\/div\>)/s', '$2', $result['text']);

echo '<div align="justify"><span style="font-family:Georgia;font-size:16px;">' . $text;

***BIBLIOGRAPHY***

...

echo $footnote; 

Maybe someone has an idea how to deal with that. I tried it on phpliveregex. There search string works fine. Many thanks for any help.

Link to comment
Share on other sites

Thank you! It seems to work, although some special characters are displayed wrongly. However, I would have preferred to extract the respective div class by regex, because I first would like to display the actual content of the article, then the bibliography and finally the footnote, which just appears occasionally. That was the reason why I wanted to split the content field. 

 

 

Link to comment
Share on other sites

Are you storing the content and the associated footnotes as HTML? You should be storing the article content and the footnotes as separate data in the same records. Then handle the output format in the code. Then there is no need to "extract" data from formatted content.

 

Rough exmaple

 

<?php
 
$query = "SELECT id, article, footnote FROM articles";
$result = $db->query($query);
 
foreach($result as $row)
{
    //Append article output
    $articles .= "<li>{$row['article']</li>\n";
    //Append footnote output
    $footnotes .= "<blockquote>
                       <a href=""#f1"" name=""fn1""> <span class=""superscript""> * </span></a>
                       {$row['footnote']}
                   </blockquote>\n";
}
 
//Run query to get references and put into $references variable
 
 
 
//Output the generated content
 
?>
 
<div class="articles">
  <?php echo $articles; ?>
</div>
 
<div class="references">
  <?php echo $references; ?>
</div>
 
<div class="footnotes">
  <?php echo $footnotes; ?>
</div>
Edited by Psycho
Link to comment
Share on other sites

Thanks. Yes, I think I need to do that. I'm just wondering why regex does not work. 

 

Because your regex doesn't match the content format. You show double double quotes in the content, <div class=""footnotes"">. But, in the regex you are using backslash double quotes '/(.*)(\div class=\"footnotes\"\>.*?\<\/div\>)/s'

 

There's no reason to backslash the double quote in the regex since you are delineating it with single quotes. You are backslashing a lot of things that don't need to be. Plus, this preg_match would be a better solution than replace in this scenario.

<?php
 
$input = 'Some beginning content <div class=""footnotes"">
<br>
<hr align=""left"" noshade=""noshade"" size=""1"" width=""150"" />
      <blockquote> <a href=""#f1"" name=""fn1""> <span class=""superscript""> * </span>
 
        </a>Some footnotetext</blockquote></div> some ending content';
 
preg_match('/(.*)<div class=""footnotes"">/s', $input, $match);
$text = $match[1];
preg_match('/(<div class=""footnotes"">.*?<\/div>)/s', $input, $match);
$footnote = $match[1];
 
echo " Text: " . $text;
echo "Footnote: " . $footnote;
 
?>

Output:

Text: Some beginning content 
 
 
Footnote: <div class=""footnotes"">
<br>
<hr align=""left"" noshade=""noshade"" size=""1"" width=""150"" />
      <blockquote> <a href=""#f1"" name=""fn1""> <span class=""superscript""> * </span>
 
        </a>Some footnotetext</blockquote></div>
Edited by Psycho
Link to comment
Share on other sites

Hello guys! The similar question, how to split page on HTML by tag with content?

 

<p>Some text with tags #1</p>

<div style="position: absolute;">

Some content (with tags)

</div>

<p>Some text with tags #2</p>

 

Result:

 

Array ( [0] => <p>Some text with tags #1</p>, [1] => <p>Some text with tags #2</p>)

 

Expected function preg_split()?

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.