LordLanky Posted August 25, 2009 Share Posted August 25, 2009 Hi All, i am trying to write a bit of php that will split a document written in html into chapters. An example doc is: <h1>The Work of an Idiot</h2> <p>Edited by A Total Moron</p> <h2>Chapter 1</h2> <p>Here is some random text</p> <h2>Chapter 2 - The Wrath of Khan's Mum</h2> <p>Here is some more random text</p> <h2>Chapter 3</h2> <p>Again.. i can ramble for ages</p> What i need is to split it into an array or a number of variables with each chunk being a chapter. So for example, an array called strChapters() being: strChapters[0][text] => "<h1>The Work of an Idiot</h2><p>Edited by A Total Moron</p>" strChapters[0][title] => "" strChapters[1][text] => "<h2>Chapter 1</h2><p>Here is some random text</p>" strChapters[1][title] => "Chapter 1" strChapters[2][text] => "<h2>Chapter 2 - The Wrath of Khan's Mum</h2><p>Here is some more random text</p>" strChapters[2][title] => "Chapter 2 - The Wrath of Khan's Mum" strChapters[3][text] => "<h2>Chapter 3</h2><p>Again.. i can ramble for ages</p>" strChapters[3][title] => "Chapter 3" My guess is i need a robust regular expression to take into account the fact that a chapter string can contain a number and a title. I also need to have the title on its own as well. I'm fairly good at php now, but this just escapes my experience. I was thinking of exploding on the word "chapter" but i dont want it to split it if it's just a word in a sentence, i.e. "as mentioned in chapter 2, Khan's not going to get any pocket money this month". Any help is really appreciated! Link to comment https://forums.phpfreaks.com/topic/171788-i-hate-regular-expressions-can-anyone-help/ Share on other sites More sharing options...
JonnoTheDev Posted August 25, 2009 Share Posted August 25, 2009 You have not closed your H1 tag correctly!!!! Try this helpful function. <?php function parseArray($string, $openTag, $closeTag, $excluding = false) { preg_match_all("($openTag(.*)$closeTag)siU", $string, $matches); if($excluding) { return $matches[1]; } return $matches[0]; } $string = "<h1>The Work of an Idiot</h1> <p>Edited by A Total Moron</p> <h2>Chapter 1</h2> <p>Here is some random text</p> <h2>Chapter 2 - The Wrath of Khan's Mum</h2> <p>Here is some more random text</p> <h2>Chapter 3</h2> <p>Again.. i can ramble for ages</p>"; $array = parseArray($string,"<h1>","</p>"); $array = array_merge($array,parseArray($string,"<h2>","</p>")); print "<xmp>"; print_r($array); print "</xmp>"; ?> Link to comment https://forums.phpfreaks.com/topic/171788-i-hate-regular-expressions-can-anyone-help/#findComment-905861 Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.