LordLanky Posted August 25, 2009 Share Posted August 25, 2009 Hi All, i am trying to write a bit of php that will split a document written in html into chapters. An example doc is: <h1>The Work of an Idiot</h2> <p>Edited by A Total Moron</p> <h2>Chapter 1</h2> <p>Here is some random text</p> <h2>Chapter 2 - The Wrath of Khan's Mum</h2> <p>Here is some more random text</p> <h2>Chapter 3</h2> <p>Again.. i can ramble for ages</p> What i need is to split it into an array or a number of variables with each chunk being a chapter. So for example, an array called strChapters() being: strChapters[0][text] => "<h1>The Work of an Idiot</h2><p>Edited by A Total Moron</p>" strChapters[0][title] => "" strChapters[1][text] => "<h2>Chapter 1</h2><p>Here is some random text</p>" strChapters[1][title] => "Chapter 1" strChapters[2][text] => "<h2>Chapter 2 - The Wrath of Khan's Mum</h2><p>Here is some more random text</p>" strChapters[2][title] => "Chapter 2 - The Wrath of Khan's Mum" strChapters[3][text] => "<h2>Chapter 3</h2><p>Again.. i can ramble for ages</p>" strChapters[3][title] => "Chapter 3" My guess is i need a robust regular expression to take into account the fact that a chapter string can contain a number and a title. I also need to have the title on its own as well. I'm fairly good at php now, but this just escapes my experience. I was thinking of exploding on the word "chapter" but i dont want it to split it if it's just a word in a sentence, i.e. "as mentioned in chapter 2, Khan's not going to get any pocket money this month". Any help is really appreciated! Quote Link to comment https://forums.phpfreaks.com/topic/171788-i-hate-regular-expressions-can-anyone-help/ Share on other sites More sharing options...
JonnoTheDev Posted August 25, 2009 Share Posted August 25, 2009 You have not closed your H1 tag correctly!!!! Try this helpful function. <?php function parseArray($string, $openTag, $closeTag, $excluding = false) { preg_match_all("($openTag(.*)$closeTag)siU", $string, $matches); if($excluding) { return $matches[1]; } return $matches[0]; } $string = "<h1>The Work of an Idiot</h1> <p>Edited by A Total Moron</p> <h2>Chapter 1</h2> <p>Here is some random text</p> <h2>Chapter 2 - The Wrath of Khan's Mum</h2> <p>Here is some more random text</p> <h2>Chapter 3</h2> <p>Again.. i can ramble for ages</p>"; $array = parseArray($string,"<h1>","</p>"); $array = array_merge($array,parseArray($string,"<h2>","</p>")); print "<xmp>"; print_r($array); print "</xmp>"; ?> Quote Link to comment https://forums.phpfreaks.com/topic/171788-i-hate-regular-expressions-can-anyone-help/#findComment-905861 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.