rdrews Posted December 9, 2009 Share Posted December 9, 2009 Ok, here is a sample of the string I need broken up... ******Begin Sample********* <div class="PubSectionHeader"><font size="+0">Bill Smith 48415126</font></div> <br> <a name="NDR4pDQoQv5rq1MQk"></a> <div class="PubNote"> <div class="PubNoteContentArea">Called Customer and blah blah blah blah. abc. 10/12/09<blockquote class="gn_c"> She doesn't want to be contacted on this number but said okay. 10/12/09</blockquote></div> </div> <a name="NDRLWDAoQ2qq687wk"></a> <div class="PubNote"> <div class="PubNoteContentArea">Spoke to customer on alternate number. She said she blah blah blah blah blah. I told her as long as we receive it within a week, no problems. abc 9/18/09</div> </div> <a name="NDRykDAoQv__VvaUk"></a> <div class="PubNote"> <div class="PubNoteContentArea">Left message on premise about issues. abc 7/7/09</div> </div> <a name="NDQrlDAoQxbL1_54k"></a> <div class="PubNote"> <div class="PubNoteContentArea">this is another part of the comment that I need. I think I am going about things the wrong way. abc 6/17/09<blockquote class="gn_c"> She said she'll call tech support. abc 6/17/09</blockquote></div> </div> <a name="NDQopDQoQ18mwt5wk"></a> <div class="PubNote"> <div class="PubNoteContentArea">Called customer about issue. Left message on alternate number. abc 6/9/09 </div> </div> <a name="NDQduDQoQoNiBhZMk"></a> <div class="PubNote"> <div class="PubNoteContentArea">call was returned '. abc 5/11/09<blockquote class="gn_c"> Customer said she doesn't want to be transferred to tech support (even though she said her system doesn't work). She asked why we're so hard to get a hold of. I let her know she can call the number to contact us. abc 5/11/09<br>Removed from list. abc 5/11/09</blockquote></div> </div> <a name="NDQ7QIgoQ9aWRsJgj"></a> <div class="PubNote"> <div class="PubNoteContentArea">Big problems. abc </div> </div> <a name="SDUThIgoQrLX5r5gj"></a> <div class="PubSectionHeader"><font size="+0">Mark and Larry 700002</font></div> <br> <a name="NDSOkDAoQ5-TduIgk"></a> <div class="PubNote"> <div class="PubNoteContentArea"><span>update per user, told the customer if he gets two remotes we can wave fee . bill 04/08/09 2:00 pm </span><div> </div></div> </div> <a name="NDQmeDQoQjdS-uIgk"></a> <div class="PubNote"> <div class="PubNoteContentArea">yada yada yada yada . <div>bill 04/08/09</div></div> </div> <a name="NDSGpIgoQ56b7r5gj"></a> <div class="PubNote"> <div class="PubNoteContentArea">another note here. abc </div> </div> <a name="SDQqRIwoQyIDmr5gj"></a> *********End Sample************** Ok...all of this is HTML source from Google Docs that I saved into a large .txt file. Basically, I need all of this broken into two parts. Account numbers and comments. All the account numbers are found right after the name which is AFTER the <div class="PubSectionHeader"><font size="+0"> and BEFORE the </font> And all the comments look to be between <div class="PubNoteContentArea"> and </div> but there are usually several separate notes for each account number. So a single account number may have several <div class="PubNoteContentArea"> note areas. Ideally I would like to run through this whole text file and end up with two arrays. One array would be all account numbers (accountNum[0] = "123", accountNum[1] = "456", etc...) and the other array would be all the notes (notes[0] = "notes for account 123", notes[1] = "notes for account 456", etc...) but if it's easier/makes more sense to do one array where the first element would be the account number, the second the notes for the account in the first element, the third, another account number, etc.... then I can work with that too. I realize there is some additional formatting in between some of the <div class="PubNoteContentArea"> note areas like "<blockquote class="gn_c">" and maybe some other stuff but for now I'm not really worried about all that. I can maybe put the whole file into word or excel and do a few find/replaces to get rid of some of that. Up to this point I have loaded the whole file contents into a string and then split the string into a character array where every character (including whitespaces) is an element in the array. I then tried to start messing with the regex part of it and decided I wasn't getting anywhere after a while of playing around with it. Any help is greatly appreciated. If I didn't explain things very well feel free to ask me to clarify. Thanks! Quote Link to comment https://forums.phpfreaks.com/topic/184564-need-help-splitting-string-up/ Share on other sites More sharing options...
rdrews Posted December 9, 2009 Author Share Posted December 9, 2009 ...I forgot to mention... The above example would be two accounts if it isn't immediately clear. The first is account number 48415126 and the second is account number 700002. Quote Link to comment https://forums.phpfreaks.com/topic/184564-need-help-splitting-string-up/#findComment-974368 Share on other sites More sharing options...
cags Posted December 9, 2009 Share Posted December 9, 2009 You may well be better off using some kind of DOMDocument but since you posted under PHP Regex... $pattern = '#<div class="PubSectionHeader"><font size="\+0">([a-z ]+?) ([0-9]+)</font></div>#is'; preg_match_all($pattern, $input, $matches); echo '<pre>'; print_r($matches); echo '</pre>'; Only thrown together quickly but it should more or less work. There's probably other characters you will need to consider for the first character class, for example a dash (-) for double barrelled names and an apostrophe (`) for O`Reilly etc. Quote Link to comment https://forums.phpfreaks.com/topic/184564-need-help-splitting-string-up/#findComment-974371 Share on other sites More sharing options...
rdrews Posted December 9, 2009 Author Share Posted December 9, 2009 You may well be better off using some kind of DOMDocument but since you posted under PHP Regex... $pattern = '#<div class="PubSectionHeader"><font size="\+0">([a-z ]+?) ([0-9]+)</font></div>#is'; preg_match_all($pattern, $input, $matches); echo '<pre>'; print_r($matches); echo '</pre>'; Only thrown together quickly but it should more or less work. There's probably other characters you will need to consider for the first character class, for example a dash (-) for double barrelled names and an apostrophe (`) for O`Reilly etc. Awesome, thanks! That looks like it is very close to what I am looking for. I will play with it for a little bit and come back if I have anymore issues. Quote Link to comment https://forums.phpfreaks.com/topic/184564-need-help-splitting-string-up/#findComment-974376 Share on other sites More sharing options...
rdrews Posted December 9, 2009 Author Share Posted December 9, 2009 You may well be better off using some kind of DOMDocument but since you posted under PHP Regex... $pattern = '#<div class="PubSectionHeader"><font size="\+0">([a-z ]+?) ([0-9]+)</font></div>#is'; preg_match_all($pattern, $input, $matches); echo '<pre>'; print_r($matches); echo '</pre>'; Only thrown together quickly but it should more or less work. There's probably other characters you will need to consider for the first character class, for example a dash (-) for double barrelled names and an apostrophe (`) for O`Reilly etc. Ok...so with your help I get $matches[2] which holds all the account numbers so that's step one. I've been working on step two and can't quite get there...apparently I need to be spoon fed. I'm having trouble figuring out how to get ALL the notes under a particular account number into one element of an array. If I use something like $pattern = '#<div class="PubNoteContentArea">.</div>#is'; preg_match_all($pattern, $contents, $matches); (which I haven't gotten working quite yet) won't that just put each note between the div tags into a separate element? How do I tell it to combine all the notes after one account number and before the next account number into one element? I know that cags mentioned possibly using DOMDocument(). Should I post this somewhere else or can this be done using regex? Thanks again for the help! Quote Link to comment https://forums.phpfreaks.com/topic/184564-need-help-splitting-string-up/#findComment-974386 Share on other sites More sharing options...
rdrews Posted December 10, 2009 Author Share Posted December 10, 2009 No one? Is there another route I should be taking here? Quote Link to comment https://forums.phpfreaks.com/topic/184564-need-help-splitting-string-up/#findComment-974805 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.