rdrews Posted December 10, 2009 Share Posted December 10, 2009 Posted this here http://www.phpfreaks.com/forums/index.php/topic,280070.0.html but don't seem to be getting a lot of assistance (cags actually provided great help but I'm still lost) so I figure maybe regex isn't the right way to go (also suggested by cags). Here is my problem... Ok, here is a sample of the string I need broken up... ******Begin Sample********* <div class="PubSectionHeader"><font size="+0">Bill Smith 48415126</font></div> <br> <a name="NDR4pDQoQv5rq1MQk"></a> <div class="PubNote"> <div class="PubNoteContentArea">Called Customer and blah blah blah blah. abc. 10/12/09<blockquote class="gn_c"> She doesn't want to be contacted on this number but said okay. 10/12/09</blockquote></div> </div> <a name="NDRLWDAoQ2qq687wk"></a> <div class="PubNote"> <div class="PubNoteContentArea">Spoke to customer on alternate number. She said she blah blah blah blah blah. I told her as long as we receive it within a week, no problems. abc 9/18/09</div> </div> <a name="NDRykDAoQv__VvaUk"></a> <div class="PubNote"> <div class="PubNoteContentArea">Left message on premise about issues. abc 7/7/09</div> </div> <a name="NDQrlDAoQxbL1_54k"></a> <div class="PubNote"> <div class="PubNoteContentArea">this is another part of the comment that I need. I think I am going about things the wrong way. abc 6/17/09<blockquote class="gn_c"> She said she'll call tech support. abc 6/17/09</blockquote></div> </div> <a name="NDQopDQoQ18mwt5wk"></a> <div class="PubNote"> <div class="PubNoteContentArea">Called customer about issue. Left message on alternate number. abc 6/9/09 </div> </div> <a name="NDQduDQoQoNiBhZMk"></a> <div class="PubNote"> <div class="PubNoteContentArea">call was returned '. abc 5/11/09<blockquote class="gn_c"> Customer said she doesn't want to be transferred to tech support (even though she said her system doesn't work). She asked why we're so hard to get a hold of. I let her know she can call the number to contact us. abc 5/11/09<br>Removed from list. abc 5/11/09</blockquote></div> </div> <a name="NDQ7QIgoQ9aWRsJgj"></a> <div class="PubNote"> <div class="PubNoteContentArea">Big problems. abc </div> </div> <a name="SDUThIgoQrLX5r5gj"></a> <div class="PubSectionHeader"><font size="+0">Mark and Larry 700002</font></div> <br> <a name="NDSOkDAoQ5-TduIgk"></a> <div class="PubNote"> <div class="PubNoteContentArea"><span>update per user, told the customer if he gets two remotes we can wave fee . bill 04/08/09 2:00 pm </span><div> </div></div> </div> <a name="NDQmeDQoQjdS-uIgk"></a> <div class="PubNote"> <div class="PubNoteContentArea">yada yada yada yada . <div>bill 04/08/09</div></div> </div> <a name="NDSGpIgoQ56b7r5gj"></a> <div class="PubNote"> <div class="PubNoteContentArea">another note here. abc </div> </div> <a name="SDQqRIwoQyIDmr5gj"></a> *********End Sample************** Ok...all of this is HTML source from Google Docs that I saved into a large .txt file. Basically, I need all of this broken into two parts. Account numbers and comments. All the account numbers are found right after the name which is AFTER the <div class="PubSectionHeader"><font size="+0"> and BEFORE the </font> And all the comments look to be between <div class="PubNoteContentArea"> and </div> but there are usually several separate notes for each account number. So a single account number may have several <div class="PubNoteContentArea"> note areas. Ideally I would like to run through this whole text file and end up with two arrays. One array would be all account numbers (accountNum[0] = "123", accountNum[1] = "456", etc...) and the other array would be all the notes (notes[0] = "notes for account 123", notes[1] = "notes for account 456", etc...) but if it's easier/makes more sense to do one array where the first element would be the account number, the second the notes for the account in the first element, the third, another account number, etc.... then I can work with that too. I realize there is some additional formatting in between some of the <div class="PubNoteContentArea"> note areas like "<blockquote class="gn_c">" and maybe some other stuff but for now I'm not really worried about all that. I can maybe put the whole file into word or excel and do a few find/replaces to get rid of some of that. Up to this point I have loaded the whole file contents into a string and then split the string into a character array where every character (including whitespaces) is an element in the array. I then tried to start messing with the regex part of it and decided I wasn't getting anywhere after a while of playing around with it. Any help is greatly appreciated. If I didn't explain things very well feel free to ask me to clarify. UPDATE* With Cags provided code I get $matches[2] which holds all the account numbers for me but I can't figure out how to get all the notes associated with all those account numbers. It won't be as straight forward as getting the account numbers (I think) because there may be several notes per account number. If I am going about this the wrong way (with regex) suggestions are more than welcome. Thanks! Link to comment https://forums.phpfreaks.com/topic/184687-need-help-breaking-up-string/ Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.