adrek Posted November 4, 2010 Share Posted November 4, 2010 Hi, I have a page that has a name on it and all associated email addresses for this name. I am making a javascript that can pull the name and email address from the page to make a contact list from it however these pages are not formated in any particular way. The email addresses are easy to pull but the names are not. A typical page looks something like this Hi, my name is [name] The only problem is the name line can look like Hi, im [name] or Name: [name] The name is not necessarly on the first line of the page either. is there a way i can use regular expressions to get pull the name from these pages? Quote Link to comment Share on other sites More sharing options...
gizmola Posted November 4, 2010 Share Posted November 4, 2010 Yes, a regex would work great. Do you know about the "or" operator? For example (Hi, im |Name: )(.*) Try preg_match(). Quote Link to comment Share on other sites More sharing options...
.josh Posted November 4, 2010 Share Posted November 4, 2010 @gizmola: preg_match() is a php function, he is asking about javascript. But that does bring up the point of...why are you scraping this page with javascript? There are 100 better ways to get that data. Quote Link to comment Share on other sites More sharing options...
Psycho Posted November 4, 2010 Share Posted November 4, 2010 If there is no rhyme or reason to how the names are displayed or where they are displayed I don't see any solution for you. What you are asking for is something that humans can do easily, but is very difficult to program and almost impossible to get 100% correct. Possible solutiosn would include creating an expansive list of possible pre-fixes to the name (e.g. "Hi, im") or creating an even larger list of common names to search against. Quote Link to comment Share on other sites More sharing options...
gizmola Posted November 4, 2010 Share Posted November 4, 2010 Oops, yeah I glossed over that, although javascript has regex as well. You can setup a regexp object and then you call string.match(). There's a bit more to it, but it still comes down to a simple regex should work fine. Quote Link to comment Share on other sites More sharing options...
.josh Posted November 4, 2010 Share Posted November 4, 2010 Oops, yeah I glossed over that, although javascript has regex as well. You can setup a regexp object and then you call string.match(). There's a bit more to it, but it still comes down to a simple regex should work fine. yeah, was just nitpicking you don't even need to setup a js regexp object if it's a straight pattern, only if you want to throw a variable into the mix as part of the pattern. Quote Link to comment Share on other sites More sharing options...
adrek Posted November 5, 2010 Author Share Posted November 5, 2010 But that does bring up the point of...why are you scraping this page with javascript? There are 100 better ways to get that data. Good question, I don't own the webpage that I will be scrapping this data off of. It is a greasemonkey script that will pull the data for me so I can make a contact List from it. Yes, a regex would work great. Do you know about the "or" operator? For example (Hi, im |Name: )(.*) I did not know about that I will defiantly give that a try and let you guys know. What you are asking for is something that humans can do easily, but is very difficult to program and almost impossible to get 100% correct. Possible solutiosn would include creating an expansive list of possible pre-fixes to the name (e.g. "Hi, im") or creating an even larger list of common names to search against. . This is a possibility. It doesn't need to be 100% accurate. Just accurate enough to make it worth while. Thanks for all of the replies! Quote Link to comment Share on other sites More sharing options...
adrek Posted November 6, 2010 Author Share Posted November 6, 2010 The "or" operator worked like a charm. Thanks for all of your help. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.