clowes Posted May 29, 2010 Share Posted May 29, 2010 I would really appreciate some advice here. I have spent countless hours trying to come up with a suitable resolution. Essentially I have a script which grabs data from various sources. All the sources provide the same data but in various different formats. The main aim of the script is to get the data, and display the important data in a clean, easy to read manner. All the data contains some common attributes.. the important stuff. Names, Addresses, emails, and websites. In addition to this some data sources contain completely irrelevant data. The perfect resolution would be to grab the important data from each source and simply display it. I can get emails and websites using regular expressions, however as far as I am aware to get Names/Addresses is impossible. Some data sources say Name: Jack Johnson Others say First Name..... Jack Last Name..... Johnson My first question is as to whether I am correct in believing it impossible to extract the names/addresses when they are always displayed in a variety of different forms? -------- The approach I am currently taking is that although not ideal I will simply remove data that I don't want, and display the rest. For example one data source separates things with a series of 10 - characters. Thus I have just used str_replace to remove these. The problem with this method is that each is layed out in a different way. I have removed the rubbish, and am left with data which has varying amounts of line breaks between line for example. I cannot simply remove line breaks as that it would be a clump of ugly text, hence I have come to another fence. Does anyone have any suggestions on a suitable way to approach the task outlined above? Thankyou. Quote Link to comment https://forums.phpfreaks.com/topic/203286-cleaning-up-data-for-display/ Share on other sites More sharing options...
premiso Posted May 29, 2010 Share Posted May 29, 2010 This is most commonly done with preg_match. If you provide examples of the data in their raw format, I am sure someone can help you with a regular expression to extract the data. Alternatively, it is rss / xml you can try using simplexml. Quote Link to comment https://forums.phpfreaks.com/topic/203286-cleaning-up-data-for-display/#findComment-1065062 Share on other sites More sharing options...
clowes Posted May 29, 2010 Author Share Posted May 29, 2010 Sadly the data is not returned as XML. This is data received from domain whois servers. I have attached examples of 3 raw returned data variables. In its most simple form I can simply utilize nl2br and then print the output. Any advice would be greatly appreciated. Thanks domain: one.com owner: n/a organization: B-one email: jnj@b-one.net address: Dubai Internet City address: Building 9 city: Dubai postal-code: 500401 country: AE phone: +45.46907100 admin-c: CCOM-387512 jnj@b-one.net tech-c: CCOM-387512 jnj@b-one.net billing-c: CCOM-387512 jnj@b-one.net nserver: a.b-one-dns.net nserver: b.b-one-dns.net status: lock created: 1992-02-12 00:00:00 UTC modified: 2008-03-18 09:07:18 UTC expires: 2015-02-13 05:00:00 UTC contact-hdl: CCOM-387512 person: n/a organization: B-one email: jnj@b-one.net address: Dubai Internet City address: Building 9 city: Dubai postal-code: 500401 country: AE phone: +45.46907100 source: joker.com live whois service query-time: 0.013596 db-updated: 2010-05-29 17:30:47 NOTE: By submitting a WHOIS query, you agree to abide by the following NOTE: terms of use: You agree that you may use this data only for lawful NOTE: purposes and that under no circumstances will you use this data to: NOTE: (1) allow, enable, or otherwise support the transmission of mass NOTE: unsolicited, commercial advertising or solicitations via direct mail, NOTE: e-mail, telephone, or facsimile; or (2) enable high volume, automated, NOTE: electronic processes that apply to Joker.com (or its computer systems). Domain Name………. abcd.com Creation Date…….. 1995-04-06 Registration Date…. 2009-06-02 Expiry Date………. 2011-04-08 Organisation Name…. Disney Enterprises, Inc. Organisation Address. 500 S. Buena Vista Street Organisation Address. 506 Second Ave. Suite 2100 Organisation Address. Burbank Organisation Address. 91521 Organisation Address. CA Organisation Address. UNITED STATES Admin Name……….. Domain Registrar Admin Address…….. Attn Phil Wahl 500 S Buena Vista Street Admin Address…….. Admin Address…….. Burbank Admin Address…….. 91521 Admin Address…….. CA Admin Address…….. UNITED STATES Admin Email………. domain.registrar@ONLINE.DISNEY.COM Admin Phone………. +1.8186233325 Admin Fax………… +1.8186233555 Tech Name………… Domain Registrar Tech Address……… Attn Phil Wahl 500 S Buena Vista Street Tech Address……… Tech Address……… Burbank Tech Address……… 91521 Tech Address……… CA Tech Address……… UNITED STATES Tech Email……….. domain.registrar@ONLINE.DISNEY.COM Tech Phone……….. +1.8186233325 Tech Fax…………. +1.8186233555 Name Server………. sens01.dig.com Name Server………. sens02.dig.com Name Server………. orns01.dig.com Name Server………. orns02.dig.com Domain Name: FMA.COM Registrar: MONIKER Registrant [1690]: com fma dns-admin@fma.net Future Media Architects, Inc. P.O. Box 71 Road Town Tortola 99999 VG Administrative Contact [1690]: com fma dns-admin@fma.net Future Media Architects, Inc. P.O. Box 71 Road Town Tortola 99999 VG Phone: +1.2844945870 Fax: +1.2844948586 Billing Contact [1690]: com fma dns-admin@fma.net Future Media Architects, Inc. P.O. Box 71 Road Town Tortola 99999 VG Phone: +1.2844945870 Fax: +1.2844948586 Technical Contact [1690]: com fma dns-admin@fma.net Future Media Architects, Inc. P.O. Box 71 Road Town Tortola 99999 VG Phone: +1.2844945870 Fax: +1.2844948586 Domain servers in listed order: NS1.US.FMA.NET 72.32.55.82 NS2.US.FMA.NET 72.3.153.73 Record created on: 2002-01-18 02:37:00.0 Database last updated on: 2010-04-15 06:24:37.183 Domain Expires on: 2012-01-18 02:37:00.0 Quote Link to comment https://forums.phpfreaks.com/topic/203286-cleaning-up-data-for-display/#findComment-1065074 Share on other sites More sharing options...
premiso Posted May 29, 2010 Share Posted May 29, 2010 What fields specifically are you trying to pull out from those three examples? Quote Link to comment https://forums.phpfreaks.com/topic/203286-cleaning-up-data-for-display/#findComment-1065076 Share on other sites More sharing options...
kenrbnsn Posted May 29, 2010 Share Posted May 29, 2010 How are you obtaining the information? Quote Link to comment https://forums.phpfreaks.com/topic/203286-cleaning-up-data-for-display/#findComment-1065078 Share on other sites More sharing options...
ignace Posted May 29, 2010 Share Posted May 29, 2010 My first question is as to whether I am correct in believing it impossible to extract the names/addresses when they are always displayed in a variety of different forms? No, it's not impossible. Check for firstname lastname first and as a last resort check for name. Quote Link to comment https://forums.phpfreaks.com/topic/203286-cleaning-up-data-for-display/#findComment-1065096 Share on other sites More sharing options...
clowes Posted May 30, 2010 Author Share Posted May 30, 2010 Utilizing the examples above, I would be looking to return the following: Address: B-one Dubai Internet City Building 9 Dubai 500401 AE +45.46907100 Email: jnj@b-one.net Nameservers: a.b-one-dns.net b.b-one-dns.net Creation: 1992-02-12 Updated: 2008-03-18 Expiration: 2015-02-13 Address Disney Enterprises, Inc. 500 S. Buena Vista Street 506 Second Ave. Suite 2100 Burbank 91521 CA UNITED STATES +1.8186233325 Email: domain.registrar@ONLINE.DISNEY.COM Name Servers: sens01.dig.com sens02.dig.com orns01.dig.com orns02.dig.com Creation: 1995-04-06 Updated: 2009-06-02 Expiration: 2011-04-08 Address: com fma Future Media Architects, Inc. P.O. Box 71 Road Town Tortola 99999 VG +1.2844945870 Email: dns-admin@fma.net Nameservers: NS1.US.FMA.NET NS2.US.FMA.NET Creation: 2002-01-18 Updated: 2010-04-15 Expiration: 2012-01-18 I can extract the nameservers/email/dates. It is the Address/Phone details I am having trouble with as every different source provides the data in a different way. Any advice would be great. Thanks Quote Link to comment https://forums.phpfreaks.com/topic/203286-cleaning-up-data-for-display/#findComment-1065305 Share on other sites More sharing options...
clowes Posted June 2, 2010 Author Share Posted June 2, 2010 Any further suggestions on this one? Thanks Quote Link to comment https://forums.phpfreaks.com/topic/203286-cleaning-up-data-for-display/#findComment-1066438 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.