salihk. Posted June 4, 2011 Share Posted June 4, 2011 Hello guys, I am not good in php so i am asking an advice. There is a website which publishes useful informations for my project but it contains lots of pages. I want to grap those informations and put my database. Datas are in different pages. I thought that I have to create a loop to search the informations from those pages. It is better to tell with an example. For exmp. I have to surf 1000 pages and in all pages i search an name and store the name and correnponding datas on my database. I am in the beginning position. First i don't know how to search a from the pages. How can i grap the page and search terms ? Thanks Quote Link to comment https://forums.phpfreaks.com/topic/238389-catch-websites-informations/ Share on other sites More sharing options...
Sanjib Sinha Posted June 4, 2011 Share Posted June 4, 2011 You may fetch any web page to your application directly using PHP but it depends on some conditions. (Accessibility matters.) Next, you may search any web page if there is searching allowed. Next, if you get your required information you may store it in your database and later loop through it to get your desired result. Quote Link to comment https://forums.phpfreaks.com/topic/238389-catch-websites-informations/#findComment-1225113 Share on other sites More sharing options...
Fadion Posted June 4, 2011 Share Posted June 4, 2011 The process you're referring is called scraping (web scraping, screen scraping, whatever). Basically, you read the page's html content and parse that data to filter what you need. Depending on the website's structure, it may be difficult to parse and slow as hell (read, parse, insert into db, repeat). From what I know, the easiest way to go is using Regular Expressions - quite confusing for experienced coders, let alone new ones. A lot easier could be using an already built parser that filters data with only a few lines of code. Take a look at PHP Simple HTML DOM Parser. Have tried it once and it worked nicely. Anyway, tell us what the site is and I'm sure someone will help. PS: Check if the website has any API. It will make your life a lot easier. Also, if you need to post data to forms (search forms for ex), you may have to take a look at cURL. Quote Link to comment https://forums.phpfreaks.com/topic/238389-catch-websites-informations/#findComment-1225114 Share on other sites More sharing options...
salihk. Posted June 4, 2011 Author Share Posted June 4, 2011 The process you're referring is called scraping (web scraping, screen scraping, whatever). Basically, you read the page's html content and parse that data to filter what you need. Depending on the website's structure, it may be difficult to parse and slow as hell (read, parse, insert into db, repeat). From what I know, the easiest way to go is using Regular Expressions - quite confusing for experienced coders, let alone new ones. A lot easier could be using an already built parser that filters data with only a few lines of code. Take a look at PHP Simple HTML DOM Parser. Have tried it once and it worked nicely. Anyway, tell us what the site is and I'm sure someone will help. PS: Check if the website has any API. It will make your life a lot easier. Also, if you need to post data to forms (search forms for ex), you may have to take a look at cURL. Thank you, it helps a lot Quote Link to comment https://forums.phpfreaks.com/topic/238389-catch-websites-informations/#findComment-1225215 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.