Jump to content

Recommended Posts

Hello guys,

 

I am not good in php so i am asking an advice. There is a website which publishes useful informations for my project but it contains lots of pages. I want to grap those informations and put my database. Datas are in different pages. I thought that I have to create a loop to search the informations from those pages. It is better to tell with an example. For exmp. I have to surf 1000 pages and in all pages i search an name and store the name and correnponding datas on my database. I am in the beginning position. First i don't know how to search a from the pages. How can i grap the page and search terms ?

 

Thanks

Link to comment
https://forums.phpfreaks.com/topic/238389-catch-websites-informations/
Share on other sites

You may fetch any web page to your application directly using PHP but it depends on some conditions. (Accessibility matters.) Next, you may search any web page if there is searching allowed.

Next, if you get your required information you may store it in your database and later loop through it to get your desired result.

The process you're referring is called scraping (web scraping, screen scraping, whatever). Basically, you read the page's html content and parse that data to filter what you need. Depending on the website's structure, it may be difficult to parse and slow as hell (read, parse, insert into db, repeat).

 

From what I know, the easiest way to go is using Regular Expressions - quite confusing for experienced coders, let alone new ones. A lot easier could be using an already built parser that filters data with only a few lines of code. Take a look at PHP Simple HTML DOM Parser. Have tried it once and it worked nicely.

 

Anyway, tell us what the site is and I'm sure someone will help.

 

PS: Check if the website has any API. It will make your life a lot easier. Also, if you need to post data to forms (search forms for ex), you may have to take a look at cURL.

The process you're referring is called scraping (web scraping, screen scraping, whatever). Basically, you read the page's html content and parse that data to filter what you need. Depending on the website's structure, it may be difficult to parse and slow as hell (read, parse, insert into db, repeat).

 

From what I know, the easiest way to go is using Regular Expressions - quite confusing for experienced coders, let alone new ones. A lot easier could be using an already built parser that filters data with only a few lines of code. Take a look at PHP Simple HTML DOM Parser. Have tried it once and it worked nicely.

 

Anyway, tell us what the site is and I'm sure someone will help.

 

PS: Check if the website has any API. It will make your life a lot easier. Also, if you need to post data to forms (search forms for ex), you may have to take a look at cURL.

 

Thank you, it helps a lot

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.