Showing results for tags 'crawl'.

cURL Crawling Data

Sucrose posted a topic in PHP Coding Help

I'm using cURL to crawl and scrape data from a website. This website contains tables with rows of data. When I send a cURL POST for the underlying data at a specific row(A), it will return the expected data. But when I move to the second row(B), the data returns blank or specifically, a tons of spaces (or nbsp's.) When I access the cURL's POST location by browser, I can see (B)'s data. The only difference in the 2 POST's are location ID's for the data. I don't think it's a problem with JavaScript as I can successfully return data from row (A) as I mentioned. Website I'm trying to crawl: https://mycpa.cpa.state.tx.us/up/Search.jsp Working POST URL(A): https://mycpa.cpa.state.tx.us/up/searchresults.do?d-49216-p=&d-49216-s=&how=&last=bales&other=&d-49216-o=&zip=&_chk=74170700611986R2ZZZZ26&which=View+Details Non-working POST URL(B): https://mycpa.cpa.state.tx.us/up/searchresults.do?d-49216-p=&d-49216-s=&how=&last=bales&other=&d-49216-o=&zip=&_chk=74600015611995R1AC081084&which=View+Details Interestingly, you can combine the data location ID's to show more than 1 set of data per page. When trying this method, the first set of data(A) is displayed and the second(B) is shown as spaces (or nbsp.) Combined POST URL: https://mycpa.cpa.state.tx.us/up/searchresults.do?d-49216-p=&d-49216-s=&how=&last=bales&other=&d-49216-o=&zip=&_chk=74170700611986R2ZZZZ26&_chk=74600015611995R1AC081084&which=View+Details

September 1, 2014
1 reply
- php
- curl
- (and 2 more)
  Tagged with:
  - php
  - curl
  - crawl
  - data

Sessions And Crawl

vignesh_phpfreak posted a topic in PHP Coding Help

this may be a very basic question. I would like to know whether the data which are displayed only to logged in (php session authenticated) users will be crawlable by search engines? for example: there is a page www.domain-name.com/content-listings/ and this page lists some information for user. Non-registered users will view basic information like name and postal address and these should be SEO friendly and crawlable. Registered users (logged in) will view sensitive information such as email_id and phone number which should not be crawlable by search engines. will this be just achieved with sessions or do I need to use javascript and ajax to make email id and phone number protected from crawling and spammers.

Need critiques and "indexing-crawling" help

eaglehopes posted a topic in Website Critique

Hello to everybody, I need critiques and "website crawling help" about my website http://enginery.freecluster.eu . My crawling question was that: I tried google search console tools to add my website's sitemap and add it : http://enginery.freecluster.eu/sitemap.xml . It says my sitemap is ok and found 312 pages but not crawl all correctly! Three weeks have passed but nothing changed. I manually request indexing some pages(about 4 pages) and google search console, after than today it only shows some of them(not all 4) when I search using "site:http://enginery.freecluster.eu". My website's all files have php extensions. Did this prevent googlebot to reach the content of my websites' pages? My robots.txt file's content is : User-agent: Googlebot Allow: / User-agent: * Allow: / Sitemap: http://enginery.freecluster.eu/sitemapv1.xml Any critiques and help is appreciated. Thanks.

June 30, 2022
5 replies
- crawl
- indexing
- (and 1 more)
  Tagged with:
  - crawl
  - indexing
  - php

Sign In

Search the Community

Search By Tags

Search By Author

Content Type

Forums

Find results in...

Find results that contain...

Date Created

Start

End

Last Updated

Start

End

Filter by number of...

Minimum number of comments

Minimum number of replies

Joined

Start

End

Group

AIM

MSN

Website URL

ICQ

Yahoo

Jabber

Skype

Location

Interests

Age

Donation Link

cURL Crawling Data

Sessions And Crawl

Need critiques and "indexing-crawling" help

Browse

Activity

Important Information