lilmer Posted June 21, 2013 Share Posted June 21, 2013 I have a robots.txt on the site root directory with the content text of: User-agent: * Disallow: / By reading some stuffs they say that no crawler will engage the file content of my site like google and it will be safe for the bad crawlers not see the directory of your site and of course it is good for the security. But when I search the site on google it happen that A description for this result is not available because of this site's robots.txt – learn more. So my question now is how can a search engine be known the description of the site if your not allowing they're crawler not to engage the protected directory where all your file is ? Quote Link to comment https://forums.phpfreaks.com/topic/279412-safe-way-to-add-robotstxt-in-codeigniter/ Share on other sites More sharing options...
kicken Posted June 21, 2013 Share Posted June 21, 2013 While Google won't crawl or index the content of pages blocked by robots.txt, we may still index the URLs if we find them on other pages on the web. Basically if google finds links to your site, it will still index the URLs, just not any of the pages contents. So based on the URL or text used within the pages that link to you, google still matched your URL and included it in the result, but because of your robots.txt google doesn't have any of the page's content in order to provide a good description of the site. According to google, to completely prevent the site from appearing in search results at all you need to use the noindex meta or http header: To entirely prevent a page's contents from being listed in the Google web index even if other sites link to it, use a noindex meta tag or x-robots-tag. As long as Googlebot fetches the page, it will see the noindex meta tag and prevent that page from showing up in the web index. The x-robots-tag HTTP header is particularly useful if you wish to limit indexing of non-HTML files like graphics or other kinds of documents. Note that the above is specific to google. Other search engines may handle things differently. Is there any particular reason why you wish to block your entire site from being crawled by search engines? Quote Link to comment https://forums.phpfreaks.com/topic/279412-safe-way-to-add-robotstxt-in-codeigniter/#findComment-1437184 Share on other sites More sharing options...
lilmer Posted June 21, 2013 Author Share Posted June 21, 2013 Thank you for your explanation, now its more clear to me. Is there any particular reason why you wish to block your entire site from being crawled by search engines? No, not really to block by search engines crawlers but to block some bad crawlers. I forgot the link but I read that robots can setup session also to your website. So right now I'm still confuse if I just specify robots who can access some directory of the site. What I'm developing is a payment system website. So I'm kinda nervous to commit something I really don't understand. Thanks again for your explanation. Quote Link to comment https://forums.phpfreaks.com/topic/279412-safe-way-to-add-robotstxt-in-codeigniter/#findComment-1437190 Share on other sites More sharing options...
kicken Posted June 21, 2013 Share Posted June 21, 2013 No, not really to block by search engines crawlers but to block some bad crawlers. A bad crawler is going to flat out ignore your robots.txt file and crawl your site anyway. The only thing a robots.txt file is good for is to indicate to a good crawler which paths you would prefer it not crawl. Quote Link to comment https://forums.phpfreaks.com/topic/279412-safe-way-to-add-robotstxt-in-codeigniter/#findComment-1437192 Share on other sites More sharing options...
lilmer Posted June 21, 2013 Author Share Posted June 21, 2013 Ah make sense, so in that case its how you really implement everything in coding, security on the server, or any part of your developing to make your site vulnerable to hackers or some sort of bad things. Quote Link to comment https://forums.phpfreaks.com/topic/279412-safe-way-to-add-robotstxt-in-codeigniter/#findComment-1437199 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.