FanRail - Railroad And Railfan Search Engine

FanRail Information For Webmasters

The Railroaders Search Engine

FanRailBot: FanRail's Web Crawler

FanRailBot is FanRail's spider or web crawling robot, and it collects documents from around the web to build a searchable index for the FanRail search engine. On this page, you'll find answers to the most commonly asked questions about how our web crawler works.

FAQ - Frequently Asked Questions

1. How frequently will FanRailBot crawl my website?
2. Can I stop FanRailBot from crawling some or all of my web pages?
3. I do not have a robots.txt file on my server, why is FanRailBot requesting it?
4. What types of links does FanRailBot follow?
5. Can I prevent FanRailBot from following links on my pages?
6. How can I stop FanRailBot from following only certain links?

Answers

1. How frequently will FanRailBot crawl my website?

Because we're currently building our database of railroad related websites and we're a new search engine, FanRailBot may crawl your site frequently. Much of this can be caused by how often a link is found to your website from other websites. After your site has been index, you should see FanRailBot about once per month to see if there has been any changes to your website that needs to be updated in our database.

2. Can I stop FanRailBot from crawling some or all of my web pages?

The short answer is yes. FanRailBot follows the rules of the robots.txt standard when crawling the web and indexing content. For detailed information on how to set up and use a robots.txt file on your site, please read the Robot Exclusion Standard.

To completely disallow FanRailBot from accessing any web pages on your site, use the following in your robots.txt file:

User-agent: FanRailBot
Disallow: /

To only disallow certain pages or folders from FanRailBot's crawl, you may add something like the following to your robots.txt file:

User-agent: FanRailBot
Disallow: /cgi-bin/
Disallow: /page2.html
Disallow: /page3.html

3. I do not have a robots.txt file on my server, why is FanRailBot requesting it?

Because FanRailBot follows the robots.txt protocol, it will always request that file from every server it visits. If it does not find a robots.txt file, it will assume that you allow spidering of all of your websites content made accessible through web links. If you want to prevent the "file not found" error messages from showing up in your web server logs, you can create an empty file named robots.txt.

4. What types of links does FanRailBot follow?

FanRailBot follows HREF links:

<a href="somepage.html">FanRailBot Will Follow This Link</a>

5. Can I prevent FanRailBot from following links on my pages?

Yes. FanRailBot 2.0 respects the robots meta tag in your web pages. To prevent FanRailBot from following any links in your web pages, use the following meta tag in the head of your web pages:

<META NAME="FanRailBot" CONTENT="nofollow">

6. How can I stop FanRailBot from following only certain links?

FanRailBot also respects the new "rel=nofollow" tag that was originally created for the blogging community. So, if you have a couple of links that you do not want FanRailBot to follow, yet still have others you DO want it to follow, use the following code in your links:

<a href="somepage.html">FanRailBot Will Follow This Link</a>

<a href="somepage.html" rel="nofollow">FanRailBot Will NOT Follow This Link</a>

If you have further questions not answered here on this page, please feel free to contact us by clicking here.




  


[ Privacy Policy ]    [ Legal Docs ]    [ Submit Site ]    [ About Us ]    [ Help ]


Copyright © 2008 FanRail Services L.L.C.