Canadian Search Engine

📖

Locale: en-US
Page: bot

⚙

🗄️

Page Type:

Alias Page To:

Page Border:

Table of Contents:

Title:

Author:

Meta Robots:

Meta Description:

Meta Properties (such as Open Graph)

One line per property in format: name|content

Header Page Name:

Footer Page Name:

=A Brief Description of FindCanBot=

==How to identify FindCanBot==

Presumably, you arrived at this site because you noticed traffic from a User-Agent that identified itself with the string:

<pre>
Mozilla/5.0 (compatible; FindCanBot +https://findcan.ca/bot.php)
</pre>

If the IP Address was also 173.11.90.73 to 78, then you have come to the right place to find out about who was probably crawling your site.

If it was a different IP address then someone else is hijacking my crawler's name.

==Who runs FindCanBot==

FindCanBot is run by Allan Pollett and Chris Pollett using technology developed at seekquarry.com

==How FindCanBot crawls a site ==

The FindCanBot is currently run sporadically (not continuously). Each machine in a crawl has about four fetcher processes. Each fetcher has open at most 100-300 connections at any given time. In a typical situation, these connections would not all be to the same host.

==How to change how FindCanBot crawls your site==

The FindCanBot does understand robots.txt (it has to be robots.txt not robot.txt ) files and it also obeys X-Robots-Tag HTTP headers, html meta tag noindex and nofollow, as well as anchor rel="nofollow" directives. FindCanBot further understands the Crawl-delay and Google and Bing * and $ syntax within Allow and Disallow line extensions to the robots.txt standard. If you want to restrict FindCanBot's access to your site the easiest way is to just add a directive for it to follow in your robots.txt file. For example, in your document root you could put a robots.txt file with lines like:

<pre>
User-agent: '''FindCanBot'''
Disallow: /some_folder2/
...
Allow: /some_other_folder/
</pre>

Of course, if you have general robot directives using expressions like "User-Agent: *", these will be understood by FindCanBot as well. FindCanBot caches the robots.txt file for 1 day. They use the cached directives rather than re-requesting the robots.txt file for 24 hours before making a new request of the robots.txt file again. So if you change your robots.txt file it might take a little while before the changes are noticed by 
FindCanBot.

==Contact Info==

If you have any questions about FindCanBot, please feel free to contact (allanp73@gmail.com).