Page and Feedback
Alias Page To:
Table of Contents:
Meta Properties (such as Open Graph)
One line per property in format: name|content
Header Page Name:
Footer Page Name:
=A Brief Description of FindCanBot= ==How to identify FindCanBot== Presumably, you arrived at this site because you noticed traffic from a User-Agent that identified itself with the string: <pre> Mozilla/5.0 (compatible; FindCanBot +https://findcan.ca/bot.php) </pre> If the IP Address was also 188.8.131.52 to 78, then you have come to the right place to find out about who was probably crawling your site. If it was a different IP address then someone else is hijacking my crawler's name. ==Who runs FindCanBot== FindCanBot is run by Allan Pollett and Chris Pollett using technology developed at seekquarry.com ==How FindCanBot crawls a site == The FindCanBot is currently run sporadically (not continuously). Each machine in a crawl has about four fetcher processes. Each fetcher has open at most 100-300 connections at any given time. In a typical situation, these connections would not all be to the same host. ==How to change how FindCanBot crawls your site== The FindCanBot does understand robots.txt (it has to be robots.txt not robot.txt ) files and it also obeys X-Robots-Tag HTTP headers, html meta tag noindex and nofollow, as well as anchor rel="nofollow" directives. FindCanBot further understands the Crawl-delay and Google and Bing * and $ syntax within Allow and Disallow line extensions to the robots.txt standard. If you want to restrict FindCanBot's access to your site the easiest way is to just add a directive for it to follow in your robots.txt file. For example, in your document root you could put a robots.txt file with lines like: <pre> User-agent: '''FindCanBot''' Disallow: /some_folder2/ ... Allow: /some_other_folder/ </pre> Of course, if you have general robot directives using expressions like "User-Agent: *", these will be understood by FindCanBot as well. FindCanBot caches the robots.txt file for 1 day. They use the cached directives rather than re-requesting the robots.txt file for 24 hours before making a new request of the robots.txt file again. So if you change your robots.txt file it might take a little while before the changes are noticed by FindCanBot. ==Contact Info== If you have any questions about FindCanBot, please feel free to contact (email@example.com).
Resources are images, videos, or files associated with this page.
The clipboard is currently empty.
No resources have been saved to this page yet.
Developed at SeekQuarry
(c) 2022 Findcan -
Canadian Search Engine