[
Skip Navigation]
≡
β©οΈ
π£οΈ
-
π
Public
:
Wiki
:
bot
≡
Welcome
Signin
Create Account
bot@Public
View
Source
History
Discussion
Public Group
Create/Find Pages
Group Feed
My Groups
π
Locale: en-US
Page: bot
β
ποΈ
Page Type:
Standard
Page and Feedback
Page Alias
Media List
Presentation
Url Shortener
Share Wall
template:test
Alias Page To:
Page Border:
Solid
Dashed
None
Table of Contents:
Title:
Author:
Meta Robots:
Meta Description:
Meta Properties (such as Open Graph)
One line per property in format: name|content
Header Page Name:
Footer Page Name:
=A Brief Description of FindCanBot= ==How to identify FindCanBot== Presumably, you arrived at this site because you noticed traffic from a User-Agent that identified itself with the string: <pre> Mozilla/5.0 (compatible; FindCanBot +https://findcan.ca/bot.php) </pre> If the IP Address was also 173.11.90.73 to 78, then you have come to the right place to find out about who was probably crawling your site. If it was a different IP address then someone else is hijacking my crawler's name. ==Who runs FindCanBot== FindCanBot is run by Allan Pollett and Chris Pollett using technology developed at seekquarry.com ==How FindCanBot crawls a site == The FindCanBot is currently run sporadically (not continuously). Each machine in a crawl has about four fetcher processes. Each fetcher has open at most 100-300 connections at any given time. In a typical situation, these connections would not all be to the same host. ==How to change how FindCanBot crawls your site== The FindCanBot does understand robots.txt (it has to be robots.txt not robot.txt ) files and it also obeys X-Robots-Tag HTTP headers, html meta tag noindex and nofollow, as well as anchor rel="nofollow" directives. FindCanBot further understands the Crawl-delay and Google and Bing * and $ syntax within Allow and Disallow line extensions to the robots.txt standard. If you want to restrict FindCanBot's access to your site the easiest way is to just add a directive for it to follow in your robots.txt file. For example, in your document root you could put a robots.txt file with lines like: <pre> User-agent: '''FindCanBot''' Disallow: /some_folder2/ ... Allow: /some_other_folder/ </pre> Of course, if you have general robot directives using expressions like "User-Agent: *", these will be understood by FindCanBot as well. FindCanBot caches the robots.txt file for 1 day. They use the cached directives rather than re-requesting the robots.txt file for 24 hours before making a new request of the robots.txt file again. So if you change your robots.txt file it might take a little while before the changes are noticed by FindCanBot. ==Contact Info== If you have any questions about FindCanBot, please feel free to contact (allanp73@gmail.com).
X
(c) 2024 Findcan -
Canadian Search Engine
We use cookies to implement this site's user functionality, social media features, and traffic analytics.
Privacy Policy Details
.
Allow Cookies