Web sites sometimes implement a feature known as "rate limiting" to prevent them being overleaded with requests.
When a server receives more requests than it can handle, or too may requests from the same IP address, it will manage the load by limiting the number of requests that it will respond to and, instead of returning the requested page content, it will return a 429 Too Many Requests HTTP Status Code.
The SiteSentry web crawler can be impacted by this, since checking for broken links requires SiteSentry to crawl many pages on a website on a regular basis. We'll report this as an error when it happens - you can see it on your site dashboard and we'll also send you a notification.
SiteSentry's latest update lets you tweak the way that the broken links check works, to reduce the load our web crawler places on your website and so reduce the risk of it being blocked, in two ways:
- You can limit how many pages are crawled; and/or
- You can slow the rate at which our crawler crawls the website.
By default, our crawler will make two simultaneous requests to your website every 0.25 seconds, but you can now speed this up, or slow it down, through the settings for the Broken Links check.
There are 5 available options so, whilst the default will be fine in most cases, you can now pick whichever works best for your website:
- Slowest: 1 concurrent request, with a 1 second delay between requests
- Slow: 1 concurrent request, with a 0.5 second delay between requests
- Default: 2 concurrent requests, with a 0.25 second delay between requests
- Fast: 2 concurrent requests, with a 0.1 second delay between requests
- Fastest: 4 concurrent requests, with a 0.1 second delay between requests