Technical

Crawl Budget

Crawl budget is the number of URLs Googlebot will crawl on your site within a given period, determined by crawl rate limit (server capacity) and crawl demand (page importance).

For most sites under 10,000 pages, crawl budget is not a concern — Google will crawl everything. It becomes critical for large sites, sites with many URL parameters, or sites with slow server response times.

Wasted crawl budget comes from crawling duplicate content, faceted navigation pages, infinite scroll pagination, or error pages. Use robots.txt to block low-value URLs, canonicalize duplicates, and return proper 404/410 status codes for removed pages.

Monitor crawl stats in Google Search Console under Settings > Crawl Stats. Look for trends in pages crawled per day and average response time.

Related terms

Robots.txt

Robots.txt is a plain text file at the root of a website that instructs search engine crawlers which URLs they are allowed or disallowed from accessing.

Sitemap XML

An XML sitemap is a file that lists URLs on your website along with optional metadata (last modified date, change frequency, priority) to help search engines discover and crawl your pages.

Indexation

Indexation is the process by which search engines discover, crawl, and store web pages in their database (index) so they can be returned in search results.

Stop shipping broken SEO

Indxel validates your metadata, guards your CI/CD pipeline, and monitors indexation — so you never miss an SEO issue again.

Get started Browse glossary