Crawlability
crawlability
What this datapoint measures
The structural ability of crawlers to traverse the site — whether the site’s information architecture, internal linking, sitemap structure, and rendering model permit crawlers to discover and access all pages of value.
Crawlability differs from ai-crawler-access. Access asks “is the crawler permitted.” Crawlability asks “can the crawler actually navigate the site.”
What high looks like
- Internal linking creates navigable paths to all important pages
- Sitemap.xml exists, validates, and includes the brand’s important pages
- Pages requiring JavaScript to render are also accessible via static HTML or pre-rendered fallbacks
- No infinite-loop crawler traps (e.g., faceted navigation generating unbounded URL combinations)
- No JavaScript-only navigation that crawlers cannot follow
- Pagination implemented in crawler-friendly form (rel=next/prev, sitemap inclusion)
- No login walls blocking crawler access to indexable content
What low looks like
- Internal linking sparse; many pages reachable only through site-search or external links
- Sitemap.xml missing, broken, or omits significant content sections
- Single-page application with client-side rendering and no server-side fallback
- Faceted navigation generating crawler traps
- Login walls blocking content that should be indexable
What at floor looks like
A brand at floor on crawlability is a brand whose content is technically present on the web but structurally unreachable by automated systems. The crawler arrives at the homepage and cannot find or follow paths to other content; the sitemap is missing or broken; the site requires JavaScript execution to expose content that is then not detectable.
This pattern is most common in single-page applications without server-side rendering, in sites with login-walled content, and in sites with broken or never-implemented sitemaps. The remedy depends on the underlying cause: engineering work to add server-side rendering, sitemap generation, or rehabilitated internal linking.
What affects this datapoint
- Quality and completeness of internal linking
- Presence and validity of sitemap.xml
- Rendering model (static, server-rendered, client-rendered with fallback, client-rendered without fallback)
- Pagination implementation
- Login walls and content gating
- URL structure stability (URLs that change frequently degrade crawlability)
- Crawler trap patterns (infinite faceting, calendar archives without limits)
OMG actions that influence this datapoint
| Action | Influence |
|---|---|
| O-4 Technical Infrastructure, Performance & International Foundation | Direct, primary. Crawlability is a core component of O-4 work. |
| M-10 Content Hub Architecture & Internal Authority Flow | Substantial. M-10 work explicitly improves internal linking structure, which lifts crawlability. |
| O-6 Content Audit & Baseline Optimization | Indirect. Content audit surfaces orphan pages and crawl gaps, which drive remediation work. |
Multilingual considerations
Multilingual crawlability requires:
- Per-language sitemaps (or a single multilingual sitemap with language-tagged URLs)
- hreflang implementation correctly cross-linking language variants (separate datapoint, but related)
- Internal linking patterns that work within and across language variants appropriately
- Language-specific content not gated behind locale-detection redirects that obscure content from crawlers
A common multilingual failure mode is automatic locale redirects that send AI crawlers to a default language regardless of the URL they requested, producing crawler indexes that contain only one language’s content.
Common failure modes
- React or Vue single-page application without SSR/SSG, served as a near-empty HTML shell to crawlers
- Sitemap pointing to URLs that redirect (chained redirects degrade crawl efficiency)
- Sitemap including URLs that return 404
- Important content behind login walls
- Faceted product navigation generating millions of crawl-available URL variants
- AJAX-loaded content that crawlers cannot trigger
- Pagination requiring JavaScript click events to advance
- Admin URL parameters generating duplicate content (sort orders, view modes)
Diagnostic interpretation
Crawlability at floor with ai-crawler-access at high indicates a structural site-architecture problem distinct from access blocking. The crawler can reach the site but cannot navigate it. Engineering remediation is required.
Crawlability at low with sitemap-validity also low indicates that the brand has not done sitemap work and internal linking is deficient. O-4 work targeting both produces compounding lift.
Crawlability at high with other V1.2 datapoints at low indicates a site that is structurally well-organized but has other infrastructure issues. The remedy is to address the specific other low datapoints rather than re-doing crawlability work.