Datapoints › optimize technical-health

Crawlability

technical-health floor concept multilingual multilingual

Influenced by actions

O-4 Technical Infrastructure, Performance M-10 Content Hub Architecture & Internal Au O-6 Content Audit & Baseline Optimization

`crawlability`

What this datapoint measures

The structural ability of crawlers to traverse the site — whether the site’s information architecture, internal linking, sitemap structure, and rendering model permit crawlers to discover and access all pages of value.

Crawlability differs from ai-crawler-access. Access asks “is the crawler permitted.” Crawlability asks “can the crawler actually navigate the site.”

What high looks like

Internal linking creates navigable paths to all important pages
Sitemap.xml exists, validates, and includes the brand’s important pages
Pages requiring JavaScript to render are also accessible via static HTML or pre-rendered fallbacks
No infinite-loop crawler traps (e.g., faceted navigation generating unbounded URL combinations)
No JavaScript-only navigation that crawlers cannot follow
Pagination implemented in crawler-friendly form (rel=next/prev, sitemap inclusion)
No login walls blocking crawler access to indexable content

What low looks like

Internal linking sparse; many pages reachable only through site-search or external links
Sitemap.xml missing, broken, or omits significant content sections
Single-page application with client-side rendering and no server-side fallback
Faceted navigation generating crawler traps
Login walls blocking content that should be indexable

What at floor looks like

A brand at floor on crawlability is a brand whose content is technically present on the web but structurally unreachable by automated systems. The crawler arrives at the homepage and cannot find or follow paths to other content; the sitemap is missing or broken; the site requires JavaScript execution to expose content that is then not detectable.

This pattern is most common in single-page applications without server-side rendering, in sites with login-walled content, and in sites with broken or never-implemented sitemaps. The remedy depends on the underlying cause: engineering work to add server-side rendering, sitemap generation, or rehabilitated internal linking.

What affects this datapoint

Quality and completeness of internal linking
Presence and validity of sitemap.xml
Rendering model (static, server-rendered, client-rendered with fallback, client-rendered without fallback)
Pagination implementation
Login walls and content gating
URL structure stability (URLs that change frequently degrade crawlability)
Crawler trap patterns (infinite faceting, calendar archives without limits)

OMG actions that influence this datapoint

Action	Influence
O-4 Technical Infrastructure, Performance & International Foundation	Direct, primary. Crawlability is a core component of O-4 work.
M-10 Content Hub Architecture & Internal Authority Flow	Substantial. M-10 work explicitly improves internal linking structure, which lifts crawlability.
O-6 Content Audit & Baseline Optimization	Indirect. Content audit surfaces orphan pages and crawl gaps, which drive remediation work.

Multilingual considerations

Multilingual crawlability requires:

Per-language sitemaps (or a single multilingual sitemap with language-tagged URLs)
hreflang implementation correctly cross-linking language variants (separate datapoint, but related)
Internal linking patterns that work within and across language variants appropriately
Language-specific content not gated behind locale-detection redirects that obscure content from crawlers

A common multilingual failure mode is automatic locale redirects that send AI crawlers to a default language regardless of the URL they requested, producing crawler indexes that contain only one language’s content.

Common failure modes

React or Vue single-page application without SSR/SSG, served as a near-empty HTML shell to crawlers
Sitemap pointing to URLs that redirect (chained redirects degrade crawl efficiency)
Sitemap including URLs that return 404
Important content behind login walls
Faceted product navigation generating millions of crawl-available URL variants
AJAX-loaded content that crawlers cannot trigger
Pagination requiring JavaScript click events to advance
Admin URL parameters generating duplicate content (sort orders, view modes)

Diagnostic interpretation

Crawlability at floor with ai-crawler-access at high indicates a structural site-architecture problem distinct from access blocking. The crawler can reach the site but cannot navigate it. Engineering remediation is required.

Crawlability at low with sitemap-validity also low indicates that the brand has not done sitemap work and internal linking is deficient. O-4 work targeting both produces compounding lift.

Crawlability at high with other V1.2 datapoints at low indicates a site that is structurally well-organized but has other infrastructure issues. The remedy is to address the specific other low datapoints rather than re-doing crawlability work.