Datapointsoptimize technical-health

Crawlability

technical-health floor concept multilingual multilingual

crawlability

What this datapoint measures

The structural ability of crawlers to traverse the site — whether the site’s information architecture, internal linking, sitemap structure, and rendering model permit crawlers to discover and access all pages of value.

Crawlability differs from ai-crawler-access. Access asks “is the crawler permitted.” Crawlability asks “can the crawler actually navigate the site.”

What high looks like

  • Internal linking creates navigable paths to all important pages
  • Sitemap.xml exists, validates, and includes the brand’s important pages
  • Pages requiring JavaScript to render are also accessible via static HTML or pre-rendered fallbacks
  • No infinite-loop crawler traps (e.g., faceted navigation generating unbounded URL combinations)
  • No JavaScript-only navigation that crawlers cannot follow
  • Pagination implemented in crawler-friendly form (rel=next/prev, sitemap inclusion)
  • No login walls blocking crawler access to indexable content

What low looks like

  • Internal linking sparse; many pages reachable only through site-search or external links
  • Sitemap.xml missing, broken, or omits significant content sections
  • Single-page application with client-side rendering and no server-side fallback
  • Faceted navigation generating crawler traps
  • Login walls blocking content that should be indexable

What at floor looks like

A brand at floor on crawlability is a brand whose content is technically present on the web but structurally unreachable by automated systems. The crawler arrives at the homepage and cannot find or follow paths to other content; the sitemap is missing or broken; the site requires JavaScript execution to expose content that is then not detectable.

This pattern is most common in single-page applications without server-side rendering, in sites with login-walled content, and in sites with broken or never-implemented sitemaps. The remedy depends on the underlying cause: engineering work to add server-side rendering, sitemap generation, or rehabilitated internal linking.

What affects this datapoint

  • Quality and completeness of internal linking
  • Presence and validity of sitemap.xml
  • Rendering model (static, server-rendered, client-rendered with fallback, client-rendered without fallback)
  • Pagination implementation
  • Login walls and content gating
  • URL structure stability (URLs that change frequently degrade crawlability)
  • Crawler trap patterns (infinite faceting, calendar archives without limits)

OMG actions that influence this datapoint

ActionInfluence
O-4 Technical Infrastructure, Performance & International FoundationDirect, primary. Crawlability is a core component of O-4 work.
M-10 Content Hub Architecture & Internal Authority FlowSubstantial. M-10 work explicitly improves internal linking structure, which lifts crawlability.
O-6 Content Audit & Baseline OptimizationIndirect. Content audit surfaces orphan pages and crawl gaps, which drive remediation work.

Multilingual considerations

Multilingual crawlability requires:

  • Per-language sitemaps (or a single multilingual sitemap with language-tagged URLs)
  • hreflang implementation correctly cross-linking language variants (separate datapoint, but related)
  • Internal linking patterns that work within and across language variants appropriately
  • Language-specific content not gated behind locale-detection redirects that obscure content from crawlers

A common multilingual failure mode is automatic locale redirects that send AI crawlers to a default language regardless of the URL they requested, producing crawler indexes that contain only one language’s content.

Common failure modes

  • React or Vue single-page application without SSR/SSG, served as a near-empty HTML shell to crawlers
  • Sitemap pointing to URLs that redirect (chained redirects degrade crawl efficiency)
  • Sitemap including URLs that return 404
  • Important content behind login walls
  • Faceted product navigation generating millions of crawl-available URL variants
  • AJAX-loaded content that crawlers cannot trigger
  • Pagination requiring JavaScript click events to advance
  • Admin URL parameters generating duplicate content (sort orders, view modes)

Diagnostic interpretation

Crawlability at floor with ai-crawler-access at high indicates a structural site-architecture problem distinct from access blocking. The crawler can reach the site but cannot navigate it. Engineering remediation is required.

Crawlability at low with sitemap-validity also low indicates that the brand has not done sitemap work and internal linking is deficient. O-4 work targeting both produces compounding lift.

Crawlability at high with other V1.2 datapoints at low indicates a site that is structurally well-organized but has other infrastructure issues. The remedy is to address the specific other low datapoints rather than re-doing crawlability work.