Datapointsoptimize technical-health

Sitemap Validity

technical-health floor concept multilingual multilingual

sitemap-validity

What this datapoint measures

Presence, validity, and freshness of XML sitemaps. Whether the brand provides a sitemap that conforms to the Sitemap protocol, includes the brand’s important pages, and reflects current content (not stale URLs from past site versions).

What high looks like

  • sitemap.xml present at standard location (root or referenced in robots.txt)
  • Sitemap validates against the Sitemap protocol XSD schema
  • Sitemap includes the brand’s important pages
  • Sitemap excludes utility pages (login, account management, parameter variants)
  • lastmod dates reflect actual modification times
  • Multiple sitemaps with sitemap index for sites with substantial content
  • Per-language sitemaps (or single sitemap with hreflang-tagged URLs) for multilingual sites
  • Sitemap referenced from robots.txt

What low looks like

  • Sitemap present but incomplete (significant content sections missing)
  • Sitemap including 404 URLs, redirected URLs, or non-canonical variants
  • Sitemap with stale lastmod dates that don’t reflect actual modifications
  • Sitemap not referenced from robots.txt
  • Sitemap exceeding 50,000 URL or 50MB limits without sitemap-index splitting

What at floor looks like

A brand at floor on sitemap-validity has no sitemap, or a sitemap so broken or stale it provides no useful information to crawlers. AI systems and search engines must discover content entirely through internal linking, which works for some content and fails for others.

The remedy is straightforward: implement sitemap generation as part of the CMS or build process. Modern CMSes typically have plugins or built-in functionality for this; custom-built sites require explicit implementation.

What affects this datapoint

  • Whether sitemap.xml is present at all
  • Validity against Sitemap protocol
  • Completeness (does it include all important content)
  • Freshness (do lastmod dates match actual modification times)
  • Robots.txt reference
  • Sitemap-index usage for large sites
  • Per-language handling for multilingual sites

OMG actions that influence this datapoint

ActionInfluence
O-4 Technical Infrastructure, Performance & International FoundationDirect, primary. Sitemap implementation is a core component of O-4.

Multilingual considerations

For multilingual sites, the sitemap should either:

  • Include all language variants in a single sitemap with hreflang-equivalent URL annotations
  • Provide separate per-language sitemaps with a sitemap index referencing all of them

Either pattern works; mixing them or implementing them inconsistently produces signal degradation. The choice typically follows the brand’s existing site architecture: subdomain-based language variants tend to use separate sitemaps; path-based language variants tend to use a unified sitemap.

Common failure modes

  • Sitemap generated by a plugin that no longer runs, leaving a stale file
  • Sitemap including admin URLs that should be excluded
  • Sitemap including URL parameters that produce duplicate-content variants
  • Sitemap with all lastmod dates set to the same value (the build date) rather than per-page modification dates
  • Sitemap files too large (>50MB or >50,000 URLs) without index splitting
  • Sitemap not gzipped on large sites; produces slow crawler download

Diagnostic interpretation

Sitemap-validity at floor with crawlability also low indicates broad O-4 needs.

Sitemap-validity at low with crawlability high indicates that internal linking is good enough to enable discovery without sitemap, but the formal sitemap-based discovery channel is broken. Some AI systems rely heavily on sitemaps; others rely more on link discovery. The brand may experience differential indexing across systems.

Sitemap-validity at high with multilingual-readiness at low indicates a brand with sitemaps for languages that have minimal content. The sitemap structure is in place ahead of the content; the remedy is M-pillar work to fill the structure.