Datapoints › manifest semantic-density

Entity Recognition

semantic-density floor concept multilingual multilingual

Influenced by actions

O-6 Audit Konten & Optimasi Dasar G-1 Verifikasi Entitas Eksternal, Knowledg M-6 Arsitektur Konten & Kutipan Berbasis B

`entity-recognition`

What this datapoint measures

Clarity with which named entities — people, organizations, products, places, events — appear in content. Whether entities are referenced consistently, disambiguated where needed, and presented with sufficient context for AI systems to ground them in their internal entity representations.

What high looks like

Named entities appear with consistent canonical naming throughout
First mention of an entity provides disambiguating context (full name, role, organization)
Subsequent mentions use stable references (consistent abbreviations or shortened names)
Ambiguous entities are disambiguated explicitly (e.g., “Apple Inc.” rather than just “Apple” when context is unclear)
Named entities link to authoritative source pages where appropriate (Wikipedia, official sites)

What low looks like

Named entities referenced inconsistently (different name forms across the same page)
First mention of an entity without disambiguating context
Ambiguous entities left ambiguous
Named entities not linked even where appropriate

What at floor looks like

A brand at floor on entity-recognition has content where named entities are referenced loosely, inconsistently, and without disambiguation. AI systems trying to ground entity claims face ambiguity that they may resolve incorrectly or decline to resolve at all.

This pattern is common in informal editorial cultures, in content created by content marketers who are not subject-matter experts, and in content translated literally without the disambiguation that translation requires.

What affects this datapoint

Consistent canonical naming for entities
Disambiguating context on first mention
Stable abbreviated references
Linking to authoritative entity sources where appropriate
Distinction between similarly-named entities

OMG actions that influence this datapoint

Action	Influence
O-6 Content Audit & Baseline Optimization	Substantial. Audit surfaces inconsistent entity naming.
G-1 External Entity Verification, Knowledge Graph & Local Authority	Substantial. G-1 work establishes canonical entity references that flow into editorial practice.
M-6 Evidence-Based Content & Citation Architecture	Substantial. Citation discipline often involves disambiguating cited entities.

Multilingual considerations

Entity recognition is significantly more language-sensitive than most V2.1 datapoints:

Person names in non-Latin scripts require careful canonical handling. Japanese and Korean honorifics affect entity-recognition; the canonical form should appear at first mention.
Organization names that include language-specific suffixes (Inc., 株式会社, etc.) should be presented in the form appropriate to the content language.
Cross-language entity references should use canonical names per language (e.g., a Japanese article referencing Apple Inc. uses the Japanese-script form アップル if the content language calls for it).

Auto-translated content often produces entity-recognition failures because translation engines do not consistently handle entity canonical forms.

Common failure modes

Entity referenced by full name in introduction, then by ambiguous shortened name throughout
Multiple entities with similar names not disambiguated
Imported or translated content using entity names from the source culture rather than the localized form
Person references without role or organization context
Product references without brand context

Diagnostic interpretation

Entity-recognition at floor with entity-schema (V1.1) also at floor indicates a brand with weak entity discipline at both the structured and unstructured levels. G-1 work and M-pillar editorial work address both.

Entity-recognition at low with entity-schema at high indicates a brand with strong structured entity declarations but weak entity discipline in editorial content. The structured signals are correct; the prose around them is not. Editorial standard-setting is the remedy.