Structured Data
Schema.org
The shared vocabulary for structured-data markup used by Google, Microsoft, and major search engines. As of June 2026, Schema.org publishes monthly aggregate adoption statistics by type.
What it is
Schema.org is the shared structured-data vocabulary maintained collaboratively and adopted by Google, Microsoft, Yandex, and other major engines. It defines a hierarchy of types and properties that describe real-world things, content, and relationships.
Why it matters
It gives AI search systems a common, machine-readable language for the entities on a page, which improves how those systems classify content and resolve it to known entities rather than guessing from raw text.
How it works
Authors select the most specific applicable type from the Schema.org hierarchy, then describe it using that type's properties, usually serialised as JSON-LD in the page head.
When it applies
It applies whenever a page describes a recognisable entity or content type that engines and answer engines may need to interpret precisely.
Examples
- Marking an article with the Article type and an author property pointing to a Person
- Describing a business with LocalBusiness, including address and openingHoursSpecification
- Using Product with an aggregateRating to express review data
How it is measured
- Proportion of pages emitting valid Schema.org types versus total pages
- Type specificity (use of the narrowest applicable type rather than a generic parent)
- Count of validation errors and warnings per page
- Coverage of recommended properties for each chosen type
Related terms in Structured Data
- JSON-LDThe recommended syntax for embedding Schema.org structured data on a page. Lightweight, decoupled from page HTML, and increasingly the format LLMs prefer when retrieving structured facts.
- Knowledge graph entityA node in Google's Knowledge Graph representing a real-world thing (person, place, organisation, work). Strong entity signals are prerequisite for Search Profile eligibility and consistent AI Overview attribution.
- Retrieval-augmented generationA generation approach where an LLM pulls relevant documents at query time and uses them as the source for its answer. The pattern behind most enterprise AI search products and Perplexity-style answer engines.
- Structured dataMarkup (typically JSON-LD using Schema.org vocabulary) that tells search engines and LLMs what the entities and relationships on a page are. Increasingly important as both Google and generative systems converge on entity-level understanding.