Structured Data
Retrieval-augmented generation
Also known as: RAG
A generation approach where an LLM pulls relevant documents at query time and uses them as the source for its answer. The pattern behind most enterprise AI search products and Perplexity-style answer engines.
What it is
Retrieval-augmented generation is an approach where an LLM fetches relevant documents at query time and grounds its answer in that retrieved material rather than relying solely on its trained weights. It is the pattern behind most enterprise AI search products and Perplexity-style answer engines.
Why it matters
It determines whether and how your content is selected, quoted, and attributed in AI answers, making retrievability and clear, chunk-level structure as important as traditional ranking.
How it works
A system embeds and indexes source content, retrieves the closest passages to a query, and passes them as context so the model composes a grounded, often cited response.
When it applies
It applies wherever answers must reflect current, source-specific, or proprietary information rather than the model's static training data.
Examples
- An answer engine retrieving three articles and synthesising a cited summary
- An internal support bot grounding replies in a company's documentation index
- A site optimising self-contained sections so passages retrieve cleanly out of context
How it is measured
- Retrieval relevance of passages returned for a target query
- Faithfulness of generated answers to the retrieved sources
- Citation accuracy and frequency for your content in answers
- Coverage of expected queries by retrievable, well-structured passages
Related terms in Structured Data
- JSON-LDThe recommended syntax for embedding Schema.org structured data on a page. Lightweight, decoupled from page HTML, and increasingly the format LLMs prefer when retrieving structured facts.
- Knowledge graph entityA node in Google's Knowledge Graph representing a real-world thing (person, place, organisation, work). Strong entity signals are prerequisite for Search Profile eligibility and consistent AI Overview attribution.
- Schema.orgThe shared vocabulary for structured-data markup used by Google, Microsoft, and major search engines. As of June 2026, Schema.org publishes monthly aggregate adoption statistics by type.
- Structured dataMarkup (typically JSON-LD using Schema.org vocabulary) that tells search engines and LLMs what the entities and relationships on a page are. Increasingly important as both Google and generative systems converge on entity-level understanding.