How do you extract JSON-LD from a web page for an SEO audit?
To extract JSON-LD from a web page, get the HTML and look for <script type="application/ld+json"> blocks. Copy their content, check that it is valid, identify the Schema.org types used, then compare the structured data with the visible content of the page.
Explanation
JSON-LD is a common format for embedding structured data in a web page. In SEO, it helps understand how a page describes itself to search engines: content type, product, article, organization, FAQ, breadcrumb, author, price, reviews or main entities.
For an audit, extraction alone is not enough. You need to check three things: JSON validity, consistency with the visible content and alignment with the rules for the Schema.org type used. A JSON-LD block can be technically valid but useless, incomplete or inconsistent with the page. Ideally, analyze it together with the page Markdown to identify missing fields, contradictory data and improvement opportunities.
Formula / method
Simple steps:
- Retrieve the page HTML or DOM.
- Search for
<script type="application/ld+json">. - Copy each JSON-LD block.
- Check JSON validity.
- Identify Schema.org types.
- Compare with visible content.
- Test with a validation tool if the goal is SEO.
Concrete example
Example prompt:
Analyze this JSON-LD together with the page Markdown. Identify the Schema.org types, missing fields, inconsistencies with visible content and possible SEO improvements. Common mistake
Do not treat JSON-LD as a guarantee of rich results or better rankings. It must be valid, consistent with the visible content and compliant with the rules of the target search engine. Also avoid auditing JSON-LD alone without reading the page.
Sources & methodology
- Google Search Central — Introduction to structured data — Explanation of structured data, supported formats and the JSON-LD recommendation.
- Google Search Central — General structured data guidelines — General rules for structured data eligible for rich results.
- Schema.org — Getting started — Introduction to the Schema.org vocabulary and the types/properties used in structured data.
- Google — Rich Results Test — Testing tool to check structured data eligible for rich results.
This content follows Outilo's editorial guidelines.