I want you to do something before you read the rest of this.
Open a new browser tab and type your site’s URL, followed by /sitemap.xml or /sitemap_index.xml. Look at what comes back. If you’re on WordPress, the sitemap is almost certainly live.
Now scroll through it. Not the XML structure — just the URLs. What you’re looking at is a list of every page on your site that you’re asking Google to care about.
Is it what you intended to ask?
What the sitemap reveals
The XML sitemap is the most honest document on your site because nobody writes it deliberately. It’s generated automatically from your publishing decisions, which means it reflects what you’ve actually been doing rather than what you planned to do or what you tell clients you do.
A site with a clear content strategy has a sitemap that reflects that strategy. The URLs cluster into recognizable topical areas. The pillar content is there. The spoke content connects to it. The site sections make sense together.
A site without a clear content strategy has a sitemap that looks like a five-year archive of whatever seemed worth publishing at the time. Every tag archive that was automatically generated. Every thin placeholder page that was meant to be temporary and never was. Every post from the period when the editorial focus was different.
The categories of sitemap problems
There are four things a sitemap audit reliably finds:
Pages that shouldn’t be indexed.
Paginated archives, tag pages, author archives, search result pages, admin pages that somehow got indexed — these pages are taking up space in your sitemap and in Google’s crawl allocation without providing any value. They signal to Google that you don’t have a clear sense of what your content is for.
Thin or outdated content you forgot about.
The post from 2018 that was 300 words and exists only because someone thought “we should post something this week.” The seasonal content from three years ago that’s now wrong. The service page for a service you no longer offer. The sitemap shows you all of it.
The content technical debt audit trail.
URL patterns from previous site structures, redirected URLs that were left in the sitemap, pages that moved and were never cleaned up. The sitemap often preserves the archaeological record of everything that’s ever been on the site.
The topical spread.
If you think you’re a content strategy site, does the URL pattern reflect that? Are a significant portion of your indexed pages about content strategy, or are they scattered across subjects that don’t reinforce each other?
Why this is a content strategy document
The difference between a content plan and a content strategy is, in part, the difference between knowing what you published and knowing what you built. The sitemap is the record of what you built — or didn’t build.
A coherent sitemap is the output of deliberate content strategy: a decision about what this site is for, what topics it covers, what pages deserve to be indexed, and what pages aren’t worth Google’s attention. The content audit often starts with the sitemap because it’s the fastest way to see the scope of the problem.
The fix isn’t technical, or not mostly technical. The technical fix — removing pages from the sitemap, adding noindex tags, redirecting orphaned URLs — is straightforward. The work is the editorial decisions that precede it: what stays, what goes, what gets updated before it can stay.
What a healthy sitemap looks like
A healthy sitemap has URLs that are predominantly valuable, indexed pages. No paginated archives unless there’s a specific reason. No tag pages unless you’ve built them out into genuinely useful topic hubs. No thin content that you’re embarrassed to look at.
It has a pattern. The URLs cluster into a recognizable structure that reflects your expertise. Someone looking only at the sitemap should be able to say: this is a site about X, and it has a coherent body of content around Y and Z sub-topics.
That pattern is the architectural expression of your content strategy. If the pattern isn’t there, the strategy isn’t really there — it exists as intent, maybe as a document, but not as a built thing that Google can read.
Just stopping publication to do this audit — to actually look at what you’ve built rather than continuing to add to it — is sometimes the most strategic move available.
The sitemap is waiting. Go look.
Jacob Clifton is the principal of Clifton Creative, an editorial strategy consultancy based in Austin, Texas. He spent fourteen years as a flagship staff writer at Television Without Pity and has written for Tor.com, Vulture, BuzzFeed News, and the Austin Chronicle.
For inquiries:
jacob@cliftoncreative.agency · cal.com/cliftoncreative

