Type a search query
right now.
Okay now notice what you typed and how.
Abbreviated, clipped, something that reads more like a telegram than a sentence. “best content strategy podcast.” “schema markup wordpress how to.” “fractional cmo vs cmo difference.”
Now, think about the last time you asked a voice assistant something. You said something more like: “Hey [Robot], what’s the difference between a fractional CMO and a full-time CMO?” Or: “Hey [Assistant with, but not Yet of, AI], How do I add schema markup to my WordPress site?” Or: “Hey [Machines], What are the best podcasts about content strategy?”
These are not the same query; they are not even close to the same query.
Yet most content teams are optimizing for the first kind and getting found — or not — by the second kind at an accelerating rate.
Voice search isn’t a niche channel anymore. It’s how a significant and growing portion of people interact with information. Particularly on mobile, particularly in the car, particularly in the moments when hands and eyes are occupied. The smart speaker in your kitchen. The phone in your pocket on the walk to the office.
A query asked while driving does not allow for scrolling through results. Think about how massively that changes everything you and I planned for our content not three years ago.
Zero-click. It’s a paradigm shift and I’m not sure A) anybody is there yet and B) anybody uses “paradigm shift” correctly anymore. But it literally is one:
SERP doesn’t matter. SERP, your entire chessboard for the last twenty/thirty years: gone. Zip.
Your good old 10 blue links? Gone, replaced by, in essence, one single link which we have no choice but to click.
CTR doesn’t matter anymore, although the intent matrix matters in a way the Funnel used to matter. the 1,001 other metrics that involve clicks certainly don’t matter anymore, except as ingredients of ratios that define success in arcane ways.
Oh yeah, and speaking of the funnel? Basically doesn’t matter anymore. “Where somebody is in the journey” is “An AIO.” By the time a user searches in a way we can still monitor, they’re already in the process of being dropped off at a destination by a robot who already made their choice for them, drove them there, and will keep the car humming while they convert.
Local business is kind of lucky, because it never had the option of sitting back and letting content do the work:
Local always had to deal with what everyone is dealing with now, which is the messiness of humanity as translated through the messiness of iffy, robot-mediated search.
They don’t have to jump through as many semantic Entity-identity hoops because they have always had to focus on real-world, concrete identity. NAP and Citations, the grittie.
My point is this: Your content was not written for any of these contexts. Does that seem like a problem to you at all? Because it looks like one to me.
Here is what we are going to do about it.
The Structural
Difference
Typed queries are compressed. They omit articles, prepositions, connective tissue. They assume the search engine will figure out what is meant. They’re efficient in the way a text message is efficient, only moreso.
Spoken queries are expanded. They include the full grammatical structure of a question, because that is how speech works.
They tend to include context — “near me,” “for my small business,” “that works with WordPress” — because speaking allows it. They are questions, often explicit: “what is,” “how do I,” “why does,” “when should.”
The implications for content are significant.
Content optimized for the typed query “content audit checklist” is optimized for a different audience, approach, milieu, than a piece optimized for the spoken query “what should I include in a content audit checklist?”
The first might rank for abbreviated search. The second is what will be selected as the spoken answer.
What Voice Search
Needs From
Your Content
Full questions as headers.
Not “Content Audit Checklist” as a header. “What Should a Content Audit Checklist Include?”
Headers that are complete questions match directly against voice queries. They also tell the answer system exactly what the section answers, which is the extraction signal it craves.
Direct answers in the first sentence of each section.
Voice answers are short. A voice assistant reading your content is going to extract 1-3 sentences and say them aloud. If the answer to the header question does not appear in the first sentence, the voice assistant either extracts the wrong thing or doesn’t do it at all.
Answer immediately, completely, in plain language that sounds natural when spoken.
Conversational language throughout.
This is the counterintuitive one. SEO tells us to write for the reader. But voice search wants us writing for the listener — which means reading your content aloud to see whether it sounds like something a person would say.
Passive constructions, dense technical vocabulary, long sentences with multiple subordinate clauses, like the ones I constantly throw your way — these are hard to speak for the robot, and hard to hear for a mom who is driving.
Write shorter and more directly. Write the way you would explain something to a smart friend who is playing tennis.
Local and contextual signals where relevant.
Voice queries are disproportionately local. “Near me,” “in Austin,” “for small businesses” — these contextual modifiers appear in spoken queries far more than in typed ones. If your content is relevant to a specific geography or context, say so explicitly in the content itself, not just in the metadata.
The Speakable Schema
Thing, Yet Again
I mentioned Speakable schema in the AEO post. It deserves another mention here, because it is specifically designed for exactly this problem.
Speakable schema tells Google’s voice interfaces which portions of your content are appropriate to read aloud. It is the explicit signal that says: this passage is well-formed for audio delivery. Use it.
Fun fact, It is implemented on fewer than one percent of eligible pages on the web. Here is your competitive advantage.
Mark your best and most voice-optimized passages with Speakable schema, validate them in the Rich Results Test, and blam! You have done something your competitors almost certainly have not done.
The One Rewrite
That Does
the Most Work
If you just want a single high-leverage change you might make to your existing content for voice, it is this:
- Go to your ten highest-traffic pages.
- Find the question the page most directly answers — the query that drives most of its traffic.
- Write a short starting section to answer that question, directly and completely, in two to three sentences. Use plain, conversational language.
- Add a header over that section that is just the question, also in natural language.
And now that section is extractable as a voice answer. The rest of the page is unchanged, it didn’t hurt, and the optimization took ten or twenty minutes per page.
Voice search is not a content revolution, despite my heated language and Schema obsession. It is a content adjustment — a set of structural changes for taking existing, good content and making it accessible to a surface that is increasingly where the queries are coming from.
The adjustment is not hard. Most brands have not made it. But we are not most brands.
We will strike while the enemy is otherwise engaged.
I write about content strategy, editorial leadership, and sometimes the future of search.
For inquiries: jacob@cliftoncreative.agency · cal.com/cliftoncreative

