The importance of disambiguation for increasing visibility on Google

Eliminating ambiguity in page content can increase Google's confidence in its topic relevance and result in better visibility on Google Search

Rabbit-Duck illusion from the Oct 23, 1892 issue of Fliegende Blätter, a German humor magazine

The goal of Google’s search algorithm is to return results with the least amount of ambiguity as it relates to the query. Every tweak or significant update to its algorithm is meant to refine that process.

It’s an ever-changing and complex challenge to find the best results for a query because every person has a different writing style, and every site uses different code, layouts, links, and content. However, Google, at least thus far, has proven to be the best at parsing and comprehending page content and presenting the results searchers want to see most.

Thinking like Google

A core goal of search engine optimization (SEO) is convincing Google that your content is the best for them to return. SEOs do this in a multitude of ways, and the methods they use can change based on how Google’s algorithm changes. In some cases, part of an algorithm change might even be in direct response to a tactic that’s been overly used by SEOs.

As an SEO, I’ve had the most success with Google by attempting to think like Google. An excellent example of that is with structured data.

Long before the Schema.org vocabulary was created, I was a proponent of Microformats. It confounded me for years why Google wouldn’t use or advocate for it. The idea behind Microformats (and RDFa, respectively) was to disambiguate content, entities, and relationships through structured data. It was what I expected a search engine bot to want.

Ultimately, it was what Google wanted, but they wanted their version, so they created Schema.org. When they released Schema.org, I embraced it immediately. Structured data is a perfect way to disambiguate free-form content, and to confirm to Googlebot what the different elements on a page are.

If you consider what Google’s algorithm is attempting to do – prioritize the most relevant results based on billions of pages – then it becomes apparent that a significant part of their process is disambiguation. Google is continually trying to pick the top results to return, and the only way it can do that is to determine the least ambiguous content for the search query.

Disambiguation via structure

Disambiguation begins before any content is written. The way a site is structured and coded can communicate topic relevance to Google.

Folder structure

Sites covering a multitude of topics that place all of their content at the root of the domain or a single nondescript folder like /news/ miss out on an opportunity for disambiguation.

Consider a content categorization strategy that uses only one category per article or chooses a primary category. Then use that category name in the URL for the content page. For example, when I write articles on SEO for Coywolf News, they always reside in the /seo/ folder (https://www.coywolf.news/seo/google-lighthouse-6/) and not off the root or /news/.

Schema.org structured data

As mentioned earlier, Schema.org structured data disambiguates content. For example, you can build off the aforementioned folder structure by including BreadcrumbList schema.

That will reinforce the topic relevance created by the folder structure and also enable a rich result feature.

BreadcrumbList Rich Result
Example of BreadcrumbList being used in a rich result

Additionally, the use of Article schema or related types disambiguates the free-form elements that make up the primary page content by splitting them up into highly specific subtypes, like headline, description, articleBody, author, and more.

Semantic HTML

Semantic HTML is any element that can communicate context and meaning to a bot and human. The most common semantic HTML that’s used for text are <strong>, <em>, and <mark>. They communicate importance, emphasis, and highlights to bots and humans.

Several semantic HTML elements are used to communicate the context of the content encased within it. Some of those elements are:

The use of these elements, like <nav>, <main>, and <article>, can help disambiguate content blocks for bots. They definitively communicate to a bot that the content within those semantic elements are navigation, the main content of a page, and the article content. They also help make pages more accessible for humans.

Disambiguation via synonyms

In writing, we’re encouraged not to repeat the same words or phrases too much to keep the text interesting. Like how SEO has inadvertently made the web more accessible, avoiding word and phrase repetition for readers has resulted in disambiguating content for Google.

Google uses natural language processing (NLP) to understand how different words and phrases relate. When you repeat the same word or phrase without any variation, Google can use NLP to relate it to similar words or phrases on other sites. That page can then be returned for a query that uses a variation that doesn’t appear in your content.

However, including variations of a word or phrase removes ambiguity and increases the algorithm’s confidence in understanding the content. If Google’s goal is to return the best and most accurate results for a query, and if your content uses word and phrase variations that increase its confidence that your page matches the query intent, it will likely rank higher.

Disambiguation via entities

Before I discuss entities, let’s read how Google defines an entity:

An entity is a thing or concept that is singular, unique, well-defined and distinguishable. For example, an entity may be a person, place, item, idea, abstract concept, concrete element, other suitable thing, or any combination thereof. Generally, entities include things or concepts represented linguistically by nouns. For example, the color “Blue,” the city “San Francisco,” and the imaginary animal “Unicorn” may each be entities.

When I use entities for disambiguation, I primarily focus on “things or concepts represented linguistically by nouns.” When I write articles, I continuously think about entities that might relate to the main topic. Sometimes I include mentions, while other times, an entity relationship can reshape the entire article.

In the article I wrote on Coywolf News about Shopify’s new Shop app, the entity relationship between Shopify and Amazon reshaped how I wrote the article. In the article, I intentionally mentioned and wrote about the following entities:

The article performed exceptionally well in organic search results and Google News.

Google Top Stories Search Result
First slot in Top Stories after focusing on entities

I recommend always writing with related entities in mind. If it’s difficult to incorporate that way of thinking when you’re writing, consider adding it to your editing checklist.

Disambiguation via content types

Repeating the same information through different content types can reinforce a page’s topical relevance. The most common example is an image that utilizes the alt attribute and includes text within the <figcaption> element or text within proximity of the image.

Videos are another example of reinforcing topic relevance. That’s especially true if the page includes details about the video via the inclusion of Schema.org’s video type.

Additionally, tables are an excellent way to repeat content on a page if portions can be reasonably repurposed into that format. Tables are a form of structured data, and they can increase the chances of a page being returned as a featured snippet in Google.

Links can help disambiguate content based on the pages it links to and the pages that link back to it.

Google depends on links to more fully understand the context of the content. When Google crawls a page, it also follows the internal and external links within the main content. It then associates the topic relevance of the linked pages back to the page that’s linking to them.

Including relevant links in the main content is a way to explicitly tell Google that your content is related to the topics of the pages you’re linking to.

Disambiguation via page content

Pages with a lot of additional content that are not related to the main content are ambiguous pages. Fortunately, and as discussed earlier, you can use Schema.org structured data and semantic HTML to help disambiguate the main page content. However, even with the use of structured data and semantic HTML, the unrelated and irrelevant content on the page is still crawled, parsed, and may dilute the page’s topical relevance.

The ideal page puts the main content front and center and reduces the amount of unrelated content to a bare minimum. If possible, try to make the other content relate to the main topic of the page as much as possible.

Disambiguation via quantity

The amount and frequency you publish on a topic can reinforce topic relevance to Google. I’ve tested this approach on Coywolf News by choosing to write articles about topics and entities that I’ve written about before.

The more I’ve written about Chrome, ICANN, and Shopify, the better those articles have performed. I have intentionally tried to thread older stories into the newer ones by linking to them and occasionally quoting them. The site and its articles have been further disambiguated by having other media sites and blogs write about the topic and links to my articles.

If I continue to target those entities by publishing more articles about them, I expect that Google will increase its confidence in Coywolf News’ topic relevance for them.

Related Articles