Getting embed with BERT

How does BERT affect SEO, and is there a way to optimize for it? AJ Kohn provided Coywolf with insights on how SEOs should respond to the latest Google update.

On October 25, 2019, Pandu Nayak, Vice President of Search at Google, announced that Google could understand searches better than ever before. Pandu highlighted that their breakthrough was a result of their research with a neural network-based technique for natural language processing (NLP) pre-training called Bidirectional Encoder Representations from Transformers (BERT). He described it as “models that process words in relation to all the other words in a sentence, rather than one-by-one in order.”

BERT models can therefore consider the full context of a word by looking at the words that come before and after it—particularly useful for understanding the intent behind search queries.

If you’re like me, the mention of neural networks, language processing, and bidirectional encoder representations from transformers, make your head spin a little. To gain some clarity about BERT, I reached out to AJ Kohn. AJ first wrote about BERT in November 2018. He said that BERT is about embeddings and he discussed how SEOs can optimize for it.

SEO Insights from AJ Kohn on BERT

Before you read any further, I recommend reading AJ’s article, Algorithm Analysis In The Age of Embeddings, first.

When I reached out to AJ about Google’s BERT announcement, he had several interesting things to say about it.

The recent algorithm change only highlights how much Google relies on embeddings to power their understanding of language and documents.

BERT‘s main breakthrough is the ability to understand embeddings by looking at the words before and after each word. It is especially useful in better understanding the intent behind queries.

Embedding is the act of including words within a sentence that provide context and relevance. Additionally, the grammar structure of the sentence can influence how Google interprets the intent of the sentence.

In AJ’s article, he gave an example of how the modifiers – coming soon and approaching – altered a page’s search visibility to match future-based query syntax. For example, visibility and clicks decreased for the search “signs of death” but increased for “signs of death coming soon” as Google applied an updated version of its neural language understanding (NLU).

The idea of writing for NLP isn’t a new concept, and as AJ referenced in his article, Justin Briggs has an excellent article on how to approach it. AJ told me that “those who are focused on matching query syntax to intent will continue to find success” but added one caveat.

Be forewarned, there will be times when Google will get it wrong. That doesn’t mean you should change strategies. It means we need to wait for engagement data to be fed back into that model.

Writing for search engines and people is an art and science. It means thinking more about sentence structure and using word variants to convey the intent you’re targeting. AJ agrees that it’s possible to write for BERT through intentional embedding.

To say that you can’t optimize for these types of changes is not quite true. A high percentage of writing out there doesn’t make it easy for these models.

It’s nothing groundbreaking, but writing in a way that makes it easy for Google, and their embedding models, to understand your content is going to lead to a higher probability of success. And when you do so, you also make that content better for everyday users.

If you have a site that appears to be affected by Google’s improved use of BERT, AJ recommends looking at how the search queries driving traffic to your pages have changed.

While it’s exciting that Google has leveraged BERT, I hope what is really remembered is that search is, in large part, powered by a number of NLP models. So look for those syntax changes after an algorithm update.

Actionable Insights

  1. Embed words in sentences that communicate context and relevance.
  2. Structure sentences in a way that match query intent.
  3. Analyze search queries for individual pages to better understand and test syntax changes.

Related Articles