Semantic Hypertext Markup Language (HTML) has been a part of the standard since its creation. Semantic markup refers to elements and attributes that provide context to the data it contains or references. The ones you’re probably most familiar with are
body, headings (
h2, etc.), and
p. They help create the most basic version of a web page.
When HTML 5 was officially released in 2014, it added several new semantic elements. The most popular and commonly used elements include:
The use of those semantic elements has slowly become a best practice for web developers because they help with building and maintaining page templates. However, there are several more reasons why they should be used.
Why everyone should use semantic HTML
Web developers, webmasters, and editors should be using semantic HTML because it can improve how computers and people experience and understand the page content.
Semantic HTML make pages machine readable
The use of semantic elements can help define specific areas of a page template for machines. For example, bots can parse and process semantic blocks with near-perfect accuracy if the HTML is coded correctly.
Browsers like Firefox and apps like Pocket have a reader mode that relies on semantic elements to make decisions about which composition it should display or hide. They look for elements like
article to determine the main content, and filter out content encased by elements like
Semantic HTML provides a better user experience
Semantic HTML creates a solid foundation for accessibility
Screen readers make use of semantic elements, which makes navigating and reading pages that incorporate them more accessible. JAWS, NVDA, and VoiceOver all announce each element for its semantic purpose.
To be effective, semantic elements need to be implemented correctly in the code. Semantic elements should also be seen as a starting point to accessibility. Web developers are encouraged to take full advantage of ARIA HTML attributes to help make their pages more fully accessible.
Semantic HTML may improve search engine visibility
It’s unknown exactly which semantic HTML elements Google’s algorithm considers when parsing and analyzing pages. Most SEOs agree that headings – in particular, the
h1 – are used by Google to understand the context and structure of a web page’s composition. However, there’s not enough research and evidence to prove that Google derives meaning from the newer HTML 5 elements.
What is known is that Google recommends the use of semantic HTML. Their Google Search Development docs clearly state that webmasters should
use semantic HTML markup for [their] content whenever possible. Based on Google’s history with structured data, I think it’s reasonable to assume that they currently use some of the newer semantic elements for better comprehension, and it is likely they’ll use more of them in the future.
Rarely used semantic HTML elements for optimizing page content
There are several semantic HTML elements that most people don’t use but should. I use most of them in my WordPress themes and posts and am resolved to start incorporating the rest of them in future templates and posts. All of them influence machine readability, accessibility, UX, and SEO to some degree.
abbr element is used for abbreviations and acronyms (the
acronym element was deprecated). I use
abbr on every post that has acronyms. The element is ideal for UX, accessibility, and SEO.
When a visitor reads an abbreviation or acronym they are unfamiliar with, they can hover over it to display the explication. The element also presents another example of how accessibility and SEO complement each other. The
title attribute text makes the abbreviation or acronym more accessible and also clearly communicates its meaning to search bots.
cite element should be used when quoting a piece of work, not a person. It can be used in conjunction with inline quotes using the
q element or block quotes using the
blockquote element. Additionally, the
cite attribute can be used with
blockquote to link to the source.
abbr, using the
cite element and attribute, and the
q element, requires editing the source because it’s not supported in most rich text editors. The benefits of using them are that it provides a proper citation, it’s machine-readable, and it may help with search visibility.
Does Google crawl the URL in a
I was curious to know if Google and other search bots crawled the link in the
cite attribute. I created a test page and linked to it from the footer of every page on the Pro site. The
q element had a
cite attribute that linked to a unique page (q.html) and the
blockquote element linked to another unique page (blockquote.html).
I initiated several crawls in Google Search Console and then analyzed the log files over two weeks using Screaming Frog’s Log File Analyzer. I was hopeful that Google would crawl the pages linked in the
cite attributes. However, the results were disappointing. The only search engine bot that followed the links was Yandex.
It was only one small test, so it’s possible that Googlebot does crawl them, but for now, I’m going to assume they don’t. Regardless, I still think they serve a purpose, and I’m going to continue to use
cite attribute links whenever it’s relevant to do so.
details element is hidden by default. Visitors must click or tap on it to reveal its contents.
details element is the easiest way to create accordion-like functionality. It’s also perfect for FAQs or any content you consider secondary to the user experience.
The element works in all browsers except for IE and earlier versions of Edge (before the use of the Blink rendering engine). Fortunately, the fallback is to display the content instead of hiding it. So the UX impact is minimal.
details element works in conjunction with the
summary element, although it’s not required. The content inside of the
summary element is what appears by default. When a visitor click or taps on the summary text, the rest of the content is revealed. If the
summary element isn’t used, the browser will simply display the word Details.
Requires a computer running an operating system. The computer
must have some memory and ideally some kind of long-term storage.
An input device, as well as some form of an output device, is
details element can be nested like a list.
Parent Holder Text
Child Holder Text
summary element supports the
list-style shorthand property and its longhand properties, such as
list-style-type, to change the disclosure triangle to whatever you choose (usually with
summary elements can also be styled in a variety of ways. Here are four good examples of how you can use it.
- Custom arrow
- Timeline tree
- Automatically closing an open
detailselement when opening a new one with minimal JS
- A fancier version of automatically closing that uses significantly more JS
One caveat about
summary is that is does have issues with accessibility. The
summary element is a button and there isn’t a heading that specifies what the button does. It’s possible to overcome that with ARIA roles, states, and properties, but it’s not trivial.
mark element is the equivalent to using a highlighter on paper. It’s meant to communicate the importance of specific text on a page. The
mark element can improve the UX by getting the attention of a reader as they scan an article. It can also be used for a call to action (CTA).
mark element may also send a signal of importance to search engines. It’s assumed by some SEOs that bolded and highlighted text is considered by Google’s algorithm when processing page content. However, I’ve never seen any research that’s tested this theory.
One of the best ways to tell a machine that an address is an address, is to use the
address element. It’s 🤯, I know!
The ability for software to parse and detect contact information is good, but it’s still not perfect. If you don’t disambiguate data with semantic HTML, some contact details may be excluded.
address element can contain different types of contact details, not just a physical address.
It also doesn’t require a physical address. It can contain a person’s name or just a phone number.
time element is typically used by search bots to determine the publish date for a page. Google News, in particular, uses it when it crawls news articles.
time element uses the
datetime attribute, it makes it possible to abbreviate the date and time for readers, while still communicating the exact date and time for a machine. Using the code example below, a visitor would see Jan 1 – 7 but a machine would know with certainty that the date range is January 1, 2020 – January 7, 2020.
time element can be used anytime a date or time is used. For example, if a page has a table that lists events, the element could be used for each cell that has a date and time.
dl element is a list made especially for definitions. The list has two child elements,
dt which is used for terms and
dd which is used for descriptions. When you combine those elements with the
dfn element, which is used for the object being defined, you have the ingredients for a semantically structured FAQ or glossary page.
dfn element should always surround the word or phrase that’s being defined. If it’s being used in a definition list, it should be contained within a
dt element and the answer should be contained within a
- Apartment, n.
- An execution context grouping one or more threads with one or more COM objects.
- Flat, n.
- A deflated tire.
- Home, n.
- The user’s login directory.
dfn element can also appear within a paragraph. When using it in a paragraph, the first sentence of the paragraph that defines the term should have the
dfn element inside it.
A microphone stand is a free-standing mount for a microphone. It allows the microphone to be positioned in the studio, on stage or on location without requiring a person to hold it.
If you create a glossary page, adding an
id attribute to the
dfn element will create the ability to link to it.
When you reference the term on a web page, you can use a bookmark link to jump the user to the definition.
If you create an FAQ page using
dfn elements, you can use the same text for the
dd elements that the FAQPage Schema uses for Question and Answer types.
del element is used for content that is no longer relevant and has been removed. Although, if you’re using it, the content is still technically there. The
del element is usually accompanied by the ins element, which is used for text that was added or replaced with the deleted text.
ins elements can be used in different ways. Some of them include:
- Displaying a completed task
- Being transparent about the text that was edited or using it for a diff comparison
- Keeping the details of a past event that you want to keep visible
The SEO part of me likes the idea of using
ins to communicate to Google that the content is maintained, edited, and is transparent about its changes. It may also provide additional context by specifying what is no longer relevant along with what is.
Additionally, using CSS, the
del element could be set to
display:none; and the
ins element could have the same style as the normal text. That would hide the deleted text for visitors while making both the deleted and inserted text contextually available to search bots.
It’s important to note that it’s unknown if or how Google processes content that uses these elements.
Schema structured data is an important part of optimizing content for Google, but it shouldn’t be solely relied upon. Google regularly crawls, parses, and uses page content for featured snippets that don’t use any structured data. Semantic HTML markup can play a significant role in disambiguating content and increasing search visibility.
Additionally, semantic HTML opens up content discovery opportunities beyond search engines, improves UX, and improves accessibility. I hope this article inspires you to start using more semantic elements in your page templates and posts.