Advertisement

An Introduction to Structured Data Markup

by

This Cyber Monday Tuts+ courses will be reduced to just $3 (usually $15). Don't miss out.

This post is part of a series called SEO Fundamentals for Web Designers.
Why You Should Add Authorship Information to Your Website
Take Control of Your Social Snippets

The term Structured data refers to information formatted in a universally understandable way. Search engines such as Google, Bing and Yahoo use structured data (implemented within web pages) to refine search results, filter with greater accuracy and enhance the way results are displayed. This all makes it easier for users to find the information they're looking for.

Why do we Need Structured Data?

Structured data is becoming an increasingly important part of the web ecosystem. - Google

Web pages have an inherent meaning which users understand when they read them. Search engines, on the other hand, have a limited understanding of web page content. For example, let’s say you have a web page about ‘jaguar’. A search engine could crawl the page, but wouldn’t necessarily know what the word ‘jaguar’ means. ‘Jaguar’ could refer to the animal, or it could refer to the car manufacturer.

This makes it difficult for search engines to display relevant search results to a user. Humans can derive the meaning of a word from the context of the web page, but search engines have difficulty doing this.

With structured data you can help search engines understand your content and display it in a useful, relevant way.

How is Structured Data Used?

Structured data has one major advantage: visibility. Information stored within structured data can be used by search engines to generate rich snippets. Rich snippets provide the user detailed information relating to their specific queries.

You’ve probably come across examples of these rich snippets - if not, take a look at Google's SERP for lasagna. I’m sure you notice the images next to some search results. These images were added thanks to microdata. They can be used to provide search engines with additional information about the web page, in this case an image, but you can also add a rating, cooking time, amount of calories, etc.

These rich snippets make certain results stand out from the rest, often resulting in a higher clickthrough rate (CTR). Some websites have reported a 30% increase in CTR after implementing structured markup data. Sounds pretty good, huh?

Types of Structured Data Markup

There are three types of structured data markup:

  • Microdata
  • Microformats
  • RFDa

Before we begin exploring these markup types, you need to keep one thing in mind: you can’t use more than one type of structured data on a single web page, because it potentially confuses search engines. We therefore need to choose between these three options. But which is most suitable for our website? Let’s take a look at all of them individually.

If you want to know everything (and I mean absolutely everything) about the technical differences between RDFa, microdata and microformats, I suggest your read An Uber-comparison of RDFa, Microdata and Microformats by Manu Sporny, Chair of the group at the World Wide Web Consortium which created RDFa.

Microdata

Microdata is probably the most popular type of structured data, largely owing the website Schema.org. On this website, an initiative of the three biggest search engines (Google, Bing and Yahoo), you’ll find a shared collection of schemas (microdata) that you can use.

The markup of microdata consists of 3 elements: itemscope, itemtype and itemprops. The itemscope attribute encloses information about the item. By adding itemscope to your HTML, you are specifying that the content within your chosen element is about a particular item.

<div itemscope>
Foo Fighters Concert
</div>

Add the itemtype element to identify the type of content. Use this attribute immediately after the itemscope.

<div itemscope itemtype=”http://schema.org/Event”>
Foo Fighters Concert
</div>

In this example, the itemscope informs search engines that the item contained in the div is in fact an event. Itemtypes are always added as URLs. You can find a complete list of all itemtypes at Schema.org.

Now that the search engines know that our page is about an event, we can provide it with additional information about this specific event. For this we use the itemprop attribute.

If we want to identify the location of the Foo Fighters concert, we simply add itemprop=”location” to the element enclosing the location name (again, visit schema.org for a full list of all properties you can associate with an itemtype).

<div itemscope itemtype=”http://schema.org/Event”>
<span itemprop=”name”>Foo Fighters Concert</span>.
Concert will take place at <span itemprop=”location”>Madison Square Garden</span>.
</div>

Sometimes you’ll have to add additional elements in order to add itemprop details. We use <span> tags because, by default, they don’t influence the way inline text is presented by a browser.

Dates and Times

Dates and times can be difficult to interpret. The date 08/10/12 for example, does it mean 8 October 2012? Or 10 August 2012? Or August 12, 2008? Confusing, isn’t it? Search engines have the same problem.

In order to provide them with the correct time and date, we need to add a ‘datetime’ attribute. This attribute specifies a date using the YYYY-MM-DD format.

<time datetime="2011-04-01">04/01/11</time>

The code above is for the date 1 April 2011.

The datetime attribute can also be used to specify a time. Times are prefixed with the letter T and can be provided along with a date.

<time datetime="2011-05-08T19:30">May 8, 7:30pm</time>

The code above displays the following date and time: May 8th 2011, 7.30 pm. If we add the date and time markup to our previous example, we could get something like this:

<div itemscope itemtype=”http://schema.org/Event”>
<span itemprop=”name”>Foo Fighters Concert</span>.
Concert will take place at <span itemprop=”location”>Madison Square Garden</span>
on <time datetime="2011-05-08T19:30">May 8, 2011 at 7:30pm</time>.
</div>

With these simple tags we can tell search engines that on 8 May 2011 at 7.30pm there will be a Foo Fighters concert at Madison Square Garden. We can use other attributes to mark up web pages about Books, Movies, Organizations, Recipes, etc.

Implicit Information

Information is not always visible to users and search engines. Some information can be embedded in a media object, or it may not be stated explicitly on a page. In this case you can use meta tags to specify this information.

Let’s say we have a video on our page and we want the duration of the video to appear as a rich snippet. Because the duration of the video is not provided as text on our page, we need to use a meta tag to add this information. For example:

<meta itemprop=”Duration” content=”T2M40S” />

The code above tells search engines that the video is 2 minutes and 40 seconds long (don’t forget that we use the ISO8601 format for dates and times). This information will appear as a rich snippet in the search results.

Microformats

Microformats extend conventional HTML tags with semantic information. In order to add structured data to a web page with microformats, you’ll mostly use the class attribute. This makes microformats arguably the easiest and cleanest way to add structured data.

The most popular types of microformats are hCard, hCalendar and hReview. hCard is used for people, companies and organizations. hCalendar can be used to add information about events. And with hReview you can review restaurants, books, movies, etc.

Let’s say we have a page for a football match. First of all we need to tell search engines that this web page is about an event by referencing hCalendar in the <head>.

<head profile=”http://microformats.org/hcalendar”>

Next we need to tell search engines which part of our web page is about the event. For this we use the vevent class.

<div class=”vevent”>

Everything contained by our <div> gives the search engine more information about the event (but you could also use other tags such as <span> or <p> if necessary). If we want to add the title of our event, we use the summary property. Summary is a required property for an event!

<div class=”vevent”>
<span class=”summary”>Real Madrid - FC Barcelona</span>
</div>

By using the location property we can specify where the match will take place.

<div class=”vevent”>
<span class=”summary”>Real Madrid - FC Barcelona</span>
at <span class=”location”>Camp Nou</span>
</div>

With this code we tell search engines that the match between Real Madrid and FC Barcelona will take place in Camp Nou. Another required property for hCaldendar is dtstart. It describes the date and time of the event.

<div class=”vevent”>
<span class=”summary”>Real Madrid - FC Barcelona</span>
at <span class=”location”>Camp Nou</span>
on <time class=”dtstart” title=”2012-10-22T20:30”>October 22, 2012 at 8:30pm</time>
</div>

These tags give search engines more information about the football match between Real Madrid and FC Barcelona, such as the location and the date/time.

For more information about microformats, check out the microformats wiki.

RFDa

RFDa uses a number of properties to identify entities (such as a person or an event). It uses HTML tags, such as <div> and <span>, to describe entities. The more advanced features of RFDa may be difficult for webmasters who don't happen to be experts in structured data.

RFDa has four basic attributes: voctype, typeof, property and resource.

The first attribute, voctype, defines the vocabulary we’re going to use for our structured data. Thanks to this attribute, search engines know where to get the information about this structured data.

<p vocab=”http://schema.org”>
Hi, my name is John Doe!
</p>

With the code above we specify that the vocabulary for our RFDa tags can be found at schema.org (for example). There are several other vocabularies, such as LOV and Dublin Core. Now we need to specify the type of data. Is it information about a person, an event, a restaurant...? For this we use the typeof attribute.

<p vocab=”http://schema.org” typeof=”Person”>
Hi, my name is John Doe!
</p>

Search engines know we’re talking about a person, but they don’t know much about him. By adding properties we can give them more information about this person.

<p vocab=””http://schema.org” typeof=”Person”>
Hi, my name is <span property=”name”>John Doe</span>!
</p>

The code above tells us that this web page is about a person named John Doe. We can add a unique id to this structured data to identify this person by adding the resource attribute.

<p vocab=”http://schema.org” resource=”#john” typeof=”Person”>
Hi, my name is <span property=”name”>John Doe</span>!
</p>

This unique id is useful if we want to talk about John Doe on another website. By adding the id to the end of the url of this web page (e.g. http://example.com/employees#john), we have reference for all the information about John Doe.

For more information about the implementation of RFDa, check out the RDFa documentation on w3.org.

Why I Prefer Microdata

I prefer to use microdata to implement structured data. I’m in no way saying that microdata is the best option (there are advantages and disadvantages to every type mentioned) but for me, microdata offers the most advantages.

Compared to microdata, RDFa has a bit of a learning curve when dealing with nested entities. And the implementation of RFDa in non-XHTML pages can be problematic because of certain attributes and values.

The downside of microformats is that, if you want to retrofit it to a website, you’ll probably have
to rename a lot of CSS classes and <div> and <span> tags.

For me, microdata is the best of both worlds; it’s straightforward and easy to implement. Microdata is also recommended by Google, so for people who like to follow Google’s guidelines, this can be a reason to choose microdata over microformats and RFDa.

Testing Your Markup

You’ve spent countless hours adding structured data to your website. But how do you know if it is implemented correctly? That’s where Google’s Rich Snippet Testing Tool comes in handy. On this website you can take a URL, or a chunk of HTML code, and test the structured data markup. It can give you an idea of how the page will appear in the search results.

Tools

We end this article with a collection of tools that might come in handy when you’re adding structured data to your website.

  • Schema.org Creator is an easy way to generate microdata. Choose a content type (person, event, review...), fill in the required fields and with the click of a button you have the correct HTML code.
  • If you have a Wordpress blog or website you can use this handy Wordpress plugin. The Schema Creator Plugin, developed by Raven, makes it really easy to add structured data to your web pages. One of the advantages of this plugin is that it uses shortcodes. So you don’t have to add microdata manually.
  • Microformat has several creators for hCard, hCalendar and hReview. Use these tools to quickly generate microformats for your website.
  • RDFa Play is probably the best tool if you want to implement RFDa. It allows you to edit and debug your code. Plus, it even comes with a data visualizer!

Conclusion

In the future we’ll be seeing much more structured data. It allows search engines to interpret content more efficiently and generate rich snippets. All this results in an understandably higher clickthrough rate for pages where structured data is implemented.

Let us know your thoughts in the comments; will you be taking advantage of structured data anytime soon? Do you already have experience in doing so?

Advertisement