Posts on State of Search about ‘Technical SEO’

Technical SEO as Part of a Multi-Signal Search Strategy – #brightonseo

technical-seo-brightonseo

There’s a common misconception that SEO should sit alongside social media and PPC as part of a blended search strategy. However, SEO is an essential part of site build, content creation and on-going strategy development. Whether it’s on desktop, mobile or Internet TV, technical SEO shouldn’t be pigeon-holed.

To approach this in a better, more future-proof method, thorough understanding of technical SEO and its place in the approach to developing a strategy is absolutely vital. At BrightonSEO, Richard Falconer, Head of Technical Search at LBi, provided some interesting insights and useful tips on how to gain a better understanding of your – and your competitor’s – technical SEO set up. (more…)

SEO Campixx: Conversion on Steroids

seo-campixx-2013-2

One of the best conferences in Germany is definitely SEO Campixx. Where do you find a conference which starts with a great show, an own song and a Star Wars theater play starring Marcus „Mediadonis“ Tandler as Luke Skywalker, Searchmetrics chief Marcus Tober as Yoda and Link Research Tools founder Christoph Cemper as Evil Darth Vader? It’s a different kind of conference with a focus on spirit and networking, but also with 8 (!) parallel tracks on a wide range of different topics. You’ll find sessions about classical SEO themes like linkbuilding, about Panda & Penguin etc. But you’ll also find advanced technical sessions about crawling & scraping, conversion optimization and sessions with titles like “how do I organize my business” or “how to find a good in house SEO for your company?”.

The conference was held last weekend in the suburbs of Berlin. (more…)

How to Perform a Complete SEO Audit for Your Website

state-audit

This is a guest post by Philip Petrescu, CEO and Co-Founder of Caphyon (full bio below)

Doing a complete SEO Audit is not an easy task. You need to approach this from different angles, have a defined structure of what you wish to accomplish and use the right tools to get the job done efficiently.

In the past, only a few experts were able to tackle such a big task. Nowadays, there is so much information on this topic and there are so many free tools available that it has become very practical for any of us to perform complete SEO audits for the sites we own.

Can you do it too? Of course. Just follow the three steps below:

  • create a list of the most important factors that you wish to check
  • use the appropriate tool to discover the issues you may have
  • come up with a list of changes that need to be made in order to fix these problems

This article will focus on the first two steps: finding the important factors that affect your website’s performance and the tools that can help you discover these problems. Subscribe to this blog or come back later for a follow-up article that will focus on how you can actually fix these problems.

One of the most important things in an SEO Audit is not finding what is wrong with a website, but rather to find a list of things that need to be changed, that will make the site rank better in the search engines.

So, which are the most important factors that affect your website’s performance?

To make things easier for you, I have structured this article into different sections that you can browse independently. At the end of each section, you will find a list of tools that you can use to identify the issues discussed in that section.

Technical Audit

If your website has a lot of technical problems, your will experience a poor performance and you will receive less traffic than you normally would receive. Performing a technical audit means searching for all the errors or technical problems that may negatively affect the performance of your website in the search engines.

Accessibility

state-robots-2The most important thing to begin with is to make sure that your content is accessible to the search engines. One mistake here and the search engines won’t be able to crawl your site, which means you will get no rankings at all.

There is a file stored on your web server, called robots.txt which is meant to tell the search engines which content of your website they are allowed to crawl and which not. The problem appears when you accidentally change this file and disallow the search engines to access the pages you wish to rank for. At this point you can say goodbye to any organic ranking for these pages.

However, this problem is very easy to identify and fix. To find out if you have blocked any important pages from being crawled you can use the Robots.txt Checker tool.

While robots.txt is very good at disallowing search engines access to different sections of your site, there is also a way to tell the search engines to ignore a single page from the HTML code of the page itself. This is done via the robots meta tag and you can check this using a meta tag analyzer.

HTTP Status Codes

When a crawler requests a page on your website, your web server returns an HTTP status code along with the response. It is very important to make sure that your server returns a status code with a value 200, which is the equivalent of saying that everything is OK.

If your web server returns a status code of 4xx or 5xx ,which usually means an error, this will prevent the search engines to access your page content.

The easiest way to check if you have pages that return errors when a search engine is trying to crawl them is to log in to your Google Webmaster Tools account and go to the Health – Crawl Errors report.

One scenario that happens most of the time is when a page no longer exists in its initial location because it was moved to a different location. A 301 redirect is the preferred way to handle this situation because the search engines will be able to handle this move correctly and transfer all the authority of the links pointing to the page that was moved to the new one.

Canonicalization

Canonicalization, although hard to pronounce, is one of the basic principles of SEO and it’s essential to creating an optimized website.

One of the common mistakes that most website owners do is splitting the link authority of their website because they are not redirecting the non-www section of their website correctly.

Example:

http://yourdomain.com/

should 301 redirect to:

http://www.yourdomain.com/

If you don’t do this, you are essentially telling the search engines to keep two copies of your site in the index and split the link authority between them.

How can you make sure you don’t have this problem? It’s easy to find out. Just search in Google for:

site:yourdomain.com -www

If your search does not match any documents, then you should be fine. Otherwise use the htaccess redirect tool from the Tools section below.

Interlinking

Interlinking is very important to the SEO strategy of your site. It is one of the most powerful weapons that you can use to increase your search engine rankings.

Your website needs to have a good internal link structure and all your pages must be linked together for the search engines to be able to crawl through and reach every single page. If some of your pages remain isolated from the rest of the website, no crawler will ever be able to find out about their existence and index their content.

One of the best ways to find out whether your pages are linked together correctly is to use a crawler such as Xenu’s Link Sleuth. This free tool will scan your website just as a search engine would and find any holes that you may have in your internal linking structure.

Indexability

Now that you have identified and fixed all the accessibility problems of your website, that doesn’t mean that all your problems are suddenly gone. It may very well happen that a search engine will only crawl a certain number of pages from your website and ignore the rest.

This is usually related to the importance that the search engine attributes to your website. More important sites have a higher number of pages crawled by a search engine.

The first thing you should check though is if your website is indexed at all. This can be easily accomplished by searching in Google for one of the following:

your brand name
yourdomain.com
site:yourdomain.com

You need to understand that the number of results that Google returns for these queries is not really accurate. In fact, it may change if you refresh the page or if you search Google in a different country.

A more accurate way to find out how many pages of your site are indexed is to visit your Google Webmaster Tools account and go to the Health – Index Status report.

Although this is probably the most accurate number that you will get from Google, there is a different number that I believe you should be focusing on here, which is the exact number of pages that have received at least one visit from a search engine. You can get this number from your Google Analytics reports or any other analytics tools that you may be using.

Site Performance

Nowadays, your site performance is more important than ever. A recent study showed that Amazon.com has achieved a 1% increase in revenue for every 100 milliseconds that were taken off from the load time of their pages.

One of the reasons this may happen is because these days people are browsing the web more and more with devices and mobile phones that connect to the Internet via slow wireless and 3G connections.

It does not matter if you have the most beautiful page if nobody sees it. It does not matter if your website is ranking in the first spot for a very high traffic keyword. If your site fails to display in less than 5 seconds, the people that clicked on your link will hit the back button and go to the next result. And the worst part of this is that you will never know what happened. Has the visitor actually seen your page or not?

Answering this question is not an easy task. But you could figure this out if you analyze your web server log files and compare the number of times your page was requested with the number of times your page has been visited in tools such as Google Analytics or similar.

speedtest-stateofsearchThat is because most tools like Google Analytics use a tracking code that is added to your page and this code usually loads after most of the elements have already been rendered in the browser.

That being said, it’s important to understand that not all the problems of poor performance happen in the back end.

The first thing that happens when you enter a page in your browser is the transfer of the HTML file from the web server to the browser. Once this is done, the browser starts loading the Javascript, the CSS and the images that it needs to render the page and this is probably where more than 90% of the issues of poor performance happen.

Does your site have a performance problem too? You can easily find out by using a tool such as YSlow or Page Speed by Google. In most cases they don’t just tell you what the problem is, but also how you can solve it.

Useful tools for a technical audit

Robots.txt Checker – see if you have blocked any important pages from being crawled.
Meta Tag Analyzer – check if you have restricted search engines access to a certain page.
Google Webmaster Tools – check if you have pages that return an HTTP status code with error.
Htaccess Redirect Tool – use it to create a .htaccess file for Apache web servers.
Xenu’s Link Sleuth – use it to crawl your website and find problems in your internal linking structure.
YSlow – find out if your site has performance issues and how to solve them
Page Speed by Google – use it to discover any performance issues of your site

Semantic Audit

Your site’s content and information flow must ensure the best experience for any visitor of your website. The design, the site’s content and navigation should go hand in hand and serve a unique purpose: that of fascinating rather than alienating your visitors.

Content Analysis

The most important thing to remember here is that you should write your content for humans, not for search engines. That being said, if you want to rank well in the search engines, you do need to make sure that the things that search engines do look for are there.

The title tag and the meta description are the most prominent features displayed on a search engine result page. Make sure you have these two right otherwise people will not click on your link, even if you are ranking first in the results.

Title Tag

  • Keep it short (less than 70 characters) and test how it looks like in the search engine result page.
  • Make sure that it is interesting and that it matches the visitor’s search intent.
  • Include your highest-value keywords in the beginning.
  • Add your brand name at the end of it when possible.
  • Make sure you don’t duplicate titles across the pages of your site.

Meta Description

The Meta Description tag does not affect keyword rankings so do not try to stuff keywords in it. Instead, use it to describe the page content succinctly and accurately. Make it actionable and encourage users to click on your link and you will see a huge impact on the click-through rate.

If you were lucky enough to be included in the DMOZ or Yahoo directories, search engines will sometimes replace your meta description with the one listed in these directories. You can easily prevent this from happening by adding the following meta tags to your HTML header:

<meta name=”robots” content=”noodp”>
<meta name=”robots” content=”noydir”>

 

Most crawling tools such as Xenu’s Link Sleuth retrieve the text in the title and meta description tags for the crawled pages. You can also find Title tag and Meta Description information in your Google Webmaster Tools account in the Optimization – HTML Improvements report.

Meta Keywords

Most search engines ignore this tag so you have no benefits from using it. The only thing you can accomplish by adding your keywords to this tag is to allow your competitors a sneak peek over your targeted terms.

Headings

Although not as important as page titles from an SEO point of view, the headings (H1, H2, H3, etc.) still weigh enough and you should make sure they are not missing and are used correctly on each page. More than that, headings have a great impact over how content is perceived by the reader, improving the user experience and conversion on the page.

Images

A picture is worth a thousand words but unfortunately only humans can see it. To make sure the search engines also understand what your pictures are about, you should include the important keywords that describe each of them in two places: in the file name and in the alt attribute.

Page Content

Your pages need to have enough content to rank well in the search engines. Having less than 200 words on a page (not counting the HTML tags) is considered sub-optimal. What’s interesting is that pages with more than 2,000 words usually receive better rankings in the search engines.

One of the major issues that could affect your rankings in the search engines is duplicate content. Regardless if you have only one product or thousands of products on your site, it is important to make the content unique and target different keywords on each page.

There are two ways in which you can duplicate content:

  1. identical content on two different pagesThis usually happens when two URLs that have different parameters point to the same page. Detecting duplicate identical pages is easy. Just use a crawler such as Screaming Frog which computes a hash tag for each page. Combine this with the power of Excel and you will be able to create actionable reports for your pages.
  2. part of the content is duplicated on two different pagesThis one is harder to spot and a simple crawler won’t help. The main problem here is that starting with February 2011, (when Google has released the first Panda update) the impact of duplicate content has become a lot more severe.

In the past, duplicate content could only harm that content itself, by being filtered out by the search engine or sent to the supplemental index instead. Ever since the Panda update was released though, a duplicate content problem may impact your entire site, not just the pages that are duplicated. You can have good pages on your site (that are not duplicated) lose their rankings or even fall out of the index altogether.

To find out if you have content that exists in a similar form on another page or website you can use the Copyscape tool.

Site Architecture

The pages of your website need to have a fluent content structure, with no cluttering, useless items. Otherwise your visitors will get lost and miss the essential message you wish to send. With a simple eye scan you can quickly evaluate if too many ads, sections on the page or call to actions are distracting the visitor from the intended purpose of the page.

There is no universal recipe for a website’s pages hierarchy but, it’s recommended to aim towards a flatter site architecture with a good interlinking between pages of the same level.

While a higher vertical depth of your website would ensure a more logical navigation flow through the website, a flatter site architecture might be more tempting as it would allow you to push forward your most important and powerful pages and make them more accessible to your visitors.

If you want to make sure that your most important pages are just a few clicks away, you can crawl your website with Screaming Frog and look at the depth of each page.

Links Analysis

This part of the audit is crucial as it helps you find the answer to daunting questions such as:

  • Are my pages linked together in a comprehensive way?
  • Is anyone linking to my website? If so, which is the content that got most links?
  • Am I getting any benefits for having these backlinks?

Internal Links

Internal links are links from one page of your site to a different page on your site. Although commonly used in main navigation, when done right, they should improve both rankings and usability.

Each page of your site has the potential, through its content, to link to other pages from your site. To use this potential, you should insert contextual links to other pages from your site that you would like to rank better. Just make sure you use the keywords that you would like the target pages to rank for when you link to them.

And remember that your visitors are more likely to click on a link in the text of a page, because it feels more natural.

To find information about your internal links go to your Google Webmaster Tools account and look at the Traffic – Internal Links report. To find any broken links you may have, see the Health – Crawl Errors report.

External links

When you link from a page of your site to another page on a different site, you send a powerful vote, endorsing the target’s page quality. Therefore it is important to make sure your site links only to high quality sites, otherwise your site’s trustworthiness might be affected.

In case you must link to sites that you don’t trust, make sure you use the nofollow attribute.

Use the IIS SEO Toolkit from Microsoft to crawl your entire website and get more information about all your external links.

Inbound links

The most powerful links that you can get from another website are those that are within the text of a page and that are surrounded by content that is relevant to both your site and the link anchor text.

When it comes to the number of inbound links, the more the better. But it is more important to get these links from different websites (unique root domains). This means that having 1 link from 10 unique websites is a lot better than having 10 links from 1 website.

Also, some top-level domains (.edu and .gov) are considered special because only qualified institutions can register them. Therefore links from these domains are supposed to have a higher value than regular links.

But having a lot of inbound links is not enough. If you want your page to rank for a certain keyword, you need to have some of these inbound links contain that keyword in the anchor text.

It used to be that the more links you had with a certain keyword in the anchor text, the better you ranked. Nowadays though, search engines look at the distribution of your anchor text and will actually lower your rankings for that keyword if you have too many links that contain that keyword.

Recent studies have shown that sites with a small amount of exact match and partial match links but with a very high amount of brand links are ranking better than sites that have most of their links made of the targeted keywords.

To find out the number of inbound links to your site and the anchor text distribution you can use either Open Site Explorer or Majestic SEO. Both these tools provide link metrics and detailed information that can help you audit your link profile.

Just keep in mind that these tools use their own link graph, i.e. they crawl the Internet independently and create their own index. This means that they can only tell you what’s in their own index, and not what’s in the Google’s or Bing’s index database.

Useful tools for a semantic audit

Copyscape – find out if you have content that exists in a similar form on another page or website
Screaming Frog – crawl your website and look at how many clicks away your main content is
IIS SEO Toolkit – use it to crawl your website and get more information about your external links
Open Site Explorer – find out the number of inbound links to your site
Majestic SEO – check your link profile and anchor text distribution

Competition Audit

If one of your main goals is to achieve high rankings for your website in the search engines, then you need to first find out who are the other websites that already rank for the keywords you are targeting: your competition.

Analyzing your competitors will help you get a better understanding of their strength and whether you have a real chance of outranking them. Making a good decision when you enter a niche will save you many months or even years of work spent on trying to catch up with a competition that is too strong. You would be better off finding a local or smaller niche and tackle that instead for a start.

Usually the first thing you should look at when you analyze your competitors is their overall strength. You can use the Domain Authority and Page Authority metrics from Open Site Explorer for that. If they have a high DA and PA then these are good indicators why their page ranks higher in the search results.

Alternatively look at Citation Flow and Trust Flow from Majestic SEO. They will tell you how influential a site may be (Citation Flow) and how trustworthy it is (Trust Flow) based on their relative distance to other trustworthy sites.

But the question is how did they get those high numbers? Did they do something special? Maybe there is something that each of your competitors have and you don’t. That’s what competitive analysis really means. It’s a process of elimination in order to find out that thing that stands out and makes a difference.

One way to find this out is to discover the pages your competitors put their most effort into, the ones that got the most attention on social platforms and got the most links.

To find out the most shared or liked content of your competitors use Social Crawlytics. It can crawl the competitor site and display the number of social shares for every page.

The next question you should ask is where did they get the links from? Are these links significantly more valuable than yours? And do they deserve them?

If your competitor is writing good content that attracts a lot of natural links, then it’s time you start creating your own content that people will naturally link to. Because if you’re still trying to persuade people to link to you at this point, then you will never catch up.

Open Site Explorer or Majestic SEO are great tools for finding out valuable information about the links of a website. And you can add the information that OSE provides into Link Detective and find out what types of links your competitor is attracting.

Another way to approach this is to look at the range of keywords that your competition is targeting and the content they use for driving their traffic. You can guide your own strategy based on the techniques that proved to be effective for them.

With a simple anchor text retrieval from Open Site Explorer you can get a good idea of the keywords they are targeting. In some cases this may not be enough, so you may use SEMRush instead, which is a tool that automatically retrieves all the keywords that a website is ranking for.

Useful tools for a competitive audit

Link Detective – find out where your competitors are getting links from
Open Site Explorer – get useful information about the links of a site
Majestic SEO – check link profile and anchor text distribution
SEMRush – get all keywords that a website is ranking for
Social Crawlitics – crawl your site and get the number of social shares for all your pages

Social Audit

Social media is one of the most effective methods to influence and engage with your customers. Your ability to engage socially and become popular on social platforms will also have a great impact on your website’s ability to achieve high rankings.

The first thing you should look at is to make sure you have your profiles set up and optimized for human interaction and to make sure that your logo and about/bio information are there so that your visitors recognize you.

There is no standard posting frequency that will guarantee success but a few posts per week are somehow mandatory for your profile to look alive. More important is to have a social media posting policy that your entire staff follows to maintain branding, tone and messaging consistency across all platforms.

Then you need to decide who will be your voice on social platforms. Will it be a person within your brand or will it be the voice of your brand itself. Although most brands post as the brand itself, it could be a more personal approach to post as a person that will speak for the brand instead. Raven Tools does a very good job with this posting as @RavenJon or @RavenCourtney although they do have an official brand voice too at @RavenTools.

Your social profiles need to be seen as an extension of your website into the social area. Therefore, just like you have branded your social profiles, you need to add social items on your website as well. Make sure that your social profiles are integrated on your website and on your blog, if you have one.

Just like with any other distribution channel, on social media you need to send the right message to get the best results. Therefore, you need to know if your content got any social reactions and which pages were able to engage better.

To find out how many shares and likes a page has you can use SharedCount. Alternatively, you can use Social Crawlytics to see this information for your entire website instead of just one page.

By looking at your website pages along with their shares and likes, you will be able to understand the overall level of social engagement and see which content was preferred by your audience.

Next, you need to pay close attention to the response you get from your social audience. What is the level of engagement you achieved? Was it just a “Like” or did it go further to sharing or leaving positive comments? The higher the engagement, the more likely it is to have a major impact over the growth of your social circles.

But instead of just looking at the number of likes, a better metric to look at would be your engagement rate with your fans. This is the number of likes, comments, retweets, etc. that you receive on each post. You can calculate an average for all your posts and divide this number by the number of fans you have and you’ll get a more accurate metric to see if your engagement with your fans is increasing or not. Do this for your competition too and see who is winning the social race.

Tools such as Facebook Insights and TweetReach give you important information about your market reach and the engagement generated by your content.

Useful tools for a social audit

TweetReach – measure the reach of your brand, marketing campaign or event on Twitter
SharedCount – track URL shares, likes, tweets, and more
Social Crawlytics – see how many shares and likes you have for each of your pages

Dedicated Tools for complete SEO Audits

There are many tools out there that can help you find the important factors that affect your website’s performance in the search engines. Most of the tools that I mentioned so far are very good at finding a particular problem and their main benefit is that they have a free version.

But if you are looking for a tool that can handle most of the problems described above, then you need a dedicated SEO audit tool. The advantage of such a tool is that it is complete and will save you lots of time spent checking different tools. The main disadvantage is that you will need to pay for it.

Web Based SEO Auditors

SEOmoz Pro

One of the oldest and more established tools for SEO Audits comes from SEOmoz and it is included in their paid SEOmoz Pro subscription:

Advantages: It crawls your website in the background and will give you alerts on the most common errors and warnings about your on-page optimization efforts. It also contains a powerful competitive link analysis tool that is based on their Mozscape link database.

Disadvantages: You can’t start the crawl on demand. It runs weekly and it has a limit of 10,000 pages per domain. Although the platform does have separate link and social analysis, they are not tied together to the auditor and it’s hard to see which of your pages get the most shares and links.

Site Auditor from Raven Tools

Site-auditor-ravenA new and interesting web based Site Auditor was recently released by Raven Tools. Although this tool is still in Beta, so lots of improvements and tweaks are yet to be made, it already handles most of the tasks that you will need to audit your website.

Advantages: You can schedule your crawls weekly and there are several reporting options available for the audits. It analyzes your content, internal links and images and even has an awesome report on page speed from YSlow that not only tells you where the problems are with your site speed but also tells you how to solve them.

Disadvantages:Â You can crawl up to 1,000 pages per website and up to 10,000 total pages per day. There is currently no option to crawl a specific path within your website. The crawl starts with the Index page and stops when it reaches 1000 pages. There is no social and no inbound links data integrated with the audit yet.

Desktop based SEO Auditors

Screaming Frog

Although this tool is recommended by many SEOs as a free tool, there are a lot of limitations so you will need the paid version to take advantage of all its features.

Advantages: It’s very fast and you can configure it to get more information about your site than you’ll ever need.

Disadvantages: It has a steeper learning curve and it requires some tweaks to harness its full powers. Also, being a pure crawler, there is no data collected on analytics, social mentions and inbound links to your website.

SEO Auditor from Advanced Web Ranking

The SEO Auditor is one of the many research tools included in Advanced Web Ranking and the best part is that it is integrated with analytics, links and social data so you can also get a complete picture of which pages of your site perform better overall.

Advantages: Being a desktop tool means that you can start a crawl whenever you wish and as many times as you wish. It’s very configurable and it does not have the limitations that other web based tools have.

Disadvantages: Currently, you can only start the crawls manually and you are limited to 10,000 pages per crawl.

Additional Resources

If you ever find yourself hungry for some more info on this topic, here are some very good articles that will teach you how to approach an SEO Audit:

The World’s Greatest SEO Audit – Steve Webb wrote an awesome article in which he shares his insanely thorough SEO audit process based on his research and experience.

Find Your Site’s Biggest Technical Flaws in 60 Minutes – Dave Sottimano gives you an insight on how the pros at Distilled approach a 60 minutes technical SEO Audit.

How To Perform A SEO Audit – Neil Patel shares the insights of a recent SEO Audit that he purchased for $5,000 for his website QuickSprout.

How to do a Mobile SEO Audit – Aleyda Solis wrote an awesome article recently on what you need to do to improve your mobile search visibility, traffic and conversions.

Large Scale Link Evaluation – Alan Bleiweiss reveals how you can become more efficient with your large scale technical audits without sacrificing quality.

What do you think?

Now it’s your turn. What are your biggest challenges when it comes to auditing a website and what is the most appropriate solution you found?

I would love to hear what you are doing differently when you audit a site. So why not comment below with your own story? It will help everyone make their own site audits better.

About Philip Petrescu – Philip is the CEO and Co-Founder of Caphyon. He’s been passionate about writing software since 1996. You can follow Philip on Google+ and Twitter. If you’re into conversion rate optimization, check out his latest project called Lead Converter.

 

 

Hreflang and canonical international SEO test

hreflang-webmastertools

This is a guest post by Grosen Fris, SEO at OnlinePartners in Denmark

Google’s hreflang option for international SEO has been available for more than a year now, so we decided it was time to conduct a clinical SEO test to see if it works as promised.

In addition to testing if Google’s hreflang option has an effect on how your web sites’ performs in Google’s country-specific indexes, like e.g. Google.co.uk and Google.dk, we also tested whether hreflang can be combined with canonical in case you have problems with duplicate content on your web sites.

Why test the combination of hreflang and canonical?

Hreflang is very interesting for web sites that have e.g. more or less identical english content spread across different sub domains or country code top-level domains (ccTLD) – e.g. mydomain.co.uk for UK and mydomain.ie for Ireland.

You may get the following advantages, when you e.g. have several web shops each targeting a specific country, despite the fact that their content is almost 100% identical and thus have major problems with duplicate content.

  1. You can get a better country-specific representation in Google’s search results, which many users no doubt appreciate. E.g. you get mydomain.co.uk to appear in search results in Google.co.uk instead of mydomain.com
  2. You also let Google help you send the user to the most relevant web shop and this way you increase the likelihood that the user immediately sees the most relevant currency and price. You also let Google help you send the user to the web shop from where delivery is possible. Imagine you have a webshop on mydomain.com and mydomain.co.uk and let’s assume that it is mydomain.com that appears in Google’s Google.co.uk search results. This would send the user to a web shop that might show the user the wrong currency and price, and perhaps shipment to the UK is not possible from mydomain.com. Here you might need special features on each web shop that tries to detect where in the world the user is located based on e.g. his/her IP and redirect him/her from e.g. mydomain.com to mydomain.co.uk

We also wanted to test hreflang in combination with canonical because Google on the one hand states that you should do so if you have problems with duplicate content, on the other hand we have also spoken with many SEO’s who were not sure about this.
However, it does make sense to be able to combine hreflang and canonical.

  • If you have domains with unique content targeted different countries, then you do not need canonical. Here you only need hreflang that gives you the opportunity to tell Google how all your various domains are linked together across many countries.
  • If you on the other hand have identical content in the same language across multiple domains targeted different countries where they speak the same language, then it makes perfect sense to combine hreflang and canonical.

Test conducted on .com domain and related sub domains

We have used the following (sub)domains to conduct this test, and we encourage all to take a look at how they are set up.

Country Language (Sub)domain
Not selected English http://href-lang.com
Australia English http://au.href-lang.com
United Kingdom English http://uk.href-lang.com
Ireland English http://ie.href-lang.com

Structure of a test web site:

When you look at a single test site, none of the pages have duplicate content, this is ensured due to the use of gibberish english – i.e. english words automatically and randomly selected for each page. However if you compare each test web site you will see that they are 100% identical across the four test (sub)domains..
Each test web site is set up as follows.

  • 5 levels:
    • Home page
    • Below the home page there are 3 levels and each has 9 sub-pages
    • 5th and lowest level consists of link-out-pages
  • The test web sites reside on 1 main .com domain and 3 related sub domains
  • Hosted on an IP address related to Denmark (77.66.30.208) Test yourself via ipligence.com/geolocation
  • The only link building made for the test web sites are from web sites related to Denmark
  • We deliberately chose to use sub domains instead of ccTLD’s as ccTLD’s themselves give Google a strong signal of target country and language, that is not the case for a .com domain and related sub domains
  • Since the site: command seems to be phased out by Google, it does not give you a good overview of the indexing of the test web sites, so we decided to submit all 4 test web sites to the same Google Webmaster Tools (GWT) account. We did not use GWT to “cheat” by setting a target country for each test web site inside GWT :) We only used GWT to monitor the indexing of each test web site.

Structure of and content on a page

Each page contains the following:

  • Titel
  • Meta description
  • Hreflang og canonical
  • Breadcrumb
  • Main headline wrapped in <h1> tag
  • Sub headline wrapped in <h2> tag
  • 1-3 paragraphs wrapped in <p> tag
  • Navigation and outgoing links

Configuration of hreflang and canonical on a page

The configuration of hreflang and canonical on a page is as follows

Country Language (Sub)domain hreflang canonical
Not selected English http://href-lang.com en Points to http://href-lang.com
Australia English http://au.href-lang.com en-au Points to http://href-lang.com
United Kingdom English http://uk.href-lang.com en-gb Points to http://href-lang.com
Ireland English http://ie.href-lang.com en-ie Points to http://href-lang.com

Example:

<link rel="alternate" hreflang="en" href="http://href-lang.com/chordospartium-pane.html" />
<link rel="alternate" hreflang="en-ie" href="http://ie.href-lang.com/chordospartium-pane.html" />
<link rel="alternate" hreflang="en-au" href="http://au.href-lang.com/chordospartium-pane.html" />
<link rel="alternate" hreflang="en-gb" href="http://uk.href-lang.com/chordospartium-pane.html" />
<link rel="canonical" href="http://href-lang.com/chordospartium-pane.html" />

 

Here you can see the complete setup of a page – click on image to enlarge (original here)

test-page-set-up

Google indexing from start until now

We conducted site: searches in Google and we watched the indexing in GWT.

Initially, both the main domain and the sub domains where indexed in Google, but when the sub domains reached up to approx. 80-110 pages being indexed, the indexing stopped and began to roll back. I assume it is because Google’s bot first crawls the pages on the test web sites, and then later another routine is doing analysis of other elements such as hreflang and canonical. Thus Google’s search results do not immediately reflect the use of hreflang and canonical. At this moment where I write this blog post GWT states that is has reviewed approx. 870 of the 901 pages on each sub domain and that there are only approx. 16-31 pages on each sub domain that are still indexed in Google, however we expect that to be fully adjusted in the near future. All in all what we saw in GWT related to the indexing of the 3 sub domains were as we expected.

Unfortunately the two screen dumps below are in danish as it was not possible for me to change the GWT interface from danish to english.

  • Blue: Total pages indexed
  • Red: Total pages reviewed
  • Yellow: Total pages blocked from being indexed (e.g. via robots.txt)
  • Purple: Total pages removed

Click on image to enlarge (original here)

gwe-uk-href-lang-com-600x376
However, the indexing of the main domain was a bit of a surprise, the reason is that due to the use of hreflang and canonical it seems as if GWT perceived the 4 test web sites as one single web site. The 4 test web sites consists of 4 x 901 pages = 3,604 pages, and as this blog post is being written GWT states that 4,409 pages have been crawled and reviewed. That is 800 pages more than actually exists on the 4 test web sites and I have no immediate idea why GWT is so inaccurate on this specific number?

Click on image to enlarge (original here)
gwe-href-lang-com-600x379

Below is a list of how many pages Google so far has reviewed for each test web site, the maximum number of pages that have been indexed and how many pages is currently indexed in Google.

Country (Sub)domain Number of pages
reviewed
Number of pages
indexed
(maximum)
Number of pages
indexed
(for the moment)
Not selected http://href-lang.com 4,409 1,129 895
Australia http://au.href-lang.com 871 110 34
United Kingdom http://uk.href-lang.com 870 83 16
Ireland http://ie.href-lang.com 869 74 19

Test results

We have conducted tests in Google’s country-specific indexes via both real people and tools:

  • Manual tests carried out by kind people in the SEO industry who are based on relevant geo-IP’s (Australia, United Kingdom and USA)
  • Via manual tests through VPN / proxy that is based is a relevant country (Canada)
  • Impersonal.me
  • Software that measures the positions of a (sub)domain on selected keywords in specific Google country-indexes

The following search phrases were tested in Google’s different country-specific indexes. Please try for yourself by copy/paste the search phrases from the fields below and try them in Google (consider including the double quotation marks as this makes a test search in Google more accurate).

Level at
test web site
Search phrase
1.
1.5
1.6.5
1.7.1.8

All test search phrases showed the expected (sub)domains in Googles search results:

Geo-IP (Sub)domain in SERPs
USA http://href-lang.com
Canada http://href-lang.com
Australia http://au.href-lang.com
United Kingdom http://uk.href-lang.com
Ireland http://ie.href-lang.com
Denmark http://href-lang.com

Conclusion

  • Can you use Google hreflang to international SEO? Yes
  • If you have problems with duplicate content, should you then combine hreflang with canonical? Yes
  • If you do NOT have problems with duplicate content, should you then also combine hreflang with canonical? No

Finally I should like to say that earlier it was not a good idea to let the three sub domains or equivalent ccTLD’s be indexed in Google, because of the problems with double content. At the same time it would be almost impossible to get other than the main .com domain to appear in all search results, even when searching in Google’s country-specific indexes. But thanks to hreflang and canonical, this is now possible.

Please beware that we also present the results from this test in this YouTube video

11 Technical SEO Tools you Should be Using

tool-time-1

As an SEO, I use a wide range of tools to help me with my job.  There are loads of tools out there and being honest, I often forget about some of the ones I have available to me.  Today I wanted to show you the contents of my bookmarks when it comes to technical SEO.   I’ll talk about the tool itself and highlight some SEO tasks that it can help with.

I will be talking about XML Sitemap Validator, Bulk HTTP header response checker, W3C Internationalization Checker, Web Page Speed Test, SEO Toolkit for IIS, Built With, Schema Creator, Reverse IP lookup, Spy on Web, Check Websites on same IP C Class and Screaming Frog. (more…)

Announcement: New URL Submitter Tool from Majestic SEO

majestic-seo-url-submitter

After a week of smaller announcements, today Majestic SEO announce a new URL Submitter Tool.

This new feature is one of the most requested and is available for free and paid for accounts. The new submitter tool allows users to influence which URL’s are crawled and add those that may not have been picked up…yet!

With over 4 Trillion plus URL’s crawled in the historic index, the ability to add specific URL’s means that you help decide on a worldwide scale which URL’s get crawled first.

(more…)

How We Got Our Penalty Revoked Using the Disavow Tool (Case Study)

google-disavow-tool

This is a guestpost by Sander Tamaëla, a freelance SEO Consultant

In October a new client approached me with a question: Can you help me with trying to revoke a penalty? It happened to be that Google had just released their link disavow tool and this penalty was link based. In this post I want to share my three months of experience with using the disavow tool to get a penalty removed. (more…)

A Technical SEO Guide to Crawling, Indexing and Ranking

Computer-Centre

Technical SEO can often be brushed aside a bit too easily in favour of things like content creation, social media and link building. However I’ve always believed that there are many opportunities for increasing traffic by looking inwards rather than outwards. One of the biggest areas of this for me is to make sure that your website is as accessible as possible to the search engines.

It’s quite simple really – if the search engines can’t crawl your website efficiently, you’re unlikely to rank. Even links and social shares won’t solve severe accessibility issues so the knock on impact is that your link building will look inaffective. This is the last thing you want because link building can be hard anyway, you don’t want to cripple yourself before you’ve even started

So in this post I’m going to talk through some of the key areas you need to think about when it comes to making your website accessible. An accessible website means that all target pages will be indexed and have the opportunity to rank for your target keywords. (more…)

SEO Strategy for Delivering a Smooth Platform Migration

splash!!!

The last week I was managing a big migration for a client when the development team was very short on time. Needless to say it wasn’t smooth which made me think that there needs to be more communication between SEOs and developers for both our sakes but that is probably a discussion for another day.

What I think will be more useful right now is to cover a few of the essential elements and strategies when planning a migration and issues that I’ve picked up along the way. (more…)

Good Rel.ations – A Beginners Guide to Rel Attributes

rel-attributes

In recent times we’ve covered a lot of stories about HTML tags that include, canonical, author, publisher and many more for reasons that they are (relatively) new; topical, useful and beneficial for webmasters seeking to get more (and more relevant) traffic to more and more appropriately indexed pages.

We bandy about terms like “attributes”, “tags” assuming we’re talking to an initiated audience which is often the case. However not all our readers are search professionals, web developers or HTML proficient. Many are webmasters and small business owners who wear many hats. We thought it might be useful to collectively examine the more useful of these tags in their employ as rel attributes; particularly the ones that might directly boost our efficacy or solve a problem from an organic search perspective. (more…)

The importance of SEO in online rebranding

New brand stamp

This month I’ve chosen to delve into the world of online rebranding, specifically in the charity sector. SEO now plays a critical role for any company wishing to rebrand as the new identity of a company will need to be reflected in the keywords it wants to rank for online. Even small changes in a rebranding exercise can have big a big effect on website traffic.

Let’s take Prostate Cancer UK as an example. In June 2012 The Prostate Cancer Charity became Prostate Cancer UK. According to the company website this change in identity was done to get away from the association of a fundraising charity and rebrand as a leading research authority:

“Our research has shown that only three of the top hundred UK charities use the word ‘Charity’ in the name and that the word is associated with fundraising, children, and dependency – not with authority, research, or expertise. Prostate Cancer UK is clear, simple, has authority, and importantly still communicates who we are.” (more…)

Page 1 of 6123456