Best practice: Planning a sitemap with keyword clustering

Keyword research is not only crucial for SEO, a powerful methodology for understanding the intentions and language used by your market, but by clustering the results you can also plan a website’s structure. This ensures:

  • Optimal search engine visibility
  • A taxonomy aligned to your market’s mental model
  • Selection of terminology understood by your market

In this practical step-by-step guide, I’ve used the example of planning a new job board, however the methodology is valid for all industries.

1. Getting started

Assuming we’ve already used Google Trends or market knowledge to identify it as the correct seed term, here’s the downloaded results for a query of ‘jobs‘ in the Google Keyword Planner. It’s important to select the correct market (in this case the UK) and turn ‘Only show ideas closely related to my search terms’ on, otherwise you’ll spend much longer sorting through irrelevant keywords.

Raw data export
The CSV download of a query for; Searches similar to ‘jobs’ in the UK

Delete the additional columns created by default (Competition, Suggested Bids etc.) leaving only the, Keyword and Average Monthly Search Volume sorted high to low.

2. Cleaning the list

As with any keyword research it’s important to check each term against these three criteria, ranked in order of importance;

  1. Is it relevant? Do I have content on my site which relates to this?
  2. Is there sufficient search volume? Do enough people search for it to make it worthwhile?
  3. Is it achievable? Will my site, now or in the future realistically have enough authority to rank for this?

First remove the irrelevant terms e.g. imagine your hypothetical job board doesn’t;

  • Recruit for specific employers e.g. ‘tesco jobs‘ or ‘mcdonalds jobs
  • Wish to compete for big competitor brand terms like ‘guardian jobs
  • Recruit for jobs overseas
  • Want any vague or irrelevant terms like ‘good jobs‘ or ‘boob jobs

Afterwards you will be left with a reduced list with only those terms relevant to your business. In this example 85% of the terms we started with, it will vary for you based on how focussed on a specific niche your business is.

3. Identifying user intent

Next you need to understand exactly what solutions people are looking for. This is as much an art as a science, and while our example refined list has every possible segmentation of the jobs market e.g.

  • Salary ‘100k jobs
  • Educational level e.g. ‘graduate jobs
  • Industry e.g. ‘jobs in sport
  • Location e.g. ‘jobs in kent

This market is most obviously divided between those looking for specific skills and industries and those looking for jobs in locations, particularly the latter. Approximately 25% of the the terms with a cumulative 1.6m search volume relate to a finding a job in a specific location.

Keyword list, locations only
201 of the the terms with a cumulative 1.6m search volume related to a specific location.

The reminder are largely searches for function e.g. ‘marketing jobs‘ pr industry e.g. ‘music jobs‘.

4. Clustering the keywords

Post Google’s Hummingbird update there is much more focus on clustering of keywords, however it has always been the case that the same user intent has been represented by multiple keywords and that these should be grouped during the planning phase of a new site. The only real difference is that now we can rely on Google being somewhat better at identifying user intents so our groups can be broader.

In the location segment, we can clearly see many keywords with same intent and similar strings e.g.

  • jobs in glasgow (27100)
  • glasgow jobs (8100)
  • jobs glasgow (8100)

Which should be clustered together in Excel e.g.

Example of keywords clustered by theme
Grouping keywords by user intent – jobs in London and in Glasgow.

It gets more interesting when we look at the professions and industries segment. As Google has improved at understanding concepts, we can now legitimately group together keywords that are semantically linked but with dissimilar literal strings, for example;

  • driving jobs (14800)
  • hgv jobs (12100)
  • delivery driver jobs (5400)
  • delivery jobs (5400)
  • chauffeur jobs (5400)
  • bus driver jobs (4400)
  • hgv driving jobs (3600)
  • van driving jobs (2900)

All these terms show a similar user intent, whether you choose to break out a term in to it’s own page is a judgment call you should make based on it’s importance to your business. In this example it’s arguable that ‘hgv jobs‘ is sufficiently distinct and popular to deserve it’s own page.

This needs to be completed for all the major segments you identified, which will probably take around a day, depending on the size of your niche and your mastery of Excel shortcuts. As you progress you will see patterns emerge and get a sense of the language and requirements of your market.

5. Building the sitemap

As you group the keywords in Excel you will see the sitemap emerge, with each page optimised for it’s most popular keyword but referencing the other keywords in the group.

A simplified example of a sitemap made from clustering keywords
A simplified example of a sitemap made by clustering keywords

When producing copy for these pages it’s ideal if you can, while keeping the user first in mind, use all or most of the keywords in the cluster.

6. Conclusion and more reading

By following this methodology you will produce an intuitive and search optimised sitemap for your site. For more information on clustering keywords, watch this video on modern keyword research from Moz’s Rand Fishkin.

Happy clustering!

The USA’s most searched for employers

Some organisations struggle to hire the employees but others are in the fortunate situation of having people seek them out for employment. It would be interesting to know which companies are actually being sought by potential employees and in what numbers.

The nuances of this are actually quite complex as there are a number of difficulties to take in to consideration;

  • The different ways a searcher may express interest in employment e.g. careers, applications, jobs, job, job in a location etc etc
  • The possible misinterpretations of brand names
  • The sheer numbers of brands to be examined.

The best (though far from perfect) approach is therefore to adopt a consistent approach in all cases. This table shows the number of exact Google queries for a brand + job originating from the United States in October 2012.

# Brand Searches + jobs
1 Government 198500
2 UPS 60500
3 Google 59400
4 Walmart 49500
5 Kaiser Permanente 49500
6 Disney 37000
7 Fedex 29000
8 TSA 27100
9 FBI 27100
10 Boeing 27100
11 Home Depot 27100
12 Target 27100
13 Apple 22200
14 Costco 22200
15 United nations 22000
16 Lockheed Martin 22000
17 USPS 21400
18 Safeway 18100
19 Yahoo! 18100
20 Amazon 18100
21 McDonalds 16200
22 At&t 16200
23 Homeland Security 14800
24 Macy’s 14800
25 Coca Cola 13200
26 Raytheon 12100
27 Microsoft 9990
28 CIA 9900
29 USAA 8100
30 Target 8100
31 Verizon 8100
32 JC Penney 8100
33 Budweiser 8100
34 BNSF 8100
35 Intel 8100
36 Bank of America 6660
37 US Army 6600
38 IRS 6600
39 Northrop Grumman 6600
40 KBR 6600
41 Whole Foods 6600
42 Pepsi 6600
43 YMCA 6600
44 Anheuser Busch 6600
45 Marriott 6600
46 Air Force 5400
47 Navy 5400
48 PG&E 5400
49 Harris Teeter 5400
50 Pizza Hut 5400
51 Royal Caribbean 5400
52 Chase 5400
53 HP 5400
54 Kroger 5400
55 Carnival Cruise 5400
56 Frito Lay 5400
57 Delta Airlines 5400
58 GE 5400
59 Walgreens 5400
60 Blackwater 4400
61 Wendys 4400
62 Ebay 4400
63 Panera Bread 4400
64 Cisco 4400
65 Sprint 4400
66 Aramark 4400
67 Old Navy 3600
68 Peace Corps 3600
69 Chrysler 3600
70 Dyncorp 3600
71 Oracle 3600
72 John Deere 3600
73 Staples 3600
74 USDA 2900
75 National Guard 2900
76 Albertsons 2900
77 T mobile 2900
78 Winn Dixie 2900
79 Wegmans 2900
80 Holiday Inn 2900
81 Sams Club 2900
82 America Express 2900
83 Time Warner 2900
84 GAP 2900
85 Kohls 2400
86 Nestle 2400
87 Lowes 2400
88 Weatherford 2400
89 Hollister job application 2400
90 Hershey 2400
91 Ford 2400
92 Dell 2400
93 Labcorp 1900
94 Circle K 1900
95 BP 1900
96 Aldi 1900
97 Rite Aid 1900
98 Sysco 1900
99 Fred Meyer 1900
100 General Dynamics 1900


Where there is a very clear intent with similar numbers they are combined as a single entry e.g. government jobs, federal government jobs, gov jobs, us government jobs are aggregated in to ‘Government’.

The top companies are a mixture of aspirational e.g. Google (3rd) and the FBI (9th) and major employers such as UPS (3rd) and Walmart (5th). To put this in perspective, Walmart receives roughly 17% less interest than Google despite employing 2.1 million people – roughly 39 times more than the famous search engine.

A Human Resources department concerned with measuring the attractiveness of it’s organization can easily use a similar methodology to gauge their appeal to potential employees versus competitors and to measure their changing appeal as an employer over time.

For a more general view of brand value via keyowrd resarch, try ‘Tracking a brand with keyword research‘.

Getting hired with keyword research

Human Resources professionals will tell you that in a competitive labor market, job seekers must compete to be the most visible to recruiters and hiring managers who are searching for their skill set.

Most recruiters aren’t searching Google for their prospects, instead they use job boards like Monster and Indeed or increasingly, professional social networks like LinkedIn, XING (German speakers) or Viadeo (French speakers).

But what exactly are recruiters searching for? Just like a company ranking in Google for a phrase that no customer searches for, if recruiters aren’t searching for what you’re ‘visible’ for then it’s not going to help you get a job.

Most of the time a recruiter or hiring manager is looking for candidates with experience similar to the job they are recruiting for so they will be searching for job titles and skills relating to the open job.

Therefore by understanding the language used in open jobs you can indirectly understand what is being searched for by the people you might want to find and hire you.

To do this accurately you’ll need a huge index of jobs; step forward – a site whose business model revolves around aggregating jobs from employers and other job boards with circa 3 million listings in the United States alone.

Therefore if we want to see the numbers of times a phrase is used in a job ad, it’s simply a matter of running a job search, use the ‘With the exact phrase’ field of the Indeed advanced search and checking the number of results returned for that phrase.

As a practical example let’s take a computer security specialist who is deciding whether to optimise primarily as an IT Security or Information Security specialist.

Exact text matches (USA) 16th October 2012

Keyword Jobs returned
it security 5,113
information security 11,582

From this we can see that there are roughly double the number of jobs referencing ‘information security’.

Or let’s say an SEO type that wanted to appear prominently in recruiters’ searches (and what SEO would turn down the challenge?);

Exact text matches (USA) 16th October 2012

Keyword Jobs returned
seo 7,999
search engine optimization 2,659
natural search 218
organic search 365

Here we can see that SEO is by far the most popular term referenced in job ads and therefore it is the most suitable synonym to optimise for.

Obviously it is important to use the description most appropriate for your skill-set, if you really do know more about IT Security than Information Security you should optimise for that. But where there are straightforward synonyms like SEO and Search Engine Optimiztion, while you should use all of them somewhere in your C.V. and profile, you can make yourself ‘more visible’ just by concentrating on the most popular phrases.

Optimise your CV (resume) and LinkedIn profile with the search terms used by recruiters and hiring managers.