Analysing Search Console Data in Pandas

Every SEO has pushed Excel beyond it’s limits at some point. Pandas (‘Python Data Analysis Library’) is a widely used Python library that can handle far more data than Excel/Google Sheets.

As an example, here is a Jupyter Notebook with Python / Pandas code that:

  • Upload a .CSV export from Search Console > Performance > Queries
  • Adds some data features
  • Graphs the correlations

Example data from recruitin.net:

Graphed, the correlations look like this:

Correlation v Clicks

The notable insight from the recruitin.net data is that due to very strong rankings and a relatively unknown brand, generic terms drive more clicks than brand-terms. In this particular niche the specificity (query length & number of tokens) have a very week impact on clicks.

Feel free to download the workbook from GitHub and use on your own data as you wish.

Further reading

Managing crawl budget for large sites

If you manage a site with millions or even billions of URLs, it’s important to consider that Google and Bing have a crawl budget, a limit in the number of URLs they are prepared to crawl, for every domain, determined by its authority.

If a less-authoritative domain has billions of URLs, Google won’t crawl potentially important sections of your site, thus losing you traffic.

So one of the biggest SEO challenges for large ecommerce sites is balancing:

Not missing traffic by excluding product and aspect (aka attribute) combinations that have search demand

vs.

Not spamming the search engines with many combinations for which there is no demand

An example

For example, Size and Colour aspects for sites that sell both televisions and shoes:

Size Colour
Televisions
  • High search volume
  • Important filter aspect
  • Low search volume
  • Unimportant filter
Shoes
  • Low search volume
  • Important filter
  • High search volume
  • Important filter aspect

Why not open up every combination?

Take one category e.g. Men’s sports shoes, with 6 aspects in your catalogue:

Aspect Example Number of aspect values
Brand Nike 150
Shoe size 10 30
Colour Blue 15
Style Basketball shoes 5
Material Leather 4
Line Air Jordan 200

Every combination of every aspect value multiplies up very quickly:

150 * 30 * 15 * 5 * 4 * 200 = 270,000,000

That is, 270 million possible URLs for this category alone!

The solution

Understand which aspects have values which are primarily searched in Google and only open those aspects for crawling.

You’ll need large and representative keyword set, potentially millions of keywords depending on the scope of your site, but here’s a rough example on a limited keyword set as an example

Category: Mens Trainers Matching Aspect 1 Matching Aspect 2 Avg. Monthly UK Google Searches
mens white trainers Colour 2900
mens running trainers Style 2900
mens black trainers Colour 2400
nike mens trainers Brand 1600
white trainers mens Colour 1600
mens trainers uk None 2400
black trainers mens Colour 1900
mens gym trainers white Style Colour 1300
all black trainers mens Colour 1000
mens red trainers Colour 880

Full list here, note the data has been tweaked to better illustrate the concept. Also for American readers, trainers = sneakers 🙂

The search volume for the keywords can be clustered into the category’s aspects e.g.

Aspect Searches Containing a value for this aspect
Colour 19,920
Style 13,420
None 6,000
Brand 3,750
Material 3,150
Size 2,720

From this we can see that Colour and Style are important to open up to crawl, Material and Size less so.

Good:

  • https://site.com/trainers/blue
  • https://site.com/trainers/running
  • https://site.com/trainers/blue-running

Potentially a waste:

  • https://site.com/trainers/size-10
  • https://site.com/trainers/leather/size-10

Just removing Material and Size, dramatically reduces the number of aspect combinations:

150 × 15 × 5 × 200 = 2,250,000

Saving us 267,750,000 URLs required for crawling. Not bad!

As aspects can be common across categories (e.g. size and colour), to exclude categories selectively, append a string which you have excluded in robots.txt to those categories you choose to not have indexed e.g.

And then in your robots.txt:

User-agent: *
  Disallow: /*search=nope

Summary

Sites with broad inventories, whether products, jobs, holiday destinations or anything else, should be careful to only open for crawl aspect combinations where there is real external demand.

Also important, sign up to my totally unrelated side project, Mustard Threads 🙂

Side Project Marketing On A Shoestring

Why

Practically everyone has, is, or will consider running a side-project at some point. It can be a hobby, a way to learn, boost your portfolio, generate extra cash or a lottery ticket out of the day job. Either way it’s tough to beat the thrill of seeing other people using your creation.

Most people though, won’t plough much money in to marketing their idea, at least until the point it starts to show real promise or generate it’s own income.

This therefore is a guide to promoting your idea with little or no money.

1.0 Set objectives

Before you start, decide what you actually want to achieve with the project e.g.

  • Test whether an idea has commercial potential before quitting your job
  • Show employers what you can do
  • Generate traffic and page-views to earn affiliate or ad revenue

Deciding this gives you a clear direction for your marketing.

2.0 Before you launch

2.1 Do your research

Firstly conduct keyword research on your market, which will:

  • Make sure there actually is some interest
  • Ensure You’re using language users understand
  • Allows you get the full benefits of SEO
  • Provide structure for your AdWords trial

This is the point where you normally find the three other tools doing your totally original idea. It’s annoying but don’t let it deter you; it’s all about marketing and execution, just ask Tom:

MySpace Tom
MySpace Tom

 

2.2 Setup a holder

Well before you launch, setup a holder page with:

  • A line to describe your project and it’s benefits e.g. An app that helps accountants file taxes faster
  • A more detailed (but still concise) explanation
  • A way to register for more interest
  • Social share prompts

If a representative user can’t tell you in a couple of seconds what the project is about, you need to tweak some more.

Having a good holder site will ensure:

  • Your SEO will hit the ground running
  • You can direct anyone interested there for more info
  • You gain a bit more credibility
  • You can capture the details of anyone who shows up in the mean-time

Launch Rock popularised and systematised this approach, but a basic HTML template + Mail Chimp is also free and can be a bit less fiddly in practice.

As an example here’s the pre-launch screenshot of holder page for Mustard Threads.

Mustard Threads Holder
Click for the full page

 

2.3 Make friends

If there are any authorities in your niche, now is the time to start making friends with them on Twitter, LinkedIn or wherever they interact mainly. Building relationships with bloggers, popular Twitter users and journalists now will pay dividends later.

Search the keywords that you have identified in your keyword research and see who shows-up in Google. Identify Twitter Hastags used by your audience e.g. for men’s fashion startup Mustard Threads, #GentsChat attracts an audience likely to be interested. Be genuine and gracious and it will work out. You can often also get new insights in to your project’s functionality.

LinkedIn can be a great way to reach people. Use RecruitEm to find journalists, industry influencers etc, connect with a personalised message.

Register all your social accounts (Twitter handle, Facebook Page name etc) at this point.

3.0 Launch!

So your creation is ready and you’ve replaced your holder with the real thing. Many people like to throw everything at a big day one launch. That can sometimes be a good idea, like when you’re gunning for an app store ‘Most Popular’ or ‘Top Rated’, but in my experience both corporate and personal, a gradual escalation of marketing is almost always better.

A ‘soft’ / beta launch allows you to:

  • Identify functional and UX bugs without alienating your most valuable users
  • Test and improve your messaging
  • Spend whatever money you do allocate wisely

It will differ by project but here’s my suggested running order:

3.1 You and the rest of the team

If you’re even remotely target market, you should definitely be using (or ‘dogfooding’) what you’ve created. This will prevent user generated content projects from being a ghost town for early users and flag up any UX / functional issues.

Yum!
Yum!

 

3.2 Your long-suffering friends and family

Your friends and family (presumably) like you, hopefully enough to give your project a go. Obviously it’s tricky if you’ve created something that’s a real niche; (your .htacess generator might be a bit lost on your grandma), but chances are you’ll have some people in your close circle who would enjoy or benefit from it.

At this point you’ll want to stay close to your new users to get any feedback as to how you can improve the UX and functionality.

Tools like Doorbell and Podio offer free feedback functionality you can embed on in your site or app.

3.3 Use your wider network

Find a way to promote your project on your every social profile without spamming everyone to death. A nice Facebook share with a request for feedback seems to work well.

Most people, inevitably, will be disinterested, but you might be surprised at the people who become users and even advocate for you.

On Twitter, unless you’re a terrible pop star with millions of followers, then make sure you do a few tweets at different times of the day, with hashtags relevant to your target audience.

If you have a personal blog, write an announcement post. Later on you can reach a wider audience on Medium.com which is the derigeur place for startups to communicate these days.

Obviously also consider, Google+ , Pinterest etc etc. as suits your project and goals.

3.4 Send those emails

Hopefully during the course of building your project, some judicious social sharing, personal networking and random SEO will have generated a few sign-ups on your holder page.

Now is the time to cash this in, email them an let them know you’re good to go, let them know they’re among the first to use it and ask for as much feedback as they are prepared to give.

3.5 Accelerate SEO

SEO is a whole topic, and generally a slow burn but at minimum should have:

  • Identified the keyword to rank for
  • Included optimised metadata (and Meta Desc Tags)
  • Built links from wherever you can e.g.
    • Other personal projects and personal blogs
    • Other blogs and news sites writing about your project

3.6 Test paid search

Google throws around vouchers to encourage new advertisers on it’s AdWords program like confetti. Voucher values are usually up to around £120, so if for example you’re paying £0.30 per click will get you around 350 possible new users.

To get a voucher you can wither join up to their ‘partner program’ in which case they will start mailing you vouchers periodically, or if you’re desperate go and flip through the web development magazines in your news agent.

You can use the results of the keyword research to build out your campaign. If it turns out all the AdWords traffic bounces, you probably picked the wrong keywords.

3.7 In person networking

Check Meetup for events related to your project. Practice a little description of what you do before you go so it sounds slick when you’re mingling.

If there is nothing specific to what you do, there will usually be a generic ‘startup’ event you can attend, they do tend to be full of people too objectionable to hold down a job, but sometimes you’ll strike gold.

3.8 Press & blogger outreach

If your service is genuinely interesting, new, or a timesaver; or at least there’s an interesting angle on it (it uses a trendy a gadget, a cult celebrity uses it, you built it while in prison etc), you can usually get someone to write about it.

In my experience it’s nearly impossible to get the mainstream media (newspapers etc) to write about you, but if you fancy a go, try Muckrack. However blogs within a niche e.g. recruitment, SEO etc, will often be happy to write about you; sending you quality links (SEO win) and traffic.

3.9 Staying in touch

Encourage users to sign up to your Twitter feed, like your Facebook page and/or sign up to email alerts to encourage repeat visits.

3.10 Social Sharing

Make sure users can easily share on social. Consider what usually makes people share:

  • Ego – something about the user that flatters their ego
  • Inherent reward – get 10 extra points on your gamification system
  • Humour – Users share something funny so people think they are funny and like them more
  • Controversy – Tricky to pull off, but people do share causes etc

4.0 What not to do

This isn’t a blog post for well-funded startups working full time on their next ‘unicorn‘, it’s for those creating projects in their spare time. Don’t lean on work contacts or resources to help; it’s probably your day job that pays the rent so don’t jepordise that.

Moreover though laws differ country-to-country, if you’re using work time, computers, contacts etc to work on your project, then should it actually become commercially valuable, your employer will have a strong case to assert ownership.

5.0 Next steps

After you’ve got a solid base of users for your idea keep soliciting feedback, checking analytics (Google Analytics, Pwick) etc, doing Guerilla UX tests and improving it.

Hopefully your service should see a steady stream of new users, retain it’s existing users, and hit all the objectives you’ve set for the project.

After that, just maybe you might make it big

Automatically check Google Sitelinks

When Google regularly swaps the organic sitelinks under your brand, it can be a pain checking multiple sites / markets to make sure it’s the way you want it.

An example of sitelinks, for eBay on Google UK.
An example of sitelinks, for eBay on Google UK.

To solve this problem (for me), I’ve written a simple PHP script which when run daily will check this out, and email you if there any changes.

Example email when sitelinks change

Here’s the link to the GitHub repo: https://github.com/ChrisCB/sitelinks-watch, and the direct download.

Feel free to take the code and do what you like with it, obviously no warranty, and completely as-is, imperfections and all.

I’ll be improving this in the future, if you have any requests, or pointers for improvement, let me know!

Best practice: Planning a sitemap with keyword clustering

Keyword research is not only crucial for SEO, a powerful methodology for understanding the intentions and language used by your market, but by clustering the results you can also plan a website’s structure. This ensures:

  • Optimal search engine visibility
  • A taxonomy aligned to your market’s mental model
  • Selection of terminology understood by your market

In this practical step-by-step guide, I’ve used the example of planning a new job board, however the methodology is valid for all industries.

1. Getting started

Assuming we’ve already used Google Trends or market knowledge to identify it as the correct seed term, here’s the downloaded results for a query of ‘jobs‘ in the Google Keyword Planner. It’s important to select the correct market (in this case the UK) and turn ‘Only show ideas closely related to my search terms’ on, otherwise you’ll spend much longer sorting through irrelevant keywords.

Raw data export
The CSV download of a query for; Searches similar to ‘jobs’ in the UK

Delete the additional columns created by default (Competition, Suggested Bids etc.) leaving only the, Keyword and Average Monthly Search Volume sorted high to low.

2. Cleaning the list

As with any keyword research it’s important to check each term against these three criteria, ranked in order of importance;

  1. Is it relevant? Do I have content on my site which relates to this?
  2. Is there sufficient search volume? Do enough people search for it to make it worthwhile?
  3. Is it achievable? Will my site, now or in the future realistically have enough authority to rank for this?

First remove the irrelevant terms e.g. imagine your hypothetical job board doesn’t;

  • Recruit for specific employers e.g. ‘tesco jobs‘ or ‘mcdonalds jobs
  • Wish to compete for big competitor brand terms like ‘guardian jobs
  • Recruit for jobs overseas
  • Want any vague or irrelevant terms like ‘good jobs‘ or ‘boob jobs

Afterwards you will be left with a reduced list with only those terms relevant to your business. In this example 85% of the terms we started with, it will vary for you based on how focussed on a specific niche your business is.

3. Identifying user intent

Next you need to understand exactly what solutions people are looking for. This is as much an art as a science, and while our example refined list has every possible segmentation of the jobs market e.g.

  • Salary ‘100k jobs
  • Educational level e.g. ‘graduate jobs
  • Industry e.g. ‘jobs in sport
  • Location e.g. ‘jobs in kent

This market is most obviously divided between those looking for specific skills and industries and those looking for jobs in locations, particularly the latter. Approximately 25% of the the terms with a cumulative 1.6m search volume relate to a finding a job in a specific location.

Keyword list, locations only
201 of the the terms with a cumulative 1.6m search volume related to a specific location.

The reminder are largely searches for function e.g. ‘marketing jobs‘ pr industry e.g. ‘music jobs‘.

4. Clustering the keywords

Post Google’s Hummingbird update there is much more focus on clustering of keywords, however it has always been the case that the same user intent has been represented by multiple keywords and that these should be grouped during the planning phase of a new site. The only real difference is that now we can rely on Google being somewhat better at identifying user intents so our groups can be broader.

In the location segment, we can clearly see many keywords with same intent and similar strings e.g.

  • jobs in glasgow (27100)
  • glasgow jobs (8100)
  • jobs glasgow (8100)

Which should be clustered together in Excel e.g.

Example of keywords clustered by theme
Grouping keywords by user intent – jobs in London and in Glasgow.

It gets more interesting when we look at the professions and industries segment. As Google has improved at understanding concepts, we can now legitimately group together keywords that are semantically linked but with dissimilar literal strings, for example;

  • driving jobs (14800)
  • hgv jobs (12100)
  • delivery driver jobs (5400)
  • delivery jobs (5400)
  • chauffeur jobs (5400)
  • bus driver jobs (4400)
  • hgv driving jobs (3600)
  • van driving jobs (2900)

All these terms show a similar user intent, whether you choose to break out a term in to it’s own page is a judgment call you should make based on it’s importance to your business. In this example it’s arguable that ‘hgv jobs‘ is sufficiently distinct and popular to deserve it’s own page.

This needs to be completed for all the major segments you identified, which will probably take around a day, depending on the size of your niche and your mastery of Excel shortcuts. As you progress you will see patterns emerge and get a sense of the language and requirements of your market.

5. Building the sitemap

As you group the keywords in Excel you will see the sitemap emerge, with each page optimised for it’s most popular keyword but referencing the other keywords in the group.

A simplified example of a sitemap made from clustering keywords
A simplified example of a sitemap made by clustering keywords

When producing copy for these pages it’s ideal if you can, while keeping the user first in mind, use all or most of the keywords in the cluster.

6. Conclusion and more reading

By following this methodology you will produce an intuitive and search optimised sitemap for your site. For more information on clustering keywords, watch this video on modern keyword research from Moz’s Rand Fishkin.

Happy clustering!

Parsing the Google referral string in a post (not provided) world

Note: As March 2016 Google is no longer passing this information in the referral string.

 

As of September 2013 Google prevented site owners from seeing all organic referring keyword data in the referral string.

However there is still plenty of data to be gleaned from the string. For quick testing the HttpFox Firefox plugin is excellent. Systematically capturing the data is easily done in any web analytics tool or server log parser using simple Regex.

It’s important to note that this data appears not to be passed from mobile searches which may somewhat skew any conclusions.

1) The rank of the link that the user clicked

To understand the rank of the result the user clicked to arrive at your site, look at the ‘cd’ key / value pair. e.g.

http://www.google.co.uk/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CC8QFjAA&url=http%3A%2F%2Fwww.ebay.co.uk%2F&ei=oe9GUtLDMJPT7AbkZw&usg=AFQjCNGEltn-KekW3pKDE9fUDb2NKpKbWw&sig2=BecHxquQThrLWIGjk9VEMA

cd=1 indicates the clicked listing was in first place, cd=3 third place etc.

It does however get more complex when authority links and universal search are included on the Search Engine Result Page (‘SERP’), which will happen in most cases.

In this case the universal search results are counted in the SERP and must be considered e.g. in this case it’s possible to have up to a cd value of 16 on page 1.

CD variable by SERP result
Orange numbers represent the ‘cd’ value

2) The type of link clicked (search, news, image etc)

The ‘ved’ parameter indicates what type of result has referred a visitor to your site.

http://www.google.co.uk/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CC8QFjAA&url=http%3A%2F%2Fwww.ebay.co.uk%2F&ei=oe9GUtLDMJPT7AbkZw&usg=AFQjCNGEltn-KekW3pKDE9fUDb2NKpKbWw&sig2=BecHxquQThrLWIGjk9VEMA

This has been well documented by Tim Resnik in this excellent post on moz.com.

Here’s a marginally more verbose version of Tim’s table. Note these are substrings of the total value;

VED Value This means
QFj A normal organic search result
QqQIw A news OneBox link (e.g. 11, 12 & 13 above)
QpwI A news OneBox image (e.g. 11 above)
Q9QEw Video OneBox link
Qtw1w Video OneBox image
QjB An authority link (e.g. #2 – 4 on the screenshot)
BEPwd Knowledge graph image
BEP4d A secondary Knowledge Graph image

 

3) The local version of Google searched by the user

This is straightforward, you can clearly see the Top Level Domain (TLD) of the Google search that referred the visitor. In this example you can see Google UK;

http://www.google.co.uk/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CC8QFjAA&url=http%3A%2F%2Fwww.ebay.co.uk%2F&ei=oe9GUtLDMJPT7AbkZw&usg=AFQjCNGEltn-KekW3pKDE9fUDb2NKpKbWw&sig2=BecHxquQThrLWIGjk9VEMA

To simulate this quickly, try the Search Latte international search tool.

4) The landing page URL

The ‘url’ variable is another nice easy one to decipher;

http://www.google.co.uk/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CC8QFjAA&url=http%3A%2F%2Fwww.ebay.co.uk%2F&ei=oe9GUtLDMJPT7AbkZw&usg=AFQjCNGEltn-KekW3pKDE9fUDb2NKpKbWw&sig2=BecHxquQThrLWIGjk9VEMA

Note the address itself is character encoded hence; http%3A%2F%2 represents
http://.

5) Is the user logged in to Google?

Finally the ‘sig2’ parameter only appears whe a users is logged in to Google, therefore you can determine the proportion of users arriving at your site authenticated with Google.

http://www.google.co.uk/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CC8QFjAA&url=http%3A%2F%2Fwww.ebay.co.uk%2F&ei=oe9GUtLDMJPT7AbkZw&usg=AFQjCNGEltn-KekW3pKDE9fUDb2NKpKbWw&sig2=BecHxquQThrLWIGjk9VEMA

 

What does any of this mean?

Obviously the loss of the referring keyword is a blow to the accuracy of any SEO reporting. But the above will at least allow site owners to answer questions like;

  • Does traffic from different ranks convert at different rates?
  • Does traffic from different types of search result behave differently?
  • What proportion of visitors arrive at your site from different local versions of Google?

A trick for secure and unique passwords

Most people need to remember secure (long and complex) passwords for dozens different services. As this requires the mental horsepower of the Rain Man, they tend to use the same one or two passwords everywhere.

Come the inevitable day that one of these services is breached, every one of the user’s other accounts using the same password is vulnerable.

A trick to solve the problem occurred to me the other day, and I don’t see it documented anywhere else, so here it is;

  1. Use a base passphrase* e.g.[highlight]AllYourBaseAreBelongToUs!xx[/highlight]
  2. Where xx is a number e.g. [highlight]AllYourBaseAreBelongToUs!27[/highlight]
  3. Decide a memorable a base number of 2 or greater e.g. [highlight]6[/highlight]
  4. Choose a letter position for a given service e.g. [highlight]the 2nd letter[/highlight]
  5. Then for each service take the 2nd letter of its name e.g. Google is ‘o’, Yahoo is ‘a’ etc
  6. ‘o’ is the 15th letter of the alphabet, so multiply your base number by 15 e.g. [highlight]6 * 15 = 90[/highlight]
  7. Use this number as the variable in your passphrase
  8. So you have a unique password for Google of; [highlight]AllYourBaseAreBelongToUs!90[/highlight]

This way you can have a unique password for each service, without the hassle of remembering a wholly unique password every time.

*Why you should use a passphrase by the awesome xkcd;
Through 20 years of effort, we've successfully trained everyone to use passwords that are hard for humans to remember, but easy for computers to guess. By XKCD.

The ultimate research methodology

This blog is about the light that the cumulative searches of hundreds of millions of individuals can shine on the world in a way that traditional sources of insight cannot.

So what makes keyword research better than other research methodologies? It’s primary strength lies in it’s lack of bias. This impartiality is born of the intimacy that exists between a searcher and their search box that simply can’t be replicated at scale any other way.

For example it’s unlikely that if asked in a survey, many of the 165,000 global searchers using Google to find information about ‘flatulence’ in July 2012 would admit that it was their primary concern. Perhaps they might instead choose to align themselves with the more socially concerned (and fragrant) 60,500 people searching for ‘cure for cancer’ in the same month.

When a user enters their search they are speaking to a machine, they have a need and, as best they are able, they clearly and explicitly state that need.

These searches range from the mundane; “where can I buy Nespresso capsules” to the hilarious: “why does my mom smell”, to the potentially tragic: “test for aids”.

Whatever a searchers intention, every time a search is made it is added to aggregate statistics for the informed researcher to mine.

The strength of this new source of understanding is not only in it’s candour, it is also unprecedented in terms of it’s scale. Google with around 66% of the search engine market is queried 400 million times per day. Extrapolated to the whole search market that’s around 600 million searches, a sample size that few other research methodologies can hope to match.

Search data versus social data (Part 1)

Social networks such as (in Anglo-Saxon countries) Twitter, Facebook and LinkedIn are often portrayed as the modern mirror of the people.

The immense data held, particularly by Facebook, is often quoted as having the key to understanding people on a macro and individual level.

For example, Facebook knows where you live, who your friends, colleagues, family are, where you go for fun, where you go on holiday and your favourite TV shows. Surely this is the ultimate data set for understanding humanity on a grand scale?

Well, no, and here’s why.

Perception versus reality

When an individual creates content on a social network, particularly those where real names are encouraged such as Facebook or Google+, they are typically at least as conscious of the impact this will have on others perception of them as they would be talking in person with people they know.

The reason for this is that the average Facebook user has around 130 friends, but 7 close ‘real life’ friends, therefore any statement on Facebook is likely to reach a much wider and more diverse audience than one made in person.

As such, most social media users will screen themselves, conscious that their content may reach the eyes of family, co-workers, less close acquaintances and quite probably strangers and that each group of people may react in different ways.

For example political opinion expressed to 4 or 5 close friends is less likely to be challenged than one made to a diverse group of more than a hundred people from separate parts of one’s life.

Beyond the user’s direct connections, the ability for a particular piece of content to be shared is virtually limitless as a number of individuals writing indiscreet Twitter updates or posting Facebook photos have found.

Instead, individuals conduct themselves on social networks in the way they wish to be perceived by this broad community of people, rather than as they truly are.

In social, users update with socially acceptable facets of their life.
“I’m on the train”
“I’m looking forward to my holiday”
“My cat is adorable”.

It would be an unusual breach of convention for users of social media to ask where they can find at some good pornography, and yet that demand clearly exists, there are 277 million porn related searches from the comfortable anonymity of the search box every month.

And it’s not only sexual interests and personal hygiene problems that are directed at search and not social. If you are in need of information regarding a specific purchase, let’s say the purchase of a lamp or refrigerator, you are much more likely to start your search with a Google search rather than ask your friends who are relatively unlikely to have specialist knowledge about specific products.

If the average Facebook user has 160 friends, compared to tens of billions of indexed pages in Google and Bing, many of them written by niche experts and specialist retailers, it’s clear that online search is a more effective way of researching your needs.

To be continued…

The USA’s most searched for employers

Some organisations struggle to hire the employees but others are in the fortunate situation of having people seek them out for employment. It would be interesting to know which companies are actually being sought by potential employees and in what numbers.

The nuances of this are actually quite complex as there are a number of difficulties to take in to consideration;

  • The different ways a searcher may express interest in employment e.g. careers, applications, jobs, job, job in a location etc etc
  • The possible misinterpretations of brand names
  • The sheer numbers of brands to be examined.

The best (though far from perfect) approach is therefore to adopt a consistent approach in all cases. This table shows the number of exact Google queries for a brand + job originating from the United States in October 2012.

# Brand Searches + jobs
1 Government 198500
2 UPS 60500
3 Google 59400
4 Walmart 49500
5 Kaiser Permanente 49500
6 Disney 37000
7 Fedex 29000
8 TSA 27100
9 FBI 27100
10 Boeing 27100
11 Home Depot 27100
12 Target 27100
13 Apple 22200
14 Costco 22200
15 United nations 22000
16 Lockheed Martin 22000
17 USPS 21400
18 Safeway 18100
19 Yahoo! 18100
20 Amazon 18100
21 McDonalds 16200
22 At&t 16200
23 Homeland Security 14800
24 Macy’s 14800
25 Coca Cola 13200
26 Raytheon 12100
27 Microsoft 9990
28 CIA 9900
29 USAA 8100
30 Target 8100
31 Verizon 8100
32 JC Penney 8100
33 Budweiser 8100
34 BNSF 8100
35 Intel 8100
36 Bank of America 6660
37 US Army 6600
38 IRS 6600
39 Northrop Grumman 6600
40 KBR 6600
41 Whole Foods 6600
42 Pepsi 6600
43 YMCA 6600
44 Anheuser Busch 6600
45 Marriott 6600
46 Air Force 5400
47 Navy 5400
48 PG&E 5400
49 Harris Teeter 5400
50 Pizza Hut 5400
51 Royal Caribbean 5400
52 Chase 5400
53 HP 5400
54 Kroger 5400
55 Carnival Cruise 5400
56 Frito Lay 5400
57 Delta Airlines 5400
58 GE 5400
59 Walgreens 5400
60 Blackwater 4400
61 Wendys 4400
62 Ebay 4400
63 Panera Bread 4400
64 Cisco 4400
65 Sprint 4400
66 Aramark 4400
67 Old Navy 3600
68 Peace Corps 3600
69 Chrysler 3600
70 Dyncorp 3600
71 Oracle 3600
72 John Deere 3600
73 Staples 3600
74 USDA 2900
75 National Guard 2900
76 Albertsons 2900
77 T mobile 2900
78 Winn Dixie 2900
79 Wegmans 2900
80 Holiday Inn 2900
81 Sams Club 2900
82 America Express 2900
83 Time Warner 2900
84 GAP 2900
85 Kohls 2400
86 Nestle 2400
87 Lowes 2400
88 Weatherford 2400
89 Hollister job application 2400
90 Hershey 2400
91 Ford 2400
92 Dell 2400
93 Labcorp 1900
94 Circle K 1900
95 BP 1900
96 Aldi 1900
97 Rite Aid 1900
98 Sysco 1900
99 Fred Meyer 1900
100 General Dynamics 1900

 

Where there is a very clear intent with similar numbers they are combined as a single entry e.g. government jobs, federal government jobs, gov jobs, us government jobs are aggregated in to ‘Government’.

The top companies are a mixture of aspirational e.g. Google (3rd) and the FBI (9th) and major employers such as UPS (3rd) and Walmart (5th). To put this in perspective, Walmart receives roughly 17% less interest than Google despite employing 2.1 million people – roughly 39 times more than the famous search engine.

A Human Resources department concerned with measuring the attractiveness of it’s organization can easily use a similar methodology to gauge their appeal to potential employees versus competitors and to measure their changing appeal as an employer over time.

For a more general view of brand value via keyowrd resarch, try ‘Tracking a brand with keyword research‘.

Getting hired with keyword research

Human Resources professionals will tell you that in a competitive labor market, job seekers must compete to be the most visible to recruiters and hiring managers who are searching for their skill set.

Most recruiters aren’t searching Google for their prospects, instead they use job boards like Monster and Indeed or increasingly, professional social networks like LinkedIn, XING (German speakers) or Viadeo (French speakers).

But what exactly are recruiters searching for? Just like a company ranking in Google for a phrase that no customer searches for, if recruiters aren’t searching for what you’re ‘visible’ for then it’s not going to help you get a job.

Most of the time a recruiter or hiring manager is looking for candidates with experience similar to the job they are recruiting for so they will be searching for job titles and skills relating to the open job.

Therefore by understanding the language used in open jobs you can indirectly understand what is being searched for by the people you might want to find and hire you.

To do this accurately you’ll need a huge index of jobs; step forward Indeed.com – a site whose business model revolves around aggregating jobs from employers and other job boards with circa 3 million listings in the United States alone.

Therefore if we want to see the numbers of times a phrase is used in a job ad, it’s simply a matter of running a job search, use the ‘With the exact phrase’ field of the Indeed advanced search and checking the number of results returned for that phrase.

As a practical example let’s take a computer security specialist who is deciding whether to optimise primarily as an IT Security or Information Security specialist.

Exact text matches Indeed.com (USA) 16th October 2012

Keyword Jobs returned
it security 5,113
information security 11,582

From this we can see that there are roughly double the number of jobs referencing ‘information security’.

Or let’s say an SEO type that wanted to appear prominently in recruiters’ searches (and what SEO would turn down the challenge?);

Exact text matches Indeed.com (USA) 16th October 2012

Keyword Jobs returned
seo 7,999
search engine optimization 2,659
natural search 218
organic search 365

Here we can see that SEO is by far the most popular term referenced in job ads and therefore it is the most suitable synonym to optimise for.

Obviously it is important to use the description most appropriate for your skill-set, if you really do know more about IT Security than Information Security you should optimise for that. But where there are straightforward synonyms like SEO and Search Engine Optimiztion, while you should use all of them somewhere in your C.V. and profile, you can make yourself ‘more visible’ just by concentrating on the most popular phrases.

Optimise your CV (resume) and LinkedIn profile with the search terms used by recruiters and hiring managers.