The notable insight from the recruitin.net data is that due to very strong rankings and a relatively unknown brand, generic terms drive more clicks than brand-terms. In this particular niche the specificity (query length & number of tokens) have a very week impact on clicks.
If you manage a site with millions or even billions of URLs, it’s important to consider that Google and Bing have a crawl budget, a limit in the number of URLs they are prepared to crawl, for every domain, determined by its authority.
If a less-authoritative domain has billions of URLs, Google won’t crawl potentially important sections of your site, thus losing you traffic.
So one of the biggest SEO challenges for large ecommerce sites is balancing:
Not missing traffic by excluding product and aspect (aka attribute) combinations that have search demand
Not spamming the search engines with many combinations for which there is no demand
For example, Size and Colour aspects for sites that sell both televisions and shoes:
High search volume
Important filter aspect
Low search volume
Low search volume
High search volume
Important filter aspect
Why not open up every combination?
Take one category e.g. Men’s sports shoes, with 6 aspects in your catalogue:
Number of aspect values
Every combination of every aspect value multiplies up very quickly:
150 * 30 * 15 * 5 * 4 * 200 = 270,000,000
That is, 270 million possible URLs for this category alone!
Understand which aspects have values which are primarily searched in Google and only open those aspects for crawling.
You’ll need large and representative keyword set, potentially millions of keywords depending on the scope of your site, but here’s a rough example on a limited keyword set as an example
Category: Mens Trainers
Matching Aspect 1
Matching Aspect 2
Avg. Monthly UK Google Searches
mens white trainers
mens running trainers
mens black trainers
nike mens trainers
white trainers mens
mens trainers uk
black trainers mens
mens gym trainers white
all black trainers mens
mens red trainers
Full list here, note the data has been tweaked to better illustrate the concept. Also for American readers, trainers = sneakers 🙂
The search volume for the keywords can be clustered into the category’s aspects e.g.
Searches Containing a value for this aspect
From this we can see that Colour and Style are important to open up to crawl, Material and Size less so.
Potentially a waste:
Just removing Material and Size, dramatically reduces the number of aspect combinations:
150 × 15 × 5 × 200 = 2,250,000
Saving us 267,750,000 URLs required for crawling. Not bad!
As aspects can be common across categories (e.g. size and colour), to exclude categories selectively, append a string which you have excluded in robots.txt to those categories you choose to not have indexed e.g.
Practically everyone has, is, or will consider running a side-project at some point. It can be a hobby, a way to learn, boost your portfolio, generate extra cash or a lottery ticket out of the day job. Either way it’s tough to beat the thrill of seeing other people using your creation.
Most people though, won’t plough much money in to marketing their idea, at least until the point it starts to show real promise or generate it’s own income.
This therefore is a guide to promoting your idea with little or no money.
1.0 Set objectives
Before you start, decide what you actually want to achieve with the project e.g.
Test whether an idea has commercial potential before quitting your job
Show employers what you can do
Generate traffic and page-views to earn affiliate or ad revenue
Deciding this gives you a clear direction for your marketing.
This is the point where you normally find the three other tools doing your totally original idea. It’s annoying but don’t let it deter you; it’s all about marketing and execution, just ask Tom:
2.2 Setup a holder
Well before you launch, setup a holder page with:
A line to describe your project and it’s benefits e.g. An app that helps accountants file taxes faster
A more detailed (but still concise) explanation
A way to register for more interest
Social share prompts
If a representative user can’t tell you in a couple of seconds what the project is about, you need to tweak some more.
Having a good holder site will ensure:
Your SEO will hit the ground running
You can direct anyone interested there for more info
You gain a bit more credibility
You can capture the details of anyone who shows up in the mean-time
Launch Rock popularised and systematised this approach, but a basic HTML template + Mail Chimp is also free and can be a bit less fiddly in practice.
As an example here’s the pre-launch screenshot of holder page for Mustard Threads.
2.3 Make friends
If there are any authorities in your niche, now is the time to start making friends with them on Twitter, LinkedIn or wherever they interact mainly. Building relationships with bloggers, popular Twitter users and journalists now will pay dividends later.
Search the keywords that you have identified in your keyword research and see who shows-up in Google. Identify Twitter Hastags used by your audience e.g. for men’s fashion startup Mustard Threads, #GentsChat attracts an audience likely to be interested. Be genuine and gracious and it will work out. You can often also get new insights in to your project’s functionality.
LinkedIn can be a great way to reach people. Use RecruitEm to find journalists, industry influencers etc, connect with a personalised message.
Register all your social accounts (Twitter handle, Facebook Page name etc) at this point.
So your creation is ready and you’ve replaced your holder with the real thing. Many people like to throw everything at a big day one launch. That can sometimes be a good idea, like when you’re gunning for an app store ‘Most Popular’ or ‘Top Rated’, but in my experience both corporate and personal, a gradual escalation of marketing is almost always better.
A ‘soft’ / beta launch allows you to:
Identify functional and UX bugs without alienating your most valuable users
Test and improve your messaging
Spend whatever money you do allocate wisely
It will differ by project but here’s my suggested running order:
3.1 You and the rest of the team
If you’re even remotely target market, you should definitely be using (or ‘dogfooding’) what you’ve created. This will prevent user generated content projects from being a ghost town for early users and flag up any UX / functional issues.
3.2 Your long-suffering friends and family
Your friends and family (presumably) like you, hopefully enough to give your project a go. Obviously it’s tricky if you’ve created something that’s a real niche; (your .htacess generator might be a bit lost on your grandma), but chances are you’ll have some people in your close circle who would enjoy or benefit from it.
At this point you’ll want to stay close to your new users to get any feedback as to how you can improve the UX and functionality.
Tools like Doorbell and Podio offer free feedback functionality you can embed on in your site or app.
3.3 Use your wider network
Find a way to promote your project on your every social profile without spamming everyone to death. A nice Facebook share with a request for feedback seems to work well.
Most people, inevitably, will be disinterested, but you might be surprised at the people who become users and even advocate for you.
On Twitter, unless you’re a terrible pop star with millions of followers, then make sure you do a few tweets at different times of the day, with hashtags relevant to your target audience.
If you have a personal blog, write an announcement post. Later on you can reach a wider audience on Medium.com which is the derigeur place for startups to communicate these days.
Obviously also consider, Google+ , Pinterest etc etc. as suits your project and goals.
3.4 Send those emails
Hopefully during the course of building your project, some judicious social sharing, personal networking and random SEO will have generated a few sign-ups on your holder page.
Now is the time to cash this in, email them an let them know you’re good to go, let them know they’re among the first to use it and ask for as much feedback as they are prepared to give.
3.5 Accelerate SEO
SEO is a whole topic, and generally a slow burn but at minimum should have:
Identified the keyword to rank for
Included optimised metadata (and Meta Desc Tags)
Built links from wherever you can e.g.
Other personal projects and personal blogs
Other blogs and news sites writing about your project
3.6 Test paid search
Google throws around vouchers to encourage new advertisers on it’s AdWords program like confetti. Voucher values are usually up to around £120, so if for example you’re paying £0.30 per click will get you around 350 possible new users.
To get a voucher you can wither join up to their ‘partner program’ in which case they will start mailing you vouchers periodically, or if you’re desperate go and flip through the web development magazines in your news agent.
You can use the results of the keyword research to build out your campaign. If it turns out all the AdWords traffic bounces, you probably picked the wrong keywords.
3.7 In person networking
Check Meetup for events related to your project. Practice a little description of what you do before you go so it sounds slick when you’re mingling.
If there is nothing specific to what you do, there will usually be a generic ‘startup’ event you can attend, they do tend to be full of people too objectionable to hold down a job, but sometimes you’ll strike gold.
3.8 Press & blogger outreach
If your service is genuinely interesting, new, or a timesaver; or at least there’s an interesting angle on it (it uses a trendy a gadget, a cult celebrity uses it, you built it while in prison etc), you can usually get someone to write about it.
In my experience it’s nearly impossible to get the mainstream media (newspapers etc) to write about you, but if you fancy a go, try Muckrack. However blogs within a niche e.g. recruitment, SEO etc, will often be happy to write about you; sending you quality links (SEO win) and traffic.
3.9 Staying in touch
Encourage users to sign up to your Twitter feed, like your Facebook page and/or sign up to email alerts to encourage repeat visits.
3.10 Social Sharing
Make sure users can easily share on social. Consider what usually makes people share:
Ego – something about the user that flatters their ego
Inherent reward – get 10 extra points on your gamification system
Humour – Users share something funny so people think they are funny and like them more
Controversy – Tricky to pull off, but people do share causes etc
4.0 What not to do
This isn’t a blog post for well-funded startups working full time on their next ‘unicorn‘, it’s for those creating projects in their spare time. Don’t lean on work contacts or resources to help; it’s probably your day job that pays the rent so don’t jepordise that.
Moreover though laws differ country-to-country, if you’re using work time, computers, contacts etc to work on your project, then should it actually become commercially valuable, your employer will have a strong case to assert ownership.
5.0 Next steps
After you’ve got a solid base of users for your idea keep soliciting feedback, checking analytics (Google Analytics, Pwick) etc, doing Guerilla UX tests and improving it.
Hopefully your service should see a steady stream of new users, retain it’s existing users, and hit all the objectives you’ve set for the project.
Keyword research is not only crucial for SEO, a powerful methodology for understanding the intentions and language used by your market, but by clustering the results you can also plan a website’s structure. This ensures:
Optimal search engine visibility
A taxonomy aligned to your market’s mental model
Selection of terminology understood by your market
In this practical step-by-step guide, I’ve used the example of planning a new job board, however the methodology is valid for all industries.
1. Getting started
Assuming we’ve already used Google Trends or market knowledge to identify it as the correct seed term, here’s the downloaded results for a query of ‘jobs‘ in the Google Keyword Planner. It’s important to select the correct market (in this case the UK) and turn ‘Only show ideas closely related to my search terms’ on, otherwise you’ll spend much longer sorting through irrelevant keywords.
Delete the additional columns created by default (Competition, Suggested Bids etc.) leaving only the, Keyword and Average Monthly Search Volume sorted high to low.
2. Cleaning the list
As with any keyword research it’s important to check each term against these three criteria, ranked in order of importance;
Is it relevant? Do I have content on my site which relates to this?
Is there sufficient search volume? Do enough people search for it to make it worthwhile?
Is it achievable? Will my site, now or in the future realistically have enough authority to rank for this?
First remove the irrelevant terms e.g. imagine your hypothetical job board doesn’t;
Recruit for specific employers e.g. ‘tesco jobs‘ or ‘mcdonalds jobs‘
Wish to compete for big competitor brand terms like ‘guardian jobs‘
Recruit for jobs overseas
Want any vague or irrelevant terms like ‘good jobs‘ or ‘boob jobs‘
Afterwards you will be left with a reduced list with only those terms relevant to your business. In this example 85% of the terms we started with, it will vary for you based on how focussed on a specific niche your business is.
3. Identifying user intent
Next you need to understand exactly what solutions people are looking for. This is as much an art as a science, and while our example refined list has every possible segmentation of the jobs market e.g.
Salary ‘100k jobs‘
Educational level e.g. ‘graduate jobs‘
Industry e.g. ‘jobs in sport‘
Location e.g. ‘jobs in kent‘
This market is most obviously divided between those looking for specific skills and industries and those looking for jobs in locations, particularly the latter. Approximately 25% of the the terms with a cumulative 1.6m search volume relate to a finding a job in a specific location.
The reminder are largely searches for function e.g. ‘marketing jobs‘ pr industry e.g. ‘music jobs‘.
4. Clustering the keywords
Post Google’s Hummingbird update there is much more focus on clustering of keywords, however it has always been the case that the same user intent has been represented by multiple keywords and that these should be grouped during the planning phase of a new site. The only real difference is that now we can rely on Google being somewhat better at identifying user intents so our groups can be broader.
In the location segment, we can clearly see many keywords with same intent and similar strings e.g.
jobs in glasgow (27100)
glasgow jobs (8100)
jobs glasgow (8100)
Which should be clustered together in Excel e.g.
It gets more interesting when we look at the professions and industries segment. As Google has improved at understanding concepts, we can now legitimately group together keywords that are semantically linked but with dissimilar literal strings, for example;
driving jobs (14800)
hgv jobs (12100)
delivery driver jobs (5400)
delivery jobs (5400)
chauffeur jobs (5400)
bus driver jobs (4400)
hgv driving jobs (3600)
van driving jobs (2900)
All these terms show a similar user intent, whether you choose to break out a term in to it’s own page is a judgment call you should make based on it’s importance to your business. In this example it’s arguable that ‘hgv jobs‘ is sufficiently distinct and popular to deserve it’s own page.
This needs to be completed for all the major segments you identified, which will probably take around a day, depending on the size of your niche and your mastery of Excel shortcuts. As you progress you will see patterns emerge and get a sense of the language and requirements of your market.
5. Building the sitemap
As you group the keywords in Excel you will see the sitemap emerge, with each page optimised for it’s most popular keyword but referencing the other keywords in the group.
When producing copy for these pages it’s ideal if you can, while keeping the user first in mind, use all or most of the keywords in the cluster.
6. Conclusion and more reading
By following this methodology you will produce an intuitive and search optimised sitemap for your site. For more information on clustering keywords, watch this video on modern keyword research from Moz’s Rand Fishkin.
Note: As March 2016 Google is no longer passing this information in the referral string.
As of September 2013 Google prevented site owners from seeing all organic referring keyword data in the referral string.
However there is still plenty of data to be gleaned from the string. For quick testing the HttpFox Firefox plugin is excellent. Systematically capturing the data is easily done in any web analytics tool or server log parser using simple Regex.
It’s important to note that this data appears not to be passed from mobile searches which may somewhat skew any conclusions.
1) The rank of the link that the user clicked
To understand the rank of the result the user clicked to arrive at your site, look at the ‘cd’ key / value pair. e.g.
Most people need to remember secure (long and complex) passwords for dozens different services. As this requires the mental horsepower of the Rain Man, they tend to use the same one or two passwords everywhere.
Come the inevitable day that one of these services is breached, every one of the user’s other accounts using the same password is vulnerable.
A trick to solve the problem occurred to me the other day, and I don’t see it documented anywhere else, so here it is;
Use a base passphrase* e.g.[highlight]AllYourBaseAreBelongToUs!xx[/highlight]
Where xx is a number e.g. [highlight]AllYourBaseAreBelongToUs!27[/highlight]
Decide a memorable a base number of 2 or greater e.g. [highlight]6[/highlight]
Choose a letter position for a given service e.g. [highlight]the 2nd letter[/highlight]
Then for each service take the 2nd letter of its name e.g. Google is ‘o’, Yahoo is ‘a’ etc
‘o’ is the 15th letter of the alphabet, so multiply your base number by 15 e.g. [highlight]6 * 15 = 90[/highlight]
Use this number as the variable in your passphrase
So you have a unique password for Google of; [highlight]AllYourBaseAreBelongToUs!90[/highlight]
This way you can have a unique password for each service, without the hassle of remembering a wholly unique password every time.
*Why you should use a passphrase by the awesome xkcd;
This blog is about the light that the cumulative searches of hundreds of millions of individuals can shine on the world in a way that traditional sources of insight cannot.
So what makes keyword research better than other research methodologies? It’s primary strength lies in it’s lack of bias. This impartiality is born of the intimacy that exists between a searcher and their search box that simply can’t be replicated at scale any other way.
For example it’s unlikely that if asked in a survey, many of the 165,000 global searchers using Google to find information about ‘flatulence’ in July 2012 would admit that it was their primary concern. Perhaps they might instead choose to align themselves with the more socially concerned (and fragrant) 60,500 people searching for ‘cure for cancer’ in the same month.
When a user enters their search they are speaking to a machine, they have a need and, as best they are able, they clearly and explicitly state that need.
These searches range from the mundane; “where can I buy Nespresso capsules” to the hilarious: “why does my mom smell”, to the potentially tragic: “test for aids”.
Whatever a searchers intention, every time a search is made it is added to aggregate statistics for the informed researcher to mine.
The strength of this new source of understanding is not only in it’s candour, it is also unprecedented in terms of it’s scale. Google with around 66% of the search engine market is queried 400 million times per day. Extrapolated to the whole search market that’s around 600 million searches, a sample size that few other research methodologies can hope to match.
Search data versus social data (Part 1)
Social networks such as (in Anglo-Saxon countries) Twitter, Facebook and LinkedIn are often portrayed as the modern mirror of the people.
The immense data held, particularly by Facebook, is often quoted as having the key to understanding people on a macro and individual level.
For example, Facebook knows where you live, who your friends, colleagues, family are, where you go for fun, where you go on holiday and your favourite TV shows. Surely this is the ultimate data set for understanding humanity on a grand scale?
Well, no, and here’s why.
Perception versus reality
When an individual creates content on a social network, particularly those where real names are encouraged such as Facebook or Google+, they are typically at least as conscious of the impact this will have on others perception of them as they would be talking in person with people they know.
The reason for this is that the average Facebook user has around 130 friends, but 7 close ‘real life’ friends, therefore any statement on Facebook is likely to reach a much wider and more diverse audience than one made in person.
As such, most social media users will screen themselves, conscious that their content may reach the eyes of family, co-workers, less close acquaintances and quite probably strangers and that each group of people may react in different ways.
For example political opinion expressed to 4 or 5 close friends is less likely to be challenged than one made to a diverse group of more than a hundred people from separate parts of one’s life.
Beyond the user’s direct connections, the ability for a particular piece of content to be shared is virtually limitless as a number of individuals writing indiscreet Twitter updates or posting Facebook photos have found.
Instead, individuals conduct themselves on social networks in the way they wish to be perceived by this broad community of people, rather than as they truly are.
In social, users update with socially acceptable facets of their life.
“I’m on the train”
“I’m looking forward to my holiday”
“My cat is adorable”.
It would be an unusual breach of convention for users of social media to ask where they can find at some good pornography, and yet that demand clearly exists, there are 277 million porn related searches from the comfortable anonymity of the search box every month.
And it’s not only sexual interests and personal hygiene problems that are directed at search and not social. If you are in need of information regarding a specific purchase, let’s say the purchase of a lamp or refrigerator, you are much more likely to start your search with a Google search rather than ask your friends who are relatively unlikely to have specialist knowledge about specific products.
If the average Facebook user has 160 friends, compared to tens of billions of indexed pages in Google and Bing, many of them written by niche experts and specialist retailers, it’s clear that online search is a more effective way of researching your needs.
Some organisations struggle to hire the employees but others are in the fortunate situation of having people seek them out for employment. It would be interesting to know which companies are actually being sought by potential employees and in what numbers.
The nuances of this are actually quite complex as there are a number of difficulties to take in to consideration;
The different ways a searcher may express interest in employment e.g. careers, applications, jobs, job, job in a location etc etc
The possible misinterpretations of brand names
The sheer numbers of brands to be examined.
The best (though far from perfect) approach is therefore to adopt a consistent approach in all cases. This table shows the number of exact Google queries for a brand + job originating from the United States in October 2012.
Searches + jobs
Bank of America
Hollister job application
Where there is a very clear intent with similar numbers they are combined as a single entry e.g. government jobs, federal government jobs, gov jobs, us government jobs are aggregated in to ‘Government’.
The top companies are a mixture of aspirational e.g. Google (3rd) and the FBI (9th) and major employers such as UPS (3rd) and Walmart (5th). To put this in perspective, Walmart receives roughly 17% less interest than Google despite employing 2.1 million people – roughly 39 times more than the famous search engine.
A Human Resources department concerned with measuring the attractiveness of it’s organization can easily use a similar methodology to gauge their appeal to potential employees versus competitors and to measure their changing appeal as an employer over time.
Human Resources professionals will tell you that in a competitive labor market, job seekers must compete to be the most visible to recruiters and hiring managers who are searching for their skill set.
Most recruiters aren’t searching Google for their prospects, instead they use job boards like Monster and Indeed or increasingly, professional social networks like LinkedIn, XING (German speakers) or Viadeo (French speakers).
But what exactly are recruiters searching for? Just like a company ranking in Google for a phrase that no customer searches for, if recruiters aren’t searching for what you’re ‘visible’ for then it’s not going to help you get a job.
Most of the time a recruiter or hiring manager is looking for candidates with experience similar to the job they are recruiting for so they will be searching for job titles and skills relating to the open job.
Therefore by understanding the language used in open jobs you can indirectly understand what is being searched for by the people you might want to find and hire you.
To do this accurately you’ll need a huge index of jobs; step forward Indeed.com – a site whose business model revolves around aggregating jobs from employers and other job boards with circa 3 million listings in the United States alone.
Therefore if we want to see the numbers of times a phrase is used in a job ad, it’s simply a matter of running a job search, use the ‘With the exact phrase’ field of the Indeed advanced search and checking the number of results returned for that phrase.
As a practical example let’s take a computer security specialist who is deciding whether to optimise primarily as an IT Security or Information Security specialist.
Exact text matches Indeed.com (USA) 16th October 2012
From this we can see that there are roughly double the number of jobs referencing ‘information security’.
Or let’s say an SEO type that wanted to appear prominently in recruiters’ searches (and what SEO would turn down the challenge?);
Exact text matches Indeed.com (USA) 16th October 2012
search engine optimization
Here we can see that SEO is by far the most popular term referenced in job ads and therefore it is the most suitable synonym to optimise for.
Obviously it is important to use the description most appropriate for your skill-set, if you really do know more about IT Security than Information Security you should optimise for that. But where there are straightforward synonyms like SEO and Search Engine Optimiztion, while you should use all of them somewhere in your C.V. and profile, you can make yourself ‘more visible’ just by concentrating on the most popular phrases.