SEO for Large Language Models (LLMs)

April 19, 2023

A robot addressing a large crowd of people

As ChatGPT usage has blown past 100m users, marketers have started seeing direct to site or brand leads referred by ChatGPT, meaning buyers are acting on transactional recommendations from LLMs.

That’s because unlike Bing Chat, ChatGPT and Google Bard don’t typically cite their sources, let alone refer traffic, so SEO becomes the process of encouraging these LLMs to recommend your brand or product.

For example, these screenshots show ChatGPT and Bard recommending commercial brands for a service:

ChatGPT: ‘how do I hire an SEO in the Bay Area?’ Recruitment brands returned in a GPT answer

Google Bard: ‘what are the best job boards?’ Recruitment brands returned in a GPT answer

If I can beat the dozens of more accomplished ‘Chris Reynolds’, your brand can beat it’s competitors.

Bing Chat: ‘Who is Chris Reynolds?’ Bing Screenshot reccomending me as an SEO

This is how to get LLMs to recommend your brand:

  1. Know the LLMs your customers are using
  2. Understand what your customers are asking these LLMs
  3. Know the corpora that these LLMs are trained on
  4. Ensure your brand is mentioned frequently and favourably in those corpora
  5. Measure visibility

1. Know the LLMs your customers are using

  • Ask direct leads where they found you
  • Track LLM market share; Bard vs. Bing Chat vs. ChatGPT

2. Understand what your customers are asking these LLMs

There is no way of doing LLM KW research unless OpenAI/Microsoft/Google make that data available. We have to assume user demand is roughly similar to Google searches, though likely phrased more conversationally.

Once you have an idea of the questions, you can define your message e.g.

  • has the most jobs
  • eBay is the best place to buy and sell collectibles

3. Know the corpora that these LLMs are trained on

  • GPT: Trained on; Common Crawl (notably including Reddit), WebText2, Books and Wikipedia
  • Bing Chat (GPT4 based): Likely uses the same corpora as GPT, plus Retrieval Augmented Generation from Bing’s index
  • Bard: Bard is powered by Pathways Language Mode (‘PaLM’). PaLM is likely trained on subset of Google’s index; Filtered webpages (likely from Google’s C4 Dataset), Books, Wikipedia, News, GitHub & Social media

4. Ensure your brand is mentioned frequently and favourably in those corpora

4.1 Your site

Include the message you want shown many times on your own site. Strategically disallow Common Crawl and Google/Bingbot, allowing them to see your marketing messaging but content you would prefer users to consume on your site.

4.2 Press Releases

Include the message you want shown many times in your press releases.

e.g. if you want Bing Chat to suggest your brand as having the ‘widest selection of collectible sneakers’, then make sure your press releases include something to the effect of ’[your brand] has the widest selection of collectible sneakers’.

4.3 Social campaigns

Both Reddit and Twitter are being used as training data, though there’s a strong chance that will end as the walls go up and lawsuits start flying. However for now, just repeating your message from your brand handle etc should be enough.

Just like traditional SEO, networks that don’t allow crawling; Instagram, Facebook etc, aren’t useful for LLM SEO.

4.4 Blogger outreach

There’s a high chance that an influential blog will be in the training corpus for your target LLM. Once again, asking 3rd party blogs to repeat your message increases the chances that the LLMs will then recommend your product.

5. Measure visibility

No tools currently track LLM response visibility. Additionally LLMs are non-deterministic (the same question will get a different answer every time), so even when these tools are available, it will be a sample that needs to be tracked over time. I expect SEO tool providers will provide this (at a premium) in 2023.

At time of writing, Bing is working on breaking out Bing Chat clicks.

For now, our only choice is to ask direct to site and brand search customers what referred them.

Going forward

If LLM powered chat based search does gain traction with users, it’s going to render a lot of the SEO we have done for 20+ years irrelevant, and this will be the new reality of SEO.

Header image generated via Midjourney.

Chris Reynolds is a Bay Area Product Manager with 15 years of international experience in SEO, digital marketing, UX, analytics and team management.

© 2023 Chris Reynolds