Deep dive· Competitive Intelligence · 13 min read

The content-gap analysis that pays for itself in a coffee's worth of API credit

Find every page a competitor ranks for and you don't — then turn that map into campaigns, assortment and content. The modern, AI-matched, near-free version.

An archaeologist brushing dust off missing tiles in a map of a website
The serious facts are real — the article covers are not.

In short: A content-gap analysis finds every page a competitor ranks for that you don't, then splits the result across three teams. Pull competitor keywords and landing pages with DataForSEO for a few dollars, let an LLM match their pages to yours, and the unmatched rows become SEO fixes, assortment-expansion ideas, paid campaigns and content. The match doesn't need to be perfect — the value is the gaps.

~100k
competitor keywords + landing pages per pull
$0.60
per 1,000 SERP queries (DataForSEO Regular)
$500/mo
Semrush Business — the tier that unlocks API access (units extra)
5 steps
queries → AI page match → gap map

The whole play in one sentence

Find the pages your competitors rank for that you don’t even have in your portfolio — then turn that list into three wins: more SEO visibility, search campaigns aimed at landing pages you were missing, and non-commercial traffic you’d otherwise never touch. One map, three different payoffs, landing on three different teams. That’s the entire analysis; everything below is how you actually get there.

Every SEO deck has the slide that tells you to do this — analyze the competition, find the gaps in your content and categories. Almost nobody follows it, because the old way was a grind: export a competitor’s keywords, eyeball which landing pages they led to, line them up against your sitemap by hand, and argue about matches in a spreadsheet for a week. So the slide stayed a slide.

Two things changed. Pulling the data stopped being expensive: DataForSEO does for cents what Semrush does for a monthly subscription. And the matching stopped being manual: an LLM pairs a competitor’s pages against yours in minutes, and it doesn’t need to be perfect to be useful.

What’s left is the part that was always the real value — and it was never really an SEO task. A content gap is a missing category page, a product line you don’t stock, a blog that feeds a remarketing list. It touches SEO, paid, assortment and content strategy at once. Here’s how I run it.

The analysis in one box

  • What we're after Every page a competitor ranks for that you don't
  • Tools DataForSEO API · any LLM · a site crawler
  • Cost A few dollars of API credit, no subscription
  • What you get A ranked shortlist of content, assortment & campaign gaps

One shop, four competitors — what this looks like in real life

Before the mechanics, the shape of it. Say ABC is a mid-size outdoor-gear e-shop: tents, trail shoes, rain jackets, backpacks. You pull its real competitor set and four names come back — and the interesting part is why each one beats ABC organically:

That one map hands ABC three concrete moves:

  1. Content — write the trail-shoe and jacket-care guides, capture the research-stage traffic the blogs are eating, and drop those readers into a remarketing list.
  2. Paid — once the granular landing pages exist, point search campaigns at them (kids' rain jackets → the kids’ rain-jacket page) instead of dumping every click on one blunt category page.
  3. Assortment & structure — build the missing category pages for lines ABC already stocks but never gave a proper home, and flag product lines a competitor carries that ABC doesn’t.

Same analysis, three teams, three budgets. (Illustrative example.) Now here’s how you produce that map.

The flow, end to end

Identify the real competitors — three ways

Not who you think competes — who actually shows up where your money is. Use three signals together. One: run your most important search queries through DataForSEO and note who appears in paid and organic. Two: read Auction Insights in Google Ads — auction overlap tells you how close a rival really is. Three: pull keyword-overlap data, where the number of queries you share with a domain is a clean proxy for relatedness. Three lists collapse into one shortlist of genuine competitors. Why first: get this wrong and every later step inherits the mistake — you’d map your gaps against a rival who was never really competing for your money.

Pull the competitor's keywords and landing pages

For each competitor, pull their top organic keywords — up to ~100k — and, critically, which landing page each keyword ranks. From position × search volume you can estimate the traffic flowing into each of their pages. Why pages, not keywords: a keyword is an abstraction; a page is something you can copy, rebuild, or point a campaign at. So you collapse the keyword list into a map: competitor page × estimated traffic × the keywords feeding it. One row might read: rival.com/trail-running-shoes · top keyword trail running shoes · ~8,000 estimated visits/month. (Illustrative example.)

Map your own site

You need the mirror image of your own pages. Why this step: you can only call something a “gap” if you’re certain it’s missing on your side — so your own map has to be complete, or you’ll chase “gaps” that are really just pages your inventory forgot to list. Crawl the site (Screaming Frog, or a throwaway Python crawler an LLM writes in five minutes), export categories from your e-shop platform, read the product feed, or parse the XML sitemap — usually a combination. One warning: don’t trust the sitemap alone. It routinely misses parametric pages, filtered category views and the blog — exactly the surfaces a gap analysis cares about.

Let AI match their pages to yours

This is the step that used to take a week. Hand both inventories to an LLM — an open-source model is fine — and have it pair each competitor page with your closest equivalent. You do not need 100 % accuracy; you need the unmatched rows. The output is the prize: the pages they have, that earn them traffic and rankings, that you simply don’t.

Decide what each gap means — this is where it stops being SEO

A gap isn’t one thing. Sort each into a bucket: products you already sell but have no category page for → fix your landing-page structure. Products you don’t sell but your supplier carries → an assortment-expansion shortlist with demand attached. A competitor’s strong non-commercial blog pulling your exact audience → a content strategy. Each bucket lands on a different team — and several of them feed straight into your campaigns.

For example: their /cordless-drills page pulls ~12,000 visits/month, you sell cordless drills but only on a generic /power-tools page — that’s a landing-page-structure fix, not a new blog post. The next unmatched row, /drill-bit-buying-guide, is pure content. Same gap map, two different teams. (Illustrative example.)

The match doesn’t have to be perfect. People stall here waiting for 100 % precision. You don’t need it. A few mislabeled pairs cost you nothing; the value is in the clearly unmatched competitor pages, and those survive a noisy match just fine. Ship the analysis at 90 % and act on it, rather than polishing a model that was only ever a means to a shortlist.

Watch it run: what each step actually spits out

The five steps above are the map; this is the territory. Below is the concrete artifact each step hands you — what you’re literally staring at before you move on. The shapes are exactly what the tools return; the rows are illustrative, not a real client. (Illustrative examples throughout.)

Step 1 → a competitor shortlist, scored. You run the three signals and collapse them into one table. The brands you’d have named by gut aren’t always the ones that survive all three:

Domain        Paid/Org  Overlap  Shared  Verdict
rival-a.com   yes/yes   71%      4,120   core
niche-c.com   yes/yes   44%      2,300   core
rival-b.com   no /yes   12%      3,880   content-only rival
bigbox.com    yes/yes    9%        910   too broad — drop

Three of four survive; the megastore that “obviously” competes gets dropped because the overlap is noise.

Step 2 → a page map with money attached. For each surviving competitor, one API call returns ranked keywords and the landing page each one hits. Aggregate by page and you stop looking at keywords:

Page                 Keyword             Pos  Vol     Visits/mo
/trail-shoes         trail running shoe   2   18,100  ~8,000
/waterproof-jackets  waterproof jacket    4   12,000  ~3,200
/blog/clean-shoes    clean trail shoes    1    2,400  ~1,500
/gaiters             running gaiters      6      900    ~640

Each row is a page earning a competitor real traffic — a target, not a search term.

Step 3 → your own inventory, and how much the sitemap missed. Mirror it for your site from crawl + feed + category export. The point of pulling four sources is visible the moment you count them:

Source                 Pages found
XML sitemap                  412
Screaming Frog crawl         938
Product feed               1,205 SKUs
Category export               64
Deduped own-site map       1,010 URLs

The sitemap saw 412 pages. The real map is 1,010. The analysis lives in the ~600 the sitemap never showed you.

Step 4 → the NO MATCH rows, ranked. Hand both inventories to the LLM with the matching prompt. It returns one verdict per competitor page; you keep only the gaps, sorted by traffic:

Competitor page        Closest OURS     Verdict   Visits/mo
/trail-shoes           /running-shoes   MATCH       —
/waterproof-jackets    —                NO MATCH    3,200
/blog/clean-shoes      —                NO MATCH    1,500
/gaiters               —                NO MATCH      640

One MATCH falls away; three ranked gaps remain. That four-row table is the entire deliverable in embryo.

Step 5 → the gap map with an owner. Tag each gap with a bucket and the team it lands on. Now it isn’t an SEO report — it’s a work order:

Gap                   Visits/mo  Bucket              Lands on
/waterproof-jackets   3,200      sell it, no LP      SEO / web
/blog/clean-shoes     1,500      non-commercial      Content + ads
/gaiters                640      don't stock it yet  Assortment

One map, three teams, every row sized by traffic. That’s the moment a “content-gap analysis” stops being an SEO chore and becomes a cross-team plan.

Semrush vs. DataForSEO: why the price gap matters

The reason this analysis went from “we should” to “we did” is cost — and the Semrush number that matters here is higher than the sticker price people quote. The $139.95/mo Pro plan runs a content-gap check in the interface, by hand, with export caps. But the analysis in this article is programmatic: one API call per competitor domain, ~100k ranked keywords and their landing pages at a time. Semrush gates its API behind the Business plan at $499.95/mo — and even then you start with zero API units. You buy those separately (roughly $50 per million units, ~10 units per ranked-keyword row), on top of the subscription. DataForSEO is pay-as-you-go: a $50 top-up lasts months, there’s no seat to rent or tier to unlock, and you only pay for the queries you actually pull.

SemrushDataForSEO
Pricing modelFlat subscription; API billed on topPay-as-you-go credit
UI entry plan$139.95/mo (Pro), recurring, export-capped— (no seat; API only)
Programmatic / API accessBusiness $499.95/mo + API units bought separatelyIncluded — you just pay per call
Organic SERP, per 1,000 queriesBundled into the seat$0.60 (Regular) – $3.50 (Advanced, live)
One off-season gap analysisA Business month + units, recurringA few dollars of credit

For a one-off, deeply technical job like a content-gap pull, that’s the difference between unlocking a $500/mo API tier and spending a coffee’s worth of credit. The data quality is there for this use case; the economics aren’t close.

Two stories from twenty years of doing this

The mechanics are new. The plays they unlock are ones I’ve watched work for two decades — they were just too laborious to set up before.

The children’s blog that became a sales channel

A client in the kids’ segment was getting beaten on a class of queries that had nothing to do with products. The competitor ran a strong blog — coloring pages, bedtime stories — with enormous search volume aimed at exactly the target audience: parents. The gap analysis surfaced the whole cluster. The client adopted the strategy, built the content, pulled the traffic, dropped those visitors into remarketing, and turned a “non-commercial” content gap into purchases. (Anonymized.)

Recipes for a diet that sells meal boxes

A meal-prep and coaching business sat next to a category with two beautiful properties: recipe queries have extreme search volume and cent-level CPCs. The strategic competitors had built structured recipe sections — and harvested a stream of people who, by definition, wanted to eat better. From there it’s a short step to a product or a coaching offer. The gap analysis is what made the opportunity visible and sized it. (Anonymized.)

The twist nobody runs: borrow from a stronger market

Here’s the angle that turns this from a defensive audit into an unfair advantage.

Say you’re the leader in a small market with no serious competition to learn from. The gap analysis at home returns nothing useful — there’s no one ahead of you to copy. So don’t run it at home. Run the exact same analysis against the strongest, most competitive foreign market in your category.

Language isn’t a barrier: the LLM maps their categories and content onto yours regardless of the language they’re written in. You import the strategies the leaders in a mature market have already proven — category structures, content angles, assortment ideas — into a market where literally nobody is doing them yet. You become the first mover at home by copying the future from abroad. It pairs naturally with a full market-expansion analysis when you’re deciding where that stronger market is.

Why this closes the loop

Notice what just happened. We started with a tidy SEO task — “find content gaps” — and it spilled into assortment decisions, paid campaigns, remarketing audiences and content strategy. That’s not scope creep. That’s the actual shape of the work.

The data was always pullable; nobody bothered, because the manual cost outweighed the payoff. Now the pull is cheap and the matching is automated. What’s left as the scarce ingredient is the thing that was always scarce: the idea — the seniority to look at a gap map and know that a competitor’s coloring-page blog is really a remarketing channel, and the range to connect SEO, paid and assortment in one head. The execution got easy. The judgment is the job.

The part you can steal

The part you can steal

Page-matching prompt — pairs a competitor’s pages with yours and flags the gaps:

You are a site-structure analyst. You get two lists of pages:
COMPETITOR (url, top keywords, estimated monthly traffic) and OURS (url, title).
For each COMPETITOR page, return the single closest OURS page, or "NO MATCH".
Then output only the NO MATCH rows, sorted by estimated traffic descending.
Match on intent and topic, not exact wording. Cross-language matches are allowed.
100% precision is not required — never invent a match to avoid "NO MATCH".

Ranked keywords + landing pages — DataForSEO Labs, one call per competitor domain:

curl -s "https://api.dataforseo.com/v3/dataforseo_labs/google/ranked_keywords/live" \
  -u "$LOGIN:$PASSWORD" -H "Content-Type: application/json" \
  -d '[{"target":"competitor.com","location_code":2840,"language_code":"en","limit":1000}]'

Three things that save you a wasted afternoon:

  1. Never trust the sitemap alone for your own map. It skips parametric pages, filtered views and the blog — the exact surfaces gaps hide in. Combine crawl + feed + category export.
  2. Ship at “good enough” matching. The value is the unmatched competitor pages; a few wrong pairs don’t change the shortlist. Don’t polish the model — act on the list.
  3. Run it on a foreign market when home is too easy. No strong local competitor means no gaps to find. Borrow from the strongest market in your category instead.

FAQ

Do I really not need 100 % matching accuracy?

Right. You’re hunting the competitor pages with no equivalent on your side — the unmatched rows. A handful of mislabeled pairs doesn’t move that shortlist. Demanding perfection here just delays acting on a list that was already good enough.

Why DataForSEO instead of Semrush?

Cost structure, and which door the API is behind. Semrush’s content-gap tools live in the UI on the $139.95/mo Pro plan; the programmatic pull this article uses needs the Business plan at $499.95/mo plus API units bought on top (you start at zero). DataForSEO is pay-as-you-go from a $50 credit that lasts months, at $0.60–$3.50 per 1,000 SERP queries. For a one-off technical pull, that’s a few dollars versus unlocking a recurring Business seat.

How do I pick which competitors to analyze?

Three signals together: who shows up in paid and organic for your key queries (via DataForSEO), who overlaps with you in Google Ads Auction Insights, and who shares the most keywords with you in the tool data. The intersection is your real competitor set — often not the brands you’d have named.

Isn't this just SEO?

It looks like SEO and it isn’t. The gaps split into landing-page structure (SEO), products you should stock (assortment), audiences worth remarketing to (paid), and topics worth writing (content). The analysis is the same; the actions land on four different teams.

Can I really do this across languages and markets?

Yes — that’s the strongest version of it. The LLM matches pages by intent, not wording, so it pairs a foreign competitor’s categories onto yours fine. If your home market has no competition to learn from, run the analysis on a stronger foreign market and import what works.

My sitemap lists all my pages — isn't that enough for my side?

No. Sitemaps routinely omit parametric URLs, filtered category views and parts of the blog — precisely where gaps live. Build your own-site map from a crawl plus the product feed plus a category export, and treat the sitemap as one input, not the source of truth.


CTA: Curious what your strongest competitor ranks for that you don’t? Let’s pull the gap map.

The point of all this

Want this level of visibility in your account?

One e-mail. I'll tell you honestly whether it's worth it for your setup.

Get in touch →