LinkGuard cover — Indexation troubleshooting checklist: when to fix, when to accept

You open Search Console. The Pages report shows 847 URLs as "Not indexed". You go to one specific URL, paste it into URL Inspection, and read "Crawled - currently not indexed". You hit "Request indexing". Nothing happens. You hit it again the next day. Still nothing. You start composing an angry forum post.

Stop. The 2026 honest answer for most of those 847 URLs is that Google made a deliberate decision and that decision is mostly correct. John Mueller has said it directly: "I strongly recommend not relying on trying to force indexing." The URL Inspection tool is rate-limited to 2,000 requests per property per day, and Google has no plan to raise it — the rate limit is a feature, not a bug. The premise of "force-indexing as the fix" is wrong; the fix is upstream of indexation.

That said: a meaningful minority of the URLs in your "Not indexed" report probably should be indexed and aren't. The checklist below tells you how to find which ones, what to do about them, and what to delete instead of fight.

One critical distinction up front. Search Console shows two very different "not indexed" statuses that look like one problem:

Discovered — currently not indexed: Google knows the URL exists (probably from a sitemap or an internal link) but hasn't crawled it yet. This is a crawl-budget / server-capacity / site-quality-signal issue. Google is rationing crawl based on whether the site is worth crawling. Different fix path.
Crawled — currently not indexed: Google crawled the URL, looked at it, and decided not to include it. This is a quality verdict, not a technical problem. Different fix path entirely.

Operators who treat these as one problem waste weeks. The first 4 items of this checklist split them.

Vocabulary

If you've read our disavow checklist or anchor portfolio checklist, you already have Penguin/SpamBrain. New terms in italics.

SpamBrain: Google's AI link-spam classifier deployed December 2022. Works at donor-cluster granularity.
Helpful Content System / HCU: Google's content-quality classifier (deployed 2022, made part of core algorithm in March 2024). Targets thin, AI-generated-without-edit, or unhelpful pages. Drives ranking suppression, not always deindexation.
URL Inspection: Search Console feature that shows Google's index status for a specific URL plus a "Request Indexing" button (rate-limited to 2,000 requests/property/day).
Crawl budget: the number of URLs Googlebot is willing to fetch from your site in a given time window. Illyes (2024): mostly a non-issue for sites under 10,000 pages.
X-Robots-Tag: HTTP response header that controls indexing. Same effect as a meta robots tag but set at the server level — often the cause when the page HTML looks fine but Google still won't index.
Canonical: HTML/HTTP signal that declares which version of a URL is the "main" one. A page that sets canonical to a different URL is asking Google to index that URL instead.
Indexing API: Google's programmatic indexing endpoint. Restricted to JobPosting and BroadcastEvent schema only — using it for general pages is against TOS and will get the property quarantined.
IndexNow: instant-indexation protocol adopted by Bing, Yandex, and Naver. Google has tested it since 2021 and not adopted — sending IndexNow to Google does nothing.
hreflang: HTML/HTTP attribute that tells Google which language and region version of a page to serve to which audience. Misconfigured hreflang causes Google to consolidate language variants and "Crawled - not indexed" the wrong ones.
E-E-A-T markers: signals of Experience, Expertise, Authoritativeness, Trustworthiness on a page. Concretely: named author with bio page, credentials, LinkedIn link, published date, fact-checked claims with sources, real photos. Used by Google's quality classifiers.

What success looks like

You finish the checklist with three buckets. Bucket A: pages that should be indexed and aren't — you identified the technical or quality blocker and fixed it. Bucket B: pages where Google's judgment is correct — you accept the verdict or delete the page. Bucket C: pages that truly don't need to be indexed (admin, thank-you, preview URLs) — you mark them noindex deliberately so they stop appearing in the "Not indexed" report and stop wasting your crawl budget. Sorting 847 URLs into those three buckets is the win.

Related reading: redirect chains are a common reason a page stalls — see redirect chains explained: what they are, why they hurt SEO, and how to fix them.

0 / 22 · 0%

Critical Important Nice to have

Show only undone

Four checks before you touch any technical setting. Most indexation troubleshooting work goes the wrong direction because the operator skipped this category and started fixing a technical problem when the issue was either (a) Google's correct quality judgment, or (b) the URL was never supposed to be indexed in the first place.

Critical Split the URL list by GSC status — Discovered vs Crawled — they are different problems "Discovered — currently not indexed" and "Crawled — currently not indexed" live in the same Search Console report but are different operational problems. Discovered = Google hasn't crawled yet (server/budget/quality-signal issue). Crawled = Google looked and declined (quality verdict). Treating them as one problem wastes weeks.
How to do this
Search Console → Pages report → scroll to the "Why pages aren't indexed" section. Click each reason individually to see the URL list:
Discovered — currently not indexed: tag these URLs as "Discovered" in your tracker. Fix path = site-level (internal linking, sitemap, server response time, content quality of the source pages linking to these URLs).
Crawled — currently not indexed: tag as "Crawled". Fix path = page-level (rewrite, expand, merge, or accept the verdict).
Export both lists separately. The rest of this checklist treats them differently.
Critical Confirm "not indexed" means deindexed — not just unranked Operators conflate "my page isn't ranking" with "my page isn't indexed". The Pages report and the Performance report measure different things. A page can be fully indexed and still get zero clicks because it ranks at position 47. The fix for unranked-but-indexed is content + on-page work; the fix for truly-not-indexed is this checklist.

How to do this

Paste the URL into the Search Console URL Inspection bar. Read the verdict at the top: "URL is on Google" / "URL is not on Google" / "URL is on Google, but has issues". If it says "URL is on Google" the page IS indexed — your problem is ranking, not indexation, and this checklist is the wrong tool. Head to a content audit or our anchor portfolio checklist instead.
Critical Batch-check the full URL list with the backlink index checker Going through URL Inspection one URL at a time when you have 847 of them isn't an audit, it's a Friday-night punishment. Batch-check the list with a tool that reports indexation status for many URLs at once. GSC reports lag actual Google state by 3-14 days, so the tool often shows URLs as indexed that GSC still flags — the real "not indexed" set is usually much smaller than the panic suggests.

How to do this

Paste the URL list into our backlink index checker. It runs a Google site:<url> lookup, which works for any URL — though the tool's LP copy is written for the backlink-checking use case, so the explanatory text mentions 'donor pages'. The verdict is the same: INDEXED / NOT INDEXED. Export results and filter for 'not indexed' — that's your real work-list.
Check if it is indexed in Google
Important Confirm the URL is supposed to be indexed in the first place On most sites, a substantial share of the "not indexed" URLs shouldn't be indexed: internal admin, thank-you pages, auto-generated tag archives, filter combinations, preview URLs, draft posts. The share varies wildly — tightly-curated B2B SaaS sites might see ~10%, while Shopify with faceted nav and tag archives can hit 60%+. Operators waste hours trying to force indexation on URLs that should be noindexed deliberately.
How to do this
Walk the URL list and mark each:
SHOULD be indexed: money pages, content pages, comparison pages, landing pages with commercial intent.
SHOULDN'T be indexed: /admin/, /thank-you/, /cart/, ?utm_source= parameter variants, tag-archive pages with <3 posts, filter combinations.
For the "shouldn't" set: add explicit noindex meta tag or X-Robots-Tag header. They'll drop out of the "Not indexed" report within 1-2 recrawl cycles. The rest of this checklist applies only to the "should be indexed" set.

Five checks for pages that should be indexed and aren't. These catch the technical reasons Google can't (or won't) index a URL even when the content is fine. Run these in order; later checks assume earlier ones pass.

Critical Verify robots.txt allows Googlebot on the URL Disallow rules in robots.txt tell Googlebot not to crawl. A blocked URL can't be indexed regardless of content quality. The most common cause: a staging robots.txt with User-agent: * / Disallow: / got deployed to production. We covered this pattern in our robots.txt checklist — if you haven't audited robots.txt against your indexation set, do that first.
How to do this
Two-step verification:
Open yoursite.com/robots.txt. Read every Disallow rule. Check any that touch the URL path of the not-indexed pages.
Test specifically: paste the URL path into our robots.txt tester against Googlebot. Verdict must be ALLOWED.
If a Disallow rule blocks the URL, remove it or carve out a Googlebot allow. Wait 1-2 recrawl cycles before continuing.
Check your robots.txt free
Critical Check meta robots and X-Robots-Tag for noindex directives A page can have a noindex meta tag in the HTML or, less obviously, an X-Robots-Tag: noindex HTTP response header. Either tells Google to drop the page from the index. The HTTP header version is the silent one — you view the HTML in browser DevTools and see no noindex, but the header is blocking indexing at the server level.
How to do this
Two checks:
HTML meta tag: view-source: on the URL, Ctrl+F for noindex. Look for <meta name="robots" content="noindex"> in the <head>.
HTTP header: curl -I https://yoursite.com/page or open DevTools → Network → click the document → Headers tab. Look for X-Robots-Tag.
If either is set to noindex, remove it. If you ALSO have the URL blocked in robots.txt, the noindex doesn't work — Googlebot can't crawl the page to see the directive. Unblock robots.txt first, wait one crawl cycle, then remove noindex.
Critical Verify canonical points to itself, not to another URL A page that sets rel="canonical" to a different URL is telling Google: "don't index me, index that other URL instead." Common causes: a CMS template auto-generating canonical to the homepage; a developer using canonical for cleanup of duplicate parameter URLs and accidentally pointing to the wrong target; pages on a subdomain canonicalizing to the main domain.
How to do this
view-source: on the URL, search for <link rel="canonical". Compare the href value to the URL you're inspecting:
If they match (or differ only by protocol/www) — canonical is fine.
If canonical points to a different URL — that's why Google won't index this one. Audit the canonical config in your CMS.
Our canonical tag checker validates this across a list of URLs in one pass.
Validate this canonical free
Nice Resolve any 3xx redirect chain that the URL is caught inside A URL caught in a long redirect chain (URL A → URL B → URL C) loses signal at each hop. Google follows several hops but degrades signal at each one. The fix is to compress chains to single-hop redirects. Rarely the cause under 3 hops in 2026 — check after the higher-tier items pass.
How to do this
Two equivalent options:
Online tool (works anywhere): paste the URL into httpstatus.io. Count the hops in the chain.
Command line (Mac/Linux/Git Bash): curl -ILs https://yoursite.com/page. Count Location: headers. PowerShell's curl alias does NOT accept -ILs — use the online tool instead, or open WSL.
Compress any chain longer than 1 hop to a single redirect from the original URL to the final destination.
Important Verify Google sees the rendered content (JavaScript rendering) Single-page apps and JS-heavy frameworks (React, Vue, Next.js with client-side rendering, Webflow custom code) can serve an empty shell to Googlebot. The page exists but the content lives in JS that Google's renderer may not execute or may execute incompletely. The page looks fine to a human visitor and empty to the crawler.
How to do this
Precursor: view-source:donor.com/page, Ctrl+F for the main body text. If the visible content of the page is missing from the raw HTML source, the site is JavaScript-rendered — continue. If the body text is in the source, this isn't your problem.
Then verify what Googlebot renders:
In Search Console, paste the URL into the URL Inspection bar → click "Test Live URL" (top-right). Wait for it to finish → click "View tested page" → "More info" tab. Look at the HTML Google rendered. Compare to the visible content of the live page.
Or use Google's Rich Results Test — same Web Rendering Service Googlebot uses, sometimes easier to inspect.
If Google's rendered HTML is missing the main content, the fix is dev-team work: server-side rendering (SSR) or static-site generation (SSG) for the main content. On Next.js: use getServerSideProps or getStaticProps, not client-only fetching. On other stacks: ask your engineering team. This is a dev ticket, not a junior fix — escalate.

Four checks for pages where Google's index decision is content-driven, not technical. These are the harder fixes because they require rewriting or merging pages, not flipping a flag. The honest framing: if Google declined to index a thin page, Google was probably right.

Critical Score the page against the thin-content threshold (300-500 words minimum) Pages under 300 words are increasingly hard to index in 2026 unless they serve exceptionally narrow query intent (product pages, contact pages, very specific FAQ entries). Mueller has been explicit: thin content was the dominant pattern in HCU-driven ranking suppression cases. Pages under 200 words almost never get indexed unless they're the canonical product page for a unique SKU.

How to do this

Word-count the page (strip HTML, count user-visible text). If <300 words and the page is truly informational, expand or merge it with a more comprehensive page on the same topic. If <300 words and the page is structural (category landing, paginated archive), consider noindexing it and consolidating the inventory.
Critical Identify duplicate or near-duplicate content with other URLs on the site When two pages on your site cover the same topic with similar content, Google picks one to index and drops the other. The dropped one shows as "Crawled — currently not indexed". Common causes: auto-generated tag pages overlapping with category pages; near-duplicate product listings; multilingual hreflang misconfiguration causing language variants to dedupe; paginated archives where each page differs only by which 10 posts are listed.

How to do this

For each not-indexed URL, run a Google search for site:yoursite.com [primary topic of the page]. If 3+ of your own URLs show in the results, you have content competing with itself. Pick one as the canonical, merge the content, redirect the rest with 301s.
Important Audit internal links — are not-indexed URLs orphaned? Pages with zero internal links are orphans. Google can theoretically still find them (via sitemap) but assigns low priority and quality signal. A page that no other page on your own site links to is implicitly being told "this page isn't important enough to feature" — Google reads that signal.

How to do this

Crawl your site with Screaming Frog or Sitebulb. Filter for pages with 0 inbound internal links from the same domain. For the ones that should be indexed: add internal links from at least 2-3 indexed, topically-relevant pages. Avoid concentrated linking from low-quality category archives only.
Important Diagnose site-wide Helpful Content suppression before page-level fixes If your site got hit by an HCU (Helpful Content) update and ranking collapsed site-wide, individual page-level indexation fixes won't restore traffic. The suppression is at the site-classifier level. Lily Ray and Glenn Gabe have documented this pattern repeatedly through 2024-2026: ranking collapse for small publishers, not necessarily mass deindexation, but the indexation symptom follows.

How to do this

Check the Search Console Performance report against Google's announced ranking-update history. If your traffic dropped on an HCU date (March 2024 was the big incorporation; subsequent core updates carry HCU signals) and the drop is site-wide, fix the site-level signal first — original research, named authors with E-E-A-T markers, removing thin pages, demonstrating actual experience. Page-level fixes are downstream of this.

Five execution steps. Run them in order. Skipping fix-then-request and going straight to Request Indexing is the most common operator mistake — you burn rate-limit quota requesting a page Google already saw and declined.

Critical Fix the page first — THEN use URL Inspection's Request Indexing Request Indexing tells Google "look at this URL again." Google will look. If the page hasn't changed since the last crawl, Google reaches the same conclusion — you wasted one of your 2,000/day requests. Always fix the page first, confirm the fix with View Source, THEN submit.
How to do this
Sequence:
Make the fix (rewrite content, fix noindex, repair canonical, etc.).
Verify the change is live: view-source:yoursite.com/page shows the new state.
Search Console → URL Inspection → paste the URL → "Request Indexing". This adds the URL to Google's crawl queue.
Wait 2-14 days. Don't resubmit during this window.
Important Submit an updated sitemap with refreshed lastmod values Sitemaps with accurate <lastmod> dates are still useful in 2026. Google uses lastmod as a hint to prioritize recrawl. But sitemap pings (the google.com/ping?sitemap= endpoint) were deprecated by Google end of 2023 — just update the sitemap file and let Google find it on the next scheduled fetch.

How to do this

Generate or update your sitemap with current <lastmod> timestamps for the URLs you just fixed. Submit through Search Console → Sitemaps. Don't ping the deprecated endpoint; it does nothing.
Important Add 2-3 internal links from indexed, topically-relevant pages Internal links are the strongest Google-controlled signal that a URL is worth indexing. A newly-fixed URL with no internal links sits in the same low-priority queue as before. Add the links as part of the fix, not as a separate task.

How to do this

For each newly-fixed URL, pick 2-3 existing pages that are (a) already indexed, (b) topically relevant, (c) get real organic traffic. Add a contextual in-content link with a descriptive anchor (see our anchor portfolio checklist for what counts as descriptive). Avoid linking only from category archives or footers — those are weaker signals.
Important Track the URL with 30 / 90 / 180-day index-check intervals After requesting indexation, Google decides on its own schedule. The operator-defensible recrawl windows are 2-14 days for the first index decision, 30 days for confirmation that the decision is stable. Without a check schedule, you don't know if the fix worked or if Google indexed temporarily and dropped the page again.
How to do this
Add a row in your fix-tracker (Google Sheet works fine):
fix_date | url | status_at_fix (Discovered/Crawled) | fix_type | bucket (A/B/C) | 7d_indexed (Y/N) | 30d_indexed (Y/N) | 90d_indexed (Y/N) | 90d_impressions | notes
The check cadence: day 7 (did Google revisit?), day 30 (did the fix stick?), day 90 (is the page still indexed AND getting impressions?). Pages indexed at 30 days but dropped at 90 days have a quality problem you didn't fully fix. Indexed at 90 days but zero impressions = indexed but not ranking, which is a different problem (head to a content audit).
Batch the day-30 and day-90 indexation checks with our backlink index checker. For impressions, pull from Search Console.
Check if it is indexed in Google
Important Delete or noindex pages where Google's verdict is correct Some pages in the not-indexed list truly don't deserve indexation: thin tag archives with 2 posts, dead landing pages from old campaigns, auto-generated search-result pages. Leaving them in the not-indexed report is noise; it also drags site-level quality signals because Google sees them as evidence of low-effort content. The honest fix is deletion (404) or explicit noindex.
How to do this
For each URL in this bucket, pick the fix:
Genuinely useless: 404 or 410 (Gone). Google drops the URL from its index over the next 2-4 weeks.
Useful for users but not search (e.g. thank-you pages, logged-in dashboards): noindex meta or X-Robots-Tag header.
Duplicates of a better page: 301-redirect to the better page.

Four anti-patterns that show up in nearly every indexation post-mortem. If you're about to do any of these, stop.

Important Don't ping the deprecated sitemap endpoint — Google removed it end of 2023 google.com/ping?sitemap= was deprecated by Google in 2023 and now returns 404. SEO blogs and outdated CMS plugins still recommend it. Hitting the endpoint does nothing, but it also costs nothing — the problem isn't damage, it's the false sense of action. If your indexation workflow includes "ping sitemap" as a step, delete that step and use Search Console → Sitemaps submission instead.

How to do this

Update your CMS / deploy pipeline / outreach SOP. Delete any sitemap-ping command. Replace with: "submit sitemap URL once in Search Console; rely on scheduled recrawl for updates."
Critical Don't use third-party "indexing API" services for general content Google's Indexing API is restricted to JobPosting and BroadcastEvent schema types. Using it for general pages is explicitly against TOS and will result in your property being quarantined. Third-party services that promise "instant indexing" for any URL are doing one of: (a) abusing the API and putting your property at risk, (b) charging you for sitemap submission you can do for free, or (c) using a compromised account that's about to get banned.

How to do this

If you're being pitched an indexing API service: ask which exact endpoint they use and which schema type they submit your URLs under. Anything other than JobPosting or BroadcastEvent is TOS-violating use. Walk away.
IndexNow is a separate protocol supported by Bing, Yandex, Naver — but NOT Google. Sending IndexNow requests thinking it'll help Google indexation does nothing.
Critical Don't force indexation on pages Google's quality judgment correctly declined If Google has crawled the page multiple times and not indexed it, the page has a quality problem you haven't fully fixed. Re-submitting through URL Inspection burns your daily quota and doesn't change Google's verdict. Mueller has been explicit: "I strongly recommend not relying on trying to force indexing."

How to do this

After ONE Request Indexing submission, wait 14 days. If the page still doesn't index, the problem is the page or the site-level signal — not the request. Re-run the quality category (q-01 through q-04) and find the actual blocker. Don't re-submit through URL Inspection more than once per fix.
Important Don't confuse Google indexation with AI-search visibility — they are separate surfaces ChatGPT Search, Perplexity, and Claude do not index from Google's index directly. They have their own indexes pulling from different sources (Wikipedia, Reddit, YouTube, common crawl, partner data). Recent third-party audits show ~11% domain overlap between ChatGPT and Perplexity citation sources. Fixing Google indexation does not improve AI-search citation rate — that's a different fix entirely.

How to do this

If your concern is "we're not getting cited in ChatGPT/Perplexity", the work lives in content + brand mentions + schema, not in Google's indexation pipeline. See our GEO vs SEO guide for that separate work.

About the Author

Andrei

SEO and digital marketing professional with 13+ years of experience. Started as a website administrator in 2011, transitioned to SEO, and achieved top-3 rankings for competitive keywords. Co-founded a consulting firm specializing in marketing audits for companies in Ukraine and internationally. Built LinkGuard to solve the problem he experienced firsthand: most SEO teams purchase links but never monitor their survival. Based in Kyiv, Ukraine.

Indexation troubleshooting checklist: when to fix, when to accept

Vocabulary

What success looks like

Pre-flight — confirm what's happening

Diagnose — technical blockers

Diagnose — quality blockers

Action — what to do (in order)

What NOT to do

Tags

About the Author

Andrei

Vocabulary

What success looks like

Tags

About the Author

Andrei

Related Articles

Quarterly backlink audit checklist for agencies: 40 items for 2026

On-page SEO optimization checklist: 35 items for the 2026 audit

Site migration SEO checklist: 37 items for surviving a 2026 domain move