InstaYolo
·by Tyler Brennan·seoindexationsite notesgoogle

1 of 57 pages indexed. The other 56 are technically perfect. So why?

I caught instayolo.com on the drop in March 2025. 13 months later, 1 of 57 indexable pages on the site is in Google. The other 56 are sitting in two GSC buckets despite every technical signal being correct. The real bottleneck isn't crawlability or schema — it's the prior owner's reputation drag, and the only way out is one external high-trust link.

Caught a dropped domain in March 2025

I caught instayolo.com on the drop in March 2025. The previous owner had let registration lapse. Short name, two real words, reads as a downloader. Bought, sat on it for a couple weeks, then started building.

1 of 57 indexed

13 months later, 1 of 57 indexable pages on the site is in Google. Just the homepage.

Of the 56 not indexed, 38 are in GSC's "Discovered – currently not indexed" bucket — Google saw the URL in the sitemap and decided not to bother crawling. 17 are in "Crawled – currently not indexed" — Google crawled the page, then quietly tossed the result. One straggler is "Page with redirect" (`/story-viewer` → `/story-downloader`, a 308 shipped after merging two near-duplicate URLs into one).

Technical state, by category

Sitemap.xml gets served from a real `app/sitemap.ts` route with `lastmod` derived from `git log` instead of the lazy `now` value everybody ships. Robots.txt allows `/`, disallows only `/api/`. One H1 per page. JSON-LD spans Organization, WebSite (with SearchAction so sitelinks-searchbox is unlocked), SoftwareApplication, BreadcrumbList, Article, Person. Every block validates in Rich Results Test. hreflang `x-default` everywhere. Canonical URLs absolute. X-Robots-Tag is never set to `noindex` on an indexable host. Googlebot UA gets the same 200 and same byte size as default UA — no cloaking, no soft 404s.

Two different LLM agents audited the stack independently. Both came back clean.

So why.

The domain wasn't fresh

The previous owner had run a Shopify-flavored e-commerce site under instayolo.com — `/products/...`, `/collections/...`, `/cart`, `/account/login` — and every one of those was cached in Google's index when I took ownership. Search "instayolo" today and the brand entity in Google's knowledge graph is partially mine, partially the prior owner's. Google's quality classifier is still working out what this domain *is*, given the gap between its memory of the domain and what my actual content says it does.

This is the part SEO writeups skip. Fresh domains are predictable — sandbox, ride it out, build links, climb. *Inherited* domains with an active prior-owner footprint are different. The crawl scheduler treats you as worse than new: not unproven, but established-and-radically-changed. Authority signals from the prior site decay slowly, but they decay onto *your* pages now. The quality classifier eyes your content sideways for not matching the brand's history. Nobody warned me.

A week of compressed work follows.

Sweep the ghost URLs

GSC URL Removals tool, Directory mode (collapses 12 paths into 4 prefix patterns). Same patterns into Bing Webmaster's Block URLs. Removals buys roughly 6 months of suppression. Without it, plan on Google taking 12+ months to drop the prior owner's URLs, with ghost results haunting your brand SERP the whole time.

Eliminate every false signal

Tedious but earns the keep. The Article schema's `publisher.logo` pointed at `/icon-512.png` — that URL 404'd, because Next.js App Router generates `/icon` dynamically without the `-512.png` suffix. Fixed to the real route, explicit 256×256 dimensions. Nginx was returning `<h1>301 Moved Permanently</h1>` as the response body on www→apex redirects. Google ignores those bodies. Bing was treating it as page content, and reporting the www host as "missing meta description." Added `X-Robots-Tag: noindex, follow` on the www server block — tells engines to never index www separately and consolidate everything into apex. 38 `meta_title` strings rendered over 60 chars in real SERPs; trimmed. 12 `meta_description` strings over 160; trimmed.

Performance

PSI mobile was Perf 78, TBT 530ms. The TBT was Google Tag Manager loading via `afterInteractive`. Switched it to `lazyOnload`. Perf 78 → 98, TBT 530 → 56ms. Then a Cloudflare Cache Rule for HTML — Cloudflare's default does not cache HTML, which I had assumed it did. (You have to opt in, via a Page Rule or Cache Rule.) TTFB 1.6s → 0.64s.

Author + entity hookup

Built `/authors/torrance` with a proper Person schema. Switched every Article schema's `author.url` from `/about` to `/authors/{slug}`. Added an "About the author" block at the foot of every blog post — template change, propagates to all 16 posts at once, no manual loop. Cross-linked old posts to newer ones. The "Crawled – not indexed" bucket was almost entirely older posts with no inbound internal links from fresh content, so the working theory is Google saw orphans and dropped them.

Indexed count after all of that: still 1.

What nobody warned me

What nobody warned me about, and the reason this post exists, is that **none of the technical work moves crawl budget on a reputation-drag domain.** What moves it is one external high-trust link. Technical work matters for *quality* once Google decides to crawl. It doesn't move *whether* Google decides to crawl. Authority does that. Specifically, authority that is observably new and attached to current content. A single backlink from a domain Google trusts probably breaks the purgatory faster than two more weeks of internal optimization.

Hence this post.

If you've inherited a dropped domain and Google is still stuck on the prior owner, feel free to reach out — receipts on the small fixes (commits, before/after PSI, GSC screenshots) are happy to share. Compare notes welcome.

Indexed pages: 1.

Related tools

Keep reading