When Should You De-Index Programmatic Content?

When Should You De-Index Programmatic Content?

Programmatic content is content created (drumroll, please) programmatically — either by replacing certain pieces of that content according to rulesets, or through the use of generative AI. 

We’re going to set aside the AI stuff for now — frankly, we’re tired of writing about it — and just focus on this Reddit thread, where a user lost a bunch of impressions after de-indexing programmatic content. They did that due to advice from…AI. 

Sigh.

But that thread led to an interesting back-and-forth on LinkedIn: Should programmatic content be de-indexed? 

The short answer is yes, if that content holds no unique value for human readers. We’ll explain our reasoning in a moment, but first, here’s a brief explanation of some of the terms (just in case you’re not an SEO). 

But first, if you’re tackling this problem on your own, we’ve got a deal for you: Send us a message, provide as much detail as possible, and we’ll tell you whether you should de-index a certain set of pages. We’ll do this for free over email, and we won’t make a sales pitch. Easy, right?

Now, onto the boring stuff.

What is de-indexing, and how is it good for SEO?

De-indexing simple means adding a noindex tag to your content, which tells search engines not to rank that page in their results. 

noindex is not a perfect tool, and Google frequently ignores it; if you really need to remove something from Google results, you’ll need to take a more dramatic approach and set an htaccess rule that blocks search engine robots (or if you’re less tech-savvy, you can just password-lock the offending content). 

You don’t want to de-index content unless it’s absolutely necessary. Google and the other search engines are pretty good about deciding what should rank, so as a rule of thumb, you should let the search engines decide which pages make the rankings. That’s why Google will sometimes ignore the noindex tag.

But there are legitimate reasons to use de-index. If you’ve got an eCommerce store, you might have products with URLs that change frequently as variables change, and you’ll want to noindex everything other than the primary version of each page.

That shows Google that you understand that many of your pages are programmatic, and that you’ve designed your site to work that way — you’re not trying to game the system by bloating your page index (the total number of pages on your site). 

When does programmatic content hurt SEO?

Here’s the bad news: If you’re using programmatic content for SEO, it will eventually hurt your SEO.

We’ve seen plenty of websites with large amounts of content that has been programmatically adjusted to include different keywords. For example, an auto dealership in St. Louis might have pages that are titled: 

  • Auto Dealership in St. Louis
  • Auto Dealership in Maplewood
  • Auto Dealership in St. Clair County

If all of those pages are identical other than the locations — and the dealership doesn’t have unique facilities in each of those areas — that’s a blatant attempt to bloat the site’s page index, and it won’t work in the long term.

So when should I deindex programmatic content?

If you’ve got an eCommerce site, your platform (such as Magento or WooCommerce) will probably handle all of the necessary deindexing right out of the box. You can check this by auditing your site (or have us do it, we’re not expensive). 

But if you’ve created a large set of programmatic content, ask yourself:

Does this provide actual value for users? Would a human being want to find this particular page?

If so, keep it. If not, you’ll probably want to delete it entirely — not just de-index it. 

Before you do that, we’d recommend checking the pages in question in Search Console and Analytics to determine whether they’re receiving traffic. Even if you’ve got low-quality pages, you might have tricked Google into sending you traffic (temporarily), and there are ways to retain that traffic. Namely, you’ll add high-quality content to the page, or you’ll make it the primary (canonical) version of its pageset. 

If you’ve got a question about SEO, we’re here to help. Shoot us a message.

About the author

John Krane administrator