• Skip to main content
  • Skip to primary sidebar

Technical Notes Of
Ehi Kioya

Technical Notes Of Ehi Kioya

  • About
  • Contact
MENUMENU
  • Blog Home
  • AWS, Azure, Cloud
  • Backend (Server-Side)
  • Frontend (Client-Side)
  • SharePoint
  • Tools & Resources
    • CM/IN Ruler
    • URL Decoder
    • Text Hasher
    • Word Count
    • IP Lookup
  • Linux & Servers
  • Zero Code Tech
  • WordPress
  • Musings
  • More
    Categories
    • Cloud
    • Server-Side
    • Front-End
    • SharePoint
    • Tools
    • Linux
    • Zero Code
    • WordPress
    • Musings
Home » WordPress » Hide (NoIndex) Files In WordPress From Search Engines

Hide (NoIndex) Files In WordPress From Search Engines

By Ehi Kioya Leave a Comment

We often need to hide or noindex certain pages or groups of pages in WordPress so that they will not show up in search engines. Tag archives (like this one) are good examples of pages that should be noindexed because they could potentially contain what search engines consider to be “duplicate content”. Other types of pages that are good candidates for noindexing include thank you pages, author archive pages for single author blogs, etc. We just usually fix this using SEO plugins.

The Problem

What about files? Sometimes you might host a PDF or Excel file on your website, and for a number of different reasons, you may not want that file to be indexed by search engines.

For example, if you use a PDF whitepaper to generate leads, you may not want to lock down your whitepaper behind a paywall. You want them to be readily downloadable by anyone who has a link to the file. At the same time, you wouldn’t want people to find your whitepaper via a Google search either. Instead you only want the file to be accessible to people who have shown interest by sending your their email address (or for some other reason, you may only want the file to be accessible via the page on your site that contains the link to the file).

Since this is not a regular web page, the regular methods for noindexing pages may not work for your file and the question becomes: How do we hide (noindex) files in WordPress from search engines?

I explain a solution to this problem below and I demonstrate with live example files how you can hide (noindex) other types of media like PDF and Excel files in WordPress.

But before we get too technical, you may want to note that…

You May Be Able To Solve This Problem With The Yoast SEO Plugin

All files are considered as “media” in WordPress and the popular Yoast SEO plugin allows you to define how you want search engines to treat each media file on your WordPress website. So with just a few clicks, you may be able to add noindex, noarchive, and nosnippet meta tags for your media files really quick.

From the edit page of the media file in question, the Yoast SEO settings you need to configure will look like this:

How To Hide (NoIndex) Files In WordPress From Search Engines Using Yoast

Easy eh? Well, maybe not.

The Yoast SEO media settings shown in the above screenshot are ONLY available if you have NOT already configured Yoast SEO to “Redirect attachment URLs to the attachment itself”.

How To Hide (NoIndex) Files In WordPress From Search Engines [Yoast Media Settings]

Notice that this is the recommended setting. You need this setting for all the other image files contained on your site. And with this setting, you will not see any Yoast SEO settings box on your media edit pages.

So, if you’re using recommended Yoast SEO settings, you will need another way to specify your noindex tag on your PDF or document file. This is also the case if you’re not even using Yoast SEO in the first place. Or if you’re on an Apache web host but not using the WordPress content management system.

My solution below uses the .htaccess file and does not care about Yoast SEO or even WordPress. It only needs an Apache host.

Hide (NoIndex) Files Using .htaccess

Here are two sample PDF files. The first one has been noindexed in my .htaccess file while the second one has not.

  • EhiTestNoIndexed.pdf
  • EhiTestIndexed.pdf

And here is the snippet of code added to my .htaccess file to achieve this:

<FilesMatch "EhiTestNoIndexed.pdf">
	Header set X-Robots-Tag "noindex, noarchive, nosnippet"
</FilesMatch>

If you are new to working with the .htaccess file, then you might want to check out my detailed article on the subject: Working With The .htaccess File.

If you wanted to noindex both files, you could get a little fancy with regular expressions like this:

<FilesMatch "^(EhiTestNoIndexed|EhiTestIndexed)\.pdf$">
	Header set X-Robots-Tag "noindex, noarchive, nosnippet"
</FilesMatch>

Why Not Just Disallow The Files Using robots.txt?

Because the instructions defined in your robots.txt file do not prevent search engines from indexing a file or web page.

True, you can stop search engines from crawling a resource using the robots.txt file. But if someone links to your file or page from a third party website, search engines will go ahead and index your file if they do not find an explicitly defined noindex tag on it.

The X-Robots-Tag directive defined in the .htaccess file as described above is needed in this case.

How Do We Test This?

To test this, we just need to view the response header of the files. You can do this using the popular Web Developer Chrome plugin (there’s also a version for Firefox).

Install the plugin if you don’t already have it. Then for the noindexed file, visit this link: EhiTestNoIndexed.pdf

Now click the “Information” tab of the Web Developer Chrome extension and press “View Response Headers”.

How To Hide (NoIndex) Files In WordPress From Search Engines [View Response Headers]

The result will be something like this:

date: Sat, 24 Nov 2018 23:11:59 GMT
last-modified: Sat, 24 Nov 2018 23:08:12 GMT
server: Apache/2.4.18
etag: "b881-57b712c8fa5ac"
content-type: application/pdf
accept-ranges: bytes
x-robots-tag: noindex, noarchive, nosnippet
content-length: 47233

200 OK

Notice the line with “x-robots-tag: noindex, noarchive, nosnippet”. This means search engines will not index this file.

If you repeat the same process for the EhiTestIndexed.pdf file, the response header will look like:

date: Sat, 24 Nov 2018 23:13:23 GMT
last-modified: Sat, 24 Nov 2018 23:08:59 GMT
server: Apache/2.4.18
accept-ranges: bytes
etag: "b991-57b712f5e943b"
content-length: 47505
content-type: application/pdf

200 OK

This one will be indexed and will show up in search engines.

Found this article valuable? Want to show your appreciation? Here are some options:

  1. Spread the word! Use these buttons to share this link on your favorite social media sites.
  2. Help me share this on . . .

    • Facebook
    • Twitter
    • LinkedIn
    • Reddit
    • Tumblr
    • Pinterest
    • Pocket
    • Telegram
    • WhatsApp
    • Skype
  3. Sign up to join my audience and receive email notifications when I publish new content.
  4. Contribute by adding a comment using the comments section below.
  5. Follow me on Twitter, LinkedIn, and Facebook.

Related

Filed Under: Apache, Backend (Server-Side), Linux & Servers, Web Development, WordPress Tagged With: Apache, Linux, SEO, Web Development, WordPress, X-Robots-Tag

About Ehi Kioya

I am a Toronto-based Software Engineer. I run this website as part hobby and part business.

To share your thoughts or get help with any of my posts, please drop a comment at the appropriate link.

You can contact me using the form on this page. I'm also on Twitter, LinkedIn, and Facebook.

Reader Interactions

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Primary Sidebar

23,690
Followers
Follow
30,000
Connections
Connect
14,568
Page Fans
Like
  • Recently   Popular   Posts   &   Pages
  • Actual Size Online Ruler Actual Size Online Ruler
    I created this page to measure your screen resolution and produce an online ruler of actual size. It's powered with JavaScript and HTML5.
  • How To Change A SharePoint List Or Library URL How To Change A SharePoint List Or Library URL
    All versions of the SharePoint user interface provide an option to change the title (or display name) of a list or library. Changing SharePoint library URL (or internal name), however, is not exactly very intuitive. We will discuss the process in this article.
  • WordPress Password Hash Generator WordPress Password Hash Generator
    With this WordPress Password Hash Generator, you can convert a password to its hash, and then set a new password directly in the database.
  • About
  • Contact

© 2022   ·   Ehi Kioya   ·   All Rights Reserved
Privacy Policy