If you work with WordPress often, and especially if you frequently do website migrations, you may at some point encounter the problem of weird characters in WordPress. These funky characters (like â€œ and â€) may suddenly start appearing in your WordPress posts and comments content.
Somehow during your website migration process, you inadvertently converted every ellipses, quotation mark, en dash, em dash, and other punctuations and special characters into unrecognizable text. If you’re not familiar with the problem, it can be quite irritating. And for a high traffic website with lots of content, this is an issue that needs quick attention – because the weird characters usually show up everywhere.
This article explains why this may happen and offers tested solutions to fix the problem.
What Causes The Appearance Of Weird Characters In WordPress?
Before I answer that question, let’s talk a little about character encoding.
Character encoding tells the computer how to interpret raw zeroes and ones into real characters. Character encoding is also referred to as character set or character map.
Character sets define how alphabets of different languages are handled and presented by computers. One of the most common character sets is UTF-8.
So… What causes the appearance of weird characters in WordPress?
By default, WordPress stores your data in its database using the UTF-8 character encoding. During the process of exporting and importing your database (as part of a website migration), one or more tools or processes used in working with the raw database files might somehow change your character encoding (from UTF-8 to something like ISO-8859-1) without your knowledge.
In my experience, this often happens if for some reason you open up the database file in a text editor. [So, lesson for the future: When doing migrations, try to avoid opening database files in text editors!]
When this happens, more often than not, the weird text you’ll find are punctuation marks that have been messed up. So, if we can identify the usual characters that get messed up when WordPress character encoding changes, we would be one step closer to fixing the problem.
Here’s a translation table to help with this:
|Weird Character||Friendly Name||Correct Character|
|â€œ||Left (opening) double quote||“|
|â€||Right (closing) double quote||”|
|â€˜||Left (opening) single quote||‘|
|â€™||Right (closing) single quote||’|
|â€”||En dash (short dash)||–|
|â€“||Em dash (long dash)||—|
Cleaning Up The Database: Manual Method
Now that we’ve identified the possible cause of the problem, let’s go ahead and fix it. We will be performing find-and-replace operations on our database using the above table as a guide. It’s a simple query you’ll run multiple times.
Refer to the translation table above and use it to create your find-and-replace SQL query. So, fire up your database management tool (I like HeidiSQL) and get ready to run queries.
The solutions below will permanently update your WordPress database tables. So, before you proceed with any of these solutions, be sure to make a backup of your website’s database.
The tables likely to contain the highest number of gibberish text are the wp_posts and wp_comments tables (assuming you’re using the default WordPress table prefix of wp_).
We will run UPDATE queries on each table, searching for a particular set of weird characters and replacing them with the corresponding valid text.
Here’s the syntax:
UPDATE table_name SET column_name = REPLACE(column_name, 'weird character', 'valid character');
For the wp_posts table, we will be working with the post_content column. Here’s a list of queries you can run on the wp_posts table to fix all the weird character mess in there:
UPDATE wp_posts SET post_content = REPLACE(post_content, 'â€œ', '“'); UPDATE wp_posts SET post_content = REPLACE(post_content, 'â€', '”'); UPDATE wp_posts SET post_content = REPLACE(post_content, 'â€˜', '‘'); UPDATE wp_posts SET post_content = REPLACE(post_content, 'â€™', '’'); UPDATE wp_posts SET post_content = REPLACE(post_content, 'â€”', '–'); UPDATE wp_posts SET post_content = REPLACE(post_content, 'â€“', '—'); UPDATE wp_posts SET post_content = REPLACE(post_content, 'â€¢', '-'); UPDATE wp_posts SET post_content = REPLACE(post_content, 'â€¦', '…');
And here are the queries to fix the wp_comments table. This time we will be updating the comment_content column.
UPDATE wp_comments SET comment_content = REPLACE(comment_content, 'â€œ', '“'); UPDATE wp_comments SET comment_content = REPLACE(comment_content, 'â€', '”'); UPDATE wp_comments SET comment_content = REPLACE(comment_content, 'â€˜', '‘'); UPDATE wp_comments SET comment_content = REPLACE(comment_content, 'â€™', '’'); UPDATE wp_comments SET comment_content = REPLACE(comment_content, 'â€”', '–'); UPDATE wp_comments SET comment_content = REPLACE(comment_content, 'â€“', '—'); UPDATE wp_comments SET comment_content = REPLACE(comment_content, 'â€¢', '-'); UPDATE wp_comments SET comment_content = REPLACE(comment_content, 'â€¦', '…');
You probably won’t need to clean up any other tables. In my experience, just cleaning up the above two tables fixes the problem. However, if you do notice corrupt options showing up as well in your admin area, you may need to clean up the wp_options table (or any other affected tables) in a similar way.
Just take a look at the syntax reference above and make changes to correspond with the specifics of your own database, and you should be up and running.
Cleaning Up The Database: The Plugin Method
If you would rather not mess with your database directly and would prefer to somehow fix the problem directly from your WordPress admin dashboard, then you should use this plugin method.
The Better Search Replace plugin can save you all the stress. Boasting more than 700,000 downloads (as of this writing) and a rating of somewhere around 4.5 stars, it is a pretty solid plugin that claims to have incorporated top features from several search-and-replace plugins.
Among other features, Better Search Replace is able to run a test find-and-replace operation on your entire WordPress database to see how many fields (if any) will need be updated.
You can also use the plugin to change URLs after migrating a WordPress installation. But let’s just focus on cleaning up the weird characters in WordPress.
Using The Better Search Replace Plugin
After installing and activating the plugin, a new “Better Search Replace” menu option will be added to the sidebar menu under Tools.
Under the Search/Replace tab, you can search for and replace all the weird characters. You can also select the specific tables on which you want the search and replace operation to be done. Again, don’t forget to first backup your database before using this.
Assuming you are under the Search/Replace tab, in the Search for field, enter the character you’d like to find (the weird character). Then, in the Replace with field, enter the text with which to replace the weird text.
In the Select tables field, select the tables where you want to perform the search-and-replace operation. Likely, they would be the same tables on which we ran the queries from the section above.
If you want to perform a case-sensitive search-and-replace operation, leave the Case-Insensitive checkbox unchecked (searches are case-sensitive by default). You may also want to leave the Replace GUIDs checkbox unchecked. And unless you are just testing out the plugin, uncheck the Run as dry run checkbox.
Click the Run Search/Replace button. The weird characters will be replaced as instructed. Easy.