WordPress: Locking Yourself Out of Syntax Highlighter

As much as we all hate shortcodes in WordPress, there are times we simply have to use them, like the Syntax Highlighter Evolved plugin which is definitely one of the best out there. But what happens when you don’t want to highlight your source code any longer? What happens if you want to disable the plugin?

What happens to those hundreds (or maybe thousands) of posts with code inside? That’s right. Syntax Highlighter “locks you in” to using it and if you deactivate it, all your shortcodes are rendered in plain text. Today we’re going to talk about “locking yourself out” of using the syntax highlighter plugin without breaking your content.

First thing’s first, you shouldn’t delete or deactivate the plugin before you’re absolutely sure that your content will not suffer. To quickly find the posts using the syntax highlighting plugin just do a search for [code or if you've been using the language tags (which is even worse) go for [php and so on.

You can manually edit each and every post if you have only a few. Basically your ultimate goal is to transfer this:

[code language="php"]
$a = array( 1, 2, 3 );
$b = array_pop( $a );
[/code]

Into a preformatted block, supported by most of the available WordPress themes. It doesn't do any highlighting but usually uses a fixed-width font which is perfect for displaying code. Preformatted blocks are rendered with the pre HTML tag. The code above should look like this:

<pre>
$a = array( 1, 2, 3 );
$b = array_pop( $a );
</pre>

So basically all we have to do is some "search and replace" magic (regular expressions) throughout your posts contents. Now there are several ways to do that:

  • Download a database dump, search and replace locally with TextMate, Notepad++ or others
  • Create a database dump and search and replace via sed without downloading the file
  • Use a PREG_REPLACE SQL statement (which is available as a user-defined function for MySQL) and do the replacement on a live database

The last one might be pretty complicated since it involves building and installing a UDF for MySQL (but if you would really like to, here you go.) The first and second methods are both doing the same thing and while the first one could be a little bit easier for beginners, the second one will be much faster once you get used to sed.

sed is a command-line stream editor available on most linux distributions and luckily, it can find and replace text using regular expressions. We'll be running our database dump file through sed in the command line, so make sure you're logged on via SSH and have backed up your database in a safe place, just in case things go wrong ;)

mysqldump -uusername -ppassword database_name > temp.sql
sed -i 's/\[code*\?.*\?\]/<pre>/' temp.sql
sed -i 's/\[\/code\]/<\/pre>/' temp.sql
mysql -uusername -ppassword < temp.sql

The first line creates a database dump file called temp.sql with all the data from database_name. Make sure you provide your own username and password. The second and third line run a search and replace for the code opening and closing shortcodes and replace them with pre HTML tags. Finally, the last line pushes the result back to MySQL.

If you're using language shortcodes instead of the code shortcode you might have to list all the languages you may have code written in on your blog. Here's an example for php, javascript and python:

sed -i 's/\[\(php\|javascript\|python\)\]/<pre>/' temp.sql
sed -i 's/\[\/\(php\|javascript\|python\)\]/<\/pre>/' temp.sql

Oh and here's how this looks in TextMate:

Search and Replace

The regular expressions themselves are a little different since some require escaping, others don't. PREG_REPLACE requires double-escaping for example, so you'll see a lot of backslashes there:

UPDATE wp_posts SET post_content =
    PREG_REPLACE('/\\[code*?.*?\\]/', '<pre>', post_content)
    WHERE post_content LIKE '%[code%';

UPDATE wp_posts SET post_content =
    PREG_REPLACE('/\\[\\/code\\]/', '</pre>', post_content)
    WHERE post_content LIKE '%[/code]%';

So make sure you read on regular expressions in the software you're using and test it out on dummy files before making any changes to your database.

Hopefully this will give you a good start for locking yourself out and not only the syntax highlighting plugin but others too, that heavily rely on shortcodes.

As for Syntax Highlighter Evolved, I think they should have used the same technique used in WP-Syntax which works with the pre tag, simply by looking at it's attributes. So if you decide to stop using WP-Syntax at some point, you won't end up with obsolete shortcodes, your pre tags will work fine:

<pre lang="php">
$a = array( 1, 2, 3 );
$b = array_pop( $a );
</pre>

However, you should also keep in mind that Syntax Highlighter plugins usually encode entities inside the code shortcodes so replacing them with pre might break things. I wrote a little Python script to encode entities between pre tags inside a database dump file, but anyways as Alex Mills (author of SyntaxHighlighter Evolved) points out on Twitter:

@ Probably easier to just write a stub plugin to do shortcode -> pre.
@Viper007Bond

Which is true, but you'll end up with yet another plugin you'll have to carry around together with your content and Alex is right about usability too:

@ I try my best not to lock people into my plugins but sometimes I sacrifice that for the sake of usability.
@Viper007Bond

Thank you for stopping by and hope you enjoyed the post. Let us know if there's something that we've missed, or perhaps we didn't use the best regular expressions? Give us a shout in the comments section and come say "hi" on Twitter!