Localizing WordPress Themes and Plugins

Now we get to the heart of this guide: localizing a theme or plugin. To demonstrate the techniques involved we will look at a file from the default WordPress theme, which is surprisingly not localized. The file is:

wp-content/themes/default/index.php

The exact same process involved in localizing a theme applies to plugins.

Text domains

The first step in preparing your theme or plugin is to decide on a text domain. A text domain is a way for you to separate your localized messages from localized messages in other themes and plugins, and from the rest of WordPress itself. A text domain is simply a text string that identifies your theme or plugin, and is typically the name of the theme or plugin (without any .php extension). Using the default theme as an example, we will chose a text domain of kubrick. Once we’ve chosen the domain we need to tell WordPress about it.

For plugins we insert the following function in an appropriate part of the plugin file and before any text is output:

load_plugin_textdomain ($domain);

Ideally this will be inserted after all other plugins have loaded to allow the best compatibility. This can be achieved by loading the domain in the init action.

add_action( 'init', 'my_plugin_init' );

function my_plugin_init() {
  load_plugin_textdomain( 'mydomain' );
}

For themes we insert the following function at the top of functions.php in the theme directory:

load_theme_textdomain ($domain);

Our theme or plugin is now setup to receive a localization. It is important to note that we must specify the text domain whenever we use the WordPress localization functions:

  • __($text, $domain) – Looks for a translated version of $text in text domain $domainand returns the result
  • _e($text, $domain) – Looks for a translated version of $text in text domain $domainand echo the result to the screen (i.e. effectively it is echo __($text))

If we do not specify the text domain then WordPress assumes the localization is contained within the core locale files.

Marking text strings

As we’ve already seen, marking a string is simply a case of wrapping it inside an appropriate WordPress function. Using the following highlighted version of index.php, we will look at individual pieces of text and explain how they should be marked.

<?php get_header(); ?>

<div id="content" class="narrowcolumn">

<?php if (have_posts()) : ?>

	<?php while (have_posts()) : the_post(); ?>

		<div class="post" id="post-<?php the_ID(); ?>">
			<h2>
				<a href="<?php the_permalink() ?>" rel="bookmark" 
					 title="Permanent Link to <?php the_title(); ?>">
					<?php the_title(); ?>
				</a>
			</h2>
			<small>
				<?php the_time('F jS, Y') ?> <!-- by <?php the_author() ?> -->
			</small>

			<div class="entry">
				<?php the_content('Read the rest of this entry &raquo;'); ?>
			</div>

			<p class="postmetadata">
				Posted in <?php the_category(', ') ?> | 
				<?php edit_post_link('Edit', '', ' | '); ?>  
				<?php comments_popup_link('No Comments &#187;', '1 Comment &#187;', 
																	'% Comments &#187;'); ?>
			</p>
		</div>

	<?php endwhile; ?>

	<div class="navigation">
		<div class="alignleft">
			<?php next_posts_link('&laquo; Previous Entries') ?>
		</div>
		<div class="alignright">
			<?php previous_posts_link('Next Entries &raquo;') ?>
		</div>
	</div>

<?php else : ?>

	<h2 class="center">Not Found</h2>
	<p class="center">
		Sorry, but you are looking for something that isn't here.
	</p>
	<?php include (TEMPLATEPATH . "/searchform.php"); ?>

<?php endif; ?>

</div>

<?php get_sidebar(); ?>

<?php get_footer(); ?>

First up is the title that appears when you hover over a link (from line 12):

Permanent Link to 

This uses the function the_title (), which displays the post title resulting in text such as ‘Permanent link to My First Post‘. Obviously we don’t want to localize every post title so we need to indicate that the post title is added at run-time. A simple first attempt at localizing this might look like:

<?php _e ('Permanent Link to', 'kubrick') ?> <?php the_title (); ?>

This would indeed produce the correct result of ‘Permanent Link to My First Post‘. However, it assumes that the localized language uses the same rules as English and that the subject ‘My First Post‘ will follow ‘Permanent Link to‘. This is not always the case and another language may need to reorder the words to look something like ‘My First Post is a permanent link‘. The translator may not actually be able to directly translate the English (the language or grammar may not support a similar concept) and could be required to dramatically change the meaning of the text, even to the point of removing the run-time title. This apparently simple line demonstrates the need to be careful about making assumptions based upon your own language.

A better attempt is:

<?php printf(__ ('Permanent Link to %s', 'kubrick'), get_the_title())?>&quot;>

Here we use the PHP function printf which allows us to insert run-time data. We pass this function the localized string ‘Permanent Link to %s‘, as well as the run-time value received from the function get_the_title () (remember that the_title displays the title, while get_the_title returns it to printf). The printf function will then insert the runtime title where the %s symbol is, resulting in a correct display and also allowing a translator to change the position of the run-time data in a localization:

msgid "Permanent Link to %s"
msgstr "%s is a permanent link"

Next we have the time display (line 17):

<?php the_time('F jS, Y') ?> 

Localizing time can be a complicated process and will be dealt with separately later. The important thing to remember here is that we need to give the translator the ability to change how the date and time is displayed as it is unlikely to be the same as in English.

<?php the_time(__ ('F jS, Y', 'kubrick')) ?> 

A simple line to display the post content (from line 21):

<?php the_content('Read the rest of this entry &amp;raquo;') ?>

In general you should always include punctuation in localized strings as this allows a translator to remove it when not appropriate. In this case the » symbol is included.

<?php the_content(__('Read the rest of this entry &amp;raquo;', 'kubrick'))?>

Next is a localization that requires some thought:

<?php comments_popup_link('No Comments &amp;#187;','1 Comment &amp;#187;','% Comments &amp;#187;'); ?>

The problem here is that while in English there are three variations of the phrase (for none, one, and more than one), some languages have more or less variations. The WordPress function comments_popup_link () only allows three to be specified, so how do we work around this? The answer for the first two variations is simple: we do nothing special:

__ ('No Comments &amp;#187;', 'kubrick'),
__ ('1 Comment &amp;#187;', 'kubrick')

The translator is able to localize this according to their language with no ill effects if they require more or less variations. The third case is different and we need to allow additional variations. WordPress provides us with a specific function called __ngettext. Details of this function will be given in the next section

__ngettext ('%d comment', '%d comments', get_comments_number (), 'kubrick')

This results in a final localization of:

<?php comments_popup_link(__ ('No Comments &amp;#187;', 'kubrick'), __ ('1 Comment &amp;#187;', 'kubrick'), __ngettext ('% comment', '% comments', get_comments_number (),'kubrick')); ?>

Plurals

Plurals are tricky enough in English, and you cannot assume that other languages follow the same rules. Chinese, for example, requires no additional suffixes for plurals, while Czech has very complicated rules. Fortunately WordPress has already thought of this and you can localize plurals using the special function __ngettext:

<?php __ngettext ($single, $plural, $number, $domain); ?>

This function takes three parameters (as well as the domain).

  • $single – The text when $number is 1
  • $plural – The text when $number is greater than 1
  • $number – The actual number

In the following .PO file you can see how this would look in Czech:

msgid "%d window"
msgid_plural "%d windows"
msgstr[0] "0 oken"
msgstr[1] "1 okno"
msgstr[2] "2 okna"

Note how Czech has more cases than English.

Dates and times

Dates and times are one of the most tricky aspects of localization due to the variety of words, formatting, and positioning. Let’s look at a few possibilities to show how complicated the situation is.

  • US English – 4/28/2007
  • UK English – 28/4/2007
  • Chinese – 2005年2月27日

We can see that not only do we need to translate the words, but we also need to be able to change the position and change the formatting.

Most dates will be displayed using the PHP function date. This allows the date to be configured according to various format settings (see the PHP date page for full details). As such, any date format string should always be localized to allow a translator to modify it:

date ('l, F js, Y', 'myplugin');

However, this assumes that the host running your site will be configured in the appropriate locale. Typically this is not the case. To get around this problem WordPress provides an additional function that accepts a date format string, but internally translates any day or month strings for you.

mysql2date (__ ('l, F js, Y', 'myplugin'));

Numbers and currencies

Like dates and times, numbers and currencies are also very tricky. For example:

  • 1,000,000.53
  • 1.000.000,53
  • 1000000.53
  • $4.00
  • £4.00
  • 4 HK$

If you are displaying a lot of numbers and it is important to have the correct separators and currency formatting then you will need to devise your own method of allowing the user to configure the output. Typically this will include:

  • Number separator (1,000.50 or 1.000,50)
  • Currency symbol ($, £, ¢ etc)
  • Currency position (before, after, with or without space)

It should be noted that PHP provides a useful function number_format which will take a number and format it according to the parameters you specify:

number_format (1000.50, 2, '.', ',');

This will display the number 1000.50 to two decimal places using a period as the decimal separator and a comma to separate thousands. The output value will be 1,000.50.

Word order

Word order is not usually an issue until you start inserting multiple values at run-time. Consider the following phrase:

Today is %s and the weather is %s

This seems innocent enough. However, in another language the word order may need to be reversed. Without any additional help this would be impossible – even if we translated this to ‘The weather is %s and today is %s‘, the data that is inserted at run-time would still be in the wrong order (we would end up with ‘The weather is Sunday and today is sunny‘).

To cater for this you can use placeholders. A placeholder is similar to the existing printf-style strings, but also defines a place:

Today is %1$s and the weather is %2$s

Here we have defined that the first runtime parameter will contain the day, and the second runtime parameter will contain the weather. This can then be translated into whatever order is needed:

The weather is %2$s and today is %1$s

This time the correct run-time data will be inserted into the appropriate position.

JavaScript

Text messages in JavaScript also need to be localized. Without resorting to dynamically-generated JavaScript we can use a very simple technique whereby all text is removed from the JavaScript and replaced with string variables. These variables are then generated at run-time using PHP.

For example, you may have a JavaScript file that includes this function:

function show_result () {
  alert ('You message was successfully received');
}

First we replace the text with a string variable:

function show_result () {
  alert (plugin_result_message);
}

Then in the HEAD section of the page we can generate the value at runtime:

var plugin_result_message = '<?php echo esc_js( 'Your message was successfully received' ); ?>';

Closing thoughts on localization

The most important rule when localizing a file is to never assume that a localization will use the same rules as yours. If you are adding text at run-time then always give the translator the ability to change its location.

Also note how we only mark text that is shown to the user and we don’t bother to mark text that is used internally. This is especially important with plugins, and you do not have to localize HTML field names, database columns, or JavaScript function names. However, you do need to localize form labels.

48 comments

  1. This is a great tutorial! After reading it I’m considering to internationalize my blog theme.

    However, the instructions about load_plugin_textdomain are incompatible with Gengo, a Compatibility Page:

    Just like the code that adds widget-capabilities to plugins, calls to load_plugin_textdomain cannot be made immediately. Plugins must call load_plugin_textdomain inside a function that runs on the ‘init’ hook, or at the earliest, the ‘plugins_loaded’ hook. Plugins that do not do this are coded incorrectly, according to advice from WordPress core developers.

  2. Hi Leonardo, I’ve updated the guide to reflect this. While it may cause incompatibility with Gengo, the method is only a suggestion by the WordPress developers and not a requirement (according to the Codex). Still, it is better to show the ideal method!

  3. I managed to create a pot file (this step is missing in the tutorial), translated it and compiled the po file into a mo file. "F jS, Y" is translated as "jS \\d\\e F \\d\\e Y" and I get times like "5th de April de 2007". PHP or WordPress aren’t translating "5th" to "5º" and "April" to "abril". I did set my browser and wp-config.php to pt_BR. What else should I do to get "5º de abril de 2007"?

  4. Leonardo,

    Producing POT files is covered in Translating WordPress Themes & Plugins, this article is just concerned with how to put the appropriate PHP code into a theme or plugin.

    How are you displaying the date in your code? For WordPress to replace the months you must be showing the date through a WordPress function (and not just a PHP function). I’m not sure that PHP or WordPress would convert ‘5th’ to ‘5º’

    %d comment should be correct!

  5. Hello, John!

    Too bad I found out that article only after commenting here (and googling for the solution). It explains the procedure very well, thanks!

    About the time format, I’m using (get_)the_time. If there’s better fitting solution, please let me know! Ideally, I’d use a function which already knows which time formats are used in any locale.

    On %d and %, I don’t know why, but only % worked for me. From my background as a free software translator, I’m used to %d, %s and even {} but never knew about a %.

  6. If you are using that function then WordPress should convert months. Is the locale actually being loaded? Does the locale translate the month strings?

  7. In fact, WordPress does translate month names correctly in my blog, but not in my local test site (so that’s my fault). But the "th" isn’t properly translated, and neither does WordPress seem know the proper date format for each locale. Is there anything I can do about the two last issues?

  8. Hello, that’s me again. I’m having a hard time with the printf, because many template tags display the result directly, instead of returning to printf. In example:

    '.comment_author_link().''); ?>

    In this line, we get comment_author_link then ' Says:'

  9. I’ll place some extra spaces to preserve the code:
    < ?php printf (__ ('%s Says:', 'aqLite'), ''.comment_author_link().'te>'); ? >
    And:
    < ?php comment_author_link() <ci te></ci te> Says: ? >

  10. Leonard,

    You were correct about the __ngettext example! The reason is that the % is being passed into a special WordPress function, not printf. The WordPress function requires % and not %d.

    I don’t believe WordPress translates the ordinal suffix at all, and even PHP itself says that the ‘S’ date modifier is for English suffixes only. It looks like you’ll have to invent your own method to work around this!

    Most WordPress functions have two versions – one that echos and one that returns. If you using the data inside printf then you need the return version. In the case of comment_author_link you will need to use get_comment_author_link.

  11. About the suffix, I really can live without it. In Brazil we use it only for the first day of the month, even if (as far as I know) it’s not a formal rule.

    Thank you for your better-than-codex guidance! I’ll try it as soon as I get some more time for the WordPress theme, and then I’ll give you some feedback.

  12. Hello again, Jonh!

    I’m making progress on the theme i18n, and it should be released soon. The almost universal get_ tips fixed most of my issues!

    Unfortunately, there’ no get_comments_rss_link; and get_comment_type doesn’t let us specify the strings for comment, pingback and trackback as comment_type does. I just discovered this functions (in WordPress source code) are quite simple, so I created similar ones to match my needs.

    I also made sprintf and __ngettext work together. I’ll send you example code, as I believe the article could show it. I’ll use the contact form, because the comment form restricts code.

  13. Thank you so much for sharing this. I would be grateful if you can cover also the bi-di languages (Arabic, Urdu, Farsi, Hebrew, etc) and how to flip and reverse the whole layout to be right to left. That would be great and I think it will add a lot to your efforts to reach broader audience.

    Thanks again
    Mohamed

  14. John,

    Thanks for the article–it got me started nicely on internationalising my plugins–but I very quickly hit a brick wall. I can’t get xgettext to work on my Windows development machine. It doesn’t seem to recognise PHP files at all. Someone told me the win versions were rather out of date. Do you know of any ways round this road-block? Thanks for your help.
    Rob

  15. A great tutorial indeed, thank you so much!. I’m using it for internationalization of the excellent TMA theme (The Morning After). Sadly, the tutorial doesn’t seem to explain the ngettext thingie right. In TMA a similar function (comments_popup_link) is often called, but when I change the relevant passage the way you suggest, it gets messed up. Precisely, I change


    into

    When there is one comment or two – all gets translated just fine. But poedit doesn’t see the other possibilities (3 comments, 4 comments, and so on) and where there are three comments or more – no text is visible at all. Any ideas?

  16. BTW, the xhtml code function doesn’t seem to work on your site, contrary to what a tooltip above the comment area says.

  17. Bah! I sorted it out finally. For some strange reason PoEdit couldn’t handle my language settings properly and when I __ngettex the %, it returned errors and added some strange referral to #, php-format file, even though there’s no such file in the theme I’m translating (and returned an error when saving).

    However, I finally decided to do it the hard way, opened the .po file in Notepad ++, then added the following “by hand”

    #: home.php:42
    #: single.php:44
    msgid “% comment”
    msgid_plural “%d comments”
    msgstr[0] “0 komentarz”
    msgstr[1] “% komentarze”
    msgstr[2] “% komentarzy”

    Then I reloaded the file in PoEdit (but without reloading the files from the folder) and saved – without any problems this time. Just don’t trust PoEdit and you’ll be fine 🙂

  18. Great article, was very helpful for localizing a wordpress theme! Unfortunately, there are some typos in the code examples (spaces before brackets), that took some extra time to find and correct to get the code working. You might want to correct those. Peace.

  19. Hello there,
    I need your help on this topic.
    I’ve already translated the whole theme (Arras 1.4) and MOST of the content, when i load the page on the second language, is translated.
    I say most because some words like: “Featured Stories”, “Latest Headlines” and the “Home” menu link remain untouched, which is kinda confusing because i translated them as well, just like all other strings.

    Can anyone help me?
    thx in advance,
    Onco

  20. I understand how to mark my theme and what the role of the pot, po and mo files is – in theory. In reality, the article says to run “a localization tool” to generate them. Great. Care to tell us what “a tool” is?

    I also read the WordPress Codex pages and they refer to a GNU gettext tool. Had a look at their site http://www.gnu.org/software/gettext/ and it’s clear they went to great lengths to make it as confusing as possible for normal people to get this to work.

    I’m not a GNU/Linux/SVN/computer science expert. I’m just a blogger with basic programming skills, want to localize my blog and move on with my live. Can someone please explain in “idiot English” what tools I need and how to run them? Thanks, I’d really appreciate it!

  21. Hi John, nice article, just the Czech language example is not quite right, it should be:
    0 oken
    1 okno
    2 okna

    in fact, it is:
    0, 5, 6, 7, 8, 9, 10, 11… oken
    1 okno
    2, 3, 4 okna

    btw this is only one of many types of declension (there are 4 types for neutral, 4 types for feminine and 6 types for masculine nouns, each declension has 7 cases for singular and 7 cases for plural from) – FUN! – no wonder most Czech people are unable to use correct Czech.

Comments are closed.