Common translation errors in Moodle language packs

http://web.archive.org/web/20110212133444im_/http://gl.ib.ly/templates/default/img/emoticons/wink.pngDigg This
Share on Facebook+1Share on LinkedInPin it on PinterestSubmit to redditSubmit to StumbleUponShare on TumblrShare on Twitter

If you have ever had to deal with language packs you will know how much of a pain they can be. I regularly interact with 9 Moodle language packs and I don’t like doing that much. For some reason, can’t remember why now, we gave people the raw language string files (PHP code files) to translate and they translated these files and sent them back. Only these people are not PHP coders and they make mistakes. Here is my list of what goes wrong and how to solve.


*Incorrect syntax*
They edit the files and send them back to us, but they delete a semicolon or something. Check all language pack files with the following shell script.

#!/bin/bash
for i in $( ls *.php)do
php -l $i
done

On windows try:

@echo off
FOR %%A IN (*.php) DO php -l %%A

*UTF8 Header*
This is prepended to the start of some files by certain editors. As the header comes before the<?PHP tag the header gets output and this will do all sorts of nasty things like corrupting user sessions. The solution is to strip utf headers from all files.

I use a script to add and strip UTF8 headers from entire directories.

*Translated variable names*
When the translators see something like $a->name they sometimes translate name. Again this isn’t easy to find but I use the following line to get malformed variable names.

grep -nroe “\$a->[^ .\<>/';:\!,\?)(]*” PATH_TO_DIR | grep -v “\$a->[a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*”
PATH_TO_DIR/block_class_list.php:14:$a->
PATH_TO_DIR/block_task_list.php:30:$a->
PATH_TO_DIR/block_task_list.php:34:$a->
PATH_TO_DIR/block_task_list.php:34:$a->ì 목

This has caught all issues of this kind to date (for me), but it does rely on illegal variable names. A better approach would be to keep a whitelist of variables extracted from a known good language pack. If we find anything that looks like a variable in a another language pack and doesn’t appear on this list we can throw an error.

*Converting files to UTF-8 format*
Sometimes countries (especially those using accented languages) tend to send us ASCII encoded files, these files will have to be converted to UTF-8 format so that accented characters are display correctly.

Here is a one liner to convert all files in the current directory to UTF-8 format:

for f in ./*; do `mv $f $f“.bak”; iconv –from-code iso-8859-1 –to-code UTF-8 $f“.bak” > $f`;done

If you are happy with changes do a

rm *.bak

See also: ISO 8859-1 and Converting files to UTF-8.

*Other things which go wrong*
Translators might change $string['mystring'] = 'Hello $a!'; to $string['mystring'] = "Hello $a!";. Subtly $a will just not appear in the output as the double quoted string will be evaluated before Moodle can swap $a with some dynamic data. The following one liner catches most errors for me even though it isn’t complete.

grep -nro “= \”” PATH_TO_DIR

I guess you could lump all that together in one nice big shell script. If you do send me a copy ;-)

http://web.archive.org/web/20110212133444im_/http://gl.ib.ly/templates/default/img/emoticons/wink.pngDigg This
Share on Facebook+1Share on LinkedInPin it on PinterestSubmit to redditSubmit to StumbleUponShare on TumblrShare on Twitter
Tagged with: , , , , ,
Posted in Stuff

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>