Task Coach: i18n

A Task Coach user reported today that one of the translations (Simplified Chinese) was not working. Not working meaning that instead of the translated texts in the user interface, the original English texts would be shown. A little investigation showed that most translations were OK, but a few were not.

Now, it is necessary to know how Task Coach deals with translations. Task Coach uses Launchpad for translations. Launchpad provides the translations as .po files. These .po files are transformed into Python sources files that are in turn bundled with the different Task Coach installers/packages.

I noticed that a few of these generated Python source files were missing from the folder where the translations are stored. Asking myself how some of these could be missing while others were not, I decided that one possible explanation would be the Makefile not removing all files. So I checked the "clean" target in the Makefile and indeed, it would only remove the "??_??.py" files and not the "??.py" files. That means zh_CN.py would get removed, but nl.py not. So that explains why some translations, such as Simplified Chinese (zh_CN) and Brazilian Portuguese (pt_BR), were missing and others, such as Dutch (nl) and French (fr), not.

The final question is, of course, how to prevent this from happening ever again. We already have a set of release tests. I guess that adding a release test that checks whether all translations are included in the Task Coach installers and packages should do it.

A recent bug in Task Coach was caused by one of the translations being incorrect. So, I decided it was time to unittest the translations. For each translated string I wanted to check that certain conditions hold. For example, if the original string has a formatting operator (e.g. '%d' for digits) the translated string should contain the same formatting operator. These tests are relatively simple:


for formatter in '%s', '%d', '%.2f':
  self.assertEqual(self.englishString.count(formatter), 
                   self.translatedString.count(formatter))

The challenge is how to create one unittest for each (language, string)-pair. This is not a good solution:


def testMatchingFormatting(self):
  for language in getLanguages():
    for english, translated in language.dictionary():
      ...

because this unittest stops as soon as one translation is incorrect.

My first thought was that I could use decorators to unfold the loop, but after a few feeble attempts I decided I am not smart enough to wrap my head around decorators. After some more experimenting I ended up with the code below. I put the loop outside the test class and explicitly create a new TestCase class for each (language, string)-pair. This generates a lot of unittests (over 7600 for the current version of Task Coach), but they run in less than 0.5 seconds, so that's a small price to pay for increased test coverage.


import test, i18n, meta, string

class TranslationIntegrityTests(object):
  ''' Unittests for translations. This class is 
      subclassed below for each translated string 
      in each language. '''
      
  def testMatchingFormatting(self):
    for formatter in '%s', '%d', '%.2f':
      self.assertEqual(self.englishString.count(formatter), 
                      self.translatedString.count(formatter))
            
  def testMatchingAccelerators(self):
    # snipped


def getLanguages():
  return [language for language in \
          meta.data.languages.values() \
          if language is not None]
        

def createTestCaseClassName(language, englishString, 
  prefix='TranslationIntegrityTest'):
  ''' Generate a class name for the test case class based 
      on the language and the English string. '''

  # Make sure we only use characters allowed in Python 
  # identifiers:
  englishString = englishString.replace(' ', '_')
  allowableCharacters = string.ascii_letters + \
                        string.digits + '_'
  englishString = ''.join([char for char in englishString \
                           if char in allowableCharacters])
  className = '%s_%s_%s'%(prefix, language, englishString)
  count = 0
  while className in globals(): # Make sure className is unique
      count += 1
      className = '%s_%s_%s_%d'%(prefix, language, 
                                 englishString, count)
  return className


def createTestCaseClass(className, language, englishString, 
                        translatedString):
  class_ = type(className, 
                (TranslationIntegrityTests, test.TestCase), 
                {})
  class_.language = language
  class_.englishString = englishString
  class_.translatedString = translatedString
  return class_


for language in getLanguages():
  translation = __import__('i18n.%s'%language, 
                           fromlist=['dict'])
  for english, translated in translation.dict.iteritems():        
    className = createTestCaseClassName(language, english)
    class_ = createTestCaseClass(className, language, 
                                 english, translated)
    globals()[className] = class_

Task Coach

Sunday, November 22, 2009

A bug caused by "make clean"

Sunday, August 26, 2007

Testing translations

Blog Archive

Labels