Sunday, August 26, 2007

Testing translations

A recent bug in Task Coach was caused by one of the translations being incorrect. So, I decided it was time to unittest the translations. For each translated string I wanted to check that certain conditions hold. For example, if the original string has a formatting operator (e.g. '%d' for digits) the translated string should contain the same formatting operator. These tests are relatively simple:

for formatter in '%s', '%d', '%.2f':
self.assertEqual(self.englishString.count(formatter),
self.translatedString.count(formatter))

The challenge is how to create one unittest for each (language, string)-pair. This is not a good solution:

def testMatchingFormatting(self):
for language in getLanguages():
for english, translated in language.dictionary():
...

because this unittest stops as soon as one translation is incorrect.

My first thought was that I could use decorators to unfold the loop, but after a few feeble attempts I decided I am not smart enough to wrap my head around decorators. After some more experimenting I ended up with the code below. I put the loop outside the test class and explicitly create a new TestCase class for each (language, string)-pair. This generates a lot of unittests (over 7600 for the current version of Task Coach), but they run in less than 0.5 seconds, so that's a small price to pay for increased test coverage.


import test, i18n, meta, string

class TranslationIntegrityTests(object):
''' Unittests for translations. This class is
subclassed below for each translated string
in each language. '''

def testMatchingFormatting(self):
for formatter in '%s', '%d', '%.2f':
self.assertEqual(self.englishString.count(formatter),
self.translatedString.count(formatter))

def testMatchingAccelerators(self):
# snipped


def getLanguages():
return [language for language in \
meta.data.languages.values() \
if language is not None]


def createTestCaseClassName(language, englishString,
prefix='TranslationIntegrityTest'):
''' Generate a class name for the test case class based
on the language and the English string. '''

# Make sure we only use characters allowed in Python
# identifiers:
englishString = englishString.replace(' ', '_')
allowableCharacters = string.ascii_letters + \
string.digits + '_'
englishString = ''.join([char for char in englishString \
if char in allowableCharacters])
className = '%s_%s_%s'%(prefix, language, englishString)
count = 0
while className in globals(): # Make sure className is unique
count += 1
className = '%s_%s_%s_%d'%(prefix, language,
englishString, count)
return className


def createTestCaseClass(className, language, englishString,
translatedString):
class_ = type(className,
(TranslationIntegrityTests, test.TestCase),
{})
class_.language = language
class_.englishString = englishString
class_.translatedString = translatedString
return class_


for language in getLanguages():
translation = __import__('i18n.%s'%language,
fromlist=['dict'])
for english, translated in translation.dict.iteritems():
className = createTestCaseClassName(language, english)
class_ = createTestCaseClass(className, language,
english, translated)
globals()[className] = class_