Friday, June 27, 2008

Automating .deb building

Some time ago, Stani helped me to create a Task Coach package in Debian package format (.deb). This is the package format that is also used by Linux distributions derived from Debian, such as Ubuntu. Since I want the release process of Task Coach to be as easy as possible, I decided to automate the package build process as much as possible.

I wrote a distutils command that creates a Debian package from a source distribution, as created by python setup.py sdist. This new distutils command, called bdist_deb, copies the source distribution, unpacks it, adds the necessary Debian control files and compiles the package using the regular packaging tools.

The bdist_deb command takes a large number of parameter since it needs a lot of information to create the .deb. For example, application title, description, version, license information, copyright, author, maintainer, etc. In the case of Task Coach, most of the information is already available in a meta data source file, and easily passed to the bdist_deb command from the distutils setup script.

Since this was specifically developed for Task Coach it is probably not completely generalized. Nevertheless, I hope it provides a starting point for other developers that want to create proper Debian packages for their Python applications.

Check out the bdist_deb command and the invocation of the command in the distutils setup script (called make.py), starting at line 109.

Sunday, April 06, 2008

Migrating from CVS to Subversion

I recently migrated the Task Coach source code from its Sourceforge CVS repository to a Subversion repository, also on Sourceforge. I followed the migration instructions at the Sourceforge website and all went smoothly.

Because the cvs2svn script needs the CVSROOT module present, you can only remove it after cvs2svn has run, i.e. you need remove it from the svndump file that cvs2svn creates. svndumpfilter does do that for you. Unfortunately, the CVSROOT module is present in the svndump file multiple times: not only in the trunk folder, but also in all branch and version folders of the svndump. A script to help with finding all those CVSROOT occurences would be nice.

Marcin ZajÄ…czkowski helped me out with a shell script. I used it as the basis for a little python script that removes specific folders or files from your svndump file. Use as you like.




#!/usr/bin/env python

import os, optparse

usage = '''Usage: %prog dumpfile path [path...]
Remove path(s) from <dumpfile> and write new dump file to <dumpfile>.out'''


class PathRemoverOptionParser(optparse.OptionParser):
def __init__(self):
optparse.OptionParser.__init__(self, usage)

def parse_args(self):
options, args = optparse.OptionParser.parse_args(self)
if len(args) < 2:
self.error('provide both dumpfile and path to remove')
dumpFile, paths = args[0], args[1:]
if not os.path.exists(dumpFile):
self.error('dumpfile (%s) does not exist'%dumpFile)
return dumpFile, paths


def branches(dumpFile, paths, pathPrefix='Node-path: '):
''' Find branches in the dumpFile that contain the paths. '''
for line in file(dumpFile):
if line.startswith(pathPrefix):
for path in paths:
if line.endswith(path+'\n'):
yield line[len(pathPrefix):-1] # yield branch


parser = PathRemoverOptionParser()
dumpFile, paths = parser.parse_args()
excludes = ' '.join(branches(dumpFile, paths))
if not excludes:
parser.error('path(s) (%s) not found'%', '.join(paths))

os.system('svndumpfilter exclude %s < %s > %s.out'%(excludes, dumpFile, dumpFile))

Tuesday, October 09, 2007

Installing with or without administrator privileges

To create the installer for Task Coach on Windows, I use the excellent Innosetup tool. I upgraded to the latest version of Innosetup a while ago but didn't notice immediately that the installer created by this new version of Innosetup required the user to have administrator privileges. Because Task Coach is aimed at ordinary users, that is not acceptable.

It took me some time to find out how to have the installer work for both users with and without administrator privileges. I'm recording the solution here so that other developers may benefit from it.

In the registry section of the Innosetup script I included two versions of the lines that associate the ".tsk" extension with Task Coach. The first four lines (*) are used if the user has administrator privileges (Check: IsAdminLoggedOn), but the last four lines are used if the user has no administrator rights (Check: not IsAdminLoggedOn).

(*) I had to split the lines to prevent them from being clipped. The line continuations are indented.


[Registry]
Root: HKCR; Subkey: ".tsk"; ValueType: string; ValueName: "";
ValueData: "TaskCoach"; Flags: uninsdeletevalue;
Check: IsAdminLoggedOn
Root: HKCR; Subkey: "TaskCoach"; ValueType: string; ValueName: "";
ValueData: "Task Coach File"; Flags: uninsdeletekey;
Check: IsAdminLoggedOn
Root: HKCR; Subkey: "TaskCoach\DefaultIcon"; ValueType: string;
ValueName: ""; ValueData: "{app}\TaskCoach.EXE,0";
Check: IsAdminLoggedOn
Root: HKCR; Subkey: "TaskCoach\shell\open\command";
ValueType: string; ValueName: "";
ValueData: """{app}\TaskCoach.EXE"" ""%%1""";
Check: IsAdminLoggedOn
Root: HKCU; Subkey: "Software\Classes\.tsk"; ValueType: string;
ValueName: ""; ValueData: "TaskCoachFile";
Flags: uninsdeletevalue; Check: not IsAdminLoggedOn
Root: HKCU; Subkey: "Software\Classes\TaskCoachFile";
ValueType: string; ValueName: ""; ValueData: "Task Coach File";
Flags: uninsdeletekey; Check: not IsAdminLoggedOn
Root: HKCU; Subkey: "Software\Classes\TaskCoachFile\DefaultIcon";
ValueType: string; ValueName: "";
ValueData: "{app}\TaskCoach.EXE,0"; Check: not IsAdminLoggedOn
Root: HKCU; Subkey: "Software\Classes\TaskCoachFile\shell\open\command";
ValueType: string; ValueName: "";
ValueData: """{app}\TaskCoach.EXE"" ""%%1""";
Check: not IsAdminLoggedOn

Sunday, August 26, 2007

Testing translations

A recent bug in Task Coach was caused by one of the translations being incorrect. So, I decided it was time to unittest the translations. For each translated string I wanted to check that certain conditions hold. For example, if the original string has a formatting operator (e.g. '%d' for digits) the translated string should contain the same formatting operator. These tests are relatively simple:

for formatter in '%s', '%d', '%.2f':
self.assertEqual(self.englishString.count(formatter),
self.translatedString.count(formatter))

The challenge is how to create one unittest for each (language, string)-pair. This is not a good solution:

def testMatchingFormatting(self):
for language in getLanguages():
for english, translated in language.dictionary():
...

because this unittest stops as soon as one translation is incorrect.

My first thought was that I could use decorators to unfold the loop, but after a few feeble attempts I decided I am not smart enough to wrap my head around decorators. After some more experimenting I ended up with the code below. I put the loop outside the test class and explicitly create a new TestCase class for each (language, string)-pair. This generates a lot of unittests (over 7600 for the current version of Task Coach), but they run in less than 0.5 seconds, so that's a small price to pay for increased test coverage.


import test, i18n, meta, string

class TranslationIntegrityTests(object):
''' Unittests for translations. This class is
subclassed below for each translated string
in each language. '''

def testMatchingFormatting(self):
for formatter in '%s', '%d', '%.2f':
self.assertEqual(self.englishString.count(formatter),
self.translatedString.count(formatter))

def testMatchingAccelerators(self):
# snipped


def getLanguages():
return [language for language in \
meta.data.languages.values() \
if language is not None]


def createTestCaseClassName(language, englishString,
prefix='TranslationIntegrityTest'):
''' Generate a class name for the test case class based
on the language and the English string. '''

# Make sure we only use characters allowed in Python
# identifiers:
englishString = englishString.replace(' ', '_')
allowableCharacters = string.ascii_letters + \
string.digits + '_'
englishString = ''.join([char for char in englishString \
if char in allowableCharacters])
className = '%s_%s_%s'%(prefix, language, englishString)
count = 0
while className in globals(): # Make sure className is unique
count += 1
className = '%s_%s_%s_%d'%(prefix, language,
englishString, count)
return className


def createTestCaseClass(className, language, englishString,
translatedString):
class_ = type(className,
(TranslationIntegrityTests, test.TestCase),
{})
class_.language = language
class_.englishString = englishString
class_.translatedString = translatedString
return class_


for language in getLanguages():
translation = __import__('i18n.%s'%language,
fromlist=['dict'])
for english, translated in translation.dict.iteritems():
className = createTestCaseClassName(language, english)
class_ = createTestCaseClass(className, language,
english, translated)
globals()[className] = class_

Saturday, May 26, 2007

False positive

A few days ago AVG Anti-virus started reporting a trojan horse in Task Coach (0.63.2). A little investigation with the help of other py2exe users indicates that AVG detects a trojan in a specific part of py2exe. Py2exe is a program that is used to bundle python source code and the python interpreter into an executable that can be easily installed on Windows machines and doesn't require users to install python. It seems that someone wrote a trojan in python and bundled it with py2exe. Apparently, AVG is now triggered by py2exe instead of a signature that is specific for that trojan horse. It probably means that all applications bundled with py2exe are affected as well. What a bummer. But also kind of interesting to see how other applications, that have nothing to do with Task Coach itself, can cause bug reports about Task Coach.

Saturday, October 14, 2006

The need for branches

Until now, I have tried to create a new release of Task Coach every two to four weeks. Each release would contain a mixture of bugfixes and new features. The nice thing of this release strategy is that I don't need to create multiple branches in the version control system. And simple is good, right?

However, the feature I'm working on right now (hierarchical categories) turns out to be harder than I thought. At the same time, people have been reporting some bugs on the latest release that are pretty easy to fix. But, since I'm halfway the development of the hierarchical categories feature, the code is not in a releaseable state. So, I cannot release those bugfixes although they are in the version control system. And that is a waste, right?

So, how to resolve this situation? The classical solution is to add a separate branch for the latest release, apply the bugfixes to that branch, and release a bugfix release from that branch. After that, one only needs to merge those bugfixes into the mainline and everything is fine again. Except for the added complexity of having to deal with multiple branches and merging, that is. The alternative solution, the one I have been using so far, is to add functionality in much smaller steps. Steps that are so small that the code is never 'not in a releaseable state' for more than, say, two weeks. Apparently, I have been not applying that strategy successfully lately, leading to the need for branching.

Anyway, I'll try the branching, see how it works out, and then decide on how to proceed in the long run.

Thursday, September 21, 2006

Duplication smells

Duplication. A programmer's worst nightmare. Does that sound over the top? Maybe it does, but I guess most programmers would agree that duplication is one of the worst code smells around. So, a programmer should do everything to prevent and reduce duplication, right? Well, I think that in the case of simple duplication we all agree on that. With simple I mean that two pieces of code are textually equal or very similar, e.g.:

patterns.Publisher().registerObserver(self.onMatchAllChanged,
eventType='view.taskcategoryfiltermatchall')
patterns.Publisher().registerObserver(self.onFilteredCategoriesChanged,
eventType='view.taskcategoryfilterlist')
patterns.Publisher().registerObserver(self.onAddCategoryToTask,
eventType='task.category.add')
patterns.Publisher().registerObserver(self.onRemoveCategoryFromTask,
eventType='task.category.remove')


This is actual code from Task Coach (blush) and represents a component registering itself as observer for different event types. Of course, this type of duplication is rather easy to fix, e.g.:

callbacks = {'view.taskcategoryfiltermatchall': self.onMatchAllChanged, ...}
publisher = patterns.Publisher()
for eventType, callback in callbacks.items():
publisher.registerObserver(callback, eventType)

But then there's also duplication that's not strictly textual, but more structural. For example, two for-loops looping over the same datastructure but with a different body. An example from Task Coach again. These two event handlers update a check-listbox (a list with items that can be checked and unchecked by the user) when categories are added to or removed from a task:

def onAddCategoryToTask(self, event):
for category in event.values():
if self._checkListBox.FindString(category) == wx.NOT_FOUND:
self._checkListBox.Append(category)
self.Enable(len(self.__taskList.categories()) > 0)

def onRemoveCategoryFromTask(self, event):
for category in event.values():
if category not in self.__taskList.categories():
self._checkListBox.Delete(self._checkListBox.FindString(category))
self.Enable(len(self.__taskList.categories()) > 0)


Removing this type of duplication is more tricky. Refactoring to something like the following should work (untested):

def updateCategoriesListbox(self, categories, updateNeeded, updateListbox):
for category in categories:
if updateNeeded(category):
updateListbox(category)
self.Enable(len(self.__taskList.categories()) > 0)

def onAddCategoryToTask(self, event):
def updateNeeded(category):
return self._checkListBox.FindString(category) == wx.NOT_FOUND

def updateListbox(category):
self._checkListBox.Append(category)

self.updateCategoriesListbox(event.values(), updateNeeded, updateListbox)

def onRemoveCategoryFromTask(self, event):
def updateNeeded(category):
return category not in self.__taskList.categories()

def updateListbox(category):
self._checkListBox.Delete(self._checkListBox.FindString(category))

self.updateCategoriesListbox(event.values(), updateNeeded, updateListbox)


But this raises the question: is the increase in lines of code and number of functions worth the reduction of duplication? Or did I miss a more simple solution?

Wednesday, September 20, 2006

Steering open source development

So far, steering the development of Task Coach has been very easy. Since I'm basically the only developer and it's a hobby project that I work on after hours, I've been adding functionality that I thought would be a good idea to add. Judging from reactions by users and on websites I think I did pick valuable features most of the time so far.

However, the list of feature requests has been growing much faster than I have been able to develop them. So, unless I get more people to work on Task Coach, I need to prioritize between features. I have been thinking about different possible ways to do that:
  • Developing a release plan with a different focus per release. For example, the focus for a 1.0 release could be to add all functionality that would make Task Coach a full fledged stand-alone task manager. Then the 2.0 release could focus on integrating Task Coach with e-mail and calendar software (e.g. using vCal).
  • Letting people vote on features. Features with the most votes get implemented first. Unfortunately, Source Forge doesn't support anything like this, as far as I know.
  • Donation driven development. The features that gets the most donations is implemented first.
Maybe a combination of the above could work too. I'll have to ponder this issue some more.

About this blog

Software development is fun and hard at the same time. This blog discusses my experience with developing an open source desktop application called Task Coach. Task Coach is a task manager that supports hierarchical tasks, budget and time registration, categories, and more. For me, Task Coach is also a vehicle to experiment with software development, tools, and techniques. For example, I try to use test-driven development to steer the development of the software. So far, I like the way test-driven development allows me to add new functionality in a safe and gradual manner. I also like how having an extensive set of automated unit tests (+/- 1700 at the moment, that run in about 30 seconds) allows me to refactor the source code without (too much) fear of seriously breaking the application. Anyway, the plan is to use this blog to talk about ideas about software development, and as much as possible, discuss experience with applying this ideas to Task Coach.