Task Coach: version control

As announced in a previous posting, I am investigating which of the CMMI goals are achieved by the Task Coach project. The second process area I am looking at is Configuration Management (CM). CM is a Support process area at level two of the CMMI for Development. Because CM is part of the core model foundation, it is also part of the other two CMMI constellations, CMMI for Services and CMMI for Acquisition. But here, I'll be looking at CM from a development perspective.

According to the CMMI, "The purpose of Configuration Management (CM) is to establish and maintain the integrity of work products using configuration identification, configuration control, configuration status accounting, and configuration audits." Work products are defined as "In the CMMI Product Suite, a useful result of a process. This can include files, documents, products, parts of a product, services, process descriptions, specifications, and invoices. A key distinction between a work product and a product component is that a work product is not necessarily part of the product." So, work products for Task Coach are obviously product components like source code and tests, but also include things like the Task Coach website and announcement emails. The purpose of CM is to establish and maintain the integrity of these work products.

CM has three specific goals and, like all CMMI process areas, five generic goals. For now, I'll only be looking at specific goals.

The first specific goal (SG1) of CM reads "Baselines of identified work products are established." CMMI defines a baseline as "A set of specifications or work products that has been formally reviewed and agreed on, which thereafter serves as the basis for further development, and which can be changed only through change control procedures." CMMI expects organisations to perform three practices to achieve this goal.

The first specific practice (SP1.1) is to "Identify the configuration items, components, and related work products that will be placed under configuration management." Almost all Task Coach work products are placed under some form of configuration management:

Source code, including automated tests, third party sources, and release scripts, are put in the Sourceforge Subversion repository.
The website sources are in the Subversion repository as well.
Bug reports are kept in the Sourceforge bug tracker.
Feature requests are tracked using the UserVoice website.
Support requests are tracked with the Sourceforge support request tracker.
Translations are both in the Subversion repository and Launchpad so that translators can edit them.
Mailinglist discussions are archived.
Old releases of Task Coach are archived.

This seems pretty complete to me, so I'd say this practice is fully implemented by the Task Coach project.

The second specific practice (SP1.2) for SG1 is "Establish and maintain a configuration management and change management system for controlling work products." Subversion is the configuration management system for source code and the website. For bug reports and support requests we use the Sourceforge trackers. For feature requests we use UserVoice. For translations we use Launchpad. For mailinglists we use Yahoo Groups and its archives. Old releases of Task Coach are archived at Sourceforge. I think we've got this one covered as well.

The third specific practice (SP1.3) is "Create or release baselines for internal use and for delivery to the customer." As specified in our developer info, under the Subversion usage conventions heading, we create branches in Subversion for each feature (x.y) release and tag each bug fix (x.y.z) release. The change history details for each release (=baseline) which features are added and which bugs are fixed. CMMI says that baselines have to be formally reviewed. That's not something we do. Instead, we make sure Task Coach is always ready for release. The head of a release branch is always ready for release. The trunk most of the time. The reason for this approach is that users can benefit from fixed bugs and new features as soon as possible. I guess that makes this practice largely implemented.

Since SP1.1 and SP1.2 are fully implemented and SP1.3 largely, I would say that the first specific goal of CM is achieved.

The second specific goal of CM is "Changes to the work products under configuration management are
tracked and controlled." CMMI expects organisations to implement two practices to achieve this goal.

The first practice of the second goal (SP2.1) of CM says "Track change requests for the configuration items." Change requests include changed or new requirements as well as bug reports. As mentioned above, we track change requests in the UserVoice system and the Sourceforge bug tracker. For each change request we keep track of the status, closing it when a new release of Task Coach is available that includes the new feature or fix. Conclusion, this practice is implemented by the Task Coach project.

The second practice of the second goal (SP2.2) is "Control changes to the configuration items." CMMI explains that control means: "Control is maintained over the configuration of the work product baseline. This control includes tracking the configuration of each of the configuration items, approving a new configuration if necessary, and updating the baseline." We track changes to the Subversion repository by means of a commit-message mailinglist. Whenever a developer checks in changes to the repository, an email message is mailed to that mailinglist, notifying the other developer (and possible other interested parties) of the changes. Only developers are allowed to make changes to the source code. Users are allowed to submit bug reports and feature requests, but the status is monitored and updated by the developers. The baseline is updated as part of the release process. Practice fully implemented, I'd say.

Since both specific practices of SG2 are implemented, this must mean SG2 is achieved.

The third specific goal (SG3) of CM says that "Integrity of baselines is established and maintained." To achieve this goal, CMMI expects organisations to implement another two specific practices.

The first specific practice (SP3.1) of the third goal is "Establish and maintain records describing configuration items." CMMI suggests, in the subpractices of SP3.1, to record configuration management actions and ensure that relevant stakeholders have access to and knowledge of the configuration status of the configuration items. We do this via Subversion and the commit-messages mailinglist mentioned above for the source code and via the Sourceforge bug tracker, UserVoice feature requests, and the change history. CMMI also suggests to specify the latest version of the baselines. This is done quite prominently on the Task Coach website, by means of announcements via the Task Coach users mailinglist and via twitter. CMMI also suggests to identify the versions of configuration items that constitute a particular baseline. We do this by tagging each release in Subversion. Differences between baselines are described in the change history mentioned before. And finally, CMMI advices us to revise the status and history of configuration items as necessary. Again, Subversion supports this for source code. For other configuration items, such as bug reports and feature requests, the status is updated by the developers using the administrative user interface provided by the Sourceforge and UserVoice websites. Conclusion, practice fully implemented.

The second specific practice (SP3.2) of SG3 reads "Perform configuration audits to maintain integrity of the configuration baselines." CMMI defines configuration audit as "An audit conducted to verify that a configuration item, or a collection of configuration items that make up a baseline, conforms to a specified standard or requirement." This something we do not do on a regular basis. We probably should, because we often run into bug reports that should be closed because the reported bug has been fixed, or feature requests that should be closed because the feature has been implemented. Often, this is caused by duplicate bug reports and feature requests. Anyhow, this practice is not implemented by the Task Coach project.

Since one practice of SG3 is fully implemented and one is not, SG3 is not satisfied.

I recently migrated the Task Coach source code from its Sourceforge CVS repository to a Subversion repository, also on Sourceforge. I followed the migration instructions at the Sourceforge website and all went smoothly.

Because the cvs2svn script needs the CVSROOT module present, you can only remove it after cvs2svn has run, i.e. you need remove it from the svndump file that cvs2svn creates. svndumpfilter does do that for you. Unfortunately, the CVSROOT module is present in the svndump file multiple times: not only in the trunk folder, but also in all branch and version folders of the svndump. A script to help with finding all those CVSROOT occurences would be nice.

Marcin Zajączkowski helped me out with a shell script. I used it as the basis for a little python script that removes specific folders or files from your svndump file. Use as you like.


#!/usr/bin/env python

import os, optparse

usage = '''Usage: %prog dumpfile path [path...]
Remove path(s) from <dumpfile> and write new dump file to <dumpfile>.out'''


class PathRemoverOptionParser(optparse.OptionParser):
    def __init__(self):
        optparse.OptionParser.__init__(self, usage)

    def parse_args(self):
        options, args = optparse.OptionParser.parse_args(self)
        if len(args) < 2:
            self.error('provide both dumpfile and path to remove')
        dumpFile, paths = args[0], args[1:]
        if not os.path.exists(dumpFile):
            self.error('dumpfile (%s) does not exist'%dumpFile)
        return dumpFile, paths


def branches(dumpFile, paths, pathPrefix='Node-path: '):
    ''' Find branches in the dumpFile that contain the paths. '''
    for line in file(dumpFile):
        if line.startswith(pathPrefix):
            for path in paths:
                if line.endswith(path+'\n'):
                    yield line[len(pathPrefix):-1] # yield branch


parser = PathRemoverOptionParser()
dumpFile, paths = parser.parse_args()
excludes = ' '.join(branches(dumpFile, paths))
if not excludes:
    parser.error('path(s) (%s) not found'%', '.join(paths))

os.system('svndumpfilter exclude %s < %s > %s.out'%(excludes, dumpFile, dumpFile))

Task Coach

Tuesday, July 13, 2010

Configuration Management for Task Coach

Sunday, April 06, 2008

Migrating from CVS to Subversion

Saturday, October 14, 2006

The need for branches

Blog Archive

Labels