Creosote

Hardcore Math for the Ruby Community (Creosote tagline in progress...)

Checklist for the Benevolent Open Source Maintainer

It's easy to whip up a quick mass of code that does something useful for yourself or your development team, throw it on GitHub, and call it open source. But when it comes to effectively sharing your work and providing to the community, you need to do a little more work.

Supporting your open source project can take many different forms, and being the perfect benevolent maintainer doesn't have to be the goal. But you should certainly consider this list of TODOs that will get your code out there, help your users understand your project, and enable your project to grow.

The Basics

I consider these tasks to be the bare minimum whenever you want to share your code with the world, and make it minimally easy for anyone to use. The basic idea behind these tasks is to give people a starting place. People wanting to hack on your project need a starting place just as the users of your code do.

Source

At a minimum, you need to make your source code easily accesible. This can be done a variety of ways:

The best solution is likely GitHub which, at a glance, provides:

Other solutions include

README

A README is an absolute minimum. Not all projects have full-fledged manuals or generated docs, but a README serves as a great introduction to your project. You get this for free when you go with GitHub, and many projects use their README as their primary documentation. Excellent examples include:

  • Capybara has an ~800 line README hosted at GitHub. They also have a GitHub page that points to 4 different resources.
  • Sinatra has a 2000 line README! They do have other documentation resources, but this README is boss.

Tests

Honestly, tests will vary from programming community to programming community. The Ruby community is crazy awesome at this concept. It is now somewhat expected to see:

In my experience, the Perl community and regions of the C community are also really good about testing their code, and publishing/documenting tests. The JavaScript community is also getting pretty good about this stuff. Awesome.

I think that providing a tiny bullet in your README showing how to run tests is perfectly adequate here. Lots of projects have various gotchas regarding your testing environment, so it is very helpful to spend 10 minutes and write out these issues.

Packaging / Distribution

Whether your open source project is Perl, C, Java, Ruby, Python, Javascript, etc., there are likely one ore more common distribution platforms for similar packages. Here are a few great examples to follow:

  • Shotwell, by Yorba, is the best photo organizer for Linux. They provide a slick little install page that give detailed installation steps, whether you are on Ubuntu, Fedora, or something else. They provide their own update repository, so you can stay up to date. You can also build from source.
  • Gherkin is the language behind Cucumber. The fast lexer and parser that they developed is available through the following channels:
    • RubyGems.org, for Ruby and JRuby,
    • npm, for Node.js,
    • Maven Central for Java,
    • NuGet, for .NET.
  • VLC provides about a million download links for every Linux flavor, mainstream OS, and more.

No matter your environment, you likely have some sort of package distribution.

Good Practices

These tasks will require some more work, but will be very helpful to your community (or rather... help to create a community around your project). Some of these are honestly very easy to complete, so you should look at all of these before groaning and clicking away :).

Additionally, some of these ideas will be helpful for you. Documenting your code will help your team to use and extend the code. Writing a manual or a Getting Started document will save you time whenever someone wants to try your code out.

GitHub Page

If your code is already hosted as a GitHub project, a great way to create a one-stop homepage is a GitHub Page. I'm going to let GitHub explain these themselves, and I'll just say that they work great as READMEs, and multiple pages can form a great mini manual. Here are some examples:

Auto-Generated Documentation

This one definitely varies from language to language. Not a lot of people use this kind of thing with C or C++ projects, but there are systems like

  • Doxygen, "a documentation system for C++, C, Java, Objective-C, Python, IDL (Corba and Microsoft flavors), Fortran, VHDL, PHP, C#, and to some extent D.", and
  • Docco, "Docco is a quick-and-dirty, hundred-line-long, literate-programming-style documentation generator. It produces HTML that displays your comments alongside your code. Comments are passed through Markdown, and code is passed through Pygments syntax highlighting.", which can document CoffeeScript, JavaScript, Ruby, Python, or TeX.

For other languages, however, like Ruby and Java, have very standard and very popular auto-generating documentation engines. Some projects take crazy advantage of this, such as Ruby's Sequel, and Jira's Soap API.

Manual

To put it simply: a manual is important if your project is large. Project's like Ruby's Headless gem don't need a manual (Headless exposes a grand total of 5 methods, I think: Headless.ly, Headless.new, Headless#start, Headless#destroy, and Headless#video). Apache Commons's StringUtils exposes quite a few static methods, but there is no complicated structure or procedure that needs to be explained. The JavaDoc is quite sufficient.

If, however, you have an open source project that has a vast array of classes, methods, interactions, interfaces, etc., then it would behoove you to write some semblance of a manual. I'm using the word "manual" pretty loosely here. This can be brief or long, on one page or many. Here are a few great examples of manuals, some of which are not very different from a README (these things definitely overlap):

  • Sequel has an HTML manual, with pages linking to each other. There is a a page for every substantial topic, such as "Sequel for SQL Users" and "Migrations" and "Model Validations".
  • Bundler has a minimal manual, as a GitHub Pages project. It serves its purpose, but is a little hard to navigate. This is a crazy simple way to write a manual, though. Just write up a bunch of documents in a git repo, and link them to each other. Here is the source for Bundler's GitHub Pages.
  • GMP provides a manual in two different formats: single-page HTML, and PDF. They are able to do this by writing all of their documentation in Texinfo.

A Getting Started Document

Getting Started will of course vary from project to project. I think that these are important for:

  • a project that has many concepts that build on eachother, such as Sequel,
  • a project that requires some boilerplate code that is written just once per implementation. Sequel is a good example of this, as is Nokogiri, which provides a nice Synopsis on their home page.
  • a project that has a complicated API, or introduces a whole new programming style, such as EventMachine.

Taking it Seriously

Extensive Manual

Beyond the basic manual, you can create something more extensive. This can be as massive as a beautiful PDF, explaining everything there is to know about your project. GMP, MPFR, and MPC are great examples. Each of these projects have documentation written in texi, which they compile into both HTML and PDF. GMP's latest PDF is 1.1MB, at 145 pages, and FLINT's PDF is 1.3MB, at 192 pages.

I absolutely recommend writing your manual in some format that can be published to multiple others. Pandoc is my current absolute favorite for this kind of work. Anything you hope that Pandoc supports, it does. I use it to translate a Markdown document into HTML and PDF.

Public Continuous Integration

OK, if you have some tests written and you like Continuous Integration or if you don't like Continuous Integration, you should totally absolutely consider a Public Continuous Integration process. Right now, uhh... I think the only service for this is Travis CI. This service is amazing. You can go to the homepage and poke around, and see all of the projects currently using Travis CI. Their GitHub page is a nice actual introduction to the service. And here's an excellent screencast. Here's a quote from their Goals:

Travis is an attempt to create an open-source, distributed build system for the OSS community that:

  1. Allows open-source projects to effortlessly register their GitHub repository and have their test suites run after pushes
  2. Allows users to contribute build capacity by connecting a machine that runs Travis workers and VMs they use on their underused servers

With Travis CI our vision is to become for builds (i.e. tests, for starters) what services like rubygems.org or Maven Central are for distribution of libraries. We strive to build a rock solid, but dead-easy to use open-source continuous integration service for the open source community.

Travis does require GitHub for their integration (I think...). If you have a DevOps dude sitting around, doing nothing (yeah... right, we all have extra labor lying around), you could try to build it yourself as well. The entire freaking service is open source.

License

No one goes into software development for the fun legal reasons, but it is important to license your code. For some resources on various open sources licenses, you can view the Open Source Initiative's list of Open Source Licenses by Category. They have some great (but opinionated) advice on choosing a license, and good definitions:

Also, if this is a Ruby gem, consider documenting the license in your gemspec.

If you are writing a (Java) project managed by Maven, you might want to look into Sonatype's Insight (or similar). From their website:

Hidden license obligations can leave your organization exposed to legal and business risk. To eliminate this risk, you need visibility into how components are licensed. And not just the licenses for the components you included, but the entire dependency tree.

.

Are you aware of what your license obligations are? For the top level dependencies alone, this is an insurmountable task. With our report you will know exactly what those obligations are and where to find them within the tree.

Here is an excellent graphs showing the legal ramifications of depending on / including various software in your project.

CONTRIBUTING Document

The need for a CONTRIBUTING document is perhaps most obvious when your project is on GitHub and anyone can clone your repository with a single mouse click. But I think that this document is important in every project you make available. Anyone who downloads your source code from your SVN repository, or your SourceForge page, etc., might want to contribute to your project. Some people will want to fix a typo in the documentation. A common experience I have is Windows users taking it upon themselves to fix incompatibilities that the primary developers never had a chance to test.

A CONTRIBUTING document doesn't need to be very long, but should provide the information that you may be asked over and over, and tips for your adoring fans, on how to get their contributions accepted into your code base. GitHub recently formalized the idea of Contributing Guidelines, and they point to Puppet and factorygirlrails as two projects with great CONTRIBUTING docs. Here's a list of what you should discuss in your CONTIBUTING document:

  • How to initially make your contribution known, if it's an issue, such as opening an issue on the project's tracker,
  • How to properly commit and tag your changes,
  • How to run your tests,
  • Whether a potential needs to sign a CLA, such as Puppet Labs', or the CLA for jQuery,
  • Rules for new contributions.
  • What versions of the language or VM need to be tested against,
  • Syntax guidelines or tips.

As far as "Rules for new contributions, I like 'factory-girl-rails's:

"Add a test for your change. Only refactoring and documentation changes require no new tests. If you are adding functionality or fixing a bug, we need a test!"

And Mail's

"Include tests that fail without your code, and pass with it"

Extensive Tests

Many projects take testing very seriously. Good testing practices, code coverage, and platform compatibility can look very good to someone evaluating your project. I know that I hold Ruby's Mail gem in very high regard when I see the following in the README:

Basically... we do BDD on Mail. No method gets written in Mail without a corresponding or covering spec. We expect as a minimum 100% coverage measured by RCov. While this is not perfect by any measure, it is pretty good. Additionally, all functional tests from TMail are to be passing before the gem gets released.

Presentations

Conference Presentations are a great way to get word out about your project. Now that there are 53424581 conferences for every programming language, paradigm, and web framework out there, you should be able to get a slot if your project is rather general purpose, or if you can extrapolate general lessons from your time creating this project. Local user groups and regional conferences can lead to bigger conferences, and better presentations. I think a great example of this is Cucumber, where they have a few presentation videos all on the homepage.

Screencasts

Screencasts are fun! Fun for both you and the users. It is possible to create screencasts rather cheaply these days. You likely have a laptop with a microphone and a camera. Screencasting software is available and free for Windows, Mac OS X, and Linux.

irc Channel

We're now definitely getting into the dedicated, benevolent maintainer arena here. Dedicating an irc channel, and your time as support, to your project, is a big step.

However, this can be an invaluable resource for your adoring fans. Having a conversation with the authors of a project, as well as more experienced users, is crazy useful. I've talked to Wayne at #rvm for help using RVM. I think I've chatted in a ruby chat, a ruby developers chat, a scintilla chat, etc.

Mailing List

A mailing list is certainly an easier tool to implement. There are pretty standard places to build a mailing list (ie group), such as Google Groups, Yahoo Groups, and I think even SourceForge will build a mailing list concept for you.

Wiki

A wiki is another form of documentation, and again is not mutually exclusive with other forms, like the README, and GitHub page, and manual. Wikis you can also get free from a number of resources. The simplest that probably jives with other work you've been up to, is GitHub Wikis (backed by Gollum (backed by Git)).

Support Tracker

blog comments powered by Disqus