It can be useful to review open source
development processes from time to time. This reddit thread[1] serves
use both as a case study, and also a moment of OSS process introduction
for newbies.
[1]
http://www.reddit.com/r/Bitcoin/comments/2pd0zy/peter_todd_is_saying_shoddy_development_on_v010/
Dirty Laundry
When
building businesses or commercial software projects, outsiders
typically hear little about the internals of project development. The
public only hears what the companies release, which is prepped and
polished. Internal disagreements, schedule slips, engineer fistfights
are all unseen.
Open source development is the opposite.
The goal is radical transparency. Inevitably there is private chatter
(0day bugs etc.), but the default is openness. This means that is it
normal practice to "air dirty laundry in public." Engineers will
disagree, sometimes quietly, sometimes loudly, sometimes rudely and with
ad hominem attacks. On the Internet, there is a pile-on effect, where
informed and uninformed supporters add their 0.02 BTC.
Competing
interests cloud the issues further. Engineers are typically employed
by an organization, as a technology matures. Those organizations have
different strategies and motivations. These organizations will sponsor
work they find beneficial. Sometimes those orgs are non-profit
foundations, sometimes for-profit corporations. Sometimes that work is
maintenance ("keep it running"), sometimes that work is developing new,
competitive features that company feels will give it a better market
position. In a transparent development environment, all parties are
hyperaware of these competing interests. Internet natterers
painstakingly document and repeat every conspiracy theory about Bitcoin
Foundation, Blockstream, BitPay, various altcoin developers, and more as
a result of these competing interests.
Bitcoin and altcoin
development adds an interesting new dimension. Sometimes engineers
have a more direct conflict of interest, in that the technology they are
developing is also potentially their road to instant $millions.
Investors, amateur and professional, have direct stakes in a certain
coin or coin technology. Engineers also have an emotional stake in
technology they design and nurture. This results in incentives where
supporters of a non-bitcoin technology work very hard to thump bitcoin.
And vice versa. Even inside bitcoin, you see "tree chains vs. side
chains" threads of a similar stripe. This can lead to a very skewed
debate.
That should not distract from the engineering
discussion. Starting from first principles, Assume Good Faith[2]. Most
engineers in open source tend to mean what they say. Typically they
speak for themselves first, and their employers value that engineer's
freedom of opinion. Pay attention to the engineers actually working on
the technology, and less attention to the noise bubbling around the
Internet like the kindergarten game of grapevine.
[2]
http://en.wikipedia.org/wiki/Wikipedia:Assume_good_faith
What the fork?
In
this case, a tweet suggests consensus bug risks, which reddit account
"treeorsidechains" hyperbolizes into a dramatic headline[1]. However,
the headline would seem to be the opposite of the truth. Several
changes were merged during 0.10 development which move snippets of
source code into new files and new sub-directories. The general
direction of this work is creating a "libconsensus" library that
carefully encapsulates consensus code in a manner usable by external
projects. This is a good thing.
The development was
performed quite responsibly: Multiple developers would verify each
cosmetic change, ensuring no behavior changes had been accidentally (or
maliciously!) introduced. Each pull request receives a full
multi-platform build + automated testing, over and above individual dev
testing. Comparisons at the assembly language level were sometimes made
in critical areas, to ensure zero before-and-after change. Each
transformation gets the Bitcoin Core codebase to a more sustainable,
more reusable state.
Certainly zero-change is the most
conservative approach. Strictly speaking, that has the lowest consensus
risk. But that is a short term mentality. Both Bitcoin Core and the
larger ecosystem will benefit when the "hairball" pile of source code is
cleaned up. Progress has been made on that front in the past 2 years,
and continues. Long term, combined with the "libconsensus" work, that leads to less community-wide risk.
The
key is balance. Continue software engineering practices -- like those
just mentioned above -- that enable change with least consensus risk.
Part of those practices is review at each step of the development
process: social media thought bubble, mailing list post, pull request,
git merge, pre-release & release. It probably seems chaotic at
times. In effect, git[hub] and the Internet enable a dynamic system of
review and feedback, where each stage provides a check-and-balance for
bad ideas and bad software changes. It's a human process, designed to
acknowledge and handle that human engineers are fallible and might make
mistakes (or be coerced/under duress!). History and field experience
will be the ultimate judge, but I think Bitcoin Core is doing good on
this score, all things considered.
At the end of the
day, while no change is without risk, version 0.10 work was done with
attention to consensus risk at multiple levels (not just short term).
Technical and social debt
Working
on the Linux kernel was an interesting experience that combined
git-driven parallel development and a similar source code hairball. One
of the things that quickly became apparent is that cosmetic patches,
especially code movement, was hugely disruptive. Some even termed it
anti-social. To understand why, it is important to consider how modern
software changes are developed:
Developers work in
parallel on their personal computers to develop XYZ change, then submit
their change "upstream" as a github pull request. Then time passes. If
code movement and refactoring changes are accepted upstream before XYZ,
then the developer is forced to update XYZ -- typically trivial fixes,
re-review XYZ, and re-test XYZ to ensure it remains in a known-working
state.
Seemingly cosmetic changes such as code
movement have a ripple effect on participating developers, and wider
developer community. Every developer who is not immediately merged upstream must bear the costs of updating their unmerged work.
Normally, this is expected. Encouraging developers to build on top of "upstream" produces virtuous cycles.
However,
a constant stream of code movement and cosmetic changes may produce a
constant stream of disruption to developers working on non-trivial
features that take a bit longer to develop before going upstream.
Trivial changes become encouraged, and non-trivial changes face a binary
choice of (a) be merged immediately or (b) bear added re-base, re-view,
re-test costs.
Taken over a timescale of months, I
argue that a steady stream of cosmetic code movement changes serves as a
disincentive to developers working with upstream. Each upstream
breakage has a ripple effect to all developers downstream, and imposes
some added chance of newly introduced bugs on downstream developers.
I'll call this "social debt", a sort of technical debt[4] for
developers.
[4]
http://en.wikipedia.org/wiki/Technical_debt
As
mentioned above, the libconsensus and code movement work is a net
gain. The codebase needs cleaning up. Each change however incurs a
little bit of social debt. Life is a little bit harder on people trying
to get work into the tree. Developers are a little bit more
discouraged at the busy-work they must perform. Non-trivial pull
requests take a little bit longer to approve, because they take a little
bit more work to rebase (again).
A steady flow of code movement and cosmetic breakage into the tree may be a net gain, but it also incurs a lot of social debt. In such situations, developers find that tested, working out-of-tree code repeatedly stops working during the process of trying to get that work in-tree. Taken over time, it discourages working on the tree. It is rational to sit back, not work on the tree, let the breakage stop, and then pick up the pieces.
Paradox Unwound
Bitcoin
Core, then, is pulled in opposite directions by a familiar problem. It
is generally agreed that the codebase needs further refactoring.
That's not just isolated engineer nit-picking. However, for non-trivial
projects, refactoring is always anti-social in the short term. It
impacts projects other than your own, projects you don't even know
about. One change causes work for N developers. Given these twin
opposing goals, the key, as ever, is finding the right balance.
Much
like "feature freeze" in other software projects, developing a policy
that opens and closes windows for code movement and major disruptive
changes seems prudent. One week of code movement & cosmetics
followed by 3 weeks without, for example. Part of open source parallel
development is social signalling: Signal to developers when certain changes are favored or not, then trust they can handle the rest from there.
While
recent code movement commits themselves are individually ACK-worthy,
professionally executed and moving towards a positive goal, I think the
project could strike a better balance when it comes to disruptive
cosmetic changes, a balance that better encourages developers to work on
more involved Bitcoin Core projects.