Monday, 25 October 1999

The lesson of Agincourt

Open source programming underdogs may have more advantages than their critics imagine. But whether they win or lose in the end, it's their story that inspires.

Giving away software is a "strategy of the weak." At least most business analysts consider it so when comparing the strategy against the "embrace and extend" strategies used by powerful software houses like Microsoft.

But let me tell you a story that took place 584 years ago today. A story about the weak and their strategies.

Act I

On October 25, 1415, on Saint Crispin's day, a troop of tired, sick, and hungry men -- the unquestionably weak -- defeated an enemy who not only outnumbered them four-to-one but who fought on their own turf. It was at the Battle of Agincourt, when King Henry V led his troops into one of the greatest victories of military history.

Six thousand English soldiers were marching in retreat. Their destination was the safe port of Calais, from whence they hoped to escape to England. They were intercepted at Agincourt by French Constable Charles d'Albret and a French army 25,000 strong, including armored cavalry and infantry. Annihilation of the English troops was imminent; they had no advantages -- except for the rain. It rained, and rained, and rained, and the battlefield became a quagmire. Mired in mud, the French cavalry were easy prey for the English archers. The heavily armored infantry men slipped and fell. Struggling to rise they were cut down by the hatchets, hooks, and knives of the highly mobile English raiders.

For every Englishman killed, 25 French were slain. Some 500 French noblemen died on the field. The victory paved the way for English domination of most of France until the middle of the century.

Act II

Like the Constable d'Albret at Agincourt, most open source detractors misjudge the present situation.

They misjudge open source programmers, accusing them of amateurism and claiming that their only motivation is peer recognition. Critics also jump to conclusions about where and how these programmers earn their income, and about their knowledge and skills. The critics are in for a surprise.

And like d'Albret, analysts misjudge the battlefield as well. Today's internetworked world is fundamentally different from that on which the current software giants built their empires. And things are changing rapidly. Today, news of friends or foes, features or bugs, span the world in a matter of hours. Things are fundamentally different when anyone can post a question, and receive more than one authoritative answer from people across the globe. Things are fundamentally different when even your teenage son is among the ones posting the answers. Things are fundamentally different when the computer I carry in my waist pack is several times more powerful than the computer I learned to program on. It's raining folks. It's pouring!

Epilogue

Business analysts comment with a sneer that it's easy to double your market share every year when you start from zero. They speculate about what will happen when the software giants feel their territory threatened and decide to charge. They predict victory for the corporations; the apparently strong. But in the fast-changing world of the Internet, the advantage may well go to those who are flexible, agile, free from corporate inertia, and brave enough to take the risks.

The Hundred Years War, of which Agincourt was part, was eventually won by the French, and the English were expelled from French territory. But the legend of the victory at Agincourt lives on.

Likewise, the final outcome of the battles between the open source advocates and traditional business may not be the most important thing. It's the tale of open source and of the souls that fight for it that inspires, and will therefore last.

This story shall the good man teach his son;
And Crispin Crispian shall ne'er go by,
From this day to the ending of the world,
But we in it shall be remembered;

We few, we happy few, we band of hackers;
For he today that shares his code with me
Shall be my brother; be he ne'er so vile,
This day shall gentle his condition:

And programmers of closed source now a-bed
Shall think themselves accurs'd they didn't code,
And hold their skill sets cheap whiles any speaks
That coded with us upon Saint Crispin's day.

By Juancarlo Añez with apologies to William Shakespeare

Originally written for In Publishing LLC
Copyright © 1999 Inprise Corp.

Monday, 18 October 1999

Xtreme testing

Extreme programmers brave the open source slopes.

You can't do it. You can't add the feature. It's too risky. You'd have to change a core part of the program, and who knows what kind of subtle bugs might creep in doing that? You got hold of the source code and made sure that the license allows you to modify it, but it's too dangerous. Isn't it?

Well, the proponents of Extreme Programming say not only that you can, but that you should!

Extreme programmers aim for code that's small, lean, mean, and functional. To achieve the lean and mean, x-programmers always do the simplest thing that could possibly work. They resist the temptation to implement stuff that may be useful later, because they assume they're not going to need it. X-programmers are fearless about modifying the code, so their software is always functional. They will edit, refactor, or reimplement even large sections when they need to add a new feature.

But what about the risks? What about the bugs? The extreme programmers' answer: extreme testing. They write a test case for every feature they implement, even before they start coding the feature. Need to add new functionality? Write a test case. Found a bug? Write a test case. Have doubts about how something really works? Write a test case. The set of test cases thus produced serves to guarantee that the system keeps doing what it should even after important modifications. X-programmers run the whole battery of tests after each change, and they never release the code until all the test cases run flawlessly.

Extreme testing produces a large number of test cases, and the process of running all the tests and evaluating the results could get unwieldy. X-programmers use automation to keep extreme testing under their control. They have written testing frameworks for Java, C++, and several different flavors of Smalltalk. The frameworks are all freely available from the XProgramming software page. The Java testing framework provides one example of how extreme testing can be used.

JUnit

JUnit is a small testing framework written in Java by Kent Beck and Erich Gamma. Beck and Gamma are well-known as two of the founders of the Software Patterns movement, and this shows in the design simplicity of JUnit. Using JUnit to test Java programs is very easy. You start by writing a descendant of junit.framework.TestCase. This is an example from my Java Diff package:

import junit.framework.*;
public class DiffTest extends TestCase {
public DiffTest(String testName) {
super(testName);
}

public void testDeleteAll() {
Revision revision = Diff.diff(original, empty);
assertEquals(revision.size(), 1);
assertEquals(revision.getDelta(0).getClass(), DeleteDelta.class);
assert(Diff.compare(revision.patch(original), empty));
}
//...

Each method in a test case exercises a different aspect of the tested software. The aim, as is customary in software testing, is not to show that your program works fine, but to show that it doesn't. Interaction with the JUnit framework is achieved through assertions, which are method calls that check (assert) that given conditions hold at specified points in the test execution.

JUnit allows for several ways to run the different tests, either individually or in batches, but the simplest one by far is to let the TestSuite collect the tests using Java introspection:

    public TestSuite suite() {
return new TestSuite(DiffTest.class);
}
//...
TestResult result = DiffTest.suite().run();

When a class is passed to the constructor of a new instance of TestSuite, the instance uses Java introspection to find all the methods in the class with names that start with "test." A subsequent call to TestSuite's run method executes all the tests and returns the results as an instance of TestResult. TestSuite also allows the creation of, well, suites, which include any combination of test cases.

For maximum convenience, JUnit provides both text-based and GUI (Swing) user interfaces that facilitate the execution of test suites. Both user interfaces can be run as standalone programs by passing the class name of the test suite to run as the first command-line argument.


JUnit's GUI interface reports progress and results
as the tests are executed.

You can also run the tests from within your favorite integrated development environment (IDE), and have the debugger stop as soon as a problem is found. Your static main method would look something like this:

 public static void main(String args[]) {
junit.textui.TestRunner.run(DiffTest.suite());
}

Open tests

Proponents of extreme programming qualify their approach as a "lightweight methodology," to contrast it with the heavyweight methodologies that became popular during the 80s, and that few software projects continue to follow. XP establishes only a small set of common sense rules and practices, and the documentation generated is no more than the minimum required to keep things flowing smoothly. Almost all software development projects, including open source ones, could benefit from some amount of XP.

Why talk about testing and methodologies in an open source column? Much open source discourse is devoted to the "freedom" programmers enjoy to modify open source code at will. The truth is that this supposed freedom is only partial. Most of the open source software one can get hold of lacks the testing infrastructure required to make sure that adding features to the code doesn't also break it. As things stand, open source programmers must devise their own ways to verify the software they change, resulting in an incalculable amount of duplicated -- and mostly wasted -- effort.

Open source projects greatly benefit when they adopt testing policies and a testing framework from the start. Most open source projects do perform formal testing. All such projects would be improved if the numerous test cases produced could be collected and officially distributed. Adopting the test frameworks provided by the XP people is one approach, but almost any other test framework would do.

Extreme testing needs to become an integral part of open source practice. Only then will we truly experience the freedom of working to improve a piece of code without the fear of breaking it.

Originally written for In Publishing LLC
Copyright © 1999 Inprise Corp.

Monday, 11 October 1999

Start the revolution without me

Confused by the ruckus over open-source software? Bobby McFerrin had it right: Don't worry, keep hacking.

Everybody's talking about the "open software revolution." From geek forums like SlashDot to academic journals like Communications of the ACM, from the New York Times to The Wall Street Journal, everyone is saying that there's something important going on.

The leaders of the open-source movement argue that their way of doing things will change the world of software development forever. Detractors question their leadership and attack their ideas on business, logical, economic, political, sociological, and other grounds. The hubbub is rising so fast I've seen good friends worry that open-source software may destroy their way of life by making it impossible to sell software or write it for hire.

I'm not at home with the rhetoric of any of the factions. I'm not a radical, I'm a programmer. You say you want a revolution? You can count me out.

War of the words

Like many programmers, I have used and written open-source software for a long time. I'm glad that we now have a phrase -- open source -- to describe what we've been doing. But I object to narrow definitions that exclude workable, time-honored, and popular practices. Now that we have a buzzword, people are trying to cram political and economic agendas into the term.

There are those, for instance, who argue that open-source development is the solution to all software development problems. But they do so based on information drawn from just a few successful projects. Most of the open-source projects I have tracked over the past 18 months have been cancelled, only to become data points that feed the well-known statistics about the high failure rate of software projects. Methodologies are a dozen for a penny; these projects failed because they never had a feasible business plan.

In the opposite corner are those who see the open-source movement as a threat to the software industry. Never mind that open-source initiatives are law-abiding and -- because of their open nature -- much less likely to fall into the anticompetitive practices that characterize a part of our industry. Never mind that thousands of developers are gainfully employed on open-source projects. Never mind that hundreds of thousands of users entrust their businesses to their work. They are convinced that the sky is falling.

Then there's the debate over "free" software.

On one side are the developers and activists who believe software should not be treated as intellectual property. Or that it should be treated like intellectual property, but developers should waive their property rights.

On the other side are people who oppose free software. Some because they think it's impractical, and some because the initiative seems politically suspect -- too leftish. The extreme left has become extremely unpopular in most circles, and all of this talk of free software smacks of outmoded socialist ideas. As long as those impressions persist, I don't see how the free software cause can make it to a critical mass of supporters.

Intellectual property has been a hairy issue since the invention of speech at least, and intense debates about it precede the Internet era by centuries. The debates are so intense, you have to wonder how any country's founding fathers were able to quiet the ruckus long enough to slip intellectual-property language into their constitutions.

The shouting's so loud, it's tempting to ignore the arguments. But you can't simply ignore the "free" versus "proprietary" software debate -- no more than you would ignore debates between Republicans and Democrats in the U.S. or Greens and Social Democrats in Germany. This debate is truly about ideas. It's worthy of your attention.

It is difficult to write about these issues without making a contribution to the arguments. But I'm determined: I'm not about to take a side on these issues. I do think that open-source development will impact software development in general, and in an important way. But I think that the effect won't be immediate, and it won't be obvious. To know what this "revolution" will actually lead to...well, you have to wait. My take is that the factions will ultimately reconcile.

Strip away the rhetoric and it's clear: There is no revolution. The ideas being argued and shouted are interesting, and important, yes, but their effect will be evolutionary. Like always.

So don't worry. It will be all right.

Originally written for In Publishing LLC
Copyright © 1999 Inprise Corp.

Monday, 4 October 1999

The true meaning behind open source licenses

What does all this licensing stuff really mean?

I develop software for a living, so my position regarding third-party software has always been pragmatic: consider the cost and evaluate how well each solution meets your requirements. Most open source software is licensed free of charge. But does software that is licensed for free really have zero cost?

Property rights -- who has the right to own what -- are at the ideological core of political systems and are one of the fundamental individual liberties in a capitalist society. Software is considered a form of property by law, so talking about software licenses -- including open source licenses -- in that context makes sense.

I'm not a lawyer, so none of my views or interpretations of this matter would necessarily stand up in a court of law. Also, I'm talking primarily about United States copyright law because the copyright licenses I'll discuss were written in the US; the laws of other countries and of the international community might be different.

Property law establishes three basic rights: the right to use, the right to enjoy (profit from), and the right to dispose of the property. When dealing with intellectual property, and more specifically with copyrighted work such as software, such rights are exclusive to the copyright owner. Most software licenses explicitly reference the right to:

  1. perform the work -- that is, to execute the software
  2. prepare derivative works -- that is, to debug, patch, enhance, or otherwise modify the software
  3. make copies of the work
  4. distribute copies of the work
  5. authorize others to exercise the above rights
  6. sublicense the software (charge for it)
  7. prevent others from exercising the above rights
  8. transfer or license the above rights

I want to analyze several well-known (and some lesser-known) open source licenses in terms of the three basic rights property law establishes and the more specific rights held by copyright owners. Politics and philosophy aside, I want to evaluate what the different licenses mean to someone who develops software for a living, like me.

The licenses I'll be considering are those mentioned at the Open Source Initiative's Web site: the BSD license; the MIT License; the Artistic License; the Mozilla Public License; the GNU General Public License and its lesser form, the LGPL; as well as lesser-known licenses like the Java Community Source License and the IBM XML4J (XML for Java) evaluation and commercial licenses.

In the Public Domain

We should first look at one of the most widely used terms in open source distribution that has not yet been certified by the Open Source Institute: placing the work "in the public domain." A work placed in the public domain has the same legal status as one for which the legal copyright term (usually 50 years after the death of all the authors) has elapsed. This condition is particularly interesting because the rights to use and profit from the work (execute, copy, and distribute the software) can be exercised by anyone, yet the ultimate right of disposing of the work (the right to authorize or restrict other's rights to it) belongs to no one and, therefore, cannot be exercised.

Software placed in the public domain essentially follows the same copyright law as the works of long-dead literary and musical authors. You can perform (execute) the works of Shakespeare, copy them, and distribute them, but remember that derivative works, compilations, and specific performances can have their own lawful copyright owners. You can include public domain software in your own work, modify it, copy it, and talk about it in publications and at conferences. You can't do the same with someone else's work, however, just because it happens to use or contain public domain work. You're otherwise free to derive any profits from public domain software.

Public domain software is unique in that it is the only form of work whose disposal rights are not retained by anyone. Every form of software licensing reserves rights 5, 6, and 7 for some entity, including the right to license work under a different scheme whenever desired. From a business perspective, it's guaranteed that legal claims of copyright violation can not be brought against users of software that is in the public domain.

The least restrictive licensing

Software licensing schemes always reserve the right to dispose of the work (grant, restrict, transfer, and license the rights). This is redundant because licensing can only exist if someone holds those basic rights, unlike work placed in the public domain. All open source licenses have that element in common, but differ in the other rights they grant or restrict.

The least restrictive form of software licensing is that which reserves only the right to disposition. The BSD and MIT licenses, for example, grant ample rights of use, creation of derivative works, and redistribution (as long as the right to dispose of the work is explicitly reserved to the original authors). Included is the right to make different licensing arrangements with whomever the licensee sees fit. The Artistic (Perl) license is somewhat less clear in its section 3, which addresses copying and modification of the original "package," but does not talk about distribution.

Tightening the reins

Most frequently, open source licenses grant ample rights for use of the software and creation of derivative works, but restrict distribution in some important way. The four most common restrictions on distribution are:

  • You may only distribute the software and the modifications within your company or institution (an aspect the Artistic License).
  • You may only distribute the software and the modifications in object (compiled) form as long as some value-added is provided (the XML4J commercial license).
  • You may only distribute the software and the modifications for free (Artistic License).
  • You must provide the source to all your modifications, or otherwise guarantee that users of your software can get hold of the source free of charge (the GPL).

These restrictions mainly affect how you profit from the intellectual property. The only restrictions made by the controversial GPL are restrictions on distribution.

Normally, these kinds of licenses don't place restrictions on the work created with the software, which gives such licenses a broad range of uses. This applies to end-user applications -- such as word processors and spreadsheets -- and to most of the tools used in software development, even commercial ones like compilers, libraries, editors, and text manipulation and make tools.

Some other licenses place absolute restrictions on distribution in order to avoid so-called "forking." As a result, most of the enhancements end up in the primary source base. But other licenses restrict distribution just to maintain control over the software. I've known only a few licenses that do this.

The most restrictive licenses

Paradoxically, one of the most restrictive licenses -- the GPL -- is very liberal about redistribution rights. Under the GPL, you can distribute the original work or its derivatives to whomever you like, as long as you make readily available the source to the original work, the modifications you obtained, and any modifications you made. Any licenses you give must also follow the GPL restrictions.

Many source code licenses restrict the use that can be made of the software. Such is the case with the Java Community Source License, which appears to restrict free use to research. Other uses require that compatibility tests be made and royalties be paid. Other licenses, like the XML4J Evaluation License, restrict the use to that which is "lawful and non-commercial." More restrictive licenses grant these rights only for a limited period of time.

Some would argue that source licenses that restrict the use or the creation of derivative works are not really open source. This, however, is the most common form of licensing use in academia, which is driving the so-called open source movement and is a main provider of useful source code.

Surprisingly, licenses that completely prohibit executing the software can still be useful. The trend is a new one, but I've seen it applied to the source code in books about algorithms and coding techniques. These books provide source code, but they restrict its use to study purposes only. You cannot copy the software, be it directly, by optical recognition, or even by typing it in, which eliminates any possibility of passing it through a compiler, much less executing it.

If the license fits

If all you want to do with a given open source package is use it as a tool to produce work separate from the package, then most open source licenses will do. Tools like compilers and text editors fall into this category. Libraries and interpreters might not, so pay attention.

For other purposes, choose public domain software. Or, if you want to distribute your software commercially but do not want to distribute the source to the modifications you made, go with a license like the MIT or the BSD. The GPL and LPGL are good choices if you don't mind distributing your source code.

Licenses like the Java Community Source License are so complicated that I'd keep away from the code unless I could count on the services of a good lawyer.

In all other cases, read carefully and look for the following keywords and phrases: use, copy, distribute, modify, derivative works, sublicense, and charge. Those words and phrases describe the rights that can be claimed on copyrighted work, and they will be specifically mentioned in documents that grant or restrict them. When in doubt, (gulp!) consult a lawyer.

If you're planning on distributing your own work as open source, then I suggest you adopt one of the licenses mentioned at the Open Source Institute's web site. I hope that my analysis will make the choice easier. These licenses have already been reviewed by lawyers and, just as important, they are free to use. Licenses can also be copyrighted, you know, and not all licenses are open source.

Originally written for In Publishing LLC
Copyright © 1999 Inprise Corp.