CategoryTools

Wikimedia’s Open Source Toolset

There is a rarely explored relationship between the world of open source tools and free knowledge collaboration. This relationship developed naturally and quietly. Some open source tools have become quite essential to projects like Wikipedia, and I’ve started a page on our Meta-Wiki called Open Source Toolset to document the use of open source/free software tools in the Wikimedia Foundation projects. Others have quickly made useful additions (if you see any glaring omissions, please do not hesitate to edit).

Inkscape is an example of a mainstream open source tool that has become essential, even though it has not reached 1.0 yet. It has been used to create thousands of vector drawings in Wikimedia projects. But there are much more specialized tools, such as Hugin (used for stitching panorama pictures) or PP3 (used for celestial charts). The availability of these tools is incredibly empowering. Anyone with the necessary skills and interest can use them to immediately contribute their knowledge; there is no charge, and the quality of the software almost always increases over time.

The importance of this open source ecosystem of tools can hardly be overestimated. Every new tool, every new feature, directly feeds back into the quality of content that is being generated. Therefore, I strongly believe that we must find ways to support them. Google has an annual Summer of Code, through which it spends a lot of money on student projects. This is very worthwhile indeed. We do not have a lot of money, but we do have global website exposure. Perhaps the Wikimedia Foundation should support its own “Autumn of Collaboration”, providing learning resources and guiding volunteers to work on the projects that make the greatest difference in the collection and development of human knowledge.

A bunch of buttons

The Definition of Free Cultural Works has been officially adopted by the Wikimedia Foundation. This gives us some major visibility, and we want to make use of that visibility by encouraging people to adopt a new set of free culture buttons, designed by amymade.com. Here are some examples:

I asked Amy to make three different colors for different types of works (such as music, scientific papers, or weblogs). I’m very happy with how they turned out — if you have a CC button for a free license on any of your works, you may want to consider replacing it with one of these buttons. I hope they will be adopted within Wikimedia as well.

Phyllobius argentatus

A lovely new featured picture candidate on Commons:

It’s used on Wikispecies. You can vote voice your opinion here.

Eventually, Wikimedia Commons will become the real ARKive.

Wikipedia Gets Advertising

Wikipedia has finally seen the light and started putting on some animated banner ads. Moreover, they’re not limited to the article space, but placed where editors are most likely to see them: on the pages showing information about users, and the internal communication system (“user talk pages”). It seems that the Wikimedia cabal even recruited the users themselves to add the ads to the pages (possibly through some kind of revenue sharing arrangement). Here are some examples:







It’s good to see that Wikipedia has given up its communist pinko resistance against advertising. Now that they’re raking in the money, I hope they’ll finally stop asking for donations like some maggot infested hippie. Jason Calacanis was right again!

(More ads here)

DBPedia: Querying Wikipedia

Ever since templates were added to Wikipedia, many people have thought about ways to make all the exciting data from the various infobox templates added to articles searchable in some way. DBPedia has done it: they have converted all the template data into RDF and made it browsable and searchable. Using, for example, the Leipzig Query Builder, you can ask questions like “which film composers were born before 1965” or “which soccer player with tricot number >11 from club with stadium with >40000 seats has scored more than 100 goals”.

Of course, it is desirable to store this data in a structured form in the first place, and to make it searchable in real-time. This is what Semantic MediaWiki and OmegaWiki are aiming for, using different strategies (one provides you with a syntax to annotate wiki pages and stores all annotations in a machine-readable fashion; the other focuses only on the structured data itself and treats it separately from a wiki page). I believe that, after we’ve made some progress on quality annotation (priority number 1), adding content structure to Wikipedia should very much be the next priority for the Wikimedia Foundation.

PediaX: Wikipedia 2.0 (BETA; NOT FOR CHILDREN UNDER 3)

Is it a toothpaste for children? Is it a midget superhero? No, Pediax is a mirror of Wikipedia with some added features. The main added value seems to be:

  • Google Suggest style autocompletion on searches
  • Floating navigation sidebar (left) and table of contents / information sidebar (right)
  • Information sidebar shows collected usage data (most popular pages as well as incoming / outgoing user tracking)
  • Auto-magnification when you mouse over images (load a typical Wikipedia article to test it)
  • Full Google Maps integration for geocoding information
  • Pre-loading of article intros in a pop-up when you hover over links

Overall, my impression is mixed. The entire site doesn’t work properly in Konqueror, so cross-browser compatibility appears to not have been a consideration. Autocompletion is definitely useful; we already support it through the OpenSearch API, e.g., you can add Wikipedia as a search engine to Firefox 2.0 and it will autocomplete the search as you type. There’s also an AJAX autocompletion feature in MediaWiki Subversion.

MediaWiki used to have a floating sidebar feature before we switched to the current MonoBook skin. It’s certainly helpful to have the navigation links permanently accessible. For some reason or other, PediaX seems to lose focus on the main content area quite often, which means I have to click in the middle column to be able to scroll. It also doesn’t handle pages with lots of interlanguage links in the sidebar well; links below the visible screen area appear to be simply inaccessible.

I think more than one floating sidebar is overkill, but being able to “pin” the left-hand bar would be useful together with an expandable or scrollable interlanguage link list. There appears to be a user script hack which does exactly that, but it kills the logo.

I don’t like the auto-magnification; I consider it bad UI design to shove large things in people’s faces when they move their mouse around. Zocky’s Picture Popups user script seems like a better solution to me; it simply shows nice, draggable previews when you click an image. I’ve been using it on the English Wikipedia for a while; the main problem is that it loads licensing information into the picture area from time to time.

Google Maps integration is interesting, though Wikimedia takes a non-discriminatory approach to external services. We’ve been using Magnus Manske’s “GeoHack” script for a while now to show links to various online mapping resources about a particular set of coordinates (example). If something were to be embedded into Wikipedia itself, it should be freely licensed, not proprietary information controlled by Google. Support for OpenStreetMap would be interesting, for example.

The pop-up preloading is a matter of taste. Again, there is a user script which implements this: Lupin’s navigation pop-ups, which also include a bunch of useful editing tools and image previews and can be customized in numerous ways.

It’s nice to see some systematic usability and feature work happening in one place. Those who think that PediaX exemplifies that Wikipedia uses “outdated” technology and is not hip enough to adopt all the cool Web 2.0 toys ought to consider, though, that implementing changes recklessly would easily break the site for a large portion of our users. Our existing innovation model includes the ability for users to write and activate their own user JavaScript and stylesheets, and this has led to many exciting Tools. The best ones, if they degrade gracefully on browsers with insufficient capabilities, should be considered for inclusion in the default view. For the others, we mainly ought to make it easier to select and activate them.

Planet Wikimedia officially launched

Planet Wikimedia is now alive. :-) This is an on-topic aggregator for posts about wikis and Wikimedia projects, by Wikimedia project participants and MediaWiki developers. See the instructions for getting added to the global feed. This is an opt-in process, partially to ensure that everyone submits a filtered feed, but also to avoid copyright concerns about aggregation.

Any developer with MediaWiki Subversion access can edit the planet configuration file, which will be updated regularly. This is not quite as open as putting the configuration in a wiki, but hopefully is a sufficient clue barrier to avoid accidental breakage.

Please spread the word, and submit your blog.

30 Days of Freedom

Slashdot just publicized 30 Days With Ubuntu, a review of the Ubuntu Linux distribution and its strengths and shortcomings. I found the review to be honest and accurate. It is purely technical and does not argue the ethics of free/libre software. I want to use it as an opportunity to reflect on where this movement is today, and why it matters. It’s not any in-depth analysis, but might be interesting to some readers (and writing such things always helps me to hone my beliefs, my arguments, and my rhetoric). In this article, I will refer to free/libre software as “Free Code”, which is a bit of an experiment and which I will explain at the end.

I’ve been a Kubuntu user for a year now (and a Debian user before that for 3 1/2 years), so I have plenty of opinions of my own, but I want to focus on the main criticisms identified by the reviewer:

  1. support for recent hardware is incomplete – some things don’t work or require a lot of tweaking
  2. no commercial games, except through (crappy) emulation
  3. no good video editing code, no good PhotoShop replacement
  4. 64-bit version sucks in many ways

These are remarkably few criticisms, which give an indication of how far Linux and Ubuntu have come. I’ll try to address each one of them briefly.

Hardware support

This is going to be a tricky issue unless and until we get mainstream adoption. That’s not entirely a chicken/egg problem, because it does not concern machines which are specifically selected or built for the purpose of running Linux (which is of course how the vast majority of users get Windows on their computers in the first place). I think three things have to mainly happen to make progress:

  • We need to build a really, really sexy Linux-based PC that everyone wants to have, get some large vendor to support it, and market it widely. See [[FriendlyPC]] for some ideas on this. The OLPC effort will be a good case study on what to do and what to avoid. (I’m particularly interested in whether Squeak will prove to be a viable development platform or not.) Not likely in 2 years. Maybe use the Wikipedia brand? If WP regains significant credibility, a “Wikipedia PC” could become quite the item.
  • Users of particular hardware need to be able to find each other, and pool resources (money, activism & coding ability), to make hardware support happen. If a major distributor like Ubuntu integrated such a matching tool, it could make a significant difference in the speed with which new hardware is supported. This goes for general code needs as well, of course, but as the review shows, these are fairly well met already.
  • There needs to be a single certification program that is supported by all major Linux vendors & companies. I don’t want to look for “Red Hat enabled” or “Ubuntu supported”; I want to know whether recent versions of any of these distributions will work with the hardware or not. And once the certification program has gained acceptance, you can do a bait and switch and withdraw certification from non-free vendors. :-)

Of course, an increasing market dominance of a single Linux-based platform will also make things easier. My general philosophy on competition is that it should only exist where it makes eminent sense, i.e., it should grow on the basis of irreconcilable disagreements about philosophy, direction, management, or architecture, not on the simple desire of some people to make more money than others. This ties into my belief that we need to gradually change from a profit-driven society to a cooperative one. Hence, I support rallying around the most progressive Linux distribution, which appears to be Ubuntu/Debian at this point. (That said, I would prefer the for-profit which supports Ubuntu, Canonical, to be fully owned by the non-profit, which is the model chosen by the Mozilla Foundation.)

Some users believe strongly in making existing non-free drivers as easy to obtain and use as possible. I’m not completely opposed to the idea, but we need a more creative solution than this. My suggestion is what I call “/usr/bin/indulgence”, something like Automatix, but more deeply embedded into the OS, which would make installing any non-free code (from the lowest to the highest level) a trivial procedure, but which would also ask the user very kindly to support an equivalent Free Code implementation in any of the three ways mentioned earlier: money, activism, or code.

I object to handling such matters carelessly and to the vociferous “You people are idiots for not making this as easy as possible” arguments that are sometimes made; these are short-sighted and uncreative (as is the ideological opposition to even discussing the issue).

Commercial games and emulation

The reviewer points out that Linux is not a state of the art gaming platform. I’m quite happy with that. Free Code games give me an occasional distraction, but are not of sufficient depth to be seriously addictive. (Some Battle for Wesnoth or NetHack players might object to the previous statement.) I think this is something parents should take into account, and we should communicate it as a plus when talking to offices, governments, and home users.

If you know me, you may be surprised to learn that I’m in favor of some governmental regulation on games; not on grounds of violence or sex, but on grounds of addictiveness. That is a topic for another entry — but I personally see mainstream games on Linux as a non-issue. Your Mileage May Vary. If you want more distractions, there are thousands of games from the 1980s and 1990s that will work perfectly in emulators which pretend your machine is a Super Nintendo, a Commodore Amiga, or a DOS PC. Seach for “Abandonware” and “ROMs” on BitTorrent et al.

I’m more interested in how the Free Code ethos, combined with continued innovation in funding and collaboration, could result in Free Games that are technically modern, but built with more than just the interest in getting as many players as possible to pay a monthly subscription fee. Games that teach things. Games that make people do things. Games that demand and encourage creativity. Games where the builders care about the lives of the players, and not just their money. I’ll be happy to promote and help with that. Getting the latest Blizzard game to run nicely? Not so much.

Missing code

You know that Linux is ready for governments and businesses when a 30 day review points out DVD and photo editing as the main weaknesses — and not because there are no Free Code replacements, but because they aren’t quite good enough yet. The reviewer only tried two applications, GIMP and Kino. I share his feelings towards the GIMP photo editor, which I regard as an “old school” Free Code project where the developers would rather tell the users why their program is, in fact, highly usable than conducting serious usability tests and making improvements. To be fair, the existing GIMP user base, which is used to the current implementation, may also resist significant changes.

That is not to say that the quite remarkable GIMP functionality could not be wrapped into a nicer user interface. GIMPShop is one such attempt, which I have not tried. I hope that it will become a well-maintained fork; I don’t have much hope for GIMP itself to improve in the UI department. I am personally partial to Krita which, while still young, seems to have generally made the right implementation decisions, and is truly user-focused (as is all of KDE — I love those guys). I am not a professional photo editor, so I don’t know how mature Krita is for serious work. It is good enough for everything I do.

As for video editing, it would have been nice to read about some alternatives to Kino, such as the popular Cinelerra. As an outsider, I expect it will take a couple of years for one of these solutions to become truly competitive. Fortunately, many end users have very basic needs (cutting, titling, some effects), which the existing solutions cover, as the reviewer acknowledges.

There are, of course, countless areas which the reviewer does not touch upon. Personally, I think two important missing components for home users are high quality speech recognition (which many users will expect now that Vista is introducing it to the mainstream), and Free OCR. Here I’m not even talking about quality; the existing packages are, frankly, useless (I have scanned, OCR’d and proofread books, so I know a little bit about the topic). While setting my father up with Kubuntu, I played with the Tesseract engine, which was recently freed. It produced the best results of all the ones I’ve tested, but still did poorly compared to proprietary solutions (perhaps due to a lack of preprocessing). Frankly, without some serious financial support I doubt we’ll make much headway anytime soon. If the Wikimedia Foundation had more resources to spare, I’d advocate funding this, as it would greatly benefit Wikisource.

64 bit support

The reviewer points out a few glitches in the apps and drivers of the 64 bit version of Ubuntu. I cannot see a single argument in the article why this even matters to end users, except that “Vista has it” and that Ubuntu will have to provide good 64 bit support “to be taken seriously.” For the most part, this seems to be an issue of buzzword compliance, and I’m confident that the glitches will be worked out long before it starts to be relevant (i.e. when regular users start to operate on huge amounts of data in the course of their daily work).

Instead of chasing after an implementation detail of doubtful significance, I find the opposite direction of innovation much more fascinating: getting Linux to run well on low end machines and embedded systems. If you want, you can run Linux on a 486 PC from 15 years ago using DeLi Linux, though you’ll have to do without some modern applications simply due to their capabilities. Others, on the other hand, are up-to-date, and there are even well maintained text-mode web browsers for Linux, not to mention highly capable text-based authoring environments like emacs and vim.


Where next?

Given the state of Linux today, there is really no excuse for any government not to switch to Free Code. Certainly, there are still unmet needs, but if governments collaborate, these will be addressed quickly. Which, incidentally, is also a lovely way to bring people from different cultures together. Imagine Free Code hackers from Venezuela, Norway, and Pakistan working together on the same codebase. Such scenarios are, of course, already playing out, but would be much more common if any taxpayer-funded code needed to be freely licensed.

I believe this is inevitable, and that we’re going to continue to see significant progress within governments and academia. One could compare the Free Code progress to any civil rights struggle. Some will consider this an exaggeration, but one should not underestimate how much is connected to it, especially 20 to 30 years down the line: the involvement of developing nations in the information society, the control over the media platform through which everything will be created and played, the potential for reform to capitalism itself — and more. While Free Code may not save lives in ways which are as obvious as, say, an HIV drug or a law against anti-gay violence, it is the enabling support layer of a Free Culture which is very much connected to such issues: to education and awareness, to media decentralization, to intellectual property law, and so on and so forth.

Accordingly, we can expect some governments to take reactionary positions for decades (regardless of the quality of free solutions and undeniable cost savings), while others are already well in the progress of migrating. This may sound obvious, but is a bit counterintuitive to many politically naive hackers: No matter how good a job we do, we will continue to meet resistance and opposition on all levels.

As for businesses migrating their desktops, I believe it’s going to be a slower process. There are many contributing factors here: the ignorance of decision-makers, the feeling that anything “free” is suspicious, the sustained marketing effort of proprietary vendors, the reluctance to commit to approaches which require cooperation with competitors, the short term thinking that often dominates company policies, and last but not least, the common lack of any ethical perspective. Of course, there are also more practical grounds for opposition, but in general, I think the corporate sector (in spite of the adoption of Free Code on the server) is going to be the hardest to convert. I’m supportive of helping especially small businesses make the switch, in spite of my reservations about the profit motive. (Realistically, we’re going to have to live with capitalism for some time …)

Let me explain the reason I used the idiosyncratic term “Free Code” in this article. “Free Software” is disappointingly ambiguous, and “Open Source” is morally sterile. The word “code” points to what matters most: the instructions, the recipes underlying an application, which can be remixed, shared and rearranged. And “Free Code”, like “Free Culture”, calls to mind freedom over any vague notion of “openness” of “sources”. Last but not least, it carries the ominous double meaning that Lessig pointed out: code is law. Code influences and shapes our society increasingly, the more networked it becomes. Who would you rather have in charge of writing law: a monopoly-centered oligarchy, or anyone who has the will and the ability to do so? If you realize what code truly is, the conclusions become inescapable.

“Free Code” is perhaps unlikely to catch on, but sometimes I just want to try out a phrase to see how it feels. I’ve also always found that “software” is an undeniably silly term. It’s easy to ridicule people who would care about such a thing. It’s “soft.” It’s a commodity. To call a coder a software developer industrializes and trivializes their role. Coding should be a creative process which is deeply rooted in social and political awareness. People should be proud to say: “I am not a ‘software developer’. I am a Free Coder.”

LibriVox Dramatic Recordings

I’ve been fond of the LibriVox project for some time, where volunteers contribute spoken recordings of public domain texts (see the Wikinews interview I did last year). It’s a wonderful example of what becomes possible when a work is no longer trapped by copyright law. But I only today discovered the Dramatic Works section of their catalog. Here, multiple readers distribute the speaking of lines from dramatic works like Shakespeare’s “King Lear”, and the result is edited into a single recording. The entire process is coordinated through the LibriVox forums. I love it.

Granted, the results are of varying quality, and only a handful of works have been completed so far. But the technology that enables such collaborations to happen is also still in its infancy. The very existence of high quality open source audio editing software like Audacity has already driven a great many projects (including our own Spoken Wikipedia); imagine what kind of creativity an open source audio collaboration suite could unleash.

Improvements, of course, often come in very small steps. A nice example is the Shtooka software, an open source program specifically designed for the purpose of creating pronunciation files. It is not rocket science, but according to Gerard, who has recorded hundreds of such files, it makes the process so much simpler. I wouldn’t be surprised if the folks at LibriVox come up with their own “Shtooka” solution to distributing the workload of complex dramatic recordings.

Add me as a friend on change.org

If you read this blog, I’d like to encourage you to add me as a friend on change.org. It’s really the only social networking site I can be bothered to spend time with. Why manage friends on LinkedIn, Friendster, Orkut, Amazon.com, or indeed any social networking site while the world is still in desperate need of social and political change? change.org allows you to meet people, but you meet them around specific causes you support, rather than because of their professional background or taste in music.

The site still needs to expand to become fully international in scope, but it’s already doing a great job tracking some of the most important global issues just weeks after its launch. You can find me, of course, through my pet cause number one: free educational content.