Looks like Meraki is getting some open source competition. An open source mesh networking platform could be exciting, especially when combined with a mini-webserver built into the platform to host content even without Internet access.
Yesterday I set up a first unofficial demo of stable versions in the configuration which I would like to see used on the English Wikipedia. It’s based on the FlaggedRevs extension by Aaron Schulz & JÃ¶rg Baach. Unofficial, because Brion is currently reviewing the code for security and scalability. I’m still slightly worried that we might hit a snag, as the extension goes pretty deeply into the way MediaWiki serves pages, and of course Wikipedia can’t afford to run something that causes major slowdowns.
Still, if you want to get a feeling for the kind of configuration that I, Jimmy Wales and Florence Devouard have expressed support for, play with it. The main new thing you’ll notice is a top right corner icon which indicates the status of the page you’re looking at. On selected pages, the last revision shown to unregistered users is the most recently vandalism-patrolled one, while the rest of the wiki behaves as before. Essentially, in this configuration, it accomplishes four things:
- Allows us to open up semi-protected pages to editing, since edits have to be reviewed and vandalism does not affect what the general public sees;
- Allows us, as a consequence, to also use this kind of “quality protection” more widely — e.g. when an article has reached a very high quality, and is unlikely to be substantially be improved by massively collaborative editing
- Improves our vandalism patrolling abilities, since vandalism doesn’t have to be repeatedly checked: we record who has looked at which changes. Also, changes by trusted users don’t have to be reviewed. The tagging system ties into “recent changes patrolling”, an old feature that has never scaled well.
- Allows us to make processes like “Featured article candidates” and “Good article candidates” revision-based rather than page-based. Thus we can more easily track when an article reached a certain quality stage, and can better examine whether changes past that point have increased or decreased its quality.
Some wikis would like to use more restrictive configurations than the one shown here. For example, a group in the German Wikipedia community is advocating to let all edits by unregistered users be reviewed before applying them, rather than only doing so on selected pages. I’ve argued against this at some length here. I think the current configuration strikes a good balance of openness, quality, and transparency to the reader. Of course there’s already a big wishlist, some of which should be addressed before taking this feature live.
VolapÃ¼k is one of those lovely constructed languages which has made it into the 100K list on our multilingual Wikipedia portal. Of course, this is due to some bot imports of census and map information. It’s all fair – the English Wikipedia also bot-generated tens of thousands of “articles” in its early history. It does raise the question whether a freetext wiki is really a good way to maintain such data.
Over at OmegaWiki, we’re experimenting with adding structured data to the multilingual concept entities we call “DefinedMeanings”. A DefinedMeaning can be accessed using any synonym or translation associated with it. An example is the DefinedMeaning Denmark. If you expand the “Annotation” section, you’ll see information such as the countries Denmark borders on, its capital, or currency.
But there’s more to it than meets the eye: the fact that a country can border on another one is expressed by giving a DefinedMeaning a class membership. The class “state” defines what relationships an entity of this type can posses. In this way we define an ontology.
Note that if you log in and change your UI language, all the names will be automatically translated, if translations are available, as each relation type is a DefinedMeaning in its own right. OmegaWiki also supports free-text annotation and hyperlinks. We’ll add more attribute types as we need them.
What does that have to do with the VolapÃ¼k Wikipedia? Well, OmegaWiki is the kind of technology you could build upon to maintain a central repository of the almanac data that was imported. Then, with some additional development, you could generate infoboxes with fully translated labels directly from OmegaWiki. And, within the free-text Wikipedia, you could focus on writing actual encyclopedia articles that reference this data.
Note that OmegaWiki is not alone in the semantic wiki space. Semantic MediaWiki adds data structure by enriching the wiki syntax itself. Freebase is a closed-source free-content “database of everything.” I’m not interested in closed source technology, but I like a lot of what I see in Semantic MediaWiki, and I expect that the technologies will converge in the end, if only conceptually.
Florence Devouard (Chair and the Wikimedia Foundation) and I disagree a bit about the value of open source software in the Wikimedia Foundation projects. Lately Florence has been taking a more “best tool for the job”, “don’t reinvent the wheel” approach, especially when it comes to tools we use internally, or as web services (a recent discussion was about survey tools). I don’t consider myself an ideological person — I discard beliefs as quickly as I adopt them if they aren’t useful. Maximizing open-source use internally and elsewhere simply strikes me as a best practice for a non-profit like the Wikimedia Foundation.
Let’s take the example of survey tools. For a user survey, you could use a number of web services, or you could build an open source extension to MediaWiki that is used for collecting information. If you use the former, you might get a deal with a company that lets you use their service for free, in return for the exposure that being “advertised” through its use on Wikipedia will give them. But consider that you might want to run a similar or different survey again the next year, to validate if certain trends (like gender participation) have been affected by your actions.
As a vast online community dealing with all imaginable topics, Wikipedia has a huge number of detractors, including some deeply malicious or even mentally disturbed trolls. This means your software likely has to be more secure, because malicious hackers are more likely to try to pollute your survey with nonsense. With a proprietary survey vendor, there’s no way to let the community inspect the code for very common security vulnerabilities like SQL injection attacks. Given that they’d be running on an external server, it would also be harder to generate reliable (anonymized) user identifiers that can’t be easily hacked using a Perl script, to protect your survey against systematic data pollution. It’s not inconceivable that such an attack would even come from within the Wikipedia community itself, as a reaction to the use of proprietary software (believing in open source doesn’t mean that you’re not a dick).
Open source software is open for security auditing. Software which is committed to our own Subversion repository can also be fairly openly modified by a large number of committers, thanks to a liberal policy of granting access to the repository. In effect, the code is almost like its own little wiki world, with reverts and edit wars, but also a constant collaborative drive towards more quality. People from all parts of the MediaWiki ecosystem contribute to it (I’ve often said that MediaWiki is almost like a Linux kernel of the free culture movement), and are likely to share improvements if they need them, if only out of the self-interest to see them maintained in the official codebase.
If you need to retool your survey for, say, doing a usability inquiry into video use, an existing open source toolset makes it fairly easy to build upon what you have. And if you want to do a survey/poll that isn’t anonymized, hooking into MediaWiki will again make your life easier.
You might say: “Gee, Erik, you’re making this sound a lot more complicated than it is. A survey is just a bunch of questions and answers – what do you need complex software for? Can’t you just drop in a different piece of proprietary software whenever needed?” If you believe that, I recommend having a conversation with Erik Zachte, the creator of WikiStats. Erik knows a thing or two about analyzing data. He explained to me that one of the things you want to ensure is that the results you collect follow a standardized format. For example, if a user is asked to select a country they are from, you’ll want a list of countries to choose from, rather than asking them to type a string.
Moreover, you want this data to be translated into as many languages as possible. This is already being done in MediaWiki for the user interface, through the innovative “MediaWiki:” namespace, where users can edit user interface messages through the wiki itself. This is how we’ve managed to build a truly multilingual site even in minority languages: by making the users part of the translation process.
So, if you work with your proprietary survey vendor, you have to convince them to manage a truckload of translations for you, and you have to make damn sure that all the translated data is well-structured and re-usable should you ever decide to switch the survey tool. Otherwise you’ll be spending weeks just porting the data from one toolset to another. You can try to have them work on the data with you, but you’ll be spending a lot of your time trying to push your proprietary vendor to behave in a semi-open manner, when you could have simply decided to follow best practices to begin with. Companies that aren’t committed to open standards to begin with will always be driven towards a greater need to “control and protect our IP” from their internal forces: investors, boards, lawyers, managers.
Sure, you might have a higher upfront investment if there’s no existing toolset you can build on. But I find it quite funny that the same companies who go on and on about protecting their “intellectual property” are often so very quick to give up theirs: Open source software effectively belongs to you (and everyone else), with everything that entails. And it’s an ecosystem that gets richer every day. Instead of literally or metaphorically “buying into” someone else’s ideas, open source maximizes progress through cooperation. I cannot think of a better fit for our wiki world.
The reason to default to open source best practices is not ideological. It’s deeply pragmatic, but with a view on the long term perspectives of your organization. So while I agree with Florence that we should keep open (no pun intended) the option of using proprietary software in some areas of Wikimedia (particularly internal use), I would posit that any cost-benefit analysis has to take the very large number of long term benefits of the open source approach into account.
[UPDATE] LimeSurvey looks like a decent open source survey tool that we could use if we don’t care that much about deep integration.
WikiEducator, a wiki-based education community with a strong focus on developing countries, is the first production wiki to use LiquidThreads, a MediaWiki extension for threaded discussions. The project has a long and interesting history: three years ago, I reached the conclusion that regular wiki discussion pages were inadequate as a communication tool for many reasons, and wrote very basic specifications and created a mock-up for an alternative discussion system called LiquidThreads.
The core idea was to replace talk pages with discussion threads, which could be flexibly attached and re-attached to multiple different points in the wiki (hence the “liquidity”). Archiving was to be done automatically, but only if a summary had been created for a thread.
As these things go, it was merely an idea that would probably have been destined to be abandoned — until David McCabe picked it up and used it as a basis for a Summer of Code application in 2006. He demonstrated a fairly slick prototype at Wikimania 2006 (Hacking Days), but the project was on hold until Wikia and the Commonwealth of Learning joined forces to pay David for further work on it. I have played a project management role during this time. By now, LQT is still beta software, but it’s largely feature-complete.
It has turned out to be less liquid than we originally intended. It does not have the “one thread on multiple talk pages” feature, for one thing. It does implement the summary-based archiving, as well as page moves and a fairly cool message watchlist that replaces the “You have new messages”. Essentially you can get notified about any reply to threads you’ve been posting on — curently the notifications are shown on the watchlist, but David is working also showing a general notification for user talk messages.
There’ll be more debugging and usability work as we try this out with a real community, and the other major remaining step is to turn it into a proper MediaWiki extension (because it touches so many pieces of the code, it wasn’t possible to implement without touching the core codebase — we’ll have to get those hooks merged back into MW proper and get rid of anything idiosyncratic). After that, my hope is that it’ll be increasingly widely used as an alternative to talk pages, hopefully also on Wikimedia projects. With a basic framework like this in place, it’s now more realistic to think about things like in-wiki chat and e-mail-to-wiki gateways as well. All it really needs is a budget. If you’re interested in funding such projects, let me know.
Project Peach is the latest “open source movie” that will be made by the good folks behind Blender, an open source 3D content creation suite. Unlike the first movie, Elephants Dream (which was codenamed “Project Orange”), Project Peach will have a cute story with cute furry animals. Like Elephants Dream, Peach will be released under Creative Commons Attribution License, allowing anyone to use any part of the work for any purpose.
People who pre-order the Peach DVD before October 1 can have their name listed in the credits. They’ve already taken almost a thousand pre-orders. What this means, in simple terms, is that the Blender Institute now has enough reputation to fund free culture projects to the tune of EUR 30-50K through online donations. Not huge, but still pretty exciting.
So Sabine is trying to bootstrap a campaign where people show in a photo how and why they love Wikipedia. Well, Gerard and I weren’t particularly creative, but at least we did make a small visual contribution:
The point? During the next fundraiser we can show the faces of all the people who participated and give it a more human feel. So even if you cannot or do not want to donate, please consider taking out your camera and photograph yourself or your friends in ways that show your support for Wikipedia and Wikimedia.
My father is running (K)ubuntu Linux. A few days ago I helped him update from the previous release (6.10) to the most recent one (7.04). After the upgrade, his scanner and sound card stopped working. In the case of the scanner, the culprit was bug #85488, which was triggered due to an experimental feature enabled in the kernel of the latest version. I managed to pull a workaround from one of the comments, but I spent several hours on these two problems. (ALSA sound configuration is almost as horrible as CUPS.)
Rolling out upgrades that kill hardware == bad. Not even rolling out any kind of fix or workaround == worse. “It’s going to be fixed in the next release” is terrible policy when you introduce new problems that the user didn’t have before! If my choice on Ubuntu is between an out of date “long-term support” release and an extremely unreliable up-to-date “current” release, I might as well go back to Debian.
While I was at it, I also gave the Google-sponsored Tesseract and Ocropus projects a whirl. Unfortunately, from what I can tell, it is still going to take years until we have commercial grade open source desktop OCR under Linux. (The Tesseract command line client produced passable results on some input files, and zero byte files on others; Ocropus produced crap even on its own test files.)
One of the greatest things about the Wikimedia community is its annual conference, Wikimania. With a new location chosen every year, it gives interested Wikimedians an opportunity to meet like-minded wiki nuts from all over the planet. I have attended all the conferences so far, including this year’s in Taipei (August 3-5).
The Foundation’s Board of Trustees used the opportunity of Wikimania 2007 for the first-ever meeting with many members of its Advisory Board. This is a belated summary of my impressions from both events.
Advisory Board meeting
Angela, who is the Chair of the Advisory Board, is still working on a detailed report, so I will keep this one short. The following Advisory Board members managed to attend: Ward Cunningham, Heather Ford, Melissa Hagemann, Teemu Leinonen, Rebecca MacKinnon, Wayne Mackintosh, Benjamin Mako Hill, Erin McKean, Achal Prabhala, and Raoul Weiler. We met in a small room at the Chien Tan Oversea Youth Activity Center, the main conference venue.
The two-day meeting was facilitated by Manon Ress; the Agenda is publicly available. I will say this much upfront: The single most important function of this meeting was for Board and Advisory Board members to get to know and trust each other, and to figure out how we can actively work together in the future. I believe the meeting reached this goal. For many Advisory Board members, visiting Taipei also must have been the equivalent of drinking from a firehose of knowledge about wikis. Of course there are exceptions. 😉
Naturally, there were also some very specific exercises which will hopefully have practical use. For instance, randomized groups tried to identify the key values of the Foundation. The group I was in started out by humorously defining all the things we don’t want to be — extremely hierarchical, exclusive, western-centric, etc. — and then compared those with positive value statements. (For some reason, “world domination” ended up in both lists.) I suggested the slogan “knowledge without borders” or “knowledge without boundaries” as a possible framework for many of the key values we found: access to knowledge (in a participatory sense) on a global scale, multilingual and multicultural diversity, content that can be freely shared and modified, etc.
I don’t know if this particular slogan will catch on, but I like the idea of trying to express key principles in a short catchphrase. The list of values itself could also be useful for messaging and policymaking. There wasn’t a lot of notetaking on the wiki, so I hope Angela has some notes from the different groups available to her. We also tried to identify some key goals, and the list my group worked on (with some predictable disagreements about the meaning of “goal” etc.) included primarily these items, read through my own personal bias:
- Increasing quality of content in our projects, but also challenging and changing misperceptions.
- Massively scaling volunteer participations in all areas of organizational and project-level work, to live up to our ambitious mission & vision.
- Building greater awareness of Wikimedia’s mission & purpose. (Gregory Maxwell recently proposed adding an essay I started, 10 things you did not know about Wikipedia, to the notice displayed to unregistered users of Wikipedia. I think that’s an interesting experiment in changing people’s perceptions — we should use our own website properties more often to actually communicate with our readers.)
- Build capacities among partners and users — the ability to participate, to create and deploy new tools, and so on.
I believe that there is a deep and complex challenge of what I call “meta-management” — the Foundation has such a diversity of projects (Wikibooks, Wikinews, Wikisource, etc.) and goals that any approach which does not scale massively will not serve our community well. So, yes, we should of course hire coordinators for grants and projects, and get better at business development, and improve our technical infrastructure, and so forth. But I think networking and empowering volunteers to do many of the things we hope to pay more people to do is a much more scalable approach.
This is a very difficult idea to promote in a group of highly intelligent people, as it’s much more exciting to focus on more specific problems, so I’m not sure if I got this particular point across well in our discussion.
Unsurprisingly, then, I also found the biggest practical value in an exercise where I moderated a group on the topic of Volunteerism (link goes to the notes), consisting of Angela, Mako, and Achal. I was especially intrigued to hear more about Ubuntu’s process for creating and coordinating volunteer teams. I am left with the conclusion that we need more semi-formal ways for Wikimedians to self-organize than the heavy process of starting a chapter. Check the notes for some other interesting ideas. We got some more suggestions later, such as an “Edit Wikipedia Day” and other online events that could be held every year to encourage different types of participation.
I missed some of the more creative and physical elements that we used during the Frankfurt Board+Chapter Retreat which happened last year. For example, our facilitator there had an interesting exercise where she asked all participants to come up with a gesture to identify themselves (I used Columbo’s famous “Just one more thing”). While a bit repetitive in the end, I thought it was a fun bonding game that also helps very practically to remember people, faces and names. This makes me think that it would be good to have someone in-house for facilitation work, to build upon knowledge from previous events. Also, is there a wiki to document these kinds of processes? 😉
In practice, we will continue to use our Advisory Board mailing list to consult with A.B. members as a group — these are typically strategic questions, or “fishing” of the type “Does anyone know someone who ..”. But I believe the more frequent interactions will be with individuals, around issues in their domain. And while I initially viewed the A.B. as only being truly connected to the Board, I am increasingly coming to the opinion that we should encourage them to interact with staff and community members as well (and possibly sometimes chapters, though I hope these will also set up their own advisory bodies).
What is the ideal size for an Advisory Board? During the meeting I believe the consensus was against significant further expansion before we’re happy with the utilization of the current Advisory Board (this is not the case for the actual Board, where there is a consensus leaning towards another expansion). I suspect there’s another reason to keep it roughly at the present size: it allows us to get to know the members socially, to form a collegial trust relationship, which can lead to very different types of useful interaction than merely someone who you suspect to know something. It also keeps it manageable to prune members, to invite them into a single place, and so forth.
At the same time, if you think about expansion because “more is better”, then any size would be too small — you’ll want to manage knowledge and trust on a global scale. Wait a minute, knowledge and trust on a global scale? That sounds like a familiar problem! 😉 I suspect indeed that innovations of internal knowledge management will be driven by our project communities. And I don’t think that a massively decentralized approach of acquiring information from trustworthy sources and a fairly stable group of passionate advisors would be mutually exclusive.
The main conference was absolutely wonderful. We cannot thank the organizers enough for putting together an event that, I think, nobody who was there will soon forget. I will have to resort to the maligned bullet point list to even begin to enumerate all the things that were done well:
- a well-chosen venue (a youth hostel) with plenty of spaces to mingle
- a large number of sponsorships that never appeared obtrusive in any way
- a highly committed local team that went out of its way to assist with anything (starting with welcoming people at the airport)
- a compelling program with talks that were truly interesting to any Wikimedian
- many opportunities for ad hoc events (laptop content bundles, lightning talks, workshops, and so forth)
- side events – citizen journalism, hacking days, party, etc.
- excellent catering
- Taipei itself
- all of you who made it 😉
- and a million other things.
You owe it to yourself to come to Wikimania 2008, which is currently accepting city bids. Swim if you have to. And block the first week of August in your calendar.
I do think we should try to have more content that appeals to wiki newbies next time: editing workshops, project tours, exhibits, etc. Whether that’s the intent or not, many people who have barely seen an edit page will always be inclined to visit a conference like this — just because it’s about this crazy new wiki thing. That’s doubly true if the conference is in a location where the community isn’t yet as strong as in the U.S., parts of Asia, or Western Europe.
I did enjoy Taipei itself, especially a fun little tour with Shun-ling Chen, Mel from OLPC, and the Semantic MediaWiki developers. There is also some incriminating video evidence from another occasion that Kat will probably use against me sooner or later. I derived the greatest enjoyment from making new friends, having interesting conversations, and discovering new patterns (in reality in general and Wikipedia in particular). In that respect, I especially cherish the new things I learned from people like Luca de Alfaro (trust and reputation in Wikipedia), Michael Dale (Metavid), Shay David (Kaltura), and Brian Mingus (quality heuristics – let’s chat some more about this soon). I think all Board members had great conversations with Sue Gardner, our new “Special Advisor”, and Mike Godwin, our new Legal Counsel. And of course, it was great to connect again and catch up with many old friends in an unlikely location.
There was definitely a language barrier to connect more with the local folks. English isn’t that commonly spoken in Taiwan, and I found it difficult to converse much beyond smalltalk. Not much that can be done about that other than learning Chinese, which I’m afraid is unlikely to make my to-do list anytime soon. I tried to be accessible to anyone who did want to speak with me and gave an interview to a local magazine about OmegaWiki.
I will find it very interesting to look back on this Wikimania in context, and to hear more from others about it. I for one think it was a complete success. But I felt the same way about Boston and Frankfurt, so I hope there will also be some constructive criticism and maybe even some trolling. 😉 I’m also keen to see more wiki-events small and large. I won’t be able to make it to all or even most of them, but that’s OK. One way or another, it is wonderful to see the global community for free culture thrive. As a community, as friends; constructive in conflict, united in diversity.
Delphine and me are sharing a precious moment. “I ate what?!” I’m sure you can come up with a funnier caption for this one. I really have no memory of what actually happened.
Photo: Kat Walsh, CC-BY-SA.