Archive for May, 2008

links for 2008-05-29


First Black Squirrel in the garden


— from Lebeus(?)

There’s been a bit of a fuss about how quickly the invading Black Squirrel has swept through England but it was thrilling to see one in the garden. He/she was on the lawn, staring up at the bird feeders. I didn’t have time to get a photo, but I found this nice one by Lebeus taken near us in Cambridge.

links for 2008-05-23


How bureaucratic is Wikipedia and is it getting worse?


Tomá Gabzdil Libertíny‘s “Honeycomb Vase” taken by annamatic3000

Recently I posted an entry bemoaning the recent ‘criticisms’ I’d heard of Wikipedia, and in particular correcting Matthijs den Besten‘s graph from his talk “Wikipedia: the organizational capabilities of a peer production effort” to show that rather than increasing (as his graph sort of implies) the actual number of Wikipedia administrators per page has fallen over 50% from from 0.000526  at the end of the third quarter of 2002, down to 0.000191 at the end of the third quarter 2006.

Three things are causing me to doubt the implications I drew.

1) Matthijs left a thought provoking rebuttal in the comments section of my post

2) I’ve just finished Clay Shirky‘s “Here Comes Everybody” in which he says “Wikipedia, which looks like a reference work to the average viewer, is in fact a bureaucracy mainly given over to arguing. The articles are the residue of the argument.” (p. 279)

3) Both Matthijs and Clay refer to two recent papers from Fernanda Viégas (and colleagues): “The Hidden Order of Wikipedia” (HCII 2007) and “Talk Before You Type: Coordination in Wikipedia” (HICSS-40, 2007)

Matthijs makes two points. Firstly he corrects me by pointing out that although the number of administrators per page may be falling, the number of administrators overall is increasing, and “size matters”. Quoting some possible failures he suggests that the co-ordination required to keep all the administrators harmonized cannot be done informally for large numbers. But these are still volunteers, the structure is bottom up. Fernanda borrows the principle of “Collective Choice Arrangements” from Elinor Ostrom‘s analysis of “successful self-governed common-pool resources communities”. Collective-choice arrangements mean that most of the individuals affected by the rules of a community should be able to change those rules, and that the cost of altering the rules should be small.

There is one fragment of Matthijs comment that crystallises my objection: “provided we can equate administrators with managers”; or, as one of the slides in Matthijs’ talk was titled,  “Wikipedia as a firm”. I’m not comfortable with that. Luckily Fernanda doesn’t want to go that far either. In her HCII 2007 paper “The Hidden Order of Wikipedia” she says of Wikipedia’s Featured Article (FA) process: “the FA endeavour starts to sound very much like a modern-day enterprise workflow process. It is not, however.”

Matthijs’ second point is interesting too. He points out that

>>> Further, it would seem likely that many of the articles in the long tail of the encyclopedia are dormant. That is, they have reached a satisfactory quality, are read relatively infrequently and are hardly changed at all. Sure, those articles won’t require much bureaucratic interventions. However, what matters more in perceptions of bureaucracy is the likelihood that someone who edits a page is rebuffed by someone else (e.g. ratio edits/reverts) or the likelihood that people encounter papers that are restricted (e.g. percentage of top 100 articles in terms of page views that are locked). <<<

This is interesting. One can imagine measures that would capture reader’s and occasional editor’s perception of Wikipedia bureaucracy. I’m uncomfortable with Matthijs’ conjecture that the dormant articles are the unread ones, but I suppose their is logic there. It is only a small proportion of readers who may edit a page, so no readers implies no editors, which is what we’d mean by dormant. But an article could become dormant through acceptable quality being reached, and still attract readers. It would be interesting to see the graph done in Matthijs’ presentation but done with one of these measures instead of the number of administrators overall: the ratio of edits to reverts, the likelihood that an edited page is locked soon after, etc. Matthijs’ other measure, “the likelihood that people encounter papers that are restricted (e.g. percentage of top 100 articles in terms of page views that are locked)”, doesn’t seem so telling. I’d expect contentious subjects to be more likely to lead to a locked article and also to be more popular among readers.

Clay’s point I just don’t get. Saying that the articles in Wikipedia are the by-product of the arguments seems like saying that Fabergé exists to employ jewellers, rather than to make jewellery. He does have a neat turn of phrase though.

The organisation of Wikipedia is clearly more complex than I had appreciated. But does that mean it’s less like the anarchist utopia I naively imagined, and more like a large corporation? I think not.

I’ll be getting my Thursdays from a banana


Camel” by bgiguere2

Like many programmers I have a long list of programming languages that I’ve been meaning to try, meaning to master, or just meaning to dust off and revisit. For example my boss wants me to check out F#. In fact lots of the people at work are using F# on an increasing variety of projects. For some folk this is clearly a great idea. Take Ralf for example. Ralf’s work often relies on applying clever Bayesian models to new problem domains. When Ralf codes the mathematical model and something doesn’t work he needs to deduce whether it’s something with the model or something in its translation into code that went wrong. So a language like F# that moves the implementation close to the maths without sacrificing the ability to call rich libraries is perfect for Ralf. But most of my code is user interface or (or data handling at present) so I doubt I’d see the same advantage. Still, F# is high up on my ‘to try’ list.

Another entry is Processing. I’m increasingly doing lots of data visualization work and some of the most beautiful interactive visualizations I encounter on the web are written in Processing. But I am a tad confused (not least by if it is called Processing or Proce55ing). As I understand it Processing is a cut down Java that lets designers approach programming in a sketchbook style. But I can code full Java so why would I do that? Wouldn’t it be like putting stabilisers or training wheels on my racing bike? Possibly not; it looks like the iterative nature of the Processing environment is its power. So that’s on my list too.

But the catalyst for this post is Perl. I did go through a brief Perl phase back in the late nineties, but I never got beyond the struggling phase. I remember once working for several hours on a VRML file manipulation script in Perl before I finally got so stuck that I didn’t mind revealing my lack of knowledge and asking for help. At the time we had John Dent (aka Denty) interning in my group so I popped my head up over the divide between our desks and explained what I was trying to do. Denty started typing at the command prompt, and one line of Perl later he hit return. The computer thought for a few seconds and out popped the answer I needed. One line of Perl may give the impression it was a brief script, but IIRC it was a few hundred characters long. Wow. So Perl has been on my ‘to try and master’ list for a decade now.

So what bumped it up? I’ve just finished Clay Shirky‘s book “Here Comes Everybody“. There’s lots to say about this excellent book, so I’ll weave it into more posts. Today it had me laughing on the train – and it is rare that a work book has you laugh out loud in public. It was the bit where Clay is recalling his days as a Perl programmer. Here are the two passages that had me LOLing.

>>> Where, [the AT&T engineers] asked, did we get our commercial support for Perl? We told them we didn’t have any, which brought on yet more shocked reactions: We didn’t have any support? “We didn’t say that,” we replied. “We just don’t have any commercial support. We get out support from the Perl community.”

It was as if we’d told them, “We get our Thursdays from a banana” <<< (p. 256)

>>> Perl is a viable programming language today because millions of people woke up today loving Perl and, more important, loving one another in the context of Perl. <<< (pp. 257-258 )

Transparent Wikipedia visualization


At the coffee machine the other day I was talking to John Winn about my forthcoming intern project with Linda Becker, and about the new word tree visualization on Many Eyes that I found. Fernanda Viegas and Martin Wattenburg gave a riveting PARC talk about Many Eyes which I picked up from Andrew‘s post on his Information Aesthetics blog. In it they mention the surprising (to them) number of text based data sets (e.g. Shakespeare plays) which were uploaded to Many Eyes. But Many Eyes only had one simple text visualization – the tag cloud; so Fernanda and Martin locked themselves away for a week and brainstormed hundreds of text visualizations. Then their team implemented the best one of them, the word tree. I do like the text tree, here’s how it is described on the Many Eyes site:

A word tree is a visual search tool for unstructured text, such as a book, article, speech or poem. It lets you pick a word or phrase and shows you all the different contexts in which it appears. The contexts are arranged in a tree-like branching structure to reveal recurrent themes and phrases.

Martin gives a number of examples of its use on Many Eyes, from political speeches, through literary texts, to a funny example of the text people use in lonely hearts ads:

Back to the subject of this post. John wasn’t familiar with Fernanda’s work visualizing the history of Wikipedia articles. So I explained History Flow, the 2003/2004 work building visualizations like this one to show the build up of different authors’ edits of a Wikipedia article. History Flow is written up in a brilliant CHI paper that shows just how much Wikipedia behaviour can be gleaned from studying these diagrams.

But the diagram, the visualization, is separate from the page itself. One couldn’t stare at the diagram and thus read the source article. It turns out that John had done his own visualization of Wikipedia pages. John reasoned that edits to a page can be thought of as a quality metric, i.e. a piece of text that survives multiple edits is likely to be of reasonable quality. Here’s that example of John’s idea again:

John describes the idea on his Wikipedia user page : the age of the text is reflected by its colour so that standard text is over two years old whereas text that is only ten minutes old is rendered on a red background. I’m not sure this is the best way to do it – the red colouring both draws attention to the new text and also makes it harder to read, but there is something interesting about the data visualization not obstructing one’s reading of the source article.

I’m not on the first page of Google’s results for Tim Regan :-(


“Sad Google Screenshot”
from dumbledad(?)

Boo hoo. Type Tim Regan into Google and my work homepage use to be top of the list. I don’t think that was a fair reflection of my fame or popularity versus other Tim Regans, just that computer scientists arrived early at the web. But it was cool. Today is a bad day, I’m not even listed on the first page 😦

———- edit ———-

Ahhh, it’s just the UK site – phew:

I must stop fretting about this – it’s not healthy.

Eek, "blib" is not a word!


from dominocat(?)

At our regular Monday catch-up meeting I mentioned that I’d seen the adorable baby William and James on Sunday. I didn’t use the word adorable though, I used the word “blib” meaning cute, small, and … well … adorable. I was met with blank stares. So after the meeting I looked it up on the Oxford English Dictionary and various online slang dictionaries but it wasn’t there. I now wonder where I got it from, was it an eighties Essex word? How strange, I really thought blib was a real word.

You can say "social" on the radio


“What a radio looks like, 2” from genmon

A few years back we were doing a project (zCast) looking at novel uses of radio spectrum to deliver datacast content to mobile devices. One of the things that struck me was the incredible amount of innovation and excitement in the radio industry at present. That excitement spills over into other disciplines who are using the radio as a metaphor in creative ways. On an aborted work blog for our team Anab wrote a post looking at some of the results of the Radio Project given to interaction design students at the RCA. Similarly, one of the treats on Richard and my recent trip up to the School of Design at Dundee was the fun “single station radios” that last year’s students had designed, including a Radio 3 radio shaped like the scroll of a violin.

Meany of these designs use the radio as a thought piece. A great quote along those lines was that tuning a radio dial is a useful analogy for browsing the web. (NB I cannot remember the exact quote, nor who made it [Bill Buxton, Clay Shirky, no, not either of them, who???])

BBC Research and other innovative BBC groups are another excellent place for novel radio uses. This isn’t ‘radio as metaphor’, this is innovating radio listening itself. Take Tristan Ferne and Tom Coates et al’s Annotatable Audio (later renamed Find Listen Label). The idea is that the playback of a radio programme is enhanced by the addition of a wiki, whereby listeners can annotate sections of the programme. One imagines users different interests (particular topics, particular voice actors, etc) leading to multi-facetted rich annotations. As with Wikipedia some users might be good at starting a topic, while others might be good at making sure the segment boundaries are accurately described. Sadly it’s an archived prototype, rather than something we can use, though I guess the BBC’s listen again feature is only available for seven days whereas Find Listen Label would clearly work best with more permanent collections. Since stumbling across this work on Tom’s blog I’ve been hoping that I’ll get some flash of inspiration about how to build on this work – perhaps as some mobile media tagging prototype.

Another side of making radio more explicitly social is to share what you are listening to with your friends – either the music itself (a la Three Degrees or iTrip) or share the fact that you are listening (a la Last FM or any ‘listening to’ tag line on a blog). This isn’t a clear-cut good idea. Back in 2003 I did a study of a prototype I’d built called Media Center Buddies. The idea was to explore what it was like to merge instant messaging with TV viewing. I built a ‘working’ prototype (well it worked enough for short bursts in our usability labs) and recruited 32 participants to come and try it out. We recruited 16 heavy IM users and got them to bring a close friend. I wanted participants doubled up with friends. It bothers me when technological studies of TV use ignore the fact that TV is often viewed socially, and this seems especially problematic for social software since if several people are interacting in the same room it’s not clear whose buddies the system should connect to. I won’t go into the results here (I presented them at NordiCHI  2004 and am working on a book chapter version) but as in any user study there were unexpected results. One was about sharing what you are watching with friends and family. My prototype didn’t include that feature, but I had mocked up a screen-shot showing a buddy-list resplendent with details of what buddies were watching. During the discussion phase I’d ask my participants what they felt about the idea: would they like to know what their buddies were watching and would they like their buddies to know what they were watching. Everyone wanted to know what their buddies were watching, but the other question divided on gender. All my women participants (about 15 people) felt it was a great idea while all my men participant (about 17 people) felt it was an awful idea. When I probed them as to why it was bad the answers that occurred more than once were “I don’t want my mom to know I’m watching porn” and “I don’t want my friends to know I’m watching Martha Stewart”. It’s tempting to think that this split results in differences between men and women’s viewing habits, and indeed that is probably most of the reason, but interestingly one of my women participants made a point of saying that she watched a lot of pornography. I wondered if another contributing difference was that men were more likely than women to watch things that they were ashamed of. You see a similar (though reversed) split in the sociological literature about alcoholism. It affects men and women but men’s drinking is often public, while women’s is (was?) often private. Interesting though this line of enquiry into the privacy of viewing habits was, it didn’t seem very useful for Microsoft so I haven’t followed it up. But one of my take-aways was that a service which offered the sharing of information about current viewing between buddies would work best if it was targeted at women.

“Three units looking left” from Schulze

But what about radio? Certainly the kind of issue I found with video shouldn’t affect radio. Sure, I might be embarrassed that I occasionally enjoy BBC Radio 2 but it’s not as strong – I’m not ashamed. Likewise for Last FM, it does bother me a bit that my listening becomes a visible part of my web identity (and it bothers me a lot that Last FM misses all my BBC iPlayer listening) but the pros outweigh the cons. Enter Olinda. I picked this up on the Make blog, though I should have spotted it on the BBC Radio Labs blog. The product design wonderful, and the modular nature of the hardware fascinating, but it’s the social computing that’s really intriguing. The Olinda is a DAB digital radio that connects to your home wi-fi network so that you can find out what your friends are listening to and they can find out about you. It’s done by Schulze and Webb for BBC Audio & Music Interactive. Wonderful. I do have some questions though. Radio is sometimes a solitary experience (e.g. the commute to work) but it is also playing in the heart of the family home – in the kitchen. Then whose taste is it reflecting? Is it my penchant for BBC Radio 3 and BBC Radio 4, my kid’s preference for BBC Radio 1 or Q103, or my wife’s preference for silence? Whose buddies is it sharing this knowledge with – mine, my daughter’s, the union, the intersection, etc? That’s the kind of issue we’ve been grappling with through our Epigraph project (and its successors) and it would be great to glean the Olinda team’s views on this. That said there are some wonderful sharing ideas in the explanatory pamphlet, e.g. klippit (c.f. Grab-and-Share) and volume voting.

links for 2008-05-09