Skip to content

{ Monthly Archives } November 2008

bias mining in political bloggers’ link patterns

I was pretty excited by the work that Andy Baio and Joshua Schachter did to identify and show the political leanings in the link behavior of blogs that are monitored by Memeorandum. They used singular value decomposition [1] on an adjacency matrix between sources and items based on link data from 360 snapshots of Memeorandum’s front page.

For the political news aggregator project, we’ve been gathering link data from about 500 blogs. Our list of sources is less than half of theirs (I only include blogs that make full posts available in their feeds), but we do have full link data rather than snapshots, so I was curious if we would get similar results.

The first 10 columns of two different U matrices are below. They are both based on link data from 3 October to 7 November; the first includes items that had an in-degree of at least 4 (5934 items), the second includes items with an in-degree of at least 3 (9722 items). In the first, the second column (v2) seems to correspond fairly well to the political leaning of the blog; in the second, the second column (v3) is better.

I’ll be the first to say that I haven’t had much time look at these results in any detail, and, as some of the commenters on Andy’s post noted, there are probably better approaches for identifying bias than SVD. If you’d like to play too, you can download a csv file with the sources and all links with an in-degree >= 2 (21517 items, 481 sources). Each row consists of the source title, source url, and then a list of the items the source linked to from 3 October to 7 November. Some sources were added part way though this window, and I didn’t collect link data from before they were added.

[1] One of the more helpful singular value decomposition tutorials I found was written by Kirk Baker and is available in PDF.

US political news and opinion aggregation

Working with Paul Resnick and Xiaodan Zhou, I’ve started a project to build political news aggregators that better reflect diversity and represent their users, even when there is an unknown political bias in the inputs. We’ll have more on this to say later, but for now we’re making available a Google gadget based on a prototype aggregator’s results.

The list of links is generated from link data from about 500 blogs and refreshed every 30 minutes. Some of the results will be news stories, some will be op-ed columns from major media services, others will be blog posts, and there are also some other assorted links.

At this early point in our work, the results tend to be more politically diverse than an aggregator such as Digg, but suffer from problems with redundancy (we aren’t clustering links about the same story yet). As our results get better, the set of links the gadget shows should improve.

Update 15 December: I twittered last week that I’ve added bias highlighting to the widget, but I should expand a bit on that here.

Inspired by Baio and Schachter’s coloring of political bias on Memeorandum, I’ve added a similar feature to the news aggregator widget. Links are colored according the average bias of the blogs linking to them. This is not always a good predictor of the item’s bias or whether it better supports a liberal or conservative view. Sometimes a conservative blogger writes a post to which more liberal bloggers than conservative bloggers, and in that case, the link will be colored blue.

If you don’t like the highlighting, you can turn it off in the settings.

wikis in organizations

Antero Aunesluoma presents at WikiFest

In early September, I attended WikiSym 08 in Porto, Portugal, so this post is nearly two months overdue. In addition to presenting a short paper on the use of a wiki to enhance organizational memory and sharing in a Boeing workgroup, I participated on the WikiFest panel organized by Stewart Mader.

Since then, a couple of people have asked me to post the outline of my presentation for the WikiFest panel. These notes are reflections from the Medshelf, CSS-D, SI, and Boeing workgroup wiki projects and are meant for those thinking about or getting started with deploying a wiki in a team. For those that have been working with wikis and other collaborative tools for a while, there probably aren’t many surprises here.

  1. Consider the wiki within your ecosystem of tools. For CSS-D and MedShelf, the wikis were able to offload many of the frequently asked questions (and, to an even greater extent, the frequent responses) from the corresponding email lists. This helps to increase the signal to noise ratio on the lists for list members that have been around for a while, and increasing their satisfaction with the lists and perhaps making them more likely to stick around.

    Another major benefit of moving some of this content from the mailing lists to the wiki is that new readers had less to read to get an answer. If you’ve ever search for the answer to a problem and found part of the solution in a message board or mailing list archive, you may be familiar with the experience of having to read through several proposed, partial solutions, synthesizing as you go, before arriving at the information you need. If all of that information is consolidated as users add it to the wiki, it can reduce the burden of synthesizing information from each time it is accessed to just each time someone adds new information to the wiki.

    In addition to considering how a wiki (or really, any other new tool) will complement your existing tools, consider what it can replace. At Boeing, the wiki meant that workgroup members could stop using another tool they didn’t like. If there was a directive to use the wiki in addition to the other tool, it probably wouldn’t have been as enthusiastically adopted. One of the reasons that the SI Wiki has floundered a bit is that there are at least three other digital places this sort of information is stored: two CTools sites and an intranet site. When people don’t know where to put things, sometimes we just don’t put them at all.

  2. Sometimes value comes from aggregation rather than synthesis. In the previous point, I made a big deal out of the value of using the wiki to synthesize information from threaded discussions and various other sources. When we started the MedShelf project, I was expecting all wikis to be used this way, but I was very wrong. With Medshelf, a lot of the value comes from individuals’ stories about coping with the illness. Trying to synthesize that into a single narrative or neutral article would have meant losing these individual voices, and for content like this, it aggregation — putting it all in the same place — can be the best approach.

    The importance of these individual voices also meant that many more pages than I expected were single-authored.

  3. Don’t estimate the value of a searchable & browsable collection. Using the workgroup wiki, team members have found the information need because they knew about one project and then were able to browse links to documentation other, related projects that had the information they needed. Browsing between a project page and a team member’s profile has also helped people to identify experts on a given topic. The previous tools for documenting projects didn’t allow for connections between different project repositories and made it hard to browse to the most helpful information. But this only works if you are adding links between related content on the wiki, or if your wiki engine automatically adds related links.

    For the wikis tied to mailing lists (CSS-D and Medshelf), some people arriving at the wiki through a search engine, looking for a solution to a particular problem, have browsed to the list information and eventually joined the list. This is certainly something that happens with mailing list archives, but which makes a better front door — the typical mailing list archive or a wiki?

  4. Have new users arrive in parallel rather than serial (after seeding the wiki with content).
  5. The Boeing workgroup wiki stagnated when it was initially launched, and did not really take off until the wiki evangelist organized a “wiki party” (snacks provided) where people could come and get started on documenting their past projects. Others call this a Barn Raising. This sort of event can give potential users both a bit of peer (or management) pressure and necessary technical support to get started adding content. It also serves the valuable additional role of giving community members a chance to express their opinions about how the tool can/should be used, and to negotiate group norms and expectations for the new wiki.

    Even if you can’t physically get people together — for the mailing list wikis, this was not practical — it’s good to have them arrive at the same time, and to have both some existing content and suggestions for future additions ready and waiting for them.

  6. Make your contributors feel appreciated. Wikis typically don’t offer the same affordances for showing gratitude as a threaded discussion, where it is usually easy to add a quick “thank you” reply or to acknowledge someone else’s contribution while adding new information. With wikis, thanks are sometimes rare, and users may see revisions to content they added as a sign that they did something wrong, rather than provided a good starting point to which others added. It can make a big difference to acknowledge particularly good writeups publicly in a staff meeting or on the mailing list, or to privately express thanks or give a compliment.

Continue reading ›