Skip to content

{ Tag Archives } opinion

bias mining in political bloggers’ link patterns

I was pretty excited by the work that Andy Baio and Joshua Schachter did to identify and show the political leanings in the link behavior of blogs that are monitored by Memeorandum. They used singular value decomposition [1] on an adjacency matrix between sources and items based on link data from 360 snapshots of Memeorandum’s front page.

For the political news aggregator project, we’ve been gathering link data from about 500 blogs. Our list of sources is less than half of theirs (I only include blogs that make full posts available in their feeds), but we do have full link data rather than snapshots, so I was curious if we would get similar results.

The first 10 columns of two different U matrices are below. They are both based on link data from 3 October to 7 November; the first includes items that had an in-degree of at least 4 (5934 items), the second includes items with an in-degree of at least 3 (9722 items). In the first, the second column (v2) seems to correspond fairly well to the political leaning of the blog; in the second, the second column (v3) is better.

I’ll be the first to say that I haven’t had much time look at these results in any detail, and, as some of the commenters on Andy’s post noted, there are probably better approaches for identifying bias than SVD. If you’d like to play too, you can download a csv file with the sources and all links with an in-degree >= 2 (21517 items, 481 sources). Each row consists of the source title, source url, and then a list of the items the source linked to from 3 October to 7 November. Some sources were added part way though this window, and I didn’t collect link data from before they were added.

[1] One of the more helpful singular value decomposition tutorials I found was written by Kirk Baker and is available in PDF.

US political news and opinion aggregation

Working with Paul Resnick and Xiaodan Zhou, I’ve started a project to build political news aggregators that better reflect diversity and represent their users, even when there is an unknown political bias in the inputs. We’ll have more on this to say later, but for now we’re making available a Google gadget based on a prototype aggregator’s results.

The list of links is generated from link data from about 500 blogs and refreshed every 30 minutes. Some of the results will be news stories, some will be op-ed columns from major media services, others will be blog posts, and there are also some other assorted links.

At this early point in our work, the results tend to be more politically diverse than an aggregator such as Digg, but suffer from problems with redundancy (we aren’t clustering links about the same story yet). As our results get better, the set of links the gadget shows should improve.

Update 15 December: I twittered last week that I’ve added bias highlighting to the widget, but I should expand a bit on that here.

Inspired by Baio and Schachter’s coloring of political bias on Memeorandum, I’ve added a similar feature to the news aggregator widget. Links are colored according the average bias of the blogs linking to them. This is not always a good predictor of the item’s bias or whether it better supports a liberal or conservative view. Sometimes a conservative blogger writes a post to which more liberal bloggers than conservative bloggers, and in that case, the link will be colored blue.

If you don’t like the highlighting, you can turn it off in the settings.