Skip to content

privacy on twitter vs. privacy on facebook

In a post describing some teens’ use of Twitter and Facebook (Twitter is for friends; Facebook is everybody; some teens are using private Twitter accounts for communication with friends because Twitter is too public), danah boyd poses the following question:

My guess is that if Twitter does take off among teens and Dylan’s friends feel pressured to let peers and parents and everyone else follow them, the same problem will arise and Twitter will become public in the same sense as Facebook. This of course raises a critical question: will teens continue to be passionate about systems that become “public” (to all that matter) simply because there’s social pressure to connect to “everyone”?

I believe that Twitter may actually be much more resistant to both this pressure and subsequent switch to less “public” platforms than Facebook for two reasons: account norms and Twitter clients.

Account Norms, Privacy, and Collapsed Contexts
On Facebook, everyone pretty much gets one account.1 This leaves me with a choice of collapsed contexts (same profile for everyone) or only friending people from a particular context or set of context. There are many fine-grained privacy controls, but this all adds up to a more-is-less experience, at least for me. There are enough many controls that I don’t particularly remember what I’ve set to be visible to whom. When I comment on something in friend’s profile (or am tagged in one of their photos), I don’t know who can see that.

With Twitter, people can have multiple accounts, and for private accounts, they know exactly who can see their posts: only people who I give permission. This is not to say Twitter is not without some privacy pitfalls – e.g. plenty of private tweets get retweeted or replies on others’ public accounts – but I have a much clearer idea of who can see a status update or reply on Twitter than I do of who can see similar content on Facebook at the time of posting. I suspect that many users of private Twitter accounts do so just to avoid the “what if so-and-so sees this?” question. So it seems reasonable that people could have different accounts for their work, family, friends, etc personas, though there’s a point at which it probably would be too many.

Twitter Clients
Having multiple accounts wouldn’t work well without an appropriate interface, and here Twitter benefits hugely from its API and the many, many Twitter clients available. Using more than one Facebook account, especially simultaneously, is an ordeal – multiple web browsers, no aggregation. With the right client, reading from and posting to multiple Twitter accounts is a breeze.

So while there may eventually be an exit from a more public Twitter, I think there is more room to move within the same service, diversifying accounts, than there might be on Facebook. This will only, work, though if people are willing to set boundaries and accept boundaries – and probably not if mom and dad insist on following the Twitter account their kids use to communicate with friends from school, or if colleagues regularly feel insulted when a coworker-acquaintance declines their request to follow an account they use to communicate with close friends.

1I believe this used to be part of the terms of service, but I don’t see it anymore and can’t be sure that it was ever there.

Tagged , , ,

Sidelines at ICWSM

Last week I presented our first Sidelines paper (with Daniel Zhou and Paul Resnick) at ICWSM in San Jose. Slides (hosted on slideshare) are embedded below.

Opinion and topic diversity in the output sets can provide individual and societal benefits. If news aggregators relying on votes and links to select and subsets of the large quantity of news and opinion items generated each day simply select the most popular items may not yield as much diversity as is present in the overall pool of votes and links.

To help measure how well any given approach does at achieving these goals, we developed three diversity metrics that address different dimensions of diversity: inclusion/exclusion, nonalienation, and proportional representation (based on KL divergence).

To increase diversity in result sets chosen based on user votes (or things like votes), we developed the sidelines algorithm. This algorithm temporarily suppresses a voter’s preferences after a preferred item has been selected. In comparison to collections of the most popular items, from user votes on Digg.com and links from a panel of political blogs, the Sidelines algorithm increased inclusion while decreasing alienation. For the blog links, a set with known political preferences, we also found that Sidelines improved proportional representation.

Our approach differs and is complementary to work that selects for diversity or identifies bias based on classifying content (e.g. Park et al, NewsCube; ) or by classifying referring blogs or voters (e.g. Gamon et al, BLEWS). While Sidelines requires votes (or something like votes), it doesn’t require any information about content, voters, or long term voting histories. This is particularly useful for emerging topics and opinion groups, as well as for non-textual items.

Tagged , , , , , ,

SI182 Final Projects

A belated congrats to all of the EECS182/SI182 students on finishing the semester. For those not familiar with the course, SI182 is an intro to programming course in the informatics program at UM. Paul Resnick and I taught it this past semester, and arranged the course around pulling data from public feeds, processing this data, and presenting it again, online, in a way that adds value.

Here’s a sampling of the final projects:

Also, a huge thanks to Chuck Severance, who got this course started and gave us early chapters of his book Using Google App Engine, which gave us the confidence to use App Engine in the course and which we were able to rely on for class readings.

Tagged , , , , , , , ,

ann arbor craigslist housing ads mapped

I tend to begin my housing search on Craigslist, with one or more general areas where I’d like to live in mind. Because location matters to me, I’ve found HousingMaps.com to be incredibly helpful. Unfortunately, it’s doesn’t include Ann Arbor. Other sites do, but aren’t really compatible with how I search for housing – I tend to search across rooms & shares and apartment & house rentals. I do have a definite price ceiling. I’ll often have a potential housemate or two, but with some flexibility in case preferences ultimately diverge. I also want to be able to limit a display to just new listings since I last checked.

So, after growing impatient with existing tools, put together a map of Ann Arbor Craigslist housing listings (sorry, I don’t include homes for sale. It’s not the right time to buy, anyway). You can:

  • simultaneously display each of the listing types,
  • Filter by price, date posted, and number of bedrooms (all as sliders with min and max), and
  • adjust the price filters to work on price per bedroom.

This was pretty fast to throw together thanks to BeautifulSoup, the Google Maps API, and YUI, but I’m sure that there are some rough edges that will need to get worked out.

Also, I don’t know or care whether it works in IE.

Update, 19 February: More locations now available.

Tagged , , , ,

visualizing political blogs’ linking

There are a number of visualizations of political bloggers’ linking behavior, notably Adamic and Glance’s 2005 work that found political bloggers of one bias tend to link to others of the same bias. Also check out Linkfluence’s Presidential Watch 08 map, which indicates similar behavior.

These visualizations are based on graphs of when one blog links to another. I was curious to what extent this two-community behavior occurs if you include all of the links from these blogs (such as links to news items, etc). Since I have link data for about 500 blogs from the news aggregator work, it was straightforward to visualize a projection of the bipartite blog->item graph. To classify each blog as liberal, conservative, or independent, I used a combination of the coding from Presidential Watch, Wonkosphere, and my own reading.

Projection of links from political blogs to items (Oct - Nov 2008)

Projection of links from political blogs to items (Oct - Nov 2008). Layout using GEM algorithm in GUESS.

The visualization shows blogs as nodes. Edges represent shared links (at least 6 items must be shared before drawing an edge) and are sized based on their weight. Blue edges run between liberal blogs, red edges between conservative blogs, maroon between conservative and independent, violet blue between liberal and independent, purple between independent blogs, and orange between liberal and conservative blogs. Nodes are sized as a log of their total degree. This visualization is formatted to appear similar to the Adamic and Glance graph, though there are some important differences, principally because this graph is undirected and because I have included independent blogs in the sample.

This is just a quick look, but we can see that the overall linking behavior still produces two fairly distinct communities, though a bit more connected than just the graph of blog to blog links. It’d be fun to remove the linked blog posts from this data (leaving mostly linked news items) to see if that changes the picture much. Are some media sources setting the agenda for bloggers of both parties, or are the conservative bloggers reading and reacting to one set news items and liberal bloggers reading and reacting to another? I.e., is the homophily primarily in links to opinion articles, or does it also extend to the linked news items?

I’m out of time at this point in the semester, though, so that will have to wait.

Tagged , , , , , ,

bias mining in political bloggers’ link patterns

I was pretty excited by the work that Andy Baio and Joshua Schachter did to identify and show the political leanings in the link behavior of blogs that are monitored by Memeorandum. They used singular value decomposition [1] on an adjacency matrix between sources and items based on link data from 360 snapshots of Memeorandum’s front page.

For the political news aggregator project, we’ve been gathering link data from about 500 blogs. Our list of sources is less than half of theirs (I only include blogs that make full posts available in their feeds), but we do have full link data rather than snapshots, so I was curious if we would get similar results.

The first 10 columns of two different U matrices are below. They are both based on link data from 3 October to 7 November; the first includes items that had an in-degree of at least 4 (5934 items), the second includes items with an in-degree of at least 3 (9722 items). In the first, the second column (v2) seems to correspond fairly well to the political leaning of the blog; in the second, the second column (v3) is better.

I’ll be the first to say that I haven’t had much time look at these results in any detail, and, as some of the commenters on Andy’s post noted, there are probably better approaches for identifying bias than SVD. If you’d like to play too, you can download a csv file with the sources and all links with an in-degree >= 2 (21517 items, 481 sources). Each row consists of the source title, source url, and then a list of the items the source linked to from 3 October to 7 November. Some sources were added part way though this window, and I didn’t collect link data from before they were added.

[1] One of the more helpful singular value decomposition tutorials I found was written by Kirk Baker and is available in PDF.

Tagged , , , , ,

US political news and opinion aggregation

Working with Paul Resnick and Xiaodan Zhou, I’ve started a project to build political news aggregators that better reflect diversity and represent their users, even when there is an unknown political bias in the inputs. We’ll have more on this to say later, but for now we’re making available a Google gadget based on a prototype aggregator’s results.

The list of links is generated from link data from about 500 blogs and refreshed every 30 minutes. Some of the results will be news stories, some will be op-ed columns from major media services, others will be blog posts, and there are also some other assorted links.

At this early point in our work, the results tend to be more politically diverse than an aggregator such as Digg, but suffer from problems with redundancy (we aren’t clustering links about the same story yet). As our results get better, the set of links the gadget shows should improve.

Update 15 December: I twittered last week that I’ve added bias highlighting to the widget, but I should expand a bit on that here.

Inspired by Baio and Schachter’s coloring of political bias on Memeorandum, I’ve added a similar feature to the news aggregator widget. Links are colored according the average bias of the blogs linking to them. This is not always a good predictor of the item’s bias or whether it better supports a liberal or conservative view. Sometimes a conservative blogger writes a post to which more liberal bloggers than conservative bloggers, and in that case, the link will be colored blue.

If you don’t like the highlighting, you can turn it off in the settings.

Tagged , , , ,

wikis in organizations

Antero Aunesluoma presents at WikiFest

In early September, I attended WikiSym 08 in Porto, Portugal, so this post is nearly two months overdue. In addition to presenting a short paper on the use of a wiki to enhance organizational memory and sharing in a Boeing workgroup, I participated on the WikiFest panel organized by Stewart Mader.

Since then, a couple of people have asked me to post the outline of my presentation for the WikiFest panel. These notes are reflections from the Medshelf, CSS-D, SI, and Boeing workgroup wiki projects and are meant for those thinking about or getting started with deploying a wiki in a team. For those that have been working with wikis and other collaborative tools for a while, there probably aren’t many surprises here.

  1. Consider the wiki within your ecosystem of tools. For CSS-D and MedShelf, the wikis were able to offload many of the frequently asked questions (and, to an even greater extent, the frequent responses) from the corresponding email lists. This helps to increase the signal to noise ratio on the lists for list members that have been around for a while, and increasing their satisfaction with the lists and perhaps making them more likely to stick around.

    Another major benefit of moving some of this content from the mailing lists to the wiki is that new readers had less to read to get an answer. If you’ve ever search for the answer to a problem and found part of the solution in a message board or mailing list archive, you may be familiar with the experience of having to read through several proposed, partial solutions, synthesizing as you go, before arriving at the information you need. If all of that information is consolidated as users add it to the wiki, it can reduce the burden of synthesizing information from each time it is accessed to just each time someone adds new information to the wiki.

    In addition to considering how a wiki (or really, any other new tool) will complement your existing tools, consider what it can replace. At Boeing, the wiki meant that workgroup members could stop using another tool they didn’t like. If there was a directive to use the wiki in addition to the other tool, it probably wouldn’t have been as enthusiastically adopted. One of the reasons that the SI Wiki has floundered a bit is that there are at least three other digital places this sort of information is stored: two CTools sites and an intranet site. When people don’t know where to put things, sometimes we just don’t put them at all.

  2. Sometimes value comes from aggregation rather than synthesis. In the previous point, I made a big deal out of the value of using the wiki to synthesize information from threaded discussions and various other sources. When we started the MedShelf project, I was expecting all wikis to be used this way, but I was very wrong. With Medshelf, a lot of the value comes from individuals’ stories about coping with the illness. Trying to synthesize that into a single narrative or neutral article would have meant losing these individual voices, and for content like this, it aggregation — putting it all in the same place — can be the best approach.

    The importance of these individual voices also meant that many more pages than I expected were single-authored.

  3. Don’t estimate the value of a searchable & browsable collection. Using the workgroup wiki, team members have found the information need because they knew about one project and then were able to browse links to documentation other, related projects that had the information they needed. Browsing between a project page and a team member’s profile has also helped people to identify experts on a given topic. The previous tools for documenting projects didn’t allow for connections between different project repositories and made it hard to browse to the most helpful information. But this only works if you are adding links between related content on the wiki, or if your wiki engine automatically adds related links.

    For the wikis tied to mailing lists (CSS-D and Medshelf), some people arriving at the wiki through a search engine, looking for a solution to a particular problem, have browsed to the list information and eventually joined the list. This is certainly something that happens with mailing list archives, but which makes a better front door — the typical mailing list archive or a wiki?

  4. Have new users arrive in parallel rather than serial (after seeding the wiki with content).
  5. The Boeing workgroup wiki stagnated when it was initially launched, and did not really take off until the wiki evangelist organized a “wiki party” (snacks provided) where people could come and get started on documenting their past projects. Others call this a Barn Raising. This sort of event can give potential users both a bit of peer (or management) pressure and necessary technical support to get started adding content. It also serves the valuable additional role of giving community members a chance to express their opinions about how the tool can/should be used, and to negotiate group norms and expectations for the new wiki.

    Even if you can’t physically get people together — for the mailing list wikis, this was not practical — it’s good to have them arrive at the same time, and to have both some existing content and suggestions for future additions ready and waiting for them.

  6. Make your contributors feel appreciated. Wikis typically don’t offer the same affordances for showing gratitude as a threaded discussion, where it is usually easy to add a quick “thank you” reply or to acknowledge someone else’s contribution while adding new information. With wikis, thanks are sometimes rare, and users may see revisions to content they added as a sign that they did something wrong, rather than provided a good starting point to which others added. It can make a big difference to acknowledge particularly good writeups publicly in a staff meeting or on the mailing list, or to privately express thanks or give a compliment.

Continue reading ›

Tagged , , , ,

palace ball

For nearly a month now, I’ve been obligated to write a post explaining the rules of Palace Ball (during times of heightened nationalism, it may also be called Freedom Ball). There isn’t a whole lot to it beyond what you see in the above video, but it works roughly as follows:

Palace ball field

Palace ball field

  1. You can play on a rectangular or square field; something about the size of a tennis court or little larger should work. There are end zones at opposite ends. There is no out of bounds at the sides.
  2. The primary ball (or palace ball) starts in the center of the field. It should probably be 18-20″ in diameter, maybe a bit more. Something like this ball should work well. Experiment for best results.
  3. Each of the two players start with a ball for bowling / throwing at the primary ball. These are called bowlers. They should be about 10″ in diameter and can either be kickballs or playground balls like the palace ball. The bowlers should be different colors, since each player can only use his or her own bowler. Unlike in the video, each player should have the same type and size of bowler, or it’s just unfair (I still contend that this is the only reason Ben won the match in the video).
  4. When the game starts, each player tries to repeatedly throw / toss / bowl your bowler in the palace ball, pushing it into the opponent’s end zone. They must release their bowler at least 3 feet from the palace ball.
  5. You can play for a set period of time, or to a certain score.

The game would probably work quite well with doubles (still one bowler per team), but more than that is probably a bit much.

Tagged , ,

Smart Mobs, iPhone 3G, and AT&T’s Direct Fulfillment process

This is an iPhone post. I’d been waiting to replace my sometimes-barely functioning phone for a good while, so, like many others, I showed up at a local AT&T store on Friday in hopes of getting my iPhone. After spending an embarrassing amount of time in line, and shortly before getting to the front, we were told that the store was out. No problem, I’d place an order and get it when it shows up.

I didn’t think too much about it until a few days later when someone who ordered the same model and color phone at the same store several hours after me mentioned that their phone had shipped. Mine hadn’t, so the sequence of order fulfillment seemed a bit strange. Curious and confused, I turned to Google. This led me to several threads and blog posts discussing AT&T’s Direct Fulfillment system, the longest of which is a now 220-page thread on AT&T’s own customer support system. The discussion in this thread is interesting to me as a customer and as a student. Though the thread contains a bit of vitriol, misinformation, and even paranoia, the posters are able to work together to build a fairly coherent model of AT&T’s direct fulfillment process.

The thread starts out with questions about whether others have received their phones — customers’ questions that can help them calibrate their own expectations. Some eager customers soon noticed that in addition to checking their own order status, AT&T’s order status system allows users to view and track orders from others in their zip code by simply incrementing the order number in the URL. From this, users notice that some orders, placed after their own unshipped orders, have already shipped — is the system unfair somehow, or are some models just shipping sooner? The posters share information and anecdotes that confirm that at least some orders for the same model and color of phone are being posted out of order.

Elsewhere on the web, Greg de Vitry builds a tool that scrapes a range of order numbers and aggregates data from several users to count total daily shipments. The tool’s users see the tool’s shipment tally and begin questioning AT&T’s official statement that they are shipping tens of thousands of orders per day. Greg soon updates his tool to collect model numbers, which again confirms that orders are not being shipped according to first-in-first-out. As more users enter their information, it becomes plausible (if not likely) that forum readers and users of the tool have a better overview of the direct fulfillment process than many of AT&T’s own frontline employees.

The thread’s users eventually begin to seek media attention, hoping that if they expose the number of unshipped orders and haphazard fashion in which they are being filled, Apple and AT&T will be embarrassed enough to ship them their phones faster or compensate them. Users post to CNN’s iReport and email Fox News.

In addition to sharing information, the thread’s posters are telling jokes, commiserating together, and wishing each other luck. The conversation feels very similar to the conversations in the line outside of the AT&T store on Friday, except the forum posters have more diversity in information and can share it with the entire virtual line much more easily than they could with the local lines.

Tagged ,