Thursday, September 04, 2008

Tales from the arXiv: Narcissistic Edition

My collaborators and I finally submitted our paper on Facebook networks last night. I normally post blog entries about my papers after the official journal version comes out, but this one was a royal pain in the ass ('particularly arduous' in more official language) to write, so I'm going to make an exception. Besides, this article is 38 journal-pages of sheer joy and community detection, so how can I not post about it now. (This is the longest article I've ever written, and it's longer than many doctoral theses. As I said, writing it was a pain in the ass, but I'm quite proud of the article and I hope it has a nice impact.)

First, here is the link to the arXiv posting.

Before I give the abstract, let me also shout out to my coauthors Mandi Traud (UNC student), Eric Kelsic (my 2005 Caltech SURF student, which is where this project all started), and Peter Mucha (UNC applied mathematician). Let me also give a shout out to Aaron Clauset and James Fowler, whose thorough commentary on an older version of the manuscript were inordinately valuable and way above and beyond the call of duty. (I can overstate my thanks for your help!) Without further ado...

Article title: Community Structure in Online Collegiate Social Networks

Article abstract: We apply the tools of network analysis to study the roles of university organizations and affiliations in structuring the social networks of students by examining the graphs of Facebook "friendships" at five American universities at a single point in time. In particular, we investigate each single-institution network's community structure, which we obtain by partitioning the graphs using an eigenvector method. We employ both graphical and quantitative tools, including pair-counting methods that we interpret through statistical analysis and permutation tests, to measure the correlations between the network communities and a set of self-identified user characteristics (residence, class year, major, and high school). We additionally investigate single-gender subsets of the university networks and also examine the impact of incomplete demographic information in the data. Our study across five universities allows one to make comparative observations about the online social lives at the different institutions, which can in turn be used to infer differences in offline lives. It also illustrates how to examine different instances of social networks constructed in similar environments, while emphasizing the array of social forces that combine to form simplified "communities" obtainable by the consideration of the friendship links. In an appendix, we review the basic properties and statistics of the employed pair-counting similarity coefficients and recall, in simplified notation, a useful analytical formula for the z-score of the Rand coefficient.

No comments: