Thursday, October 18, 2007

Interesting new paper (on topics near and dear to my heart)

Here is a really interesting new paper that just got posted to the arxiv.

It discusses a very interest way of constructing datasets for social networks that is very appealing and I would like to use in future research. (In fact, even though I haven't had the chance to read it closely yet --- I started rectifying that earlier this evening, though I don't plan to finish reading it until tomorrow --- this paper has already given me some new ideas for undergraduate and/or Masters research projects for this summer.) Moreover, the working examples in this paper cover three of my favorite groups of people: physicists, U.S. Senators, and baseball players. (It's like the authors are trying to steal my heart: not only did they do something really interesting with real-world networks, but look at the examples they chose! And they even cited one of my papers. If I ever decide to deal with children, I think I'll simply have to adopt some of the authors of this paper.) Also, I very much appreciate the use of the word "googling" in the paper's title. I use the word a lot (and I know others among my crowd at least use it occasionally), but I had never previously seen it in the title of a scientific paper.

Here is the paper's abstract:

Recently, massive digital records have made it possible to analyze a huge amount of data in social sciences such as social network theory. We investigate social networks between people by extracting information on the World Wide Web. Using famous search engines such as Google, we quantify relatedness between two people as the number of Web pages including both of their names and construct weighted social relatedness networks. The weight and strength distributions are found to be quite broad. A class of measure called the R{\'e}nyi disparity, characterizing the homogeneity of weight distribution for each node, is presented. We introduce the maximum relatedness subnetwork, which extracts the most essential relation for each individual. We analyze the members of the 109th United States Senate as an example and demonstrate that the methods of construction and analysis are applicable to various other social groups and weighted networks.


Their idea for gathering data is awesome! Because I haven't read the paper closely yet, I'll reserve comments as to the analysis they do with that data. However, their choice of methodoly has already inspired several ideas in my head, so that alone makes this paper's existence extremely worthwhile in my book.

The most straightforward of my ideas is simply to use their method of data collection to construct other networks that interest me (such as collaborations among mathematicians and various kinds of collaborative connections in the U.K. parliament). However, I'm also wondering if I can come up with some sort of variant involving Google Battle. (For the record, Oxford beats Cambridge according to this index. Sadly, Caltech doesn't fair nearly as well against MIT. In fact, this latter battle was downright embarrassing for the home team.)

No comments: