Saturday, July 12, 2014

"A Simple Generative Model of Collective Online Behavior"

Last Monday, PNAS published, in their early edition (so there are no volume or page numbers yet), a new article by my collaborators and me about how to model complex systems in a way that incorporates data directly into a simple model. We also illustrate with this example a nice situation in which several different situations (in particular, different parameter values, where one gets different qualitative mechanisms in different regions of parameter space) give the same qualitative long-time behavior (and very similar quantitative long-time behavior) so that one needs to consider the temporal dynamics explicitly to distinguish between the mechanisms. There are many papers that seem to assume mechanisms based on statistical fitting of long-time (or even equilibrium) behavior, and that is very dangerous. Here are the details about the article.


Title: A Simple Generative Model of Collective Online Behavior

Authors: James P. Gleesona, Davide Cellai, Jukka-Pekka Onnela, Mason A. Porter, and Felix Reed-Tsochas

Abstract: Human activities increasingly take place in online environments, providing novel opportunities for relating individual behaviors to population-level outcomes. In this paper, we introduce a simple generative model for the collective behavior of millions of social networking site users who are deciding between different software applications. Our model incorporates two distinct mechanisms: one is associated with recent decisions of users, and the other reflects the cumulative popularity of each application. Importantly, although various combinations of the two mechanisms yield long-time behavior that is consistent with data, the only models that reproduce the observed temporal dynamics are those that strongly emphasize the recent popularity of applications over their cumulative popularity. This demonstrates--even when using purely observational data without experimental design--that temporal data-driven modeling can effectively distinguish between competing microscopic mechanisms, allowing us to uncover previously unidentified aspects of collective online behavior.

Significance Paragraph (which is now a part of PNAS papers): One of the most common strategies in studying complex systems is to investigate and interpret whether any "hidden order" is present by fitting observed statistical regularities via data analysis and then reproducing such regularities with long-time or equilibrium dynamics fromsome generative model. Unfortunately, many different models can possess indistinguishable long-time dynamics, so the above recipe is often insufficient to discern the relative quality of competing models. In this paper, we use the example of collective online behavior to illustrate that, by contrast, time-dependent modeling can be very effective at disentangling competing generative models of a complex system.


This paper has gotten a little bit of press coverage. You can find links to press coverage for this article (and other articles of mine) on the press part of my web page.

No comments: