Detecting Illicit Activity by Examining Communication Network Structure

This article from The Atlantic's website describes some fascinating research by Brandy Aven at CMU's Tepper School that demonstrates how communication networks discussing illicit activity differ from those discussing routine matters by examining the Enron email archives. It's a great example of how the structure of a network can reveal information about the process that generated it.

Another Social Media Disaster: The Milk "Everything I Do is Wrong" campaign

The New York Times chronicles another example of an online ad campaign gone bad.  This time the California Milk Processor Board ran a campaign at (since replaced with touting the abilities of milk to help reduce PMS symptoms that clearly made light of women from a stereotyped males perspective.  As we well know by now, in the age of social media, a misstep like this can quickly turn into a disaster.

The Christakis and Fowler Social Networks Influence Brouhaha

Recently there has been a spirited conversation kicked off by the publication of an article, "The Spread of Evidence-Poor Medicine via Flawed Social-Network Analysis," by Russell Lyons regarding the well-publicized work of Nicholas Christakis and James Fowler on social contagion of obesity, smoking, happiness, and divorce.  The discussion has been primarily confined to the specialized circle of social network scholars, but now that conversation has spilled out into the public arena in the form of an article by Dave Johns in Slate (Dave Johns has written about the Christakis-Fowler work in Slate before).  Christakis and Fowler's work has received a huge amount of attention, appearing on the cover of the New York Times magazine, on the Colbert Report TV program, and a ton of other places (see James's website for more links).  Many others have made detailed comments on Lyon's article and on the original Christakis-Fowler papers.  I wish to address some of the related issues raised in Slate about scientists in the media

The article seems to criticize Christakis and Fowler for their media appearances, as though this publicity is inappropriate for scientists who should be diligently but silently working in the background, leaving it up to policy makers and the media to make public commentary and recommendations.  I think this criticism is not only wrong, but dangerous.  Many if not most researchers do work silently in the background, shunning the spotlight and scrutiny of the media, not out of shyness or fear of embarrassment, but because of a pervasive misunderstanding of scientific uncertainty.  Hard science is simply much softer than many people realize.

ALL scientific conclusions — from physics to sociology come with uncertainty (this does not apply to mathematics, which is actually not a science).  A "scientific truth" is actually something that we're only pretty sure is true.  But we'll never be definitely 100% sure, that's just how science works.  When one scientist says to another, we have observed that X causes Y, it is understood that what is meant is, the probability that the observed relationship between X and Y is due to chance is very small.  But, statements like that don't make for good news stories.  Not only are they uninteresting, but for most people they're unintelligible (which is not to say that the public is stupid — the concepts of uncertainty and statistical significance are extremely subtle and often misunderstood even by well-trained scientists).  So, many scientists avoid the media because we're asked to make definitive statements where no definitive statements are possible, or we make statements that include uncertainty that are ignored or misunderstood.

But we need scientists in the media.  Only a fraction of Americans believe the planet is warming and 40% of Americans believe in creationism.  Scientists in the media can help correct these misperceptions and guide public policy.  And, maybe even more importantly, scientists in the media can make science sexy.  We already live in a world where science and politics are often at odds, and in which scientists that avoid the media are often overruled by politicians that seek it out.  Scientists are already wary of making public statements that implicitly contain uncertainty for fear of them being interpreted as definitive. Christakis and Fowler have done us a great service by taking the risk of making statements and recommendations in the public arena based on the best of their knowledge, by raising public awareness of the science of networks, and by making science fun, interesting, and relevant.

Social Networks in the Classroom

Today's New York Times has an article on an educational software start-up that "has a social-networking twist."  The company, Piazza, provides a course page where students can ask and answer questions with moderation from the instructor.  I'm not sure what the "social networking" component of this site is.  From the Times article, it sounds simply like a message board with a few bells and whistles.  A quick search for the company's website left me empty handed, so we can only speculate that there is actually something more here.

In passing, the article raised another interesting point though: "As in the case of Facebook, the wildly popular social network that sprang from a Harvard dorm room, the close-knit nature of college campuses has helped accelerate the adoption of Piazza."  The idea that close-knit communities lead to increased technology adoption is something that I prove in my recent paper, "Friendship-based Games."  The idea of closely knit communities is captured by the clustering coefficient of a network.  This metric measures the probability that two individuals that share a mutual friend are friends with one another.  In the paper, I show using a game theoretic model that new (beneficial) technologies have an easier time breaking into a market in networks with high clustering.  The basic idea is that small communities of users can adopt the new technology and interact mainly with one another, protecting themselves from the incumbent.  This may be one of the reason that college campuses, which probably exhibit higher clustering than many other social networks, prove to be such fertile ground for the adoption of new innovations.

Nicholas Christakis at WIDS@LIDS

Today and tomorrow I’m at the Interdisciplinary Workshop on Information and Decision in Social Networks at the MIT Laboratory for Information and Decision Systems (WIDS@LIDS).  Nicholas Christakis gave a thought provoking talk this morning drawing on a lot of material from his book, Connected, written with James Fowler.  One of the first ideas he raised is that humans are unique in having a social pressure on our evolution.  Humans  and other species also face environmental and other species evolutionary pressures.  But, he argued that humans are unique in this social pressure because we live in close proximity and other human groups are one of the biggest threats that  we face.  He went on to say that possibly this unique social pressure is responsible for humans evolving intelligence, because in order to navigate the complexities of social interactions, we need substantial intelligence.  I’m not sure that I buy this argument though.  What about ants, bees, wolf packs, ... ?  All of these species work in groups, cooperate, and face competitive pressure from other groups, but none of them have evolved intelligence on a human scale.

Christakis ended his talk asking about why certain ideas are “sticky.”  I think this is a super interesting and super difficult question.  I’ve been talking with Adam Berinsky in the Political Science Department at MIT about this question in relation to political rumors.  Why does the rumor that Obama was born in another country stick around, but other rumors die out?  Christakis suggested that this might somehow be a tractable question, but I think it is much more subtle.  First of all, there are no natural metrics for judging ideas.  Second of all, we can’t just look at which ideas have actually taken off and which haven’t, because so many other chance factors come into play.  Because of the big positive feedbacks involved in the spread of ideas, this process is highly susceptible to chance tipping (see the work by Salganik, Dodds, and Watts).  It’s very east to fall into the trap that Duncan Watts sums up in the title of his recent book, Everything is Obvious Once You Know the Answer.  Once an idea does “go viral,” like the Birther rumor, it is tempting to make up a narrative that says, well of course that rumor spread because it has attributes x, y, and z.  But, if the rumor had died we could just as easily construct a different narrative explaining its failure.  Paul Lazarsfeld’s paper on The American Soldier gives a fantastic example of how we can trick ourselves into believing this kind of after the fact rationalization.

Homophily and Information Spread

This article in Wired covers new research on networks and information by Sinan Aral (Northwestern B.A. in Political Science, MIT Sloan PhD, now at NYU Stern) and Marshall Van Alstyne.  The article describes research on the email communications of members of an executive recruiting firm, and says, “those who relied on a tight cluster of homophilic contacts received more novel information per unit of time.”  The article is confusing though because it mixes several distinct network concepts: homophily, strong ties, clustering, and “band width.”  Homophily is the tendency for people to be connected to other people that are similar to them; birds of a feather flock together. In his seminal paper, “The Strength of Weak Ties,” Mark Granovetter defined the strength of a tie as “a (probably linear) combination of the amount of time, the emotional intensity, the intimacy (mutual confiding), and the reciprocal services which characterize the tie”.  Clustering measures the tendency of our friends to be friends with each other.  And bandwidth is a less standard term in the social networks literature that captures the total amount of information that flows through a given tie per unit time (and thus is about the same thing as strength of a tie).

After reading the Wired piece, I’m left wondering if it is

  1. strong or “high bandwidth” ties through which we communicate a lot of total information,
  2. homophilic ties with people that are similar to us,
  3. ties with people that are members of a tightly knit cluster of friends, or
  4. all of the above

that provide us with the most novelty in our information diet.

A look at the original research article makes it more clear why the Wired article was so confusing.  The actual argument has a lot of moving pieces to it.  The first argument is that structurally diverse networks tend to have lower bandwidth ties.  Here structurally diverse appears to mean not highly clustered.  So, you talk more to the people in your personal clique than to people outside of your tightly knit group.  The second piece relates structural diversity to information diversity.  They find that the more structurally diverse the network, the more diverse the information that flows through it.  So far, this seems to line up with the standard Granovetter weak ties story.  The third relationship is that increasing bandwidth also increases information diversity, and more importantly, increasing bandwidth increases the total volume of new (non-redundant) information that an individual receives.  The idea here is that if you get tons of information from someone, some of it is going to be new.

Finally, since both structural diversity and bandwidth increase information diversity, but structural diversity decreases with increased bandwidth, they set up a head to head battle to see whether the information diversity benefits of increasing bandwidth outweigh the costs of reducing structural diversity.  They have three main findings on this front that characterize when bandwidth is beneficial:

  • “All else equal, we expect that the greater the information overlap among alters, the less valuable structural diversity will be in providing access to novel information.”
  • “All else equal, the broader the topic space, the more valuable channel bandwidth will be in providing access to novel information.”
  • “All else equal, ... the higher the refresh rate, the more valuable channel bandwidth will be in providing access to novel information.”

Crowdsourcing a clinical trial

Ars Technica has an article today about a crowdsourced clinical trial to evaluate the effectiveness of using lithium for treating ALS (Lou Gehrig’s Disease).  Over 3500 patients participated in tracking their disease symptoms online and 150 of them were treated with the drug.  The results showed no significant impact of the drug on ALS symptoms.  The company that ran the study, PatientsLikeMe, was founded by three MIT engineers, and they published an article describing the trial in Nature Biotechnology.

From the press release:

“This is the first time a social network has been used to evaluate a treatment in a patient population in real time,” says ALS pioneer and PatientsLikeMe Co-Founder Jamie Heywood. “While not a replacement for the gold standard double blind clinical trial, our platform can provide supplementary data to support effective decision-making in medicine and discovery. Patients win when reliable data is made available, sooner.”