Revisiting Stone and Robert’s classic checkerboard score.

I was looking over the classic Stone and Roberts (1990) paper in Oecologia that presented a simple way to check for nestedness in community occurrence data.  I’ve been playing with different metrics of nestedness for a community data set.  However, I think I found an error in the paper.  They define the number of checkerboard units as:

Cij=(ri-Sij)(rj-Sij)                (1)

Where ri is the number of times that species i occurs without species j, rj is the number of times species j occurs without species i, and Sij is the number of times they co-occur. Cij calculates the number of checkerboards in your matrix, and this will be highly dependent on your sample size.  They suggest standardizing by the total number of possible species pairs (P), which for M species is

P=M(M-1)/2                       (2)

So then the C-score, which represents the number of checkerboard units per species pair, would be:

C-score=Cij/P.                   (3)

Stone and Roberts (1990) are quite clear that equation (3) defines the C-score: “We define the checkerboard score for a particular colonisation pattern (matrix) as the mean number of checkerboard units per species-pair of the community” (italics are mine).

Luckily, they include some of the example matricies that they use as examples right in the paper. So they define two matrices, U and V, which have the same row and column sums, but matrix U has a lot of checkerboards relative to matrix V. 


They calculate C(U) as 52.6, and C(V) as 30.3.  But these values represent the mean of Cij for each matrix (Equation 1), not the C-score calculated from equation 3. If we use equation 3, we get a C-score of 0.28 for matrix U and 0.16 for matrix V.  Indeed, even though they define the C-score as equation 3, every time they present a C-score it is actually the mean of equation 1.  For example, on page 76, they discuss matricies B, D, E and F and present C(B)=C(E)=C(F)=2.67 and C(D)=2.  Again, these are the mean of Cij. If I calculate the C-score as they define it using equation 3 I get 0.45 and 0.33 respectively.  Later in the paper they analyze two data sets.  I don’t have access to the two data sets that they analyze, but given the scale of the C-scores they present (i.e. they are significantly greater than 1), I strongly suspect they also were calculated as the mean of Cij, not as the C-score.  What does this mean for their conclusions?

Well I did some playing of my own, generating random matrices and calculating Cij like Stone and Roberts actually did and calculating C-score like Stone say they did.  They are perfectly linearly correlated, Cij is just a couple of orders of magnitude larger than C-score (1467 is the slope of the linear regression):


So what does all of this mean?  At the end of the day, I don’t think this changes anything but the scale of their analyses - the numbers they present are just ~3 orders of magnitude larger than they should be – but it probably doesn’t change their conclusions.  So if it doesn’t matter, why did I just waste your time on this silly blog? Well, I spent the afternoon trying to figure out why I was getting the wrong answers from matrix U and V, so I thought I would share this with the internets.

I’ve started a blog: Here’s my philosophy of science

This is my first blog post. So, I thought I would start all the way at the beginning and write a bit about some of the philosophy of science that really guides my daily approach to problem solving, and science. None of this is new; I’m sure you can find blog posts by the barrel from scientists pontificating about philosophy written all over the internet with permanent marker. But I really do think these ideas are critical for any scientist to keep on the tip of their brain. It matters how we do science, what we think science is, and it matters how we might make our approach to science better by thinking about these types of problems. It also matters how we gather evidence. So, here’s my take.

Most scientists have probably encountered Karl Popper’s idea of falsification as one solution to the problem of how to demarcate science from non-science. It’s often taught to scientists this way, but I always got the impression that Popper was a bit of a reluctant philosopher of science. If you read Popper closely, he was more of an epistemologist than a philosopher of science. His idea of falsification was conceived as a work around for the problem of induction. However, it can be useful to teach Popper’s falsification to young scientists to get them thinking in a hypo-deductive way even if they haven’t yet encountered Hume’s problem of induction. (That is, we want scientists to use deductive reasoning instead of inductive reasoning.) Popper’s falsification is a wonderful work around that does a reasonably good job of solving Hume’s problem of induction. Scientists absolutely should be wary of the problem of induction, and by formulating bold assertions or hypotheses and attempting to disprove them, we gain certainty that is impossible if we were to try and prove a hypothesis. We teach this to young scientists, but is falsification really what we do as scientists?

Many scientists might also be familiar with Thomas Kuhn’s ideas about paradigm shifts and the structure of scientific revolutions. Kuhn observed that, while scientists do seem to use Popperian falsification - in that they use the hypo-deductive method - they often don’t wholesale reject the major paradigms that guide their field (Kuhn defines paradigms somewhat vaguely via examples, a classic one being Newtonian mechanics versus Einstein’s relativity as alternative paradigms). Instead, scientists appear to treat certain individual falsifications as anomalies and it takes a lot of falsifications to shake the faith of a scientist in their overarching research paradigm. For example, an evolutionary biologist is probably prepared to accept that not all traits are adaptive, but they are probably not prepared to discard the theory of evolution by natural selection very easily. If Kuhn is correct, this gives a cyclical view of science where (after some early stumbling around with “precience”) scientists are engaged in what Khun called “normal science”. In normal science, everyone is working within a similar research paradigm, and the paradigm is working well to explain most data and most scientists are content. Individual falsifications are usually explained away. These falsifications are perhaps the fault of the observer or a problem with the instrument and so we don’t usually ditch our major paradigms. However, as these falsifications build up, the field reaches what Kuhn called a “crisis” and the anomalies in the data can no longer be ignored. When this happens, a new paradigm may arise. The new paradigm should be capable of explaining the old data, as well as the problem data that led to the crisis. Kuhn called such a shift in paradigms a scientific revolution, and once the revolution is over, we lead back into a period of “normal” science built upon the new paradigm until the falsifications start to build up again. It’s important to realize that Kuhn also thought that alternative paradigms were mutually exclusive. That is, alternative paradigms often synthesize the same observations, but in a fundamentally different way, and often with fundamentally different explanations and conclusions rendering them incompatible. One must be utterly abandoned before a second may arise. Kuhn’s book, the structure of scientific revolutions, is one of my favourite books, and everyone should give it a read.

At first glance Kuhn’s account might sound really good. We still have elements of the hypo-deductive method, but Kuhn’s account might be closer to how scientists actually use falsification. However, are Kuhn’s catastrophic revolutionary cycles really what goes on in science either? The idea of a paradigm as an immovable force that is only shaken by a “crisis” is probably not accurate. Instead, we scientists often make small (and ideally progressive) revisions over time in our thinking. Some hypotheses are discarded, while others might be retained. This can lead to incremental changes and revisions in a theory without one of Kuhn’s catastrophic crisis leading to a revolution. Instead, these quiet revolutions might occur quite slowly. It probably also isn’t true that alternative paradigms are wholly mutually exclusive. Newtonian mechanics has largely been replaced by relativistic mechanics, but there are some subsets of data (e.g. low velocity objects) for which both theories give a pretty similar answer. But can we do better still with our understanding of how science does work or should work?

You might also have heard of Imre Lakatos, though I find fewer people have heard of Lakatos than have heard of Popper or Khun. Lakatos was a student of Karl Popper, and he developed the idea of “research programmes” as a way of combining Popper’s falsification with Kuhn’s ideas of paradigms and revolutionary science. Personally, I think Lakatos has a really useful conceptual framework for thinking about science, and I would say I’m Lakatosian in my approach to science. The real key difference between Kuhn and Lakatos is the idea of hard core and auxiliary hypotheses as an essential feature of a scientific research programme. Hard core hypotheses are those that cannot be discarded without abandoning the entire research programme – this can still lead to Kuhn’s scientific revolutions if the hard core hypotheses are abandoned and then replaced. Where Lakatos and Kuhn (and Popper) depart are with auxiliary hypotheses. Auxiliary hypotheses are those that scientists test and are prepared to discard based on experimental falsification – and this retains Popper’s falsification. This account of science seems to be much closer to how we actually behave as scientists and solves many of the objections raised above. But I think it also tells us something about how we should behave. Lakatos made a distinction between progressive and degenerate research programmes. The distinction was primarily about explanatory power. Scientists working within a Lakatosian framework are constantly revising their world views as they discard auxiliary hypotheses, and adopt new ones. There is a danger here in this sort of ad hoc revisions (Popper was very concerned about such revisions, and thought that we should never use them!). Progressive research programmes are those that gain increased explanatory power through the constant abandonment of old auxiliary hypotheses and the development of new auxiliary hypotheses. Degenerate research programmes on the other hand are forced to make revisions to auxiliary hypotheses, but these revisions are more of a necessity to deal with troublesome data and do not lead to improved explanatory power. Another way to think about it is that degenerate research programmes are constructing auxiliary hypotheses to prop up the crumbling hard core hypotheses rather than to really gain an understanding of the world through falsification. A persistently degenerate research programme is headed towards something like a Kuhnian crisis, and this means that eventually the hard core hypotheses – and the entire research programme with them – will need to be replaced via a sort of revolution.

There is one more philosopher worth talking about in this modern world of “big data”. Francis Bacon was a 16th century philosopher, and an early scientist who was probably instrumental in the development of the modern scientific method. Bacon predates Hume’s articulation of the problem of induction, and was an advocate of reasoning using induction. This approach is directly different from Popper’s falsification. Where Popper advocated rejecting theories as a means to gain certainty, Bacon advocated the use of accumulating data to support theories via induction. Bacon actually wrote that “hypotheses have no place in experimental science”, and thought that buy simply building up enough observations and synthesizing them we could gain understanding. This should seem troubling to you. We can never know whether explanations built in this way are true because of the problem of induction. I think that modern technology has allowed us to collect data at an unprecedented rate. I’ve met a number of ecologists who are very excited about this tool or that tool and ran out and collected a bunch of data but now don’t know what to do with it. They come and give talks in the department are actually searching for hypotheses after they collected the data. I’m slightly fearful that we are experiencing a bit of a Baconian revolution, and if you liked anything I wrote before this paragraph you should be worried about this.

Ok if you’ve lasted this long, you’re probably wondering: What good is any of this to a real scientist? I think there are four lessons here. The first lesson is that the basic idea of falsification is valuable: the problem of induction is real, and deductive reasoning is a much more reliable way of gaining knowledge. We’ve learned since the 16th century that Bacon’s inductive approach of searching for synthesis in data without a priori hypotheses is not very reliable if we want truth. I think that it is even more important than ever to be mindful of the problem of induction in this era of big data. However, though falsification gives a practical solution the problem of induction, it probably wouldn’t be a good idea to toss the major theories of our field out the window because of one strange experimental result and indeed, scientists don’t do this. The second lesson from Kuhn is to remember that all of our scientific theories are essentially wrong and are destined for the dustbin. As G.E.P. Box famously said: “all models are wrong, but some are useful.” It’s worth keeping this in the back of your mind as you falsify your favourite hypothesis and then have to think about why it got falsified. The third lesson comes from Lakatos who reminds us that it is wise to be worried about the ad hoc revisions that we make as scientists when faced with falsification of an auxiliary hypothesis. I think we should be conscious of Lakatos’ concepts of progressive and degenerate research programmes as we make these ad hoc revisions to our auxiliary hypotheses on a daily basis. Fourth, the ability to articulate the hard core and auxiliary hypotheses of your own research programme is an extraordinarily useful exercise. I tend to filter much of the scientific world through these four lessons, and I think it makes me better at problem solving.

Personally, I think ecology and evolution is near what Kuhn might call a crisis, and what Lakatos might perceive as a number of competing research programmes. But this post is already quite long, and so this will have to be part of a future blog post.

Acknowledgement: My B.Sc. was a in philosophy and ecology and I’ve worried about these things for a long time. However, a good portion of these ideas evolved via conversations with Joel S. Brown at the University of Illinois at Chicago.