After talking with Léa Roumazeilles, PhD candidate in Neurobiology at Oxford, we now present the second of our series ‘JOTE in conversation with researchers.’ Today we talk from a meta-level with an expert on how practices, experimental methods, statistical approaches and funding affect the way errors are framed and published. Daniël Lakens, Associate Professor in the Human-Technology Interaction (HTI) group at Eindhoven University of Technology (TU/e), is a metascientist, a field of rising importance that has a stronghold in the Netherlands. Daniël himself was the first to receive NWO support to launch a pilot project granting researchers for doing replication research in 2017. His work consists of ‘developing methods for critically reviewing and optimally structuring studies.’ Daniël aims to put to use in a constructive way the critical lense through which metascience assesses the validity of scientific studies. Today Max Bautista Perpinyà, JOTE’s co-founder and creative director, talks with Daniël Lakens about metascience, the psychology reproduction crisis, and how many other experimental sciences could learn from a meta-perspective.
As I set the recorder to rec, I told Daniël: ‘You don’t have to talk very loudly, the recorder is pretty good.’ He responded, ‘Alright, I’ll try.’ We chuckled. More than the volume of his voice, Daniël’s words are loud and travel far because of his reflections on the status of science and the ethics of researchers may be poignant to some listeners. We hope this interview gives you something to reflect on.What is metascience? How did it originate?
Metascience has been done for a long time, known as epistemology. However, metascience as we know it became very popular in psychology 10 years ago. To help you understand, let me briefly explain what has happened in the field since then.
First, in psychology there was always an understanding that not everything was perfect but we didn’t really know how bad it really was, and it was easy to just to tell each other, ‘yeah, things are not perfect, but how bad can it be.’ It was extremely common, especially as PhD students, that we would tell each other, ‘we are doing these things, and flexibly analysing the data.’ It was a very convenient thing to say because you didn’t really have to think about how bad it actually was. At the same time, I remember my supervisor once told me, ‘yeah, this finding is not really reliable. We know this. We should have an independent committee that replicates this sort of finding, so that it becomes widely known in the literature.’ But nothing much was going on. That was sort of the state of the field.
Then in 2010, a paper came out that became really, really well known. This was a paper by Daryl Bem on precognition, the ability of people to predict the future. He did a classical task in psychology, with a variation. In the original task, you are presented with a picture on the left and on the right and press a button saying whether the stimulus is on the right or on the left. He flipped the task temporally, meaning that you had to press the buttons before they appeared, and he counted how often people guessed better than the guessing average. Apparently, most people could tell better than the guessing average. This was a 9-paper study that appeared in 2010 and circulated a lot around 2011. The editor said, ‘I don’t know what to do with this. This looks like any other psych paper that we have. Just because it’s a crazy topic, it’s 9 studies, it’s super convincing. I should accept this.’
After this, people had two choices: either to say that precognition is real since the evidence was extremely convincing, or that this indicated that there was a problem in the way that we worked. So many people opted for the second, unsurprisingly. From the moment the Bem papers came out, there were many responses of many different kinds, saying things like: statistical analysis is not good, it’s based on p-values and we should report base factors, etc. There were other strategies to understand what happened in Bem’s studies. Then, some researchers (Simmons, Simonsohn, and Nelson, 2011) published what became a classic of the field, ‘false-positive psychology.’ First, they reported that they had played the song ‘When I am 64’ by The Beatles, and a control song, to people; they showed that those people listening to ‘When I am 64’ aged. They were older. And you were reading and said, ‘that’s impossible, that’s a nonsense finding.’ The authors said, ‘of course this is a nonsense finding. But this is not all that we did in the study.’ Then they added all the things that they left out, the selective reporting they did: ‘we didn’t just ask this, we also asked this, this, and this. And we not only tested this, but also this, this and this. And if you have flexibility in your testing, then Type I error will inflate up to the point that it is really easy to find significant results.’ This paper made it very clear that the way that we work, and the answer to the question ‘how bad can it be’ was ‘well, it can be really, really bad.’ The important thing was that many people recognised the practices that this article described, and thought ‘okay, am I doing things the wrong way?’ People came around to the fact that methods were extremely important.
Parallelly, in the Netherlands there was the case of fraud by Diederik Stapel, who just made up data. And that’s kind of boring in terms of the process, because he just typed random data. You are not supposed to do it, so everyone agreed. There wasn’t much to discuss. So in the Netherlands you had these two things together, close in time. The Daryl Bem studies were the rational part, and the fraud case of Diederik Stapel was the emotional one. So it was the time when many people realised for the first time that there is a big issue with the way that we work, and we were all strongly motivated. We knew we had to do something about it. Now, the Netherlands is regarded by most people as one of the countries leading the change on these issues.
They have a big role. Part of the key issues in disseminating tools for better science is educational. There’s a new journal – Advances in Methods and Practices in Psychological Science – that helps people to improve the way that they work in a very broad spectrum. There are papers about theory, measurement, concept formation, how we should collaborate and store data, informed consent – everything. All of this follows from each other: if you want more transparency you need to share your data and for this you need to get informed consent. But how do you do this for fMRI research where you scan people and have their face, for example? All sorts of practical and ethical considerations have come out of this, which has kept people busy for a very long time.
Another key development is the emergence of open access publications – especially the ones like PLoS One. For example, there was a classic case of a failure to replicate. It was a very highly-cited study about elderly priming, where people are presented subliminally with words related to the elderly and then they walk slower.
During a conference in 2009 or so, I remember talking to Stéphane Doyen, who was the first author on a paper that attempted to replicate it. He presented the study’s failure to replicate at the conference and I thought: ‘this is super cool that you’ve done this. Credits to you to try to replicate!’ I remember emailing him a couple of years later saying ‘hey, I was thinking about your study. Where is it?’ He said ‘yeah, unsurprisingly all journals rejected it’ because of course the original author was most likely one of the reviewers.
Then eventually it was published in PLoS One because it had an explicit rule: we don’t care about novelty, we care about the reliability of the findings. So there started to be outlets that gave people room to publish, for example, failures to replicate. This was an important development. There was a huge discussion about this finding; the original author wrote very harsh, impolite blog posts about it. It’s all very interesting to look back at. I’m sure he regrets it now but it tells you something about the state of the field: you couldn’t challenge a finding like this; people really thought that this was etched in stone but if we look at it now, these are studies with 14 participants in each condition. C’mon.
There is a journal called Metapsychology that publishes about metascience. The journal itself is also a little bit of an experiment: it has open peer review and the submission is basically uploading your pre-print somewhere. It will take into account any comments that are made on the pre-print on social media or using Hypothes.is, which is a tool where you can annotate the web page and leave comments on the pre-print. It’s zero APC (author-processing charges) and they also have formats similar to what you try to do in your journal at JOTE; they have an ‘empty your file drawer’ format. This type of journal is not super popular yet but there are a couple of examples now. The problem is getting people to actually submit null results.
In an indirect way, registered reports. Brian Nosek and I did a call for a special issue on this in 2012. Around the same time, Chris Chambers also proposed a similar idea, and he really pushed this at many journals – hundreds of journals now have it – where the decision to publish is made before the data is collected. It combats publication bias and it improves the quality a lot because the peer-review processes move to the front before the study is actually performed. This is a development that in practice works very well in reducing publication bias. Many people are trying it out. Not a lot of people, but some.
Not a lot is happening when it comes to funding. Sander L. Koole and I wrote a paper also in 2012 about the importance of rewarding replication research because who’s going to do a replication study when novel studies are so much more rewarded? And then when we wrote that paper I thought that at least we should at least try to reach out to somebody from NWO about this. We can just write a paper and put it in a journal but that seemed a bit boring. So we thought: if we want to be serious about this, we should send a letter to the NWO asking them to fund replication research. And so we did.
Initially they wrote a reply saying ‘we will fund replication research as long as it is innovative.’[Silence]
Chuckles, yes – I mean of course it’s not innovative. You’re just repeating something. So we thought, how can they officially reply like this? I think they didn’t completely understand. So I sent a letter back saying that basically this was destroying the foundations of science. If this is what you think then this is really problematic.
But even though their first response was odd, they’re actually a really nice organisation. They invited me over to talk about this topic and I tried to convince them that it would be a good idea to fund replication research. Crazy enough – it took them a long time, three years or something, but now they have a call to fund replication studies! It’s a pilot programme, so they’re evaluating it and taking a look at how they should extend it into the future, which is pretty nice.
A couple of other funders paid attention and thought: hmm okay, we should also take responsibility in making science more reproducible. This is good news, because I think we need to reserve a part of the fund to make it only available for replication research, otherwise novel research will always be valued more. The key point is that nobody will get a career by replicating other people’s work but we know that somebody needs to do it.
Anyone who would really follow the Code of Conduct in their lab nowadays would encounter pushback. If you ask them who’s responsible they might say ‘the head of my group is responsible’ or ‘I won’t get my PhD if I don’t do what my supervisor wants.’ In turn, the supervisor is going to say, ‘the dean of the faculty tells me that I have to publish this much, and do this and get these kinds of grants and so on. And how do I do that? By doing this kind of research.’ The dean is going to say ‘well, the university says ‘the government is forcing us to be such and such.’ So if you ask, ‘who’s responsible?’ The truth is that everybody’s pointing but nobody is doing anything.
I’m interested in giving advice to the government or funders about these kinds of things. I don’t think it’s very popular amongst scientists to actually say people need to force us to do the right thing because they feel that it will make their lives more difficult. And yes, it will. But, honestly, I don’t care – I want to make my life more difficult. It is tax money that we’re spending and we should make sure that it is spent very efficiently.
Well first of all, will the general scientific community listen to metascience? No, of course not, because they will say ‘you’re just trying to get as much power as possible and do whatever your political agenda is.’ So I wouldn’t say that metascientists are supposed to determine these kinds of rules.
You should have a team of people who are interested in making advice for policy. I don’t think NWO should listen to just any single person or group of metascientsts because they are a skeptical bunch of people with very specific views about certain things and there are alternative viewpoints. Some people think we should focus on the big and the good stuff, for example.
I think metascientists can at least ask the questions that weren’t asked. It’s amazing how bottom-up science organised itself over the last 500 years and how we keep things in place because ‘this is how it is.’ I think the role of metascientists is to ask: why is it like this? What are we doing? This doesn’t make sense.
For example, publication bias is a disaster for science. When I discuss it in workshops I don’t think people go home very happy and proud about being a scientist. I think they go home with at least some motivation not to contribute to publication bias too much in the future. This is the best I can achieve when I teach young researchers about this now.
Of course we will do things wrongly. This is a process where you try your best and in 50 years someone’s going to say: look at these idiots. They thought they were improving Science in 2020. They missed this and this and this. That’s just the logical process in how we work.
You can’t justify everything. The real thing is to justify a little bit more than you’re doing now and ask yourself: why am I actually doing this? That’s the goal. I want to be able to stop following rules – especially norms.
This work is licensed under a Creative Commons Attribution 4.0 International License.
Journal of Trial and Error | ISSN (Online) 2667-1204