Recently, there has been considerable discussion about research misconduct in pursuit of high scores in university rankings. A study reviewed in Nature has claimed that several South Asian and Middle Eastern universities have engaged in questionable research practices, including fabricating authorship and duplicating images. Consequently, there has been a high rate of retractions of and corrections to papers from those universities.
It is regrettable that so many papers are flawed. But one wonders whether there is a degree of bias here, with a focus on very recent papers from newly emerging universities.
Everybody should be concerned that papers need to be withdrawn, but surely it is worse if bad research is not retracted and is allowed to dominate entire fields of research and their public perception.
There was a classic example a few weeks ago. Freddie deBoer in a Substack post lambasted Jane Pratt, editor of New York, who wrote about The Great Pretender, a book by Susannah Cahalan that revisited a famous experiment by David Rosenhan involving the infiltration of mental hospitals by “pseudopatients” and supposedly showing that doctors were unable to tell whether someone was sane or not. Rosenhan’s paper, ‘On being sane in insane places’, was published in Science, the peak of global research, and was immensely influential. Rosenhan had degrees from Yeshiva and Columbia, taught at Harvard, and was Professor of Law and Psychology at Stanford, a visiting Fellow at Oxford, and a consultant for the Educational Testing Srevice.
Pratt praised the book, apparently thinking it was recounting the experiment. In fact, the book was an exposé of what was a very flawed, bordering on fraudulent, exercise. The study has not been retracted, and Rosenhan went on to enjoy a long and distinguished career. He never got around to presenting his research in book form, even though it would probably have been a best seller.
DeBoer noted the absurdity of a very well-paid editor claiming to have read a book that she hadn’t read. At least I assume that she hadn’t read it: if she had and failed to understand it, that would be even worse.
The case of David Rosenhan is an example of how flawed research can have a massive influence. Rosenhan’s work was not totally off target, but it contributed to an exaggerated scepticism about psychiatry and accelerated the “deinstitutionalization” of mentally ill persons and their return to “community care.”
That is not the only instance of flawed research becoming entrenched in the academic world and the popular imagination. Perhaps the most striking is a book, Pygmalion in the Classroom (PITC), published in 1968 by Robert Rosenthal, a distinguished professor at Harvard and the University of California, and Leone Jacobson, principal of an elementary school in San Francisco.
The study was based on a test of cognitive skills given to high school students. Some of them were designated as likely to start blooming intellectually. They were not told this, but their teachers were, and lo and behold, when the tests were readministered, they showed that “bloomers” did bloom remarkably. This was attributed to the expectations of the teachers who supposedly treated the bloomers in such a way that they were able to flourish while their non-blooming peers did not.
PITC became a classic in educational studies. It has become commonplace for teachers, human resources officers, and policymakers to proclaim that “we gotta have faith”, we must encourage everyone, our children are shaped by our bias and prejudice. It reached a peak with the ill-fated No Child Left Behind policy that was based on the belief that the cause of group disparities in education and elsewhere was the “soft bigotry of low expectations.”
The book has been massively cited, but ultimately, the study proved to be a hollow one, based on data that can be described as worthless.
There is now, in many disciplines, especially in medicine, education, and the social sciences, a significant replication crisis. Many highly cited studies cannot be replicated. Yet papers continue to be solemnly cited by policymakers and the mass media, while criticism is confined to blogs and the work of independent scholars.
One example, that I have mentioned before is ‘Recursive processes in self-affirmation: Intervening to close the minority achievement gap’ by Geoffrey L Cohen, Julio Garcia, Nancy Apfel, and Allison Master, which reported that having students spend a few minutes writing about their personal values would start a process of self-affirmation that would close the gap between Hispanic and Black and White/Asian students.
I think that anyone who has been in a secondary school classroom or has had contact with low-achieving students would find it implausible that such a short assignment would have such substantial effects. Cohen et al in their original study said, “ our apparently disproportionate results rested on an obvious precondition: the existence in the school of adequate material, social, and psychological resources and support to permit and sustain positive academic outcomes. Students must also have had the skills to perform significantly better. What appear to be small or brief events in isolation may, in reality, be the last element required to set in motion a process whose other necessary conditions already lay, not fully realized, in the situation."
If I understand correctly, the exercise only works if there are plentiful resources available and if students possess a range of relevant skills. The question then arises, how do you disentangle the effects of the resources and skills from the self-affirmation exercise? How can we tell if the claimed improvement was due to self-affirmation or to the resources and skills.
The Cohen et al study and a follow-up study were the subject of a large-scale replication by Hanselman et al (2018) in the Journal of Educational Psychology . Their conclusion is clear: there was “no evidence of effects in this replication study.” Furthermore, “we found no evidence of treatment benefits for the targeted population in the new study.” Nonetheless, the Cohen et al paper remains highly cited and unretracted.
So, perhaps someone could produce a ranking of universities according to studies that have not been retracted but cannot be replicated. I suspect that if a multiplier were added counting the number of citations then Harvard and Stanford would be near the top.