#chalder — Public Fediverse posts
Live and recent posts from across the Fediverse tagged #chalder, aggregated by home.social.
-
Claims Built on Fraudulent Trials Should Be Ignored
By David Tuller, DrPH When researchers cite fraudulent studies in support of their claims, it is best not to take anything they write at face value. That is certainly the case with a recent paper titled “Persistent physical symptoms not explained by structural abnormalities or disease processes: a primary care approach to promote recovery,” published earlier this month in the Scandinavian Journal of Primary Health Care. (I use “fraudulent” here not in the legal sense but in the sense of “deceptive” or “deceitful.”) As evidence of something or other, the paper’s references include both the fraudulent PACE trial, whose reported findings have been discredited and rejected by leading medical authorities, and a fraudulent pediatric trial of the Lightning Process, in which the investigators violated core methodological principles of scientific research. (The Lightning Process, a woo-woo “brain retraining” program, was created by osteopath and former spiritual healer Phil Parker, who once claimed to be able to diagnose people’s ailments by stepping into their bodies for a look-see.) The Scandinavian Journal of Primary Health Care has emerged as something of a house organ for members of the biopsychosocial ideological brigades, including prominent non-Nordic fellow travelers like Professor Paul Garner and Professor Trudie Chalder. (The former is the corresponding author of this paper; the latter is one of multiple co-authors.) Both were also co-authors of a similarly misguided document published by the same journal In 2023–a manifesto from the self-styled Oslo Chronic Fatigue Consortium called “Chronic fatigue syndromes: real illnesses that people can recover from.” The new paper’s goal is to offer primary care physicians a short summary of “contemporary theories of PPS” along with purported “evidence-informed pathways” for treating patients. The research involved a “narrative literature review and consensus development with experienced practitioners.” In other words, the paper presents the beliefs, opinions and …https://trialbyerror.org/2026/04/01/claims-built-on-fraudulent-trials-should-be-ignored/
-
Claims Built on Fraudulent Trials Should Be Ignored
By David Tuller, DrPH When researchers cite fraudulent studies in support of their claims, it is best not to take anything they write at face value. That is certainly the case with a recent paper titled “Persistent physical symptoms not explained by structural abnormalities or disease processes: a primary care approach to promote recovery,” published earlier this month in the Scandinavian Journal of Primary Health Care. (I use “fraudulent” here not in the legal sense but in the sense of “deceptive” or “deceitful.”) As evidence of something or other, the paper’s references include both the fraudulent PACE trial, whose reported findings have been discredited and rejected by leading medical authorities, and a fraudulent pediatric trial of the Lightning Process, in which the investigators violated core methodological principles of scientific research. (The Lightning Process, a woo-woo “brain retraining” program, was created by osteopath and former spiritual healer Phil Parker, who once claimed to be able to diagnose people’s ailments by stepping into their bodies for a look-see.) The Scandinavian Journal of Primary Health Care has emerged as something of a house organ for members of the biopsychosocial ideological brigades, including prominent non-Nordic fellow travelers like Professor Paul Garner and Professor Trudie Chalder. (The former is the corresponding author of this paper; the latter is one of multiple co-authors.) Both were also co-authors of a similarly misguided document published by the same journal In 2023–a manifesto from the self-styled Oslo Chronic Fatigue Consortium called “Chronic fatigue syndromes: real illnesses that people can recover from.” The new paper’s goal is to offer primary care physicians a short summary of “contemporary theories of PPS” along with purported “evidence-informed pathways” for treating patients. The research involved a “narrative literature review and consensus development with experienced practitioners.” In other words, the paper presents the beliefs, opinions and …https://trialbyerror.org/2026/04/01/claims-built-on-fraudulent-trials-should-be-ignored/
-
Claims Built on Fraudulent Trials Should Be Ignored
By David Tuller, DrPH When researchers cite fraudulent studies in support of their claims, it is best not to take anything they write at face value. That is certainly the case with a recent paper titled “Persistent physical symptoms not explained by structural abnormalities or disease processes: a primary care approach to promote recovery,” published earlier this month in the Scandinavian Journal of Primary Health Care. (I use “fraudulent” here not in the legal sense but in the sense of “deceptive” or “deceitful.”) As evidence of something or other, the paper’s references include both the fraudulent PACE trial, whose reported findings have been discredited and rejected by leading medical authorities, and a fraudulent pediatric trial of the Lightning Process, in which the investigators violated core methodological principles of scientific research. (The Lightning Process, a woo-woo “brain retraining” program, was created by osteopath and former spiritual healer Phil Parker, who once claimed to be able to diagnose people’s ailments by stepping into their bodies for a look-see.) The Scandinavian Journal of Primary Health Care has emerged as something of a house organ for members of the biopsychosocial ideological brigades, including prominent non-Nordic fellow travelers like Professor Paul Garner and Professor Trudie Chalder. (The former is the corresponding author of this paper; the latter is one of multiple co-authors.) Both were also co-authors of a similarly misguided document published by the same journal In 2023–a manifesto from the self-styled Oslo Chronic Fatigue Consortium called “Chronic fatigue syndromes: real illnesses that people can recover from.” The new paper’s goal is to offer primary care physicians a short summary of “contemporary theories of PPS” along with purported “evidence-informed pathways” for treating patients. The research involved a “narrative literature review and consensus development with experienced practitioners.” In other words, the paper presents the beliefs, opinions and …https://trialbyerror.org/2026/04/01/claims-built-on-fraudulent-trials-should-be-ignored/
-
Claims Built on Fraudulent Trials Should Be Ignored
By David Tuller, DrPH When researchers cite fraudulent studies in support of their claims, it is best not to take anything they write at face value. That is certainly the case with a recent paper titled “Persistent physical symptoms not explained by structural abnormalities or disease processes: a primary care approach to promote recovery,” published earlier this month in the Scandinavian Journal of Primary Health Care. (I use “fraudulent” here not in the legal sense but in the sense of “deceptive” or “deceitful.”) As evidence of something or other, the paper’s references include both the fraudulent PACE trial, whose reported findings have been discredited and rejected by leading medical authorities, and a fraudulent pediatric trial of the Lightning Process, in which the investigators violated core methodological principles of scientific research. (The Lightning Process, a woo-woo “brain retraining” program, was created by osteopath and former spiritual healer Phil Parker, who once claimed to be able to diagnose people’s ailments by stepping into their bodies for a look-see.) The Scandinavian Journal of Primary Health Care has emerged as something of a house organ for members of the biopsychosocial ideological brigades, including prominent non-Nordic fellow travelers like Professor Paul Garner and Professor Trudie Chalder. (The former is the corresponding author of this paper; the latter is one of multiple co-authors.) Both were also co-authors of a similarly misguided document published by the same journal In 2023–a manifesto from the self-styled Oslo Chronic Fatigue Consortium called “Chronic fatigue syndromes: real illnesses that people can recover from.” The new paper’s goal is to offer primary care physicians a short summary of “contemporary theories of PPS” along with purported “evidence-informed pathways” for treating patients. The research involved a “narrative literature review and consensus development with experienced practitioners.” In other words, the paper presents the beliefs, opinions and …https://trialbyerror.org/2026/04/01/claims-built-on-fraudulent-trials-should-be-ignored/
-
Claims Built on Fraudulent Trials Should Be Ignored
By David Tuller, DrPH When researchers cite fraudulent studies in support of their claims, it is best not to take anything they write at face value. That is certainly the case with a recent paper titled “Persistent physical symptoms not explained by structural abnormalities or disease processes: a primary care approach to promote recovery,” published earlier this month in the Scandinavian Journal of Primary Health Care. (I use “fraudulent” here not in the legal sense but in the sense of “deceptive” or “deceitful.”) As evidence of something or other, the paper’s references include both the fraudulent PACE trial, whose reported findings have been discredited and rejected by leading medical authorities, and a fraudulent pediatric trial of the Lightning Process, in which the investigators violated core methodological principles of scientific research. (The Lightning Process, a woo-woo “brain retraining” program, was created by osteopath and former spiritual healer Phil Parker, who once claimed to be able to diagnose people’s ailments by stepping into their bodies for a look-see.) The Scandinavian Journal of Primary Health Care has emerged as something of a house organ for members of the biopsychosocial ideological brigades, including prominent non-Nordic fellow travelers like Professor Paul Garner and Professor Trudie Chalder. (The former is the corresponding author of this paper; the latter is one of multiple co-authors.) Both were also co-authors of a similarly misguided document published by the same journal In 2023–a manifesto from the self-styled Oslo Chronic Fatigue Consortium called “Chronic fatigue syndromes: real illnesses that people can recover from.” The new paper’s goal is to offer primary care physicians a short summary of “contemporary theories of PPS” along with purported “evidence-informed pathways” for treating patients. The research involved a “narrative literature review and consensus development with experienced practitioners.” In other words, the paper presents the beliefs, opinions and …https://trialbyerror.org/2026/04/01/claims-built-on-fraudulent-trials-should-be-ignored/
-
By David Tuller, DrPH
The CODES trial investigated cognitive behavior therapy (CBT) as a treatment for dissociative seizures (DS), a sub-category of what is now called functional neurological disorder (FND). The intervention was a course of CBT specifically designed to address the variety of factors presumed to be triggering the seizures. (I have previously critiqued CODES here, here, and here,)
However, the trial was a bust, with null findings for the self-reported primary outcome–seizure reduction 12 months after randomization. In fact, in what must have been a major embarrassment for the investigators, the group that did not receive the intervention reported a greater reduction of seizures than the group that did, although this difference was not statistically significant.
(In the past, seizures not believed to have been caused by abnormal electrical signals have generally been called “psychogenic non-epileptic seizures.” The new term is meant to be less insulting; patients often resent being told their conditions are psychologically driven.)
Since these null CODES findings were published in 2020, FND experts have tried to reframe them by, among other strategies, suggesting that seizure reduction wasn’t the most appropriate or relevant primary outcome after all. The investigators themselves raised this notion in the initial paper reporting the CODES results. In an accompanying commentary, a colleague of the investigators promoted a similar notion, suggesting that quality-of-life measures were perhaps a better primary outcome than seizure reduction.
Now they’re at it again. In a new paper called “Reflections on the CODES trial for adults with dissociative seizures: what we found and considerations for future studies,” published this month by BMJ Neurology Open, the key CODES investigators present additional analyses of the trial data and try to argue that the results weren’t really so bad. The paper includes the following sentence: “Overall, inspection of our data does not support others’ suggestions that our treatment did not sustainably reduce DS frequency.”
This is a bizarre remark. It obviously implies that the data from CODES support the idea that the specialized treatment did, in fact, “sustainably reduce DS frequency.” However, CODES provided no evidence that the treatment did any such thing. The primary goal of the trial was not even to investigate whether participants in the intervention arm had reduced seizure frequency but whether the intervention showed benefit—that is, whether those who received the intervention did better than those who did not. And that didn’t happen.
In CODES, both arms experienced some seizure reduction, but the intervention did not provide any advantages in that regard. The reduction in seizure frequency cannot be attributed to the intervention, even if the investigators now appear to be claiming otherwise.
(I could be misinterpreting the above sentence, but I don’t think so. I think the investigators truly believe, notwithstanding the evidence, that their trial documented some impact from the intervention.)
With 368 participants, CODES was the largest clinical trial to date of a treatment for FND. The senior author was the factually and mathematically challenged Trudie Chalder, a professor of cognitive behavior therapy at King’s College London (KCL). The press release from KCL scammed the public by burying the disastrous findings for the primary outcome and instead touting the trial as a big success—a claim based on some subjective secondary outcomes with modestly positive findings that really mean nothing at all.
The new paper explains that, in the CODES model, “DS are maintained by a vicious circle of behavioural, cognitive, affective, physiological and social factors of which fear and avoidance are particularly salient.” This framework, the paper notes, “lends itself to the application of CBT interventions, particularly graded exposure to feared (avoided) situations and seizure interruption and control techniques.”
In the new paper, the investigators provide, perhaps inadvertently, a clue into why CODES was destined to be a failure. As they explain, seizure reduction six months after the end of treatment was the primary outcome in a pilot study of CBT for DS, published in 2010: “In the pilot RCT [randomized controlled trial], 6 months after treatment there was an observed post-randomisation difference in favour of the DS-CBT group, but it could not be shown to be statistically significant.”
Exactly–the pilot study had null results. And yet the investigators were able to convince funders that the evidence warranted a test of the intervention in a full-scale trial. Is there something wrong with this picture? Why is anyone surprised that the full trial also had null results for seizure reduction at follow-up?
In the new paper, the investigators again present creative reasons to re-interpret the null findings from CODES. They note that the CODES comparison arm provided more than the standard care patients would have received outside the trial context. The participants in the comparison arm received some of the explanatory information and coping guidance that was available to those in the intervention arm, even though they did not receive the intervention’s active CBT component. From the perspective of the investigators, then, the null results for the primary outcome seem to mean that both arms benefited from the approach embodied by the intervention—not that the intervention was ineffective.
(I think I’m understanding their point, although I can’t be sure.)
**********
Primary and secondary outcomes
The new paper includes a lengthy discussion of the choice of primary outcome. The investigators first mention that funders required it—even though they themselves have a long history of defending seizure reduction as the primary outcome. In the pilot study, the investigators explicitly rejected the idea that other metrics might be more suitable. Presumably they took that step after careful consideration of other possibilities.
Here’s what they wrote in the pilot:
“Our CBT approach is predicated on the assumption that PNES represent dissociative responses to arousal, occurring when the person is faced with fearful or intolerable circumstances. Our treatment model emphasizes seizure reduction techniques especially in the early treatment sessions. While the usefulness of seizure remission as an outcome measure has been questioned, seizures are the reason for patients’ referral for treatment.”
That reasoning still makes sense. Since the investigators specifically designed the intervention to achieve seizure reduction based on their hypothetical understanding of the etiology disorder, it is not immediately clear why seizure reduction should not be the primary outcome. If they are now abandoning this metric as not so important after all, are they also questioning the biopsychosocial theories that informed the creation of the intervention? If not, why not?
Failure of an intervention should lead smart investigators to question their assumptions—but that doesn’t seem to have happened with CODES. The investigators still seem to believe the trial should be viewed as a success, making much of the fact that nine of their 16 secondary measures had findings that were statistically significant. But let’s be clear: This was an unblinded study relying on self-reported (or, in one case, physician-reported) outcomes—a trial design subject to an enormous amount of possible bias. It would be unexpected for the intervention group not to report modestly better outcomes from bias alone.
(The primary outcome and three of the secondary outcomes involved patients’ reports of the number of seizures. The self-reporting of seizures has the appearance as well as some aspects of objectivity, but it is still subjective and potentially influenced by bias.)
My colleague Philip Stark, a professor of statistics at the UC Berkeley, made the following assessment of CODES:
“The trial did not support the primary clinical outcome, only secondary outcomes that involve subjective ratings by the subjects and their physicians, who knew their treatment status. This is a situation in which the placebo effect is especially likely to be confounded with treatment efficacy. The design of the trial evidently made no attempt to reduce confounding from the placebo effect. As a result, it is not clear whether CBT per se is responsible for any of the observed improvements in secondary outcomes.”
I highlighted Professor Stark’s assessment in a 2020 post, which also included my own observations about the secondary outcomes. Here’s the relevant passage:
“The investigators included 16 secondary outcomes in the study, measured either through questionnaires or the seizure diaries, and reported statistically significant findings for nine of them: seizure bothersomeness, longest period of seizure-free days in the last six months, health-related quality of life, psychological distress, work and social adjustment, number of somatic symptoms, self-rated overall improvement, clinician-rated overall improvement, and satisfaction with treatment. Although many of these findings were modest, the array appeared impressive.
“Yet the seven outcomes that failed to achieve statistically significant effects also constituted an impressive array: seizure severity, freedom from seizures in the last three months, reduction in seizure frequency of more than 50% relative to baseline, anxiety, depression, and both mental and physical scales on a different instrument assessing health-related quality of life than the one that yielded positive results.
“So parsing these findings, CBT participants reported that the seizures were less bothersome than in the SMC group, but not less severe. They reported benefits on one health-related quality-of-life instrument, but not on two separate scales on another health-related quality-of-life instrument. They reported less psychological distress, but not less anxiety and depression. When viewed from that perspective, the results seem somewhat arbitrary, with findings perhaps dependent on how a particular instrument framed this or that construct.
“If investigators throw 16 packages of spaghetti at the wall, some of them are likely to stick. The greater the number of secondary outcomes included in a study, the more likely it is that one or more will generate positive results, if only by chance. Given that, it would make sense for investigators to throw as many packages of spaghetti at the wall as feasible, unless they have to pay a statistical penalty for having boosted their odds of apparent success.
“The standard statistical penalty involves accounting for the expanded number of outcomes with a procedure called correcting (or adjusting) for multiple comparisons (or analyses). In such circumstances, statistical formulae can be used to tighten the criteria for what should be considered statistically significant results–that is, results that are very unlikely to have occurred by chance.
“The CODES protocol made no mention of correcting for this large number of analyses, or comparisons. The CODES statistical analysis plan included the following, under the heading of “method for handling multiple comparisons: ‘There is only a single primary outcome, and no formal adjustment of p values for multiple testing will be applied. However, care should be taken when interpreting the numerous secondary outcomes.‘
“In other words, the investigators decided not to perform a routine statistical test despite their broad range of secondary outcomes. It is fair to call this a questionable choice, or at least one that departs from the approach advocated by many trial design experts and statisticians, such as Professor Stark, my Berkeley colleague. A self-admonition to take care “when interpreting the numerous secondary outcomes” is not an appropriate substitute for an acceptable statistical strategy to address the potpourri of included measures.
“Despite this lapse, it appears that someone–perhaps a peer-reviewer?–questioned the decision to completely omit this statistical step. A paragraph buried deep in the paper mentions the results after correcting for multiple comparisons, with no further comment on the implications. Of the nine secondary outcomes initially found to be statistically significant, only five survived this more stringent analysis: longest period of seizure-free days in the last six months, work and social adjustment, self-rated overall improvement, clinician-rated overall improvement, and treatment satisfaction.
“Let’s be clear: These are pretty meager findings, especially since they are self-reported measures in an open-label trial. For example, it is understandable and even expected that those who received CBT would report more “treatment satisfaction” than those who did not receive it. It is also understandable that a participant who received a treatment and the clinician who treated that participant would be more likely to rate the participant’s health as improved than when compared to the SMC group. And a course of CBT could well help individuals with medical problems adjust to their troubling condition in work and social situations.
“None of this means that the core condition itself has been treated–especially since those who did not receive CBT had better results for the primary outcome of seizure reduction at 12 months.”
********
Even as they present all of their additional analyses, the CODES investigators are ignoring the admonition they themselves included in their protocol—that “care should be taken when interpreting the numerous secondary outcomes.” As this latest paper shows, they have taken zero such care. The new paper doesn’t even mention that only five of the secondary outcomes were statistically significant after adjustment for multiple comparisons—a telling omission. The whole thing reads like a desperate attempt to portray their intervention as having had some meaningful effect. The CODES data tell a different story.
-
By David Tuller, DrPH
The CODES trial investigated cognitive behavior therapy (CBT) as a treatment for dissociative seizures (DS), a sub-category of what is now called functional neurological disorder (FND). The intervention was a course of CBT specifically designed to address the variety of factors presumed to be triggering the seizures. (I have previously critiqued CODES here, here, and here,)
However, the trial was a bust, with null findings for the self-reported primary outcome–seizure reduction 12 months after randomization. In fact, in what must have been a major embarrassment for the investigators, the group that did not receive the intervention reported a greater reduction of seizures than the group that did, although this difference was not statistically significant.
(In the past, seizures not believed to have been caused by abnormal electrical signals have generally been called “psychogenic non-epileptic seizures.” The new term is meant to be less insulting; patients often resent being told their conditions are psychologically driven.)
Since these null CODES findings were published in 2020, FND experts have tried to reframe them by, among other strategies, suggesting that seizure reduction wasn’t the most appropriate or relevant primary outcome after all. The investigators themselves raised this notion in the initial paper reporting the CODES results. In an accompanying commentary, a colleague of the investigators promoted a similar notion, suggesting that quality-of-life measures were perhaps a better primary outcome than seizure reduction.
Now they’re at it again. In a new paper called “Reflections on the CODES trial for adults with dissociative seizures: what we found and considerations for future studies,” published this month by BMJ Neurology Open, the key CODES investigators present additional analyses of the trial data and try to argue that the results weren’t really so bad. The paper includes the following sentence: “Overall, inspection of our data does not support others’ suggestions that our treatment did not sustainably reduce DS frequency.”
This is a bizarre remark. It obviously implies that the data from CODES support the idea that the specialized treatment did, in fact, “sustainably reduce DS frequency.” However, CODES provided no evidence that the treatment did any such thing. The primary goal of the trial was not even to investigate whether participants in the intervention arm had reduced seizure frequency but whether the intervention showed benefit—that is, whether those who received the intervention did better than those who did not. And that didn’t happen.
In CODES, both arms experienced some seizure reduction, but the intervention did not provide any advantages in that regard. The reduction in seizure frequency cannot be attributed to the intervention, even if the investigators now appear to be claiming otherwise.
(I could be misinterpreting the above sentence, but I don’t think so. I think the investigators truly believe, notwithstanding the evidence, that their trial documented some impact from the intervention.)
With 368 participants, CODES was the largest clinical trial to date of a treatment for FND. The senior author was the factually and mathematically challenged Trudie Chalder, a professor of cognitive behavior therapy at King’s College London (KCL). The press release from KCL scammed the public by burying the disastrous findings for the primary outcome and instead touting the trial as a big success—a claim based on some subjective secondary outcomes with modestly positive findings that really mean nothing at all.
The new paper explains that, in the CODES model, “DS are maintained by a vicious circle of behavioural, cognitive, affective, physiological and social factors of which fear and avoidance are particularly salient.” This framework, the paper notes, “lends itself to the application of CBT interventions, particularly graded exposure to feared (avoided) situations and seizure interruption and control techniques.”
In the new paper, the investigators provide, perhaps inadvertently, a clue into why CODES was destined to be a failure. As they explain, seizure reduction six months after the end of treatment was the primary outcome in a pilot study of CBT for DS, published in 2010: “In the pilot RCT [randomized controlled trial], 6 months after treatment there was an observed post-randomisation difference in favour of the DS-CBT group, but it could not be shown to be statistically significant.”
Exactly–the pilot study had null results. And yet the investigators were able to convince funders that the evidence warranted a test of the intervention in a full-scale trial. Is there something wrong with this picture? Why is anyone surprised that the full trial also had null results for seizure reduction at follow-up?
In the new paper, the investigators again present creative reasons to re-interpret the null findings from CODES. They note that the CODES comparison arm provided more than the standard care patients would have received outside the trial context. The participants in the comparison arm received some of the explanatory information and coping guidance that was available to those in the intervention arm, even though they did not receive the intervention’s active CBT component. From the perspective of the investigators, then, the null results for the primary outcome seem to mean that both arms benefited from the approach embodied by the intervention—not that the intervention was ineffective.
(I think I’m understanding their point, although I can’t be sure.)
**********
Primary and secondary outcomes
The new paper includes a lengthy discussion of the choice of primary outcome. The investigators first mention that funders required it—even though they themselves have a long history of defending seizure reduction as the primary outcome. In the pilot study, the investigators explicitly rejected the idea that other metrics might be more suitable. Presumably they took that step after careful consideration of other possibilities.
Here’s what they wrote in the pilot:
“Our CBT approach is predicated on the assumption that PNES represent dissociative responses to arousal, occurring when the person is faced with fearful or intolerable circumstances. Our treatment model emphasizes seizure reduction techniques especially in the early treatment sessions. While the usefulness of seizure remission as an outcome measure has been questioned, seizures are the reason for patients’ referral for treatment.”
That reasoning still makes sense. Since the investigators specifically designed the intervention to achieve seizure reduction based on their hypothetical understanding of the etiology disorder, it is not immediately clear why seizure reduction should not be the primary outcome. If they are now abandoning this metric as not so important after all, are they also questioning the biopsychosocial theories that informed the creation of the intervention? If not, why not?
Failure of an intervention should lead smart investigators to question their assumptions—but that doesn’t seem to have happened with CODES. The investigators still seem to believe the trial should be viewed as a success, making much of the fact that nine of their 16 secondary measures had findings that were statistically significant. But let’s be clear: This was an unblinded study relying on self-reported (or, in one case, physician-reported) outcomes—a trial design subject to an enormous amount of possible bias. It would be unexpected for the intervention group not to report modestly better outcomes from bias alone.
(The primary outcome and three of the secondary outcomes involved patients’ reports of the number of seizures. The self-reporting of seizures has the appearance as well as some aspects of objectivity, but it is still subjective and potentially influenced by bias.)
My colleague Philip Stark, a professor of statistics at the UC Berkeley, made the following assessment of CODES:
“The trial did not support the primary clinical outcome, only secondary outcomes that involve subjective ratings by the subjects and their physicians, who knew their treatment status. This is a situation in which the placebo effect is especially likely to be confounded with treatment efficacy. The design of the trial evidently made no attempt to reduce confounding from the placebo effect. As a result, it is not clear whether CBT per se is responsible for any of the observed improvements in secondary outcomes.”
I highlighted Professor Stark’s assessment in a 2020 post, which also included my own observations about the secondary outcomes. Here’s the relevant passage:
“The investigators included 16 secondary outcomes in the study, measured either through questionnaires or the seizure diaries, and reported statistically significant findings for nine of them: seizure bothersomeness, longest period of seizure-free days in the last six months, health-related quality of life, psychological distress, work and social adjustment, number of somatic symptoms, self-rated overall improvement, clinician-rated overall improvement, and satisfaction with treatment. Although many of these findings were modest, the array appeared impressive.
“Yet the seven outcomes that failed to achieve statistically significant effects also constituted an impressive array: seizure severity, freedom from seizures in the last three months, reduction in seizure frequency of more than 50% relative to baseline, anxiety, depression, and both mental and physical scales on a different instrument assessing health-related quality of life than the one that yielded positive results.
“So parsing these findings, CBT participants reported that the seizures were less bothersome than in the SMC group, but not less severe. They reported benefits on one health-related quality-of-life instrument, but not on two separate scales on another health-related quality-of-life instrument. They reported less psychological distress, but not less anxiety and depression. When viewed from that perspective, the results seem somewhat arbitrary, with findings perhaps dependent on how a particular instrument framed this or that construct.
“If investigators throw 16 packages of spaghetti at the wall, some of them are likely to stick. The greater the number of secondary outcomes included in a study, the more likely it is that one or more will generate positive results, if only by chance. Given that, it would make sense for investigators to throw as many packages of spaghetti at the wall as feasible, unless they have to pay a statistical penalty for having boosted their odds of apparent success.
“The standard statistical penalty involves accounting for the expanded number of outcomes with a procedure called correcting (or adjusting) for multiple comparisons (or analyses). In such circumstances, statistical formulae can be used to tighten the criteria for what should be considered statistically significant results–that is, results that are very unlikely to have occurred by chance.
“The CODES protocol made no mention of correcting for this large number of analyses, or comparisons. The CODES statistical analysis plan included the following, under the heading of “method for handling multiple comparisons: ‘There is only a single primary outcome, and no formal adjustment of p values for multiple testing will be applied. However, care should be taken when interpreting the numerous secondary outcomes.‘
“In other words, the investigators decided not to perform a routine statistical test despite their broad range of secondary outcomes. It is fair to call this a questionable choice, or at least one that departs from the approach advocated by many trial design experts and statisticians, such as Professor Stark, my Berkeley colleague. A self-admonition to take care “when interpreting the numerous secondary outcomes” is not an appropriate substitute for an acceptable statistical strategy to address the potpourri of included measures.
“Despite this lapse, it appears that someone–perhaps a peer-reviewer?–questioned the decision to completely omit this statistical step. A paragraph buried deep in the paper mentions the results after correcting for multiple comparisons, with no further comment on the implications. Of the nine secondary outcomes initially found to be statistically significant, only five survived this more stringent analysis: longest period of seizure-free days in the last six months, work and social adjustment, self-rated overall improvement, clinician-rated overall improvement, and treatment satisfaction.
“Let’s be clear: These are pretty meager findings, especially since they are self-reported measures in an open-label trial. For example, it is understandable and even expected that those who received CBT would report more “treatment satisfaction” than those who did not receive it. It is also understandable that a participant who received a treatment and the clinician who treated that participant would be more likely to rate the participant’s health as improved than when compared to the SMC group. And a course of CBT could well help individuals with medical problems adjust to their troubling condition in work and social situations.
“None of this means that the core condition itself has been treated–especially since those who did not receive CBT had better results for the primary outcome of seizure reduction at 12 months.”
********
Even as they present all of their additional analyses, the CODES investigators are ignoring the admonition they themselves included in their protocol—that “care should be taken when interpreting the numerous secondary outcomes.” As this latest paper shows, they have taken zero such care. The new paper doesn’t even mention that only five of the secondary outcomes were statistically significant after adjustment for multiple comparisons—a telling omission. The whole thing reads like a desperate attempt to portray their intervention as having had some meaningful effect. The CODES data tell a different story.
-
By David Tuller, DrPH
The CODES trial investigated cognitive behavior therapy (CBT) as a treatment for dissociative seizures (DS), a sub-category of what is now called functional neurological disorder (FND). The intervention was a course of CBT specifically designed to address the variety of factors presumed to be triggering the seizures. (I have previously critiqued CODES here, here, and here,)
However, the trial was a bust, with null findings for the self-reported primary outcome–seizure reduction 12 months after randomization. In fact, in what must have been a major embarrassment for the investigators, the group that did not receive the intervention reported a greater reduction of seizures than the group that did, although this difference was not statistically significant.
(In the past, seizures not believed to have been caused by abnormal electrical signals have generally been called “psychogenic non-epileptic seizures.” The new term is meant to be less insulting; patients often resent being told their conditions are psychologically driven.)
Since these null CODES findings were published in 2020, FND experts have tried to reframe them by, among other strategies, suggesting that seizure reduction wasn’t the most appropriate or relevant primary outcome after all. The investigators themselves raised this notion in the initial paper reporting the CODES results. In an accompanying commentary, a colleague of the investigators promoted a similar notion, suggesting that quality-of-life measures were perhaps a better primary outcome than seizure reduction.
Now they’re at it again. In a new paper called “Reflections on the CODES trial for adults with dissociative seizures: what we found and considerations for future studies,” published this month by BMJ Neurology Open, the key CODES investigators present additional analyses of the trial data and try to argue that the results weren’t really so bad. The paper includes the following sentence: “Overall, inspection of our data does not support others’ suggestions that our treatment did not sustainably reduce DS frequency.”
This is a bizarre remark. It obviously implies that the data from CODES support the idea that the specialized treatment did, in fact, “sustainably reduce DS frequency.” However, CODES provided no evidence that the treatment did any such thing. The primary goal of the trial was not even to investigate whether participants in the intervention arm had reduced seizure frequency but whether the intervention showed benefit—that is, whether those who received the intervention did better than those who did not. And that didn’t happen.
In CODES, both arms experienced some seizure reduction, but the intervention did not provide any advantages in that regard. The reduction in seizure frequency cannot be attributed to the intervention, even if the investigators now appear to be claiming otherwise.
(I could be misinterpreting the above sentence, but I don’t think so. I think the investigators truly believe, notwithstanding the evidence, that their trial documented some impact from the intervention.)
With 368 participants, CODES was the largest clinical trial to date of a treatment for FND. The senior author was the factually and mathematically challenged Trudie Chalder, a professor of cognitive behavior therapy at King’s College London (KCL). The press release from KCL scammed the public by burying the disastrous findings for the primary outcome and instead touting the trial as a big success—a claim based on some subjective secondary outcomes with modestly positive findings that really mean nothing at all.
The new paper explains that, in the CODES model, “DS are maintained by a vicious circle of behavioural, cognitive, affective, physiological and social factors of which fear and avoidance are particularly salient.” This framework, the paper notes, “lends itself to the application of CBT interventions, particularly graded exposure to feared (avoided) situations and seizure interruption and control techniques.”
In the new paper, the investigators provide, perhaps inadvertently, a clue into why CODES was destined to be a failure. As they explain, seizure reduction six months after the end of treatment was the primary outcome in a pilot study of CBT for DS, published in 2010: “In the pilot RCT [randomized controlled trial], 6 months after treatment there was an observed post-randomisation difference in favour of the DS-CBT group, but it could not be shown to be statistically significant.”
Exactly–the pilot study had null results. And yet the investigators were able to convince funders that the evidence warranted a test of the intervention in a full-scale trial. Is there something wrong with this picture? Why is anyone surprised that the full trial also had null results for seizure reduction at follow-up?
In the new paper, the investigators again present creative reasons to re-interpret the null findings from CODES. They note that the CODES comparison arm provided more than the standard care patients would have received outside the trial context. The participants in the comparison arm received some of the explanatory information and coping guidance that was available to those in the intervention arm, even though they did not receive the intervention’s active CBT component. From the perspective of the investigators, then, the null results for the primary outcome seem to mean that both arms benefited from the approach embodied by the intervention—not that the intervention was ineffective.
(I think I’m understanding their point, although I can’t be sure.)
**********
Primary and secondary outcomes
The new paper includes a lengthy discussion of the choice of primary outcome. The investigators first mention that funders required it—even though they themselves have a long history of defending seizure reduction as the primary outcome. In the pilot study, the investigators explicitly rejected the idea that other metrics might be more suitable. Presumably they took that step after careful consideration of other possibilities.
Here’s what they wrote in the pilot:
“Our CBT approach is predicated on the assumption that PNES represent dissociative responses to arousal, occurring when the person is faced with fearful or intolerable circumstances. Our treatment model emphasizes seizure reduction techniques especially in the early treatment sessions. While the usefulness of seizure remission as an outcome measure has been questioned, seizures are the reason for patients’ referral for treatment.”
That reasoning still makes sense. Since the investigators specifically designed the intervention to achieve seizure reduction based on their hypothetical understanding of the etiology disorder, it is not immediately clear why seizure reduction should not be the primary outcome. If they are now abandoning this metric as not so important after all, are they also questioning the biopsychosocial theories that informed the creation of the intervention? If not, why not?
Failure of an intervention should lead smart investigators to question their assumptions—but that doesn’t seem to have happened with CODES. The investigators still seem to believe the trial should be viewed as a success, making much of the fact that nine of their 16 secondary measures had findings that were statistically significant. But let’s be clear: This was an unblinded study relying on self-reported (or, in one case, physician-reported) outcomes—a trial design subject to an enormous amount of possible bias. It would be unexpected for the intervention group not to report modestly better outcomes from bias alone.
(The primary outcome and three of the secondary outcomes involved patients’ reports of the number of seizures. The self-reporting of seizures has the appearance as well as some aspects of objectivity, but it is still subjective and potentially influenced by bias.)
My colleague Philip Stark, a professor of statistics at the UC Berkeley, made the following assessment of CODES:
“The trial did not support the primary clinical outcome, only secondary outcomes that involve subjective ratings by the subjects and their physicians, who knew their treatment status. This is a situation in which the placebo effect is especially likely to be confounded with treatment efficacy. The design of the trial evidently made no attempt to reduce confounding from the placebo effect. As a result, it is not clear whether CBT per se is responsible for any of the observed improvements in secondary outcomes.”
I highlighted Professor Stark’s assessment in a 2020 post, which also included my own observations about the secondary outcomes. Here’s the relevant passage:
“The investigators included 16 secondary outcomes in the study, measured either through questionnaires or the seizure diaries, and reported statistically significant findings for nine of them: seizure bothersomeness, longest period of seizure-free days in the last six months, health-related quality of life, psychological distress, work and social adjustment, number of somatic symptoms, self-rated overall improvement, clinician-rated overall improvement, and satisfaction with treatment. Although many of these findings were modest, the array appeared impressive.
“Yet the seven outcomes that failed to achieve statistically significant effects also constituted an impressive array: seizure severity, freedom from seizures in the last three months, reduction in seizure frequency of more than 50% relative to baseline, anxiety, depression, and both mental and physical scales on a different instrument assessing health-related quality of life than the one that yielded positive results.
“So parsing these findings, CBT participants reported that the seizures were less bothersome than in the SMC group, but not less severe. They reported benefits on one health-related quality-of-life instrument, but not on two separate scales on another health-related quality-of-life instrument. They reported less psychological distress, but not less anxiety and depression. When viewed from that perspective, the results seem somewhat arbitrary, with findings perhaps dependent on how a particular instrument framed this or that construct.
“If investigators throw 16 packages of spaghetti at the wall, some of them are likely to stick. The greater the number of secondary outcomes included in a study, the more likely it is that one or more will generate positive results, if only by chance. Given that, it would make sense for investigators to throw as many packages of spaghetti at the wall as feasible, unless they have to pay a statistical penalty for having boosted their odds of apparent success.
“The standard statistical penalty involves accounting for the expanded number of outcomes with a procedure called correcting (or adjusting) for multiple comparisons (or analyses). In such circumstances, statistical formulae can be used to tighten the criteria for what should be considered statistically significant results–that is, results that are very unlikely to have occurred by chance.
“The CODES protocol made no mention of correcting for this large number of analyses, or comparisons. The CODES statistical analysis plan included the following, under the heading of “method for handling multiple comparisons: ‘There is only a single primary outcome, and no formal adjustment of p values for multiple testing will be applied. However, care should be taken when interpreting the numerous secondary outcomes.‘
“In other words, the investigators decided not to perform a routine statistical test despite their broad range of secondary outcomes. It is fair to call this a questionable choice, or at least one that departs from the approach advocated by many trial design experts and statisticians, such as Professor Stark, my Berkeley colleague. A self-admonition to take care “when interpreting the numerous secondary outcomes” is not an appropriate substitute for an acceptable statistical strategy to address the potpourri of included measures.
“Despite this lapse, it appears that someone–perhaps a peer-reviewer?–questioned the decision to completely omit this statistical step. A paragraph buried deep in the paper mentions the results after correcting for multiple comparisons, with no further comment on the implications. Of the nine secondary outcomes initially found to be statistically significant, only five survived this more stringent analysis: longest period of seizure-free days in the last six months, work and social adjustment, self-rated overall improvement, clinician-rated overall improvement, and treatment satisfaction.
“Let’s be clear: These are pretty meager findings, especially since they are self-reported measures in an open-label trial. For example, it is understandable and even expected that those who received CBT would report more “treatment satisfaction” than those who did not receive it. It is also understandable that a participant who received a treatment and the clinician who treated that participant would be more likely to rate the participant’s health as improved than when compared to the SMC group. And a course of CBT could well help individuals with medical problems adjust to their troubling condition in work and social situations.
“None of this means that the core condition itself has been treated–especially since those who did not receive CBT had better results for the primary outcome of seizure reduction at 12 months.”
********
Even as they present all of their additional analyses, the CODES investigators are ignoring the admonition they themselves included in their protocol—that “care should be taken when interpreting the numerous secondary outcomes.” As this latest paper shows, they have taken zero such care. The new paper doesn’t even mention that only five of the secondary outcomes were statistically significant after adjustment for multiple comparisons—a telling omission. The whole thing reads like a desperate attempt to portray their intervention as having had some meaningful effect. The CODES data tell a different story.
-
By David Tuller, DrPH
The CODES trial investigated cognitive behavior therapy (CBT) as a treatment for dissociative seizures (DS), a sub-category of what is now called functional neurological disorder (FND). The intervention was a course of CBT specifically designed to address the variety of factors presumed to be triggering the seizures. (I have previously critiqued CODES here, here, and here,)
However, the trial was a bust, with null findings for the self-reported primary outcome–seizure reduction 12 months after randomization. In fact, in what must have been a major embarrassment for the investigators, the group that did not receive the intervention reported a greater reduction of seizures than the group that did, although this difference was not statistically significant.
(In the past, seizures not believed to have been caused by abnormal electrical signals have generally been called “psychogenic non-epileptic seizures.” The new term is meant to be less insulting; patients often resent being told their conditions are psychologically driven.)
Since these null CODES findings were published in 2020, FND experts have tried to reframe them by, among other strategies, suggesting that seizure reduction wasn’t the most appropriate or relevant primary outcome after all. The investigators themselves raised this notion in the initial paper reporting the CODES results. In an accompanying commentary, a colleague of the investigators promoted a similar notion, suggesting that quality-of-life measures were perhaps a better primary outcome than seizure reduction.
Now they’re at it again. In a new paper called “Reflections on the CODES trial for adults with dissociative seizures: what we found and considerations for future studies,” published this month by BMJ Neurology Open, the key CODES investigators present additional analyses of the trial data and try to argue that the results weren’t really so bad. The paper includes the following sentence: “Overall, inspection of our data does not support others’ suggestions that our treatment did not sustainably reduce DS frequency.”
This is a bizarre remark. It obviously implies that the data from CODES support the idea that the specialized treatment did, in fact, “sustainably reduce DS frequency.” However, CODES provided no evidence that the treatment did any such thing. The primary goal of the trial was not even to investigate whether participants in the intervention arm had reduced seizure frequency but whether the intervention showed benefit—that is, whether those who received the intervention did better than those who did not. And that didn’t happen.
In CODES, both arms experienced some seizure reduction, but the intervention did not provide any advantages in that regard. The reduction in seizure frequency cannot be attributed to the intervention, even if the investigators now appear to be claiming otherwise.
(I could be misinterpreting the above sentence, but I don’t think so. I think the investigators truly believe, notwithstanding the evidence, that their trial documented some impact from the intervention.)
With 368 participants, CODES was the largest clinical trial to date of a treatment for FND. The senior author was the factually and mathematically challenged Trudie Chalder, a professor of cognitive behavior therapy at King’s College London (KCL). The press release from KCL scammed the public by burying the disastrous findings for the primary outcome and instead touting the trial as a big success—a claim based on some subjective secondary outcomes with modestly positive findings that really mean nothing at all.
The new paper explains that, in the CODES model, “DS are maintained by a vicious circle of behavioural, cognitive, affective, physiological and social factors of which fear and avoidance are particularly salient.” This framework, the paper notes, “lends itself to the application of CBT interventions, particularly graded exposure to feared (avoided) situations and seizure interruption and control techniques.”
In the new paper, the investigators provide, perhaps inadvertently, a clue into why CODES was destined to be a failure. As they explain, seizure reduction six months after the end of treatment was the primary outcome in a pilot study of CBT for DS, published in 2010: “In the pilot RCT [randomized controlled trial], 6 months after treatment there was an observed post-randomisation difference in favour of the DS-CBT group, but it could not be shown to be statistically significant.”
Exactly–the pilot study had null results. And yet the investigators were able to convince funders that the evidence warranted a test of the intervention in a full-scale trial. Is there something wrong with this picture? Why is anyone surprised that the full trial also had null results for seizure reduction at follow-up?
In the new paper, the investigators again present creative reasons to re-interpret the null findings from CODES. They note that the CODES comparison arm provided more than the standard care patients would have received outside the trial context. The participants in the comparison arm received some of the explanatory information and coping guidance that was available to those in the intervention arm, even though they did not receive the intervention’s active CBT component. From the perspective of the investigators, then, the null results for the primary outcome seem to mean that both arms benefited from the approach embodied by the intervention—not that the intervention was ineffective.
(I think I’m understanding their point, although I can’t be sure.)
**********
Primary and secondary outcomes
The new paper includes a lengthy discussion of the choice of primary outcome. The investigators first mention that funders required it—even though they themselves have a long history of defending seizure reduction as the primary outcome. In the pilot study, the investigators explicitly rejected the idea that other metrics might be more suitable. Presumably they took that step after careful consideration of other possibilities.
Here’s what they wrote in the pilot:
“Our CBT approach is predicated on the assumption that PNES represent dissociative responses to arousal, occurring when the person is faced with fearful or intolerable circumstances. Our treatment model emphasizes seizure reduction techniques especially in the early treatment sessions. While the usefulness of seizure remission as an outcome measure has been questioned, seizures are the reason for patients’ referral for treatment.”
That reasoning still makes sense. Since the investigators specifically designed the intervention to achieve seizure reduction based on their hypothetical understanding of the etiology disorder, it is not immediately clear why seizure reduction should not be the primary outcome. If they are now abandoning this metric as not so important after all, are they also questioning the biopsychosocial theories that informed the creation of the intervention? If not, why not?
Failure of an intervention should lead smart investigators to question their assumptions—but that doesn’t seem to have happened with CODES. The investigators still seem to believe the trial should be viewed as a success, making much of the fact that nine of their 16 secondary measures had findings that were statistically significant. But let’s be clear: This was an unblinded study relying on self-reported (or, in one case, physician-reported) outcomes—a trial design subject to an enormous amount of possible bias. It would be unexpected for the intervention group not to report modestly better outcomes from bias alone.
(The primary outcome and three of the secondary outcomes involved patients’ reports of the number of seizures. The self-reporting of seizures has the appearance as well as some aspects of objectivity, but it is still subjective and potentially influenced by bias.)
My colleague Philip Stark, a professor of statistics at the UC Berkeley, made the following assessment of CODES:
“The trial did not support the primary clinical outcome, only secondary outcomes that involve subjective ratings by the subjects and their physicians, who knew their treatment status. This is a situation in which the placebo effect is especially likely to be confounded with treatment efficacy. The design of the trial evidently made no attempt to reduce confounding from the placebo effect. As a result, it is not clear whether CBT per se is responsible for any of the observed improvements in secondary outcomes.”
I highlighted Professor Stark’s assessment in a 2020 post, which also included my own observations about the secondary outcomes. Here’s the relevant passage:
“The investigators included 16 secondary outcomes in the study, measured either through questionnaires or the seizure diaries, and reported statistically significant findings for nine of them: seizure bothersomeness, longest period of seizure-free days in the last six months, health-related quality of life, psychological distress, work and social adjustment, number of somatic symptoms, self-rated overall improvement, clinician-rated overall improvement, and satisfaction with treatment. Although many of these findings were modest, the array appeared impressive.
“Yet the seven outcomes that failed to achieve statistically significant effects also constituted an impressive array: seizure severity, freedom from seizures in the last three months, reduction in seizure frequency of more than 50% relative to baseline, anxiety, depression, and both mental and physical scales on a different instrument assessing health-related quality of life than the one that yielded positive results.
“So parsing these findings, CBT participants reported that the seizures were less bothersome than in the SMC group, but not less severe. They reported benefits on one health-related quality-of-life instrument, but not on two separate scales on another health-related quality-of-life instrument. They reported less psychological distress, but not less anxiety and depression. When viewed from that perspective, the results seem somewhat arbitrary, with findings perhaps dependent on how a particular instrument framed this or that construct.
“If investigators throw 16 packages of spaghetti at the wall, some of them are likely to stick. The greater the number of secondary outcomes included in a study, the more likely it is that one or more will generate positive results, if only by chance. Given that, it would make sense for investigators to throw as many packages of spaghetti at the wall as feasible, unless they have to pay a statistical penalty for having boosted their odds of apparent success.
“The standard statistical penalty involves accounting for the expanded number of outcomes with a procedure called correcting (or adjusting) for multiple comparisons (or analyses). In such circumstances, statistical formulae can be used to tighten the criteria for what should be considered statistically significant results–that is, results that are very unlikely to have occurred by chance.
“The CODES protocol made no mention of correcting for this large number of analyses, or comparisons. The CODES statistical analysis plan included the following, under the heading of “method for handling multiple comparisons: ‘There is only a single primary outcome, and no formal adjustment of p values for multiple testing will be applied. However, care should be taken when interpreting the numerous secondary outcomes.‘
“In other words, the investigators decided not to perform a routine statistical test despite their broad range of secondary outcomes. It is fair to call this a questionable choice, or at least one that departs from the approach advocated by many trial design experts and statisticians, such as Professor Stark, my Berkeley colleague. A self-admonition to take care “when interpreting the numerous secondary outcomes” is not an appropriate substitute for an acceptable statistical strategy to address the potpourri of included measures.
“Despite this lapse, it appears that someone–perhaps a peer-reviewer?–questioned the decision to completely omit this statistical step. A paragraph buried deep in the paper mentions the results after correcting for multiple comparisons, with no further comment on the implications. Of the nine secondary outcomes initially found to be statistically significant, only five survived this more stringent analysis: longest period of seizure-free days in the last six months, work and social adjustment, self-rated overall improvement, clinician-rated overall improvement, and treatment satisfaction.
“Let’s be clear: These are pretty meager findings, especially since they are self-reported measures in an open-label trial. For example, it is understandable and even expected that those who received CBT would report more “treatment satisfaction” than those who did not receive it. It is also understandable that a participant who received a treatment and the clinician who treated that participant would be more likely to rate the participant’s health as improved than when compared to the SMC group. And a course of CBT could well help individuals with medical problems adjust to their troubling condition in work and social situations.
“None of this means that the core condition itself has been treated–especially since those who did not receive CBT had better results for the primary outcome of seizure reduction at 12 months.”
********
Even as they present all of their additional analyses, the CODES investigators are ignoring the admonition they themselves included in their protocol—that “care should be taken when interpreting the numerous secondary outcomes.” As this latest paper shows, they have taken zero such care. The new paper doesn’t even mention that only five of the secondary outcomes were statistically significant after adjustment for multiple comparisons—a telling omission. The whole thing reads like a desperate attempt to portray their intervention as having had some meaningful effect. The CODES data tell a different story.
-
By David Tuller, DrPH
The CODES trial investigated cognitive behavior therapy (CBT) as a treatment for dissociative seizures (DS), a sub-category of what is now called functional neurological disorder (FND). The intervention was a course of CBT specifically designed to address the variety of factors presumed to be triggering the seizures. (I have previously critiqued CODES here, here, and here,)
However, the trial was a bust, with null findings for the self-reported primary outcome–seizure reduction 12 months after randomization. In fact, in what must have been a major embarrassment for the investigators, the group that did not receive the intervention reported a greater reduction of seizures than the group that did, although this difference was not statistically significant.
(In the past, seizures not believed to have been caused by abnormal electrical signals have generally been called “psychogenic non-epileptic seizures.” The new term is meant to be less insulting; patients often resent being told their conditions are psychologically driven.)
Since these null CODES findings were published in 2020, FND experts have tried to reframe them by, among other strategies, suggesting that seizure reduction wasn’t the most appropriate or relevant primary outcome after all. The investigators themselves raised this notion in the initial paper reporting the CODES results. In an accompanying commentary, a colleague of the investigators promoted a similar notion, suggesting that quality-of-life measures were perhaps a better primary outcome than seizure reduction.
Now they’re at it again. In a new paper called “Reflections on the CODES trial for adults with dissociative seizures: what we found and considerations for future studies,” published this month by BMJ Neurology Open, the key CODES investigators present additional analyses of the trial data and try to argue that the results weren’t really so bad. The paper includes the following sentence: “Overall, inspection of our data does not support others’ suggestions that our treatment did not sustainably reduce DS frequency.”
This is a bizarre remark. It obviously implies that the data from CODES support the idea that the specialized treatment did, in fact, “sustainably reduce DS frequency.” However, CODES provided no evidence that the treatment did any such thing. The primary goal of the trial was not even to investigate whether participants in the intervention arm had reduced seizure frequency but whether the intervention showed benefit—that is, whether those who received the intervention did better than those who did not. And that didn’t happen.
In CODES, both arms experienced some seizure reduction, but the intervention did not provide any advantages in that regard. The reduction in seizure frequency cannot be attributed to the intervention, even if the investigators now appear to be claiming otherwise.
(I could be misinterpreting the above sentence, but I don’t think so. I think the investigators truly believe, notwithstanding the evidence, that their trial documented some impact from the intervention.)
With 368 participants, CODES was the largest clinical trial to date of a treatment for FND. The senior author was the factually and mathematically challenged Trudie Chalder, a professor of cognitive behavior therapy at King’s College London (KCL). The press release from KCL scammed the public by burying the disastrous findings for the primary outcome and instead touting the trial as a big success—a claim based on some subjective secondary outcomes with modestly positive findings that really mean nothing at all.
The new paper explains that, in the CODES model, “DS are maintained by a vicious circle of behavioural, cognitive, affective, physiological and social factors of which fear and avoidance are particularly salient.” This framework, the paper notes, “lends itself to the application of CBT interventions, particularly graded exposure to feared (avoided) situations and seizure interruption and control techniques.”
In the new paper, the investigators provide, perhaps inadvertently, a clue into why CODES was destined to be a failure. As they explain, seizure reduction six months after the end of treatment was the primary outcome in a pilot study of CBT for DS, published in 2010: “In the pilot RCT [randomized controlled trial], 6 months after treatment there was an observed post-randomisation difference in favour of the DS-CBT group, but it could not be shown to be statistically significant.”
Exactly–the pilot study had null results. And yet the investigators were able to convince funders that the evidence warranted a test of the intervention in a full-scale trial. Is there something wrong with this picture? Why is anyone surprised that the full trial also had null results for seizure reduction at follow-up?
In the new paper, the investigators again present creative reasons to re-interpret the null findings from CODES. They note that the CODES comparison arm provided more than the standard care patients would have received outside the trial context. The participants in the comparison arm received some of the explanatory information and coping guidance that was available to those in the intervention arm, even though they did not receive the intervention’s active CBT component. From the perspective of the investigators, then, the null results for the primary outcome seem to mean that both arms benefited from the approach embodied by the intervention—not that the intervention was ineffective.
(I think I’m understanding their point, although I can’t be sure.)
**********
Primary and secondary outcomes
The new paper includes a lengthy discussion of the choice of primary outcome. The investigators first mention that funders required it—even though they themselves have a long history of defending seizure reduction as the primary outcome. In the pilot study, the investigators explicitly rejected the idea that other metrics might be more suitable. Presumably they took that step after careful consideration of other possibilities.
Here’s what they wrote in the pilot:
“Our CBT approach is predicated on the assumption that PNES represent dissociative responses to arousal, occurring when the person is faced with fearful or intolerable circumstances. Our treatment model emphasizes seizure reduction techniques especially in the early treatment sessions. While the usefulness of seizure remission as an outcome measure has been questioned, seizures are the reason for patients’ referral for treatment.”
That reasoning still makes sense. Since the investigators specifically designed the intervention to achieve seizure reduction based on their hypothetical understanding of the etiology disorder, it is not immediately clear why seizure reduction should not be the primary outcome. If they are now abandoning this metric as not so important after all, are they also questioning the biopsychosocial theories that informed the creation of the intervention? If not, why not?
Failure of an intervention should lead smart investigators to question their assumptions—but that doesn’t seem to have happened with CODES. The investigators still seem to believe the trial should be viewed as a success, making much of the fact that nine of their 16 secondary measures had findings that were statistically significant. But let’s be clear: This was an unblinded study relying on self-reported (or, in one case, physician-reported) outcomes—a trial design subject to an enormous amount of possible bias. It would be unexpected for the intervention group not to report modestly better outcomes from bias alone.
(The primary outcome and three of the secondary outcomes involved patients’ reports of the number of seizures. The self-reporting of seizures has the appearance as well as some aspects of objectivity, but it is still subjective and potentially influenced by bias.)
My colleague Philip Stark, a professor of statistics at the UC Berkeley, made the following assessment of CODES:
“The trial did not support the primary clinical outcome, only secondary outcomes that involve subjective ratings by the subjects and their physicians, who knew their treatment status. This is a situation in which the placebo effect is especially likely to be confounded with treatment efficacy. The design of the trial evidently made no attempt to reduce confounding from the placebo effect. As a result, it is not clear whether CBT per se is responsible for any of the observed improvements in secondary outcomes.”
I highlighted Professor Stark’s assessment in a 2020 post, which also included my own observations about the secondary outcomes. Here’s the relevant passage:
“The investigators included 16 secondary outcomes in the study, measured either through questionnaires or the seizure diaries, and reported statistically significant findings for nine of them: seizure bothersomeness, longest period of seizure-free days in the last six months, health-related quality of life, psychological distress, work and social adjustment, number of somatic symptoms, self-rated overall improvement, clinician-rated overall improvement, and satisfaction with treatment. Although many of these findings were modest, the array appeared impressive.
“Yet the seven outcomes that failed to achieve statistically significant effects also constituted an impressive array: seizure severity, freedom from seizures in the last three months, reduction in seizure frequency of more than 50% relative to baseline, anxiety, depression, and both mental and physical scales on a different instrument assessing health-related quality of life than the one that yielded positive results.
“So parsing these findings, CBT participants reported that the seizures were less bothersome than in the SMC group, but not less severe. They reported benefits on one health-related quality-of-life instrument, but not on two separate scales on another health-related quality-of-life instrument. They reported less psychological distress, but not less anxiety and depression. When viewed from that perspective, the results seem somewhat arbitrary, with findings perhaps dependent on how a particular instrument framed this or that construct.
“If investigators throw 16 packages of spaghetti at the wall, some of them are likely to stick. The greater the number of secondary outcomes included in a study, the more likely it is that one or more will generate positive results, if only by chance. Given that, it would make sense for investigators to throw as many packages of spaghetti at the wall as feasible, unless they have to pay a statistical penalty for having boosted their odds of apparent success.
“The standard statistical penalty involves accounting for the expanded number of outcomes with a procedure called correcting (or adjusting) for multiple comparisons (or analyses). In such circumstances, statistical formulae can be used to tighten the criteria for what should be considered statistically significant results–that is, results that are very unlikely to have occurred by chance.
“The CODES protocol made no mention of correcting for this large number of analyses, or comparisons. The CODES statistical analysis plan included the following, under the heading of “method for handling multiple comparisons: ‘There is only a single primary outcome, and no formal adjustment of p values for multiple testing will be applied. However, care should be taken when interpreting the numerous secondary outcomes.‘
“In other words, the investigators decided not to perform a routine statistical test despite their broad range of secondary outcomes. It is fair to call this a questionable choice, or at least one that departs from the approach advocated by many trial design experts and statisticians, such as Professor Stark, my Berkeley colleague. A self-admonition to take care “when interpreting the numerous secondary outcomes” is not an appropriate substitute for an acceptable statistical strategy to address the potpourri of included measures.
“Despite this lapse, it appears that someone–perhaps a peer-reviewer?–questioned the decision to completely omit this statistical step. A paragraph buried deep in the paper mentions the results after correcting for multiple comparisons, with no further comment on the implications. Of the nine secondary outcomes initially found to be statistically significant, only five survived this more stringent analysis: longest period of seizure-free days in the last six months, work and social adjustment, self-rated overall improvement, clinician-rated overall improvement, and treatment satisfaction.
“Let’s be clear: These are pretty meager findings, especially since they are self-reported measures in an open-label trial. For example, it is understandable and even expected that those who received CBT would report more “treatment satisfaction” than those who did not receive it. It is also understandable that a participant who received a treatment and the clinician who treated that participant would be more likely to rate the participant’s health as improved than when compared to the SMC group. And a course of CBT could well help individuals with medical problems adjust to their troubling condition in work and social situations.
“None of this means that the core condition itself has been treated–especially since those who did not receive CBT had better results for the primary outcome of seizure reduction at 12 months.”
********
Even as they present all of their additional analyses, the CODES investigators are ignoring the admonition they themselves included in their protocol—that “care should be taken when interpreting the numerous secondary outcomes.” As this latest paper shows, they have taken zero such care. The new paper doesn’t even mention that only five of the secondary outcomes were statistically significant after adjustment for multiple comparisons—a telling omission. The whole thing reads like a desperate attempt to portray their intervention as having had some meaningful effect. The CODES data tell a different story.
-
By David Tuller, DrPH
I’ve said it before and will undoubtedly say it again. Trudie Chalder, King’s College London’s mathematically and factually challenged professor of cognitive behavior therapy (CBT), is a one-trick pony. She writes what is essentially the same bad paper based on the same unfounded assumptions over and over again. Her apparent professional success represents, at least when it comes to this specific domain of inquiry, the broken state of scientific research, the triumph of mediocrity and incompetence, and the ethical bankruptcy of leading medical journals. From outside the culture of slavish deference to authority that seems to infuse British academia, Professor Chalder’s continuing ability to obtain major grant funding and publish endless muck is astonishing. Is there no mercy in this world for those of us forced by professional obligations to read this unceasing stream of sewage?
In Professor Chalder’s most recent study, she and her colleagues find an association between worsening fatigue and two of her favorite constructs– “all-or-nothing behavior” and “catastrophic thinking”–in patients with inflammatory bowel disease (IBD). The odds ratios are pretty tiny, however, at 1.13 for all-or-nothing behavior and 1.18 for catastrophic thinking—meaning that the odds of worsening fatigue are only marginally higher in those demonstrating these purportedly unhelpful or behavioral or cognitive patterns. (The study was published by the journal Inflammatory Bowel Diseases.)
However banal and inconsequential, this sort of statistical finding is catnip for Professor Chalder. She seems never to have met an association she couldn’t try to spin as a causal relationship in order to justify the promotion of CBT as a solution. In this case, she seems to interpret this very minimal association to mean that these two identified patterns—all-or-nothing behavior and catastrophic thinking—are major factors in generating the reported worsening of fatigue.
This interpretation is implied, although not stated explicitly, in the headline: “All-or-Nothing Behavior and Catastrophic Thinking Predict Fatigue in Inflammatory Bowel Disease: A Prospective Cohort Study.” The word “predict” is doing a lot of work here. It creates the impression that these cognitive and behavioral patterns are to blame for the reported fatigue. But the association itself cannot be construed as evidence of that.
In reality, logic and common sense suggest that any causal relationships, however small, could easily run in the opposite direction than that presumed by Professor Chalder and her colleagues–that is, the fatigue itself is likely leading to the reported behavioral and cognitive patterns. Given fluctuating or worsening levels of fatigue, it makes sense that IBD patients would try to get as much done as possible when they felt well and less so when they felt worse. Moreover, if someone with pronounced or worsening levels of fatigue responds to questionnaires with a realistic appraisal of their condition, it could easily be interpreted by Professor Chalder and similarly biased investigators as “catastrophizing.”
In this study, 167 patients filled out questionnaires at baseline and three months later. The primary outcome was Professor Chalder’s eponymous instrument, the Chalder Fatigue Scale. (Let’s put aside for now the various concerns that have been raised about the usefulness and accuracy of this questionnaire.) To assess possible “explanatory variables,” the investigators used an instrument called the Cognitive and Behavioural Responses to Symptoms Questionnaire (CBRQ). This instrument was designed and validated by experts, including Professor Chalder herself, who believe that fatigue and other “medically unexplained” symptoms should be treated with psychological and behavioral interventions, which they further believe are what they call “evidence-based.”
The CBRQ includes two behavioral subscales and five cognitive ones. The behavioral subscales are all-or-nothing behavior and excessive resting behavior. And here’s how the paper describes the cognitive ones:
“The cognitive subscales are (1) fear avoidance, which focuses on avoidance of exercise due to fear of worsening symptoms; (2) damage beliefs, which measure the belief that symptoms and their severity reflects true damage to the body; (3) catastrophizing about symptoms, which captures negative and inflated beliefs in anticipation of symptoms; (4) embarrassment avoidance, which quantifies avoidance of social situations due to feelings of shame about symptoms and concerns about others’ opinion of symptoms; and (5) symptom focusing, which assesses attentional focus towards symptoms.”
Each of these subscales only make sense if the underlying hypothesis of the CBRQ and this study—that these cognitive factors serve to perpetuate and exacerbate the fatigue—is valid. If the fatigue is driven instead by poorly understood pathophysiological dysfunctions connected to the primary diagnosis of IBD, then it would not be surprising for respondents to have higher scores on all of these subscales. These investigators, however, do not appear to acknowledge the possibility of such pathophysiological dysfunctions, so they view such straightforward accounts as “unhelpful” cognitions that need to be reversed through CBT.
Moreover, as one of the three lead investigators of the discredited and arguably fraudulent PACE trial, Professor Chalder still cites it in this paper as support for her claims, such as that CBT has been shown to be effective in producing “significant improvements” in fatigue in patients with what is referred to as “chronic fatigue syndrome.” (It is an insult to patients that even now Professor Chalder chooses to use this rejected name for the illness.) It goes without saying that PACE should never be cited as if the findings can be taken seriously, and no peer reviewers should allow such a citation to pass unchallenged.
As a secondary measure, the study included an instrument called the Work and Social Adjustment Scale (WSAS). As with fatigue, there was an association between worsening scores on the WSAS and the CBRQ subscales of all-or-nothing behavior and catastrophizing; the other behavioral subscale of excessive resting and the cognitive subscale of fear avoidance were similarly associated with worsening WSAS scores. The odds ratios in all four of these associations were as tiny as the associations involving fatigue. (Is it just me, or is there something confusing about having positive associations with both of CBRQ’s behavioral subscales–all-or-nothing behavior and excessive resting behavior? Don’t those two behavioral patterns kind of cancel each other out?)
In sum, Professor Chalder’s study of IBD patients yielded marginal associations between the primary and secondary outcomes and so-called “explanatory variables.” These findings would be negligible even if the investigators were correct in interpreting the relationships as causal in their favored direction and were not basing their interpretations upon their unproven hypotheses and questionable assumptions. The study’s inevitable conclusion–that “CBT interventions targeting all-or-nothing behavior and catastrophic thinking in IBD are warranted”–is self-serving bullshit.
(Originally posted on Virology Blog.)
https://trialbyerror.org/2023/09/04/just-the-latest-gibberish-from-professor-chalder/
-
By David Tuller, DrPH
*April is crowdfunding month at UC Berkeley. If you like my work, consider making a tax-deductible donation to Berkeley’s School of Public Health to support the Trial By Error project: https://crowdfund.berkeley.edu/project/37217
No human being should ever have to read as many papers as I have from Professor Esther Crawley, Bristol University’s methodologically and ethically challenged pediatrician, and Professor Trudie Chalder, King’s College London’s statistically and factually challenged cognitive behavior therapy specialist. Most recently, I had to ask the UK’s Health Research Authority to track down why most of the papers that Professor Crawley was ordered to correct a few years ago, per the results of an investigation into her work, had not been corrected. And Professor Chalder makes one egregious error after another–as when she declared at a PACE press conference that people in the trial got “back to normal,” a serious misstatement of the findings.
Really, I’ve had it up to here with the crap that they publish. Perhaps that’s why I have so far avoided paying attention to a major study in which they both play a role. But the time has come to discuss the project called Children & young people (CYP) with Long Covid—which the authors shorthand as the CLoCk study. (Enough with these stupid pseudo-acronyms! What the @#$ does “clock” have to do with anything???)
Professors Chalder and Crawley are not the main investigators but are members of a larger consortium of researchers across multiple universities. Nonetheless, their participation is certainly a red flag. The prospective study is sponsored by University College London’s Great Ormond Street Institute of Child Health and has been awarded £1.35 million from UK funding agencies. Here is a description of the project from UCL’s site:
“This project has identified test positive and test negative 11–17-year-olds through Public Health England’s database. We will be contacting families 3, 6, 12 and 24 months after the young person’s COVID-19 test asking them to complete a questionnaire about the young person’s physical and mental health. We will compare the symptoms between those who have tested positive and those who do not, and also track symptoms over time.”
In a problematic sign, the study is housed under the umbrella of UCL’s “psychological medicine research.” That suggests that this project is not likely to demonstrate a sense of urgency about the need to uncover any pathophysiological mechanisms related to long Covid. It seems designed in a way likely to maximize the chances of generating data that would allow the authors to attribute reported symptoms to the effects of lockdown and related challenging circumstances.
It is undeniable that some children identified as having long Covid are likely suffering from anxiety, depression and other pandemic-related mental health issues, and that such mood states can manifest as physical symptoms. But that truism does not mean that emotional distress is the major cause of the current wave of pediatric medical complaints. And it certainly does not explain why so many formerly healthy kids are experiencing disabling conditions—just like these hypothesized psychological mechanisms cannot explain why so many adults now find themselves unable to engage in their regular daily activities, including work.
Unfortunately, it is always easy to use broad criteria and inappropriate or otherwise problematic comparison groups to generate results suggesting that prolonged symptoms are related to “psychosocial” factors. That happened most recently with the recent Norwegian study published in JAMA Network Open, which I analyzed here and here. The study was conducted in conjunction with Recovery Norway, an organization with strong links to the mind-body “retraining” program known as the Lightning Process. The study’s conclusion—that reports of prolonged symptoms are mainly related to factors other than coronavirus infection—is not credible. But it aligns with the bias of leaders of the research team.
With the CLoCK study. I’m coming late to the party. Luckily, others have already weighed in on this one, so I’m highlighting the steps they’ve already taken to address the situation. In particular, Claire Higham, who writes a newsletter called Long Covid Advocacy and lives in England, sent a letter to UCL last July expressing a number of cogent concerns. The letter was co-signed by several physicians, researchers and other experts who have become known for speaking out about long Covid. Higham has a long history of post-acute viral illness, and her daughter has long Covid. According to a statement on the newsletter, “The purpose of Long Covid Advocacy is to highlight systemic injustice in the fields of Long Covid and ME/CFS.”
Higham finally received a response from UCL late last month and has posted it here. (Long Covid Advocacy has post about the letter and responded to UCL’s response, here and here.)
Long Covid Advocacy’s letter highlighted the following issues (read the full letter for details):
*Ethical breaches in recruitment and data collection: Informed consent; Deception; Fishing and Data Protection.
*Long Covid and ME/CFS are clinically linked; CLoCK draws heavily on psychosocial research into ME/CFS that contravenes the current NICE ME/CFS Guidelines, and ignores the report from APPG4ME: Rethinking ME.
*Psychosocial aspects such as ‘Lockdown Anxiety’ and ‘abnormal thoughts’ are being inappropriately researched in CLoCK.
*Insufficient public patient involvement: issues raised are being inadequately addressed from a PPIE perspective resulting in major flaws in the proposed trial.
*Harmful exercise programmes may be used with CYP on the CLoCK study.
*CYP may face a misdiagnosis of PRS or FII or if parents refuse or pull out of treatment or research; this may further prompt inappropriate involvement from Child Protection Services.
The UCL response, needless to say, was disappointing. Some of the explanations also demonstrated a level of bureaucratic bungling and incompetence that is especially concerning when the people involved are dealing with vulnerable pediatric patients. For example, Higham’s complaint was apparently passed around among multiple offices and agencies because no one seemed to know who was responsible for responding to it. That’s why it took almost a year for UCL to respond. And some of the forms used by the investigators created confusion about whether pediatric patients at clinics were or were not being enrolled in the study. Etc.
Beyond the specifics of the CLoCk study, a major problem–at least from my observation–is that people seem to fail upwards in this domain of UK academia, which appears to operate under Trumpian approaches to logic and intregity. Professors Crawley and Chalder should by now be considered major embarrassments to their respective institutions, given their documented methodological errors, ethical missteps and false public statements. Both of these prolific and high-profile investigators have committed what I consider to be serious research misconduct, and arguably worse—Professor Chalder with the discredited PACE trial in The Lancet and Professor Crawley with the disastrous pediatric trial of the Lightning Process in Archives of Disease in Childhood, a BMJ journal. The latter now carries a 3,000-word correction and a 1,000-word editor’s note explaining with tortured logic why the paper wasn’t retracted. But it still gets cited authoritatively.
The CLoCK study is only the latest example of questionable work connected to this bunch. It seems largely based on long-held but unproven assumptions that unexplained symptoms are mainly caused by psychosocial factors. Professors Crawley and Chalder and their colleagues are one-trick ponies. Despite the obvious flaws in their research, they retain undeserved levels of professional applause and continue to pull in grant money from leading UK funders. It is hard to fathom.
(Originally posted on Virology Blog.)