How Your Doctor Looks at Research: the Error Factor November 8, 2005 By Robert H. Shmerling, M.D. Beth Israel Deaconess Medical Center If you look at most medical news stories years later, there is often little to show for them. The latest "groundbreaking research" has gone nowhere and the news has moved on to other "breakthroughs." Why does this happen? In some ways, the answer is predictable. It happens because medical research is designed and performed by fallible humans using imperfect methods, because the news media want to attract and hold onto an audience, and because we all want hopeful, positive results. Each of these factors is at play and each plays havoc with our ability to tell the difference between an interesting but preliminary finding, with uncertain importance, and the "next big thing." Of these, perhaps the one factor that dooms many scientific studies and is most difficult to eliminate is error. The fact is, no matter how careful and talented researchers may be, error is common in medical research. Types of Error In broad terms, medical researchers identify two main types of error: Type 1 error is the incorrect finding of positive results (also called a false-positive study result). For example, imagine a study reporting that people who eat lots of flaxseed tend to live longer. The news flash might be: The more flaxseed you eat, the longer you live — so eat more! Sure, it could be true that flaxseed makes you healthier and live longer, but there could be other factors (called confounders) that account for the findings. Maybe people who add flaxseed to their diet exercise more, smoke less and watch what they eat more carefully than people who aren't into flaxseed. If researchers take these confounders into account, the apparent advantage of flaxseed may disappear; but if these factors are not accounted for — perhaps because no one thought of them — attributing longevity to flaxseed is a Type 1 error. Type 2 error is the incorrect finding of negative results (also called a false-negative result). For example, suppose flaxseed actually does help you live longer. If researchers perform a study and find people have the same lifespan regardless of how much flaxseed they consume, the conclusion would be an example of a Type 2 error. Perhaps several of the flaxseed eaters happened to die prematurely for unrelated reasons. Their demise could have thrown off the results and apparently eliminated the beneficial effects of flaxseed. Researchers take pains to minimize the chances of either type of error. For example, a study will try to anticipate every potential confounder. Once information about these factors is collected, there are statistical methods to account for them, eliminating their contribution to the results. Frequently, however, there is an important variable that was not anticipated, because our knowledge of the relevant factors is simply incomplete. And studies may not recruit enough subjects to "even out" the occasional unusual or irrelevant result. The Limits of a Limited Time Frame Because studies are expensive, require volunteers and the work of many physicians, nurses, pharmacists and other health professionals, they often last for the minimum amount of time thought necessary to answer the question of interest. While studies lasting only a few months or up to a few years are common, we really need information about the longer term because: In "real life," people take medications for many years or even decades. It's important to know how effective and how safe a medicine is many years down the line. Long-term follow-up will eliminate some of the random variation or "noise" in the measured outcomes. For example, blood pressure can vary under normal circumstances. If only one or two measurements are made over two or three months, results will not be as reliable as many more measurements over many years. People change over time. If other diseases develop or other medicines are taken over time (which is a fact of life for the majority of people as they age), the findings of a short-term study may be reversed. Short-term studies are not reliable measures of the long-term. Unless the disease or symptom is a short-term issue, the inadequate follow-up of many studies is a significant limitation and source of error. Yet, there are major challenges to performing long-term research, including lack of personnel (who change jobs or graduate), difficulty keeping track of the volunteers (who move away or grow weary of the monitoring), or changes in technology. For example, it may be impossible to compare the results of an MRI today with results from 15 years ago because the quality of images obtained has improved over that span of time. Could You Repeat the Question? Research begins by asking a question and then designing a way to answer it. But because of how research is performed, the question may change to be something related that is easier to study, but not exactly the ideal question. For example, there is a lot of focus on cholesterol, but it's cardiovascular disease and death that we really care about. Knowing that there's a connection between cholesterol and these other important outcomes has led researchers to focus on cholesterol results. So, instead of asking, "Does this medication prevent heart attack and prolong life?" we ask, "Does this medicine reduce cholesterol?" Relying on research that uses "proxies" (a variable that "fills in" for the one you're really interested in) is not as reliable as those that directly look at the key endpoints of interest. Animal studies are another way that the question is revised for the sake of study design. If you want to ask, "Does this medicine prevent age-related changes in the brain?" you can actually examine the brains of rats and mice in a way that you cannot for humans. So, the question becomes, "Does this medicine prevent age-related changes in the brains of mice?" Finally, the question may be shaped based on who is paying for the research. While you may want to know, "Does this medicine work better than another medicine?" a pharmaceutical company may change it to "Is our medicine better than nothing?" Why? Because that's the minimum requirement of the U.S. Food and Drug Administration for approving a drug and allowing it to be marketed in this country. That question may be much less interesting — do you really want to take a medicine that's "better than nothing" or the best one among all options for treating your problem? Conclusions Medical research is vitally important, but it has some significant limitations, and some of them contribute to how the medical news you hear today may be contradicted by other research next week or next year. Here's my advice: Remain skeptical, recognize that it's rarely a good idea to bank on a single study, and keep in mind that there could always be error, no matter how good the research (or the researcher) sounds. Someday, there may be major changes in our system of funding and executing medical research. We need funding sources that do not alter the question that is asked, we need better ways to minimize error, and we need research designs with a long-term vision. One suggestion is to require studies to continue even after a new drug is approved, but that would take cooperation (and funding) that seems unlikely in the near future. Until that happens, take your medical news with a grain of salt — or perhaps a couple. Robert H. Shmerling, M.D., is associate physician at Beth Israel Deaconess Medical Center and associate professor at Harvard Medical School. He has been a practicing rheumatologist for over 20 years at Beth Israel Deaconess Medical Center. He is an active teacher in the Internal Medicine Residency Program, serving as the Robinson Firm Chief. He is also a teacher in the Rheumatology Fellowship Program.