HTA 101: Introduction to Health Technology Assessment Transcript

Event Started: 8/31/2011

Hello and welcome to HTA 101, Introduction to Health Technology Assessment. Our speaker today is Cliff Goodman. Cliff is a senior vice president and principal at the Lewin Group, a policy consulting firm located in Falls Church, Virginia. Cliff has 30 years of experience working with government, industry, and nonprofit organizations in health care evaluation, primarily in health technology assessment and policy analysis with related expertise in evidence-based medicine, outcomes research, health economics, regulatory policy, third-party payment and technological innovation.

In 1999, Cliff Goodman developed HTA 101, an introductory primer on health technology assessment for the National Library of Medicine, updated it in 2004 and is currently completing a new revision for 2011. Cliff is the chair of the Medicare Evidence Development and Coverage Advisory Committee (MEDCAC), for the Centers for Medicare and Medicaid Services. He is President of the professional society, Health Technology Assessment international (HTAi) and is a fellow of the American Institute for Medical and Biological Engineering. He received a Doctor of Philosophy from the Wharton School of Pennsylvania, a Master’s of Science from the Georgia Institute of Technology and a Bachelor of Arts from Cornell University.

Thank you very much. I am glad to be here to reintroduce health technology assessment 101, Introduction to Health Technology Assessment. We will cover a broad overview of information in the next hour. I will refer you to the slides, which will be posted later on, and the posted version of the full HTA 101, which will be out in about a month or so. Here is a basic outline of what we will cover this afternoon. We will talk a little bit about the origins of technology assessment, then some description about what we mean by “health technology”, some discussion about HTA and its role in health care, followed by an overview of health technology assessment methods. A bit of discussion will be on priority setting and health technology assessment, as well as the timing of HTA, bibliographic sources for HTA and finally, we will talk about ten current trends in HTA.

Here is some information a lot of people do not even know. Health technology assessment was broader and arose in the mid 1960s, from a broader appreciation of the critical role of technology in our society. Particularly, it has the potential for unintentional and sometimes even harmful consequences. The term technology assessment was first used in the mid 1960s on the floor of the U. S. House of Representatives. You will see when we talk about some examples of early technology assessments, technology assessments were actually some things that are still familiar to us. For example, the implications of things like off-shore oil drilling, pesticides, automobile pollutions, nuclear power plants, supersonic airplanes, and artificial hearts. Some of these problems about technology’s unintended, as well as intended consequences, are still with us. I find these are still fascinating to me as time goes on. What about health technology itself and some early ones? You will see these really got started in the very late 1960s and on into the mid 1970s. So those are some of the original topics there. As you can see, several of them have to do with things that challenge social or ethical norms. Often times, technology and advances in technology challenges in those areas and some of these are some of the original subjects of technology assessments. In fact, some of those topics are still of interest today. That is a little bit about the background and origins of HTA.

Let’s make sure we first understand what we mean by health care technology. First of all, think of technology in a broad sense, not just as some kind of mechanical gizmo of a particular kind of molecule or something like that. It really is the practical application of knowledge and in health care, it is a practical application of knowledge in the health care delivery system. Three ways to describe health care technology include; physical nature, clinical purpose, and stage diffusion. I will go over these briefly. First, physical nature, with which I am sure you are most familiar. You see drugs and a variety of drugs of all different kinds. Drugs; biologics, which ranges from vaccines to blood products to even biotechnology derived substances. Devices, equipment, and supplies, range from cardiologic pace makers to MRI scanners to mosquito netting, of all things. This is a very broad definition. Medical and surgical procedures, acupuncture, bariatric surgery, cesarean section; you see there is a broad range there. Now, some folks do not realize that technology in healthcare goes even more beyond that scope. It really also includes support systems like clinical laboratories systems or drug formularies or electronic health record systems and beyond that even to organizational managerial systems. So when we talk about a vaccination program, even a particular healthcare payment system, that is a form of applied knowledge and a form of health technology.

Technology can also be considered by its purpose. Prevention is not the same as screening. In screening, you look at populations of asymptomatic people. That is not the same as prevention, although it can ultimately contribute to prevention. Diagnosis is not the same as screening. Diagnosis refers to populations or individuals that have symptoms. There seems to be something wrong with them and we try to figure out what it is. On to treatment; rehabilitation let’s say is like after someone had a stroke or someone had a hip joint replacement. Palliation really is not the same as treatment; what we are trying to do is make life easier for people and improve their quality of life, in instances where they have gone beyond the point where you can treat them effectively. There are half a dozen more types of purposes in health technology and even some devices or instrumentation can be used for multiple types of these. Some of them can be used in diagnosis as well as treatment, so there is some crossover there. But again, there are multiple purposes of health technology. You can also think of technology with regard to its stage of diffusion or where it is in its life cycle. It can be on the future, kind of a drawing board kind of thing, where you can see it but you have not put it together yet. Experimental typically refers to laboratory testing or testing in animals. We use the term investigational typically to revolve around clinical studies or testing in humans. Establish concerns of standard approach with standard of care or mainstream care. Then obsolete; toward the end of a life cycle of the technology, something may be outmoded or overtaken by other forms of technology. It is helpful to think of technology by its physical nature, its purpose, and its stage of diffusion. All three of those dimensions of health technology are important in the field of technology assessments in health care.

Now, I would like to remind folks about why we are doing this. As I noted earlier in one of the first slides, one of the concerns of HTA has to do with unintended/undesirable consequences or unexpected consequences of technology. Unfortunately, the list is long of technologies that were put into wide-spread practice probably a little bit too early or were not taken off of practice or away from practice until some harm had actually been done. Here are just ten examples of that and the one at the top may be the best known, which was thalidomide used in sedation in pregnant women. This was used in the very late 1950s and was pulled offer the market in 1961 because it led to very severe birth defects. It also led to changes in how we recognize health care products in the United States and around the world. But there is a longer list of technologies that we thought at some point made sense but once we got them out and started using them, things did not go as intended. In most of these cases we did not do a good enough job of assessing these technologies. I will just call your attention to another one in the middle, AMBT-HDC (Autologous Bone Marrow Transplantation with High-Dose Chemotherapy) used on women with metastatic breast cancer. That was brought on to wide-spread use based on some substandard studies. It took us too long to do the real rigorous trials to figure out that AMBT-HDC really did not work and unfortunately, a lot of women were harmed because of this. We learned many lessons about how to assess technologies in instances like that. This is a very interesting list, with some of these more recent than others. The idea here is what are we doing prospectively and while the technology is being used, to collect the right data to really keep on top of whether or not benefits continue to outweigh the risks? At the same time, we need be cognizant of technologies that really do work well and might even be cost effective on top of that. What are we doing to make sure the evidence for those is understood and disseminated to decision makers including; patients, practitioners, payers and others? Here is a list of ten technologies that are underused today, for various reasons. We need to learn more about how to document these and share the information. In the field of health technology assessment, it is not just about trying to keep bad things away; it is about trying to understand why good things work and demonstrating that and making sure we provide the evidence in a way that is understandable and can be used by people in practice. There you have ten examples of many technologies that are underused and might even be cost effective relative to those that are not so cost effective.

Now, let us get to health technology assessment itself. Here are five descriptors of health technology assessment. First of all, it is the systematic evaluation of the properties, effects, and other impacts of health technology. The main purpose is to inform policymaking. Note that it does not say to make the policies; HTA supports policy making by providing evidence-based information. HTA may detect a direct intended consequence of technology, as well as the indirect and unintended consequences of technologies. I just gave you some examples a few slides ago about what can happen with the indirect and unintended consequences. HTA is typically conducted by interdisciplinary groups, different types of experts and viewpoints and perspectives work together, and it uses explicit analytical frameworks in a variety of methods. Those are five attributes or correct characteristics of health technology assessment.

Who does health technology assessment? Well, here are half a dozen ways in which it can be done or performed by various parties. It can be used to advise a regulatory agency about allowing the marketing or use of a technology. It can be used to advise payers, such as national health authorities or health plans or commercial insurers about technology reimbursement. Technology reimbursement is coverage; whether or not to pay, coding, and payment amounts. It can be used to advise clinicians and patients. It can help managers of hospitals and other health care organizations to make decisions about acquiring technology. It can support decisions by health technology companies themselves about development and marketing. Often times they use the phrase, “go no, no-go decisions” when health technology companies are developing new drugs or biologics or devices. They need do their own internal assessment to figure out whether there is an appropriate market opportunity and whether this is a technology that will work well in the market and whose benefits will outweigh its risks. Sometimes even technology assessment is done to support the decisions by financial groups. I often get calls from venture capitalists, wanting to know what we think about a technology. That is typically a shorter turnaround technology assessment but it does illustrate the breadth as the full set does; the breath of how and why technology is performed and why is it used. In recent years, much of the growth of HTA is attributable to the greater interest on the part of payers to support their decisions. They are responsible for thousands or millions or even tens of millions of beneficiaries so they need better information about making those kinds of decisions.

In HTA, what are the properties or impacts that are assessed by technology? In other words, what are we trying to figure out about technologies? Here are five main categories; the first is technical properties. This may be something along the lines of a magnetic resonance imaging unit or CAT scanner; what is the resolution of the picture or the image and technical properties. Safety is an indicator of harm or willingness to accept risk of a technology. Sometimes you hear the phrase adverse events associated with technology, so safety is a category as well. Efficacy and effectiveness refer to how well does the technology do what it is designed to do and how well does it accomplish those goals? There is an important distinction between those two terms and we will get to that in a minute. The fourth main category deals with cost and other economic attributes. Cost effectiveness, cost benefit, cost utility and so forth and we will mention those a bit later. Finally a broad category regards social, legal, ethical, and political impacts. Some technologies have very little impact along these lines and others may have extraordinary impact. We already talked about the example before about in vitro fertilization. Certainly, that may have social, ethical, and legal impacts in certain circles. Other technology today like stem cell use is viewed as having certain social or legal implications. Oftentimes HTA is called upon to look at those properties and impacts as well. Whenever you think about HTA, generally the things that people care about will fall into one of these five main categories.

Let's return to this distinction between efficacy and effectiveness. This is important distinction that became much more pronounced over the last 20 years and really it goes like this. We use the term efficacy to refer to how well a technology works under ideal conditions of use. For example, in a strict protocol of a randomized controlled trial at the center of excellence, you would expect that under those conditions the performance of a technology, its effectiveness, may be as high as it is going to be. Oftentimes its efficacy data that we see when a product is approved, for example by the U. S. Food and Drug Administration, the data or evidence that may have been generated for that is typically generated under ideal conditions. But more and more were saying “that is interesting but what I really want to know is how well this thing works in routine practice, in general use, and in community settings?” For that, we reserve the term ‘effectiveness’. So that refers to how well some technology works under general or routine conditions and you might consider that something, whose efficacy is 90% under ideal conditions, may have effectiveness that may be 50% or 60%. If you are a doctor or a patient or even a payer, you more and more want to know about effectiveness.

How do we measure efficacy or effectiveness? In other words, what are some of the attributes or dimensions of those? Some major categories are health outcomes or sometimes called end points, which measure benefit and harms; typically mortality, morbidity and adverse events. Another category has to do with quality of life, a broad term and sometimes it is mixed in with functional status or even patient satisfaction. We often ask not just about these traditional health outcomes (mortality, morbidity, and so forth), we also say how might this technology be affecting the quality of life for a patient or their satisfaction with their status of health? Another category we call intermediate end points. Some particular intermediate end points are also surrogates for the final end points that I mentioned before. So there are things like blood pressure or laboratory values or an EKG. These are often called biomarkers and the significance of these is that sometimes it could be difficult to measure, especially in the short term of these particular health outcomes. If we know that an intermediate end point or biomarker is highly correlated or highly predictive of mortality or morbidity, sometimes we have reason to use intermediate end point as a surrogate for a health outcome or end point. Another category has to do with accuracy of tests. Sometimes we say, “How effective is this screening or diagnostic test?” We will say that efficacy or effectiveness is measured in terms of its sensitivity or specificity, its positive predictive value or negative predictive value, and so forth. Those are main categories for measuring efficacy and effectiveness.

Now, a particular type of outcome that is of considerable interest to the United States and around the world is an outcome called quality adjusted life year. We will use this as an example a little bit later on in the hour but let’s just briefly talk about it for now. It is widely accepted that a year of life spent in a good state of health is preferred to one year spent in a poor state of health. We start to talk about the term ‘utility’ or ‘patient utility’. That refers to the relative preference or value that an individual or even society has for some particular state of health. There are ways to measure patient utility in a particular state of health. Here are some of the ways to figure it out. Time tradeoff or standard gamble, both derived from gaming theory or there are some indexes such as the standard ShortFform 36 or the EuroQol, the Health Utility Index and Quality of Wellbeing. These are ways to actually assess patient utility. The quality adjusted life is the unit for measuring outcomes of health care and it combines length of life with quality of life. That is years of life that are affected or increased by an intervention are weighted by one's utility for that quality of life during those years.

QUALYs may be used as a unit of patient or user outcomes in cost utility analysis. We will talk about those a little bit later. What I would like to do is show you a picture of this concept. Let's talk about quality adjusted life. Again, it is the length of life times its quality weight that you see at the top. Here is a picture depicting someone at the end of his/her life. Let's look at the years; zero (0} is today and this person we expect will live another five years. That is length of life. On the vertical axis, we havethe quality adjusted weight that can range from zero to one, with death to perfect health respectively. You can see in this circumstance that today the person has a pretty good quality of life at 0.8, but this person is going to go down at this decline and will die in five years. So what we can say is here in this picture, in terms of quality of life and length of life, describing this portion of the person's life. The person, by the way, is under some current therapy and they will be under that therapy for the remainder of their lives. What happens if we provide some sort of new therapy and we ask how is this new therapy somehow affecting this person's life? As you can see, the graph is doing a couple of things. One, it would lengthen the person's life by one year, from five to six, and for any given year of life, that person's quality of life is better. So the net difference here that is conferred by the new therapy is shown in this yellow space, and basically what we are saying is that this new therapy has provided a gain in quality adjusted life years. It is represented by that yellow field there. Hold on to that thought because we will come back to it a little bit later.

Let's move on to three main groups of methods used in HTA. One can spend years going over these and people have devoted their careers to advancing these methods so we are going to provide a quick overview. The first category is primary data collection, which involves collecting original data. For example, in a clinical trial or set of observational studies (perspective studies, retro speculative studies or other kinds of studies conducted in the past and drawing data from these) involve primary data collection. Secondary or integrated analyses do not generate new data but what they do is combine or integrate data from existing sources. A secondary method might take data from multiple clinical trials and combine that or integrate that. The third main category is economic analysis, where we weigh the cost and typically cost and benefits which may be outcomes or other results. We look at things like cost effectiveness analysis and others. The three main categories are primary data collection, secondary or integrative analyses and economic analysis. I would like to briefly go over some examples of each of those categories. First, in primary data methods, you have heard about things like strength of evidence or evidence-based medicine or principles of evidence-based medicine. These seven categories are a summary of what makes certain studies stronger than others. Perspective studies are stronger than retrospeculative studies, controlled studies are typically stronger than uncontrolled studies and contemporaneous controls are stronger than historical studies. Randomization in clinical trials is typically a strengthening attribute of a clinical trial. A large study that has enough patients or subjects to detect a true treatment effect is typically stronger than a small study. Sample size does matter. Blinded studies, where patient and providers do not know which intervention is being used, are better than one that is not blinded because their assessment of outcomes will not be biased by that knowledge. Finally, this is important - when you read about a study in a journal or elsewhere, studies that clearly define study populations, interventions and outcome measures are typically stronger than those that do not. These are better-documented studies.

Those are seven principles or attributes that tend to strengthen evidence and these are reflected when we use evidence appraisal for HTA, when we use evidence hierarchy, evidence grading, and the like. Those principles, those hierarchies, those grading schemes are based largely on these seven principles. As a matter of fact, here is a basic evidence hierarchy. This one actually starts with systematic reviews and meta-analyses of randomized controlled trials. What you see is a stronger stuff at the top and what tends to be the weaker stuff at the bottom. Towards the top are randomized control trials, non-randomized trials, followed by observational studies, non-experimental studies, and then finally expert opinion. These hierarchies come in different forms and so forth but this is the general approach to them. For this one, RCTs are very strong but if you can actually get a systematic review or meta-analysis that combined even the existing RCTs, it may comprise even stronger evidence. Here is an example of a particular evidence grading framework. As a matter of fact this one is called ‘GRADE’ and it is sort of a basic framework but it still captures these concepts. You will see two broad categories of randomized trials and observational studies on the left. This grading framework does not even consider things like expert opinion and other types. Behind this framework is a lot of documentation about what comprises these types of evidence, high-quality evidence or moderate quality evidence, low-quality evidence and very low. By the way, you notice that an observational study can at best be low and in randomized trials can be high or sometimes moderate. What I like about GRADE is that, while it has some criteria for quality of evidence based on the study design, it also can bump up or push down the grade of a study based on some of these other types of bias or inconsistency, or indirectness. It can also bump up a study that is given a higher grade based on things like the size of its treatment effect and other kinds of aspects that would suggest to you that you are seeing evidence of a true causal effect between an intervention and a technology. So, there are many types of these frameworks or hierarchies for grading evidence. I just wanted to show you one or two to give you a general idea but there is very rich literature on these. The good thing is that they can continue to evolve; they are actually becoming more flexible and they are becoming more adaptable to different kinds of interventions. I can tell you that the general trend is to always look for stronger evidence and better documentation of that evidence.

Let's look quickly now - just briefly - at some of these secondary or integrative data analyses. Here is an array of six of them and these are not in any particular order or hierarchy but do note that they are quite different in quality. Expert opinions just may be one expert’s opinion and you might think, well, that is probably not very strong. You might also get people together in a group for a group judgment or consensus development. At least that is providing some more input there. Unstructured literature review is when somebody or a group of people look at parts of the literature and put together a review of it in some format. However, if it is unstructured, you have to wonder about how systematic they were about doing this and what the strength of the findings were. A systematic literature review is something that says, prospectively, here is what we are going to do, here is our purpose, here are our inclusion criteria for studies, and here are our exclusion criteria for studies. This is how we are going to grade the evidence and so forth. The state of the art of systematic literature reviews has advanced quite a lot in recent years. What often happens is that HTA groups get together and they will do a systematic literature review or contract for one and then provide these systematic literature reviews to groups of experts. Those experts will walk through the evidence, based on the systematic literature review and render a report. Often times, they offer expert input but they start with a systematic literature review. Meta-analysis refers to a way of a statistical type of technique for combining data or pulling data or findings from multiple studies. Modeling is a way to try to represent decision making in a dynamic way that may draw on multiple sources of evidence.

All of these are secondary data analysis approaches. Let's look at a couple in particular. As I said, a systematic review is a structured form of a literature review that addresses one or more key questions. You can see it involves specific steps of well-planned approaches to reviewing this literature. This is a better way of documenting it, of replicating it, and of holding it accountable. Some systematic reviews may include meta-analyses. Sometimes one is actually part of the other. As I suggested earlier, systematic reviews are a key part of most health technology assessments these days. Let's talk briefly about meta-analysis. Meta-analysis refers to a set of statistical procedures for combining results from different studies. This combination may produce a stronger conclusion than you might get from any single study. Meta-analysis is often appropriate when the existing studies do not provide any definitive result; they may point in different directions or results. That is, they may be in disagreement or the available studies may be too small to derive statistically significant findings. We may say that no single study answers our question and/or the group of studies may point in different directions so let's try to combine data to get a better answer. That is basically a meta-analysis. Now, I will not go over this very good example in detail, but I want to tell you that this is a classical example of meta-analysis. I will talk about it briefly to give you an idea for what we mean when we talk about meta-analysis. This is a meta-analysis conducted on a series of studies on a clot-busting drug, streptokinase. This was used for people who are having an acute myocardial infarction - heart attack. All these studies were comparisons of using this drug versus some other standard of care or no care. These were a set of 33 conducted studies, starting in 1959 and all the way through 1988.

Along the left-hand side, you see these 33 studies. For each study, what you see is the year it was conducted and the number of patients in the study. What this graph shows is a ratio. The ratio is the mortality rate of streptokinase patients compared to the mortality rate for people who were in the control groups. For streptokinase to show up well, you want that ratio to be less than one. So whenever you see a dot here to the left of one, that means that the ratio favored streptokinase; it had a lower mortality rate. Of course you can see in some instances the dot (average mortality rate) fell above one, which means the control group actually did better. It is interesting to note that some of these studies that were small may have provided point estimates that were less than one but also you see a horizontal line for each one of those. The horizontal lines are confidence intervals around those point estimates, typically a 95% confidence interval. If a confidence interval around a point estimate crosses this ratio of one, it means that you are really not very sure that the point estimate is true or that the streptokinase actually did better than the control. So any time you see the confidence interval crossing the ratio of one, it means you cannot have a statistically significant conclusion about which treatment worked, the streptokinase or the control. You can see right away that so many of these lines cross one, which means we cannot tell. What do you do? Maybe you do a meta-analysis that combines the data. In short, what they did was pool the data for all of the patients in all these studies and you can see down here the total of all the patients come close the 37,000 patients. Now that is a big sample size! So you see two things for the result. One, you see a point estimate and that point estimate is indeed below one, which means that the drug did have a favorable effect on mortality. The other thing you notice is that you can barely see a horizontal line there for confidence interval and clearly that confidence interval does not come close to one. That means that you can be statistically rather certain that the effect you observe about the drug working, is a true effect.

What the researchers went on to do, which I think is quite fascinating, is they said now we see that streptokinase is better after having done all these 33 trials. What if we had been doing meta-analysis on a rolling or cumulative basis? Which means every time a new clinical trial came in, we would combine it with a previous one to see where we stood. That is what the authors did over here on the right with a cumulative meta-analysis. You can see under the ‘number of patients’ column that they were adding patients as they went all the way. In the end, they still came up with that final number of 36,000. What is important is they said if they had used this cumulative meta-analysis before the meta-analysis these guys did in 1992, they could have known as early as 1977 that streptokinase worked, with an acceptable p value of .001. You see the confidence interval does not cross one so the confidence interval is within the bound, the p value at this point reaches a very acceptable level, they said that you could have known more than a decade earlier that this stuff works. That is why I thought this was a quite interesting application of meta-analysis and indeed is a classical article in the field. That was a quick overview of a meta-analysis. I will give you a few more seconds to take a look at this interesting set of graphs. So before we leave this, I encourage you uh to return to this slide, think about what a meta-analysis is, how it is used. Think about learning over time. Think about the magnitude of sample sizes, point estimates and so forth, and think of this interesting application that the authors did for a cumulative meta-analysis.

Modeling is another interesting approach to combine data from multiple sources and try to draw inferences from those sources. Modeling involves a set of analytical techniques to simulate or represent real processes involving decisions and their outcomes. Often times in modeling, you have a set of choices and we need data from multiple sources to make a decision among these multiple choices. You will hear terms like Markov chain processes, decision analysis or trees, Monte Carlo simulations, even some very complicated simulations of disease processes, health care interventions or health care systems. There is a whole variety of types of modeling that can be used. I want to show you one I think is interesting. Here is an instance of how to treat patients who have recurrent angina after coronary artery bypass graft surgery. In this instance, there were three ways to treat patients; medical using drugs, PTCA (percutaneous transluminal coronary angioplasty), or having a second CABG (coronary artery bypass grafting). The group mapped out what would happen. If you went this route with medical treatment, you could get better, you could get worse, or you could die. If you do PTCA, what could happen? They would say you could have these five types of outcomes. With CABG, you could have these three types of outcomes. This shows why modeling is an integrative approach. Then you have to go to the literature and/or get expert input on the probability of each of these things occurring. If you have this health problem and you go with medical treatment with drugs, your chances of improving are 60%. Chances of getting worse are 34% and the chance of dying is 6%. Those of course, add up to one. The group went further and said, well, what about the value of this outcome? How do you value the outcome? Well, they might value this improvement at 80% on a scale of zero to one. They might value deterioration at 20% and death is valued at zero. They calculated the expected value of each of these three sorts of interventions and the expected value, using the probability for a particular outcome and the value that was placed on each intervention. In this instance, the calculated values indicated the preferred outcome here, which looks to be PTCA. You see this expected value is greater than this expected value and this expected value is greater than this one. Of course, those of you who are aficionados of modeling know that it is important to conduct sensitivity analysis around all these estimates so that you can get a better idea for the validity of these findings on expected value. This is basically the modeling example that is a decision tree. You use the literature and other kinds of input to calculate what happens by calculating the probability of certain outcomes and the value of fixed outcomes.

Let's take a breath now and look at economic evaluation. Economic evaluation typically involves looking at costs and consequences or cost and outcomes. Even more so, we tend to think about the difference in cost between what we are doing now and something new, and the difference is in the consequences or outcomes between what we are doing now and perhaps something new. That is kind of a basic tradeoff of balance or economic evaluation. There are many types of economic evaluation used in HTA and here are a half a dozen or more of those. Cost of illness analysis involves looking at the economic impact of illness or condition. Sometimes it does not even involve the health outcomes. What it really says is, what is this cost? For example, what is the cost of asthma in the United States annually? What is the cost of diabetes in the United States annually and so forth? Cost minimization analysis says, well, if you have alternative interventions, if we can and only if we can assume that these alternatives produce equivalent outcomes, which is preferred? Well, the least costly one; that is cost minimization. Cost effectiveness analysis looks at tradeoffs between economic values of the investment weighed against the outcomes. It might be cost per death averted or cost per heart attack averted. The outcomes are typically these sorts of natural outcomes.

Cost consequence and cost utility analysis are types of cost effectiveness analysis. Cost utility analysis involves using quality adjusted life years or other units of patient or society utility for the outcomes, as opposed to cost effective analysis. Cost effectiveness analysis generally looks at things like these national outcomes such as death, heart attacks, and so forth. These use things like quality adjusted life years. Cost benefit analysis is often very hard to do because not only do you look at the cost of providing a technology; we need to assign monetary value or an economic value to the outcomes as well. This is hard to do. Budget impact analysis asks what is the impact of a new intervention or other technology on my budget or on my formulary budget or on my overall health system budget? It is asking a particular question about how it is going to affect my budget or my budget silo. Those are more than a half a dozen kinds of economic studies. Here is what one would look like insofar as the basic math. This is a cost effectiveness ratio. You can see that, as we pointed out in that slide with the balance in the beginning, we look at the difference in cost divided by the differences in effectiveness or outcomes. This is the difference between the cost of a new intervention and a comparator divided by the difference in the effectiveness between a new intervention and the comparator. What would the answer sound like? The answer might sound like something, oh, $45,000 per life-year saved. or $10,000 per lung cancer case averted. That is what cost effectiveness ratio would sound like. You may also hear the term incremental cost effectiveness ratio, sometimes known as an ICER. That is a term you will often hear in this context. That is a basic equation for calculated cost effectiveness.

Now, sometimes when these things start getting complicated, I like to put it into two dimensions and think about cost effectiveness in these four quadrants. What do you have here? This is a representation of the cost and effectiveness of some current intervention. Here it is right here. The thing costs this much on this vertical scale and its effectiveness is here on this horizontal scale. Here we are - this is our comparator, if you will. This is where we are today with our current intervention or standard of care with a certain cost and a certain effectiveness. Now we introduce a new technology/a new intervention and we would say, “Well, where does that fall within these four quadrants?” For example, it might be more effective and less costly and fall here. It may be less effective and less costly and fall in this quadrant. It may be less effective and more costly or it may be more effective as well as more costly. We have begun to break this problem down a little bit. As a matter of fact, here is what we might be considering. If something is more effective and less costly, we are ready to adopt it. On the other hand, if it is less effective and more costly we are more likely to reject it. Things get interesting in these two quadrants. If something is more effective and more costly we need to run a cost and effectiveness analysis to really examine the tradeoffs there. In some instances we find ourselves in this quadrant where something is less effective but it is less costly and in some settings or circumstances we might be interested in doing that. These four quadrants are a good way to think about or break down any cost effectiveness analysis. Stuff gets interesting when you say well where in this quadrant are we willing to accept something? That is a when we take a look as follows. Now we say, hmm, well, in this quadrant where something is more effective and more costly, and if it is a lot more effective and just a little more costly, we are more likely to adopt. But if it is just little bit more effective and a lot costly, we are maybe more likely to reject. Here is where you get these interesting tradeoffs. We are basically saying, it is better and it costs more so how much are you willing to spend to get that incremental benefit? Are you here or are you here? Maybe there is some line below which we are always going to be ready to adopt and above which we always need to reject, but where does that line go? Is there some kind of cutoff or some kind of implicit or explicit threshold for that? We get interesting discussions when we look around the world at some of the other wealthy nations and how they are making choices in their health care systems and at whether or not they have tradeoffs, in terms of cost for things like quality adjusted life year.

Well, when we look at cost study attributes or whenever I see a publication on a cost effectiveness analysis, I ask questions of these types and there are many. This is also a checklist that you can use to say how good is this study? Of course, we cannot go over all these - but comparator and economic perspective, as in are they using efficacy or effectiveness data. Are they using direct costs as well as indirect costs? Indirect costs meaning loss of productivity and so forth. There are a whole bunch of criteria that you might use to check out some kind of cost study and I just want to talk about a couple. One is economic perspective. You might say, cost for whom? Well, it depends on your perspective. The costs of doing something and the outcomes accrue different from the standpoint of a patient compared to the perspective of a clinician, compared to the standpoint of a hospital, a national payer, commercial payer, or even society at large. Is there some right economic perspective? Many economists will say that the true economic perspective is always the society at large and I understand those arguments. On the other hand, people will say the society is not the payer or the decision maker. The decision maker may be a patient or the decision maker may be a payer making a coverage decision and those costs look different from their relative perspectives. A cost study should always declare its economic perspective. Another interesting one is a time horizon of a study. Here is an interesting example that insofar as how costs and health accrue differently over time. Here is a rough picture rendering of some kind of intervention that costs some money that accumulates over time and then we look at the benefit to health over time. You can see clearly that we invest in some kind of a program; let's say a smoking cessation program, we start spending money and it accumulates over time and we might have spent most of our money after ten years and continue to spend some after 20 years. Just because we invested in this health program does not mean the health impact has started immediately. In this example, the health impact does not start to kick in until close to a decade, then starts to improve. So here, the health improvement lags the investment. Why is this important? It is important if you are going do a cost effectiveness analysis and say, when do we do this? At what point do we say how do costs relate to benefits? At what point do the outcomes outweigh the costs? Well, here is a rendering of that. If we did our cost effectiveness analysis at this point, we would say we started to spend quite a bit of money and almost nothing has happened to the health impact. After ten years we would say we spent most of our money and the health impact is just starting to be felt. It is not until 15 years that we finally experience most of our health impact. Then after 20 years, we realize we are about to spend out all of our health benefits. The fair question is when is the right time to conduct this cost effectiveness analysis? You always want to be careful about reading when the investigator chooses the point over which the study will be done and the study duration. This is very interesting to consider.

Back quickly to cost utility analysis. You will remember this drawing. Now, my question to you is (the drawing has not changed since before) how much are we willing to spend for this difference here? Certainly this person has experienced improvement or a gain in quality adjusted life years, but what about the investment in that? I give you this classical example done 20 years ago by Alan Maynard in the U.K. He and his team put together the existing data at the time on the cost per quality adjusted life year of this whole set of interventions compared to their comparators of standard of care. If you were going to invest in cholesterol testing and diet therapy for this group and wanted to buy the next quality adjusted life year, you would have to spend 220 British pound sterling. On the other hand, if you were going to invest in a kidney transplant and wanted to buy your next quality adjusted life year, you would have to spend 4,700. If you were spending on erythropoietin treatment for people with dialysis anemia, who are very sick as a matter of fact, the assumption here is there is no increase in survival but you might have a small improvement in quality of life. If you wanted to buy one quality adjusted life year doing that, you would have to spend 126,000. The point made by Maynard and others was if you have fixed resources and you are trying to optimize the health gain, how do you start spending your money here? The point they try to make was you want to buy as many qualities as you can with the money you have and basically start spending up here. You unfortunately might run out of money before you got to the bottom of this list. The point was there are real tradeoffs that you made here and it is important and useful to consider what you are buying in health care, in terms of cost per quality adjusted life year. Now, of course there is a lot of controversy involved here and there is a lot of methodological discussion about the right way to do this. However, it is interesting to point out insofar as making explicit these investments in what we are getting in terms of health for our beneficiaries in our population. A current example of this cost utility analysis can be found in the cost effectiveness registry that is put together by Tufts Medical Center. I will not go over this example in detail, but it is an example of CUA that compared aspirin and clopidogrel, which are antiplatelet therapies for secondary prevention of coronary heart disease. They looked at RCT data to figure out what we knew about the effects of each one of these antiplatelet therapies. Sometimes they may be used independently or together. They noted that there is a cost difference; Aspirin is only four cents a day, clopidogrel was three dollars a day when the study was done. They put together a simulation to try to figure out what the cost per quality adjusted life year would be of alternative ways of using aspirin and clopidogrel. Here you see on the right-hand side the cost per quality adjusted life year gained, with different ways of using these. 14,000 and 39,000 tend to be generally acceptable cost levels per cost utility. But if you start doing some other combinations, there are different results. For example, you give clopidogrel to all patients compared to giving aspirin to all eligible patients and clopidogrel to only those patients that could not tolerate aspirin. Now you are starting to talk about $320,000 per quality adjusted life year, which is a lot heftier than some of those other amounts. In any case, I refer to you the cost effectiveness analysis registry at Tufts for a whole catalog of these kinds of calculations. It really provides very useful insights about cost utility analysis.

In health technology assessment, what gets the attention? How are priorities set in HTA? Different kinds of HTA organizations have different sets of priorities or criteria. Typically you will see sets that look like this because we cannot assess everything all the time. What gets their attention? Well, things that involve a high individual burden of mortality or morbidity, things that affect a lot of patients, things that have a high cost either per person or over a population, or high unit or aggregate cost of technology. Sometimes we see a substantial variation in practice that suggests to us we do not know enough about the technology being used here. There are a whole set of criteria that we might use in different combinations to try to understand at what point and how do we make choices about what technologies we should assess. What about when we assess the technology? We call this the timing of the assessment. A colleague, Martin Buxton in the U.K., has this well-recognized quote. He said, “It is always too early to assess a technology until suddenly it is too late”. Martin made this quote in the context of looking at heart transplantation in the U.K. He and others observed that those who are advocates for new technology really want to see it used. They say, “it is new so do not quite assess it yet because we do not have all the data; we are still perfecting it”. If you wait too long, something may be well-out in the marketplace or the delivery system and we might realize too late that it is really not providing a good balance of cost and benefits or benefits and harms. There is always that tradeoff about too early or too late. Here are some things to think about in so far as timing of assessment. There is no single correct time to do an HTA in the course or the lifespan of a technology. There may be multiple times when it is important to do an HTA. It is also conducted to meet the needs of various sorts of policymakers. The stakeholders always desire some sort of transparency in predictability when these things are done and again, there are tradeoffs about when to assess. Finally, there is the moving target problem. We say it is the moving target because by the time you decide to assess something and collect all the information, conduct a systematic review, other analyses to get a report out, things have changed. The technology you are assessing may have changed or evolved, the comparator may have changed, and/or the populations in which a technology is used may have changed. Therefore, we always have to deal with the moving target problem when we deal with health technology assessment.

When we do HTAs, especially when we are doing systematic literature reviews and pulling evidence together the evidence, it is important to go to useful sources to get those data and literature in an efficient way. Here you see some of the main sorts of categories of bibliographic databases, starting out with PubMed, which includes MEDLINE and other databases put together by the National Library of Medicine. Embase is a commercial bibliographic database that pulls from many sources. The Cochrane library has a whole set of resources that are often used in technology assessment. In particular, the Library includes databases of systematic reviews, which are very valuable. There are a couple of economic databases of cost effectiveness analyses and so forth. One is put together by the NHS, and in the U.K., there is something else called the Health Economic Evaluations Database. There is also the CEA registry by Tufts and some of the others. These are certainly not the full set of bibliographic databases for HTA, but most of these tend to be the ones most often used to support health technology assessment.

Let's finish off with ten current trends in health technology assessment. There are more trends, of course, but I think these provide a pretty good overview for what is going on. First of all, there is greater demand for HTA overall. More stakeholders, more decision makers, and more policymakers need HTA to support their decisions. Over the years, the processes of HTA have become more transparent, more systematic and more consultative. These are more open; involve more people, more public notice, and more input from patients and other stakeholders. As I mentioned before, the standards of evidence and the use of broader evidence appraisal hierarchies are increasing, and there is even the refinement of these. Also in line with the efficacy and effectiveness distinction, there is more interest in evidence from real-world practice. We do not get this data just from clinical trials and oftentimes, we do not get data very well at all from RCTs. We get more things from registries, surveillance and practical clinical trials where we capture real-world data. You have heard the term compare effectiveness research, which is pulling from head-to-head comparisons of effectiveness, not efficacy. Number five, more specificity in HTA findings by patient, subgroups, practice setting, and provider experience. We not only want to know how something works in general, but we also want to know how it works for different kinds of patients; elderly versus younger, people with different socioeconomic characteristics, and people’s certain morbidities and what not. We want the findings of technology assessment to be broken down in more specific particular groups. Next, we see a greater emphasis on cost effectiveness and related economic impacts, so the state of the art of those costs studies I showed you before continue to improve. There is greater and broader use of systematic reviews, meta-analyses, and other syntheses methods. We are getting better at developing these methods and applying them in a more systematic and documented approach. Especially in this age, there is also instant international access, low-cost access usually to published evidence, most completed HTA reports, and the awareness of ongoing health technology assessments. So when an assessment is done, it is up there, it is out there, and people around the world can get it instantly. That is an important consideration of the diffusion of HTAs themselves. There is greater international collaboration in exchange in HTA methods and expertise and reports themselves. And then finally, this is quite interesting now especially in the past few years, there is a greater attention to the need to coordinate or align HTA to support market approval and payment functions. That is, efforts to coordinate and share information among those that do HTA, with those that regulate products like drug and biologics in devices, and those who pay for them. These people are talking to each other more; they all care about evidence, and they all care about improving care for their beneficiaries or populations. They also care to talk more about how they can, together, encourage stronger evidence, encourage evidence about subgroups, encourage more alignment of the kinds of health outcomes and patient-supported outcomes that are used, and so forth. That is a quick set of ten current trends in HTA.

Last Reviewed: February 23, 2024

National Information Center on Health Services Research and Health Care Technology (NICHSR)