Skip Navigation

Quality of Antibiotic Use for Lower Respiratory Tract Infections at Hospitals: (How) Can We Measure It?

  1. J. A. Schouten1,2,3,
  2. M. E. J. L. Hulscher1,
  3. H. Wollersheim1,3,
  4. J. Braspennning1,
  5. B. J. Kullberg2,3,
  6. J. W. M. van der Meer2,3, and
  7. R. P. T. M. Grol1
  1. 1Centre for Quality of Care Research, Nijmegen, The Netherlands
  2. 2Nijmegen University Centre for Infectious Diseases, Nijmegen, The Netherlands
  3. 3Department of General Internal Medicine, Radboud University Nijmegen Medical Centre, Nijmegen, The Netherlands
  1. Reprints or correspondence: Dr. J. A. Schouten, Centre for Quality of Care Research (KWAZO 229), Radboud University Nijmegen Medical Centre, P.O. Box 9101, Nijmegen, The Netherlands (J.Schouten{at}aig.umcn.nl).

Abstract

Background. To assess and improve the quality of antibiotic use in patients with community-acquired pneumonia (CAP) and acute exacerbation of chronic bronchitis or chronic obstructive pulmonary disease (AECB), a valid set of quality indicators is required. This set should also be applicable in practice.

Methods. Guidelines and literature were reviewed to derive potential indicators for quality of antibiotic use in treating hospitalized patients with lower respiratory tract infection (LRTI). To assess the evidence base of each indicator, a literature review was performed. Grade A recommendations were considered valid. For grade B–D recommendations, an expert panel performed a consensus procedure on the indicator's relevance to patient health, reduction of antimicrobial resistance, and cost containment. To test applicability in practice, feasibility, opportunity for improvement, reliability, and case-mix stability were determined for a data set of 899 hospitalized patients with LRTI.

Results. None of the potential indicators from guidelines and literature were supported by grade A evidence. Nineteen indicators were selected by consensus procedure (12 indicators for CAP and 7 indicators for AECB). Lack of feasibility and of opportunity for improvement led to the exclusion of 4 indicators. A final set of 15 indicators was defined (9 indicators for CAP and 6 indicators for AECB).

Conclusions. A valid set of quality indicators for antibiotic use in hospitalized patients with LRTI was developed by combining evidence and expert opinion in a carefully planned procedure. Subjecting indicators to an applicability test is essential before using them in quality-improvement projects. In our demonstration setting, 4 of the 19 indicators were inapplicable in practice.

Community-acquired lower respiratory tract infection (LRTI) is a common cause of acute illness in adults. The spectrum of disease ranges from mild mucosal colonization or infection, to acute bronchitis or acute exacerbation of chronic bronchitis or chronic obstructive pulmonary disease (AECB), to overwhelming parenchymal infection in patients with community-acquired pneumonia (CAP). Antibiotic treatment is rarely indicated for acute bronchitis and is sometimes indicated for the more severe cases of AECB, but it is always indicated for CAP. It may be difficult to differentiate between viral and bacterial LRTI or between bronchitis, AECB, and CAP. This may be one of the reasons why antibiotics are prescribed to more than two-thirds of patients with LRTI in Europe and the United States. In view of the worldwide development of antibiotic resistance, this is not a desirable situation [1].

Recommendations for the rational use of antibiotics in hospitalized patients with LRTI have been formulated in national and international guidelines [27]. Guidelines describe, in essence, “the right thing to do.” They assist in making practitioner and patient decisions prospectively for specific clinical circumstances. To make a valid and reliable assessment of current practice in patients with LRTI, key recommendations from these guidelines can be translated into measurable elements—so-called “indicators” [8]. Indicators serve as measurable elements of practice performance for which there is evidence or consensus that they can be used to assess the quality (and, therefore, change in the quality) of care provided [9].

Several articles have suggested quality indicators for the management of CAP, but the results have often varied in terms of their relevance, scientific soundness, and interpretability [10]. To our knowledge, no quality indicators have been suggested for antibiotic use in hospitalized patients with AECB. Where possible, indicators should be based directly on scientific evidence. However, like in many fields of medicine, there is only a limited scientific basis for recommendations regarding antibiotic use in cases of LRTI [10]. To develop valid quality indicators, it is necessary to use a systematic procedure that combines available evidence and expert opinion to assess additional aspects of care for which evidence alone is insufficient, absent, or methodologically weak [11, 12].

Development of valid quality indicators is, however, not enough. Validity in itself does not guarantee applicability in a specific setting. To assess applicability, again, a rigorous approach is required: indicators should be tested on important clinimetric characteristics, such as feasibility, reliability, opportunity for improvement, and case-mix stability [13].

In this article, we describe how we systematically developed a valid set of quality indicators for antibiotic use in hospitalized patients with LRTI. In addition, we describe a method to assess their applicability in daily practice. We used our quality-improvement project for antibiotic use in LRTI in 8 Dutch hospitals as a test case.

Methods

A set of indicators was developed in 4 steps. Applicability was tested in 5 steps. Figure 1 shows a flowchart detailing the steps involved in developing the indicators and testing their applicability.

Figure 1

Procedural flowchart showing the steps involved in the development of indicators for assessing and improving the quality of antibiotic use in patients with community-acquired pneumonia (CAP) and acute exacerbation of chronic bronchitis or chronic obstructive pulmonary disease (AECB), as well as the steps involved in testing the applicability of those indicators. ATS, American Thoracic Society; BTS, British Thoracic Society; ERS, European Respiratory Society; IDSA, Infectious Diseases Society of America; QI, quality improvement.

Development of a Valid Set of Indicators

Preselection of potential indicators. Four independent investigators (J.S., M.H., and 2 guideline experts) preselected key recommendations from national guidelines [6, 7]. Quality indicators that were published in international guidelines for CAP [25] or in the literature [10, 1433] were added to the list of potential indicators (table 1 and table A1 in the Appendix). For the latter purpose, a literature search was performed (table 2).

Table 1

Rating and adding procedure for the development of quality indicators for antibiotic use in lower respiratory tract infection.

Table 2

Summary of a systematic literature search for quality indicators for antibiotic use in the treatment of community-acquired pneumonia (CAP) and acute exacerbation of chronic bronchitis or chronic obstructive pulmonary disease (COPD).

Evidence-based assessment of the indicators. Every potential indicator was investigated to determine the degree of scientific evidence that linked indicator performance to outcome (i.e., mortality, morbidity, length of hospitalization, and cost-effectiveness). We started by reviewing whether the source (guideline or literature) of the potential indicator specified any references. A search of the PubMed database was then performed using search terms specific to the quality indicator topic. On the basis of the available literature, all of the potential indicators were given 1 of 4 grades (A–D) of supporting evidence (table A2 in the Appendix). Potential indicators with contradictory evidence were excluded, and grade A recommendations were immediately accepted as valid (i.e., evidence-based) indicators. The remaining indicators (grades B, C, or D) were tested further in an expert consensus procedure.

Table 3

Applicability of quality indicators for antibiotic use in 443 patients hospitalized with community-acquired pneumonia in 8 Dutch hospitals.

Table 4

Applicability of quality indicators for antibiotic use in 456 patients hospitalized with acute exacerbation of chronic bronchitis or chronic obstructive pulmonary disease.

Table A1

International comparison of quality indicators for community-acquired pneumonia (CAP), by group or organization.

Table A2

Level of supporting evidence linking indicator performance to outcome.

Rating and adding procedure by an expert panel. A panel of 11 opinion leaders in medical microbiology, infectious diseases, respiratory medicine, and quality-of-care medicine were asked to conduct a consensus procedure for the preselected set of indicators [11, 34]. In the 2-round consensus procedure, the panel judged the potential indicators on the basis of 3 criteria: (1) clinical relevance to the patient health benefit, (2) relevance to reducing antimicrobial resistance, and (3) relevance to cost-effectiveness. A 5-category Likert scale was used that varied from “completely disagree” (category 1) to “completely agree” (category 5). An extra answer category could be marked if the expert could not decide about a particular question. A definition of these constructs was provided in the covering letter. In the second round, the expert panel had the opportunity to comment on the proposed indicators and to add or modify potential indicators for evaluation.

Only indicators with >70% agreement between the experts on 1 criterion were selected in the first round. Indicators with >70% disagreement on all 3 criteria were rejected [34]. All of the other indicators, including those added or modified by the experts, were reevaluated in the second round.

Final set of indicators. The final step in devising the set of indicators consisted of operationalizing them by defining numerators and denominators. An algorithm for every indicator revealed how it had been deduced from the available data.

Assessment of Applicability of Quality Indicators in a Specific Patient Sample

Setting and study population. To test applicability in a specific setting, feasibility of data collection, reliability, opportunity for improvement, and case-mix stability were determined in a demonstration data set (Dutch LRTI quality-improvement project). A prospective observational audit was performed at 8 medium-sized hospitals, including both teaching and nonteaching facilities, in the southeastern part of The Netherlands. Patients with CAP and AECB were selected on the basis of formal inclusion criteria.

Data collection. During a 6-month period, trained research assistants made twice-weekly reviews of the charts of all of the patients admitted to internal medicine and respiratory medicine hospital wards. All of the relevant patients were followed-up during their period of hospitalization and until 30 days after discharge from the hospital. Data were collected from admission sheets, medical and nursing records, medication charts, and microbiological and radiological testing reports. After recording data on the preprinted standardized data forms, 2 assistants entered the results into a database.

Applicability steps and analysis. Clinimetric characteristics of quality indicators—including feasibility, reliability, opportunity for improvement, and case-mix stability—were defined and determined in the demonstration data set. In addition, factor analysis was performed for data (i.e., indicator) reduction purposes.

Feasibility. Feasibility of data abstraction was defined as the percentage of missing values per indicator (i.e., the percentage of indicator values that could not be calculated because ⩾1 element of the algorithm could not be retrieved from the available records). Feasibility was considered poor if this percentage exceeded 25%.

Reliability. To assess the reliability of our data collection, the percentage of agreement between 2 data reviewers on the level of indicator outcome, corrected for chance, was expressed in κ coefficients. A sample consisting of 10% of the records of 2 hospitals was collected by 2 independent data reviewers. Scores of 0.41 ⩽ κ ⩽ 0.6 were considered to be moderate, 0.61 ⩽ κ ⩽ 0.8 were considered to be good, and κ > 0.8 were considered to be very good [35]. Values of <0.4 were considered to be poor and led to elimination of the indicator.

Potential opportunity for quality improvement. Quality measures must be capable of detecting changes in the quality of care to discriminate between and within subjects. If indicator performance is invariably high, with little variation, this renders an indicator less sensitive and thus less successful as an indicator. From the viewpoint of internal quality improvement, indicators with a performance score >85% were defined as having limited room for improvement. Indicators with a performance score >85% in all participating hospitals were not selected [8, 36].

Case-mix stability. Case-mix stability is an important indicator asset, enabling application of an indicator to monitor quality in a specific hospital over time and to compare hospitals of different sizes and settings [13]. The relationship between certain patient characteristics and the indicator result was analyzed to decide whether correction for case mix was necessary. In the CAP indicators, we studied the distribution of outcome according to age (either ⩾70 years or <70 years), sex, and Pneumonia Severity Index [37] (either ⩽III or >III). For AECB, no validated severity-of-illness score was available, so we used the most recent forced expiratory volume in 1 second (FEV1) value (expressed as a percentage of the predicted value) as a substitute. The need for case-mix correction did not lead to exclusion from our final set.

Factor analysis. Factor analysis was performed to detect relationships between indicators, thus potentially leading to a reduction in the number of indicators [36]. To perform data reduction through factor analysis, a minimum of correlation between the items is required. We used Bartlett's sphericity test (in which P should be <.05), Kaiser-Maier-Olkin measure of sampling adequacy (MSA) (in which MSA should be >0.5), and R2 (in which R2 should be >0.20). This procedure was performed separately for the sets of CAP and AECB indicators.

Results

Development of a valid set of indicators. In the first step, 10 potential indicators for CAP and 5 for AECB were preselected from national and international guidelines and the literature (figure 1 and table 2). No “good supporting evidence” (grade A) could be found that linked process to outcome in any of these indicators. None of the indicators had to be excluded because of contradictory evidence (table 1).

All 15 potential indicators were entered into the iterated consensus procedure. In the first round, 4 recommendations were immediately selected for both CAP and AECB. No indicators were eliminated. Six new items (4 for CAP and 2 for AECB) were added by the expert panel. One potential indicator was revised, and another was split into 2 separate indicators. In the second round, 8 potential indicators (5 for CAP and 3 for AECB) out of 14 (11 for CAP and 3 for AECB) remaining recommendations were selected. The ultimate set was considered to be valid. It consisted of 12 indicators for CAP and 7 indicators for AECB.

Assessment of applicability of quality indicators in a specific patient sample. All 19 validated indicators were tested using a sample of 443 hospitalized patients with CAP and 456 hospitalized patients with AECB (tables 3 and 4). A review was made of the distribution of performance of the indicators over the 8 hospitals. Although there was wide variability in outcome (table 3), this was not because of the patient mix (data not shown).

Feasibility. Feasibility of 3 indicators was poor in our demonstration data set of Dutch patients with LRTI. Performing cultures of blood and sputum samples before the first antibiotic dose was administered showed poor feasibility: 55% of the subjects had missing values for timely culturing of blood samples, and 75% had missing values for timely culturing of sputum samples. These 2 indicators were rejected. It proved to be impossible to construct an algorithm for 1 of the 12 indicators for CAP (“change antibiotic therapy if no clinical improvement within 72 h of initiation”). We were unable to operationalize “no clinical improvement.” This indicator was considered to be nonfeasible (100% of the subjects had missing values) and was also rejected.

Reliability. One indicator (timeliness of antibiotic administration) received a score of κ = 0.5, indicating moderate interobserver reliability. All other indicators showed κ scores of >0.6 (i.e., good or very good). No indicator was rejected.

Opportunity for improvement. The AECB recommendation to not prescribe macrolides as a first-choice antibiotic showed a high outcome (performed in >90% of cases) in each participating hospital, and thus it showed little room for improvement. This indicator was rejected for use as a quality indicator in our quality-improvement project. For the AECB indicator “adapt dose and dose interval to renal function,” there was a high median performance rate (96%), but an outlier hospital with a performance rate of 73% of cases was detected, and the indicator was, therefore, not rejected.

Need for case-mix correction. Two CAP indicators needed correction for age: “adapting dose and dose interval of antibiotics to renal function” (P = .0001) and “obtaining samples for blood cultures” (P = .003). Regarding sex, sputum samples were obtained significantly more often from men than from women (P = .001). All of the other CAP indicators showed stable patterns of distribution over the 3 patient characteristics (age, sex, and severity of illness). In the population of patients with AECB, sputum cultures were performed more consistently in patients with an FEV1 value of ⩽60% (P = .033). The need for case-mix correction did not lead to exclusion from our set but should be taken into account for interpretation of performance scores in our group of hospitals.

Factor analysis. For CAP and AECB, Bartlett's sphericity test (P = .091 and P = .161), Kaiser-Maier-Olkin MSA (0.463 and 0.489), and R2 (0.147 and 0.137) indicated that no relevant correlation was detected between the indicators. Subsequently, further factor analysis, performed in an attempt to reduce the number of indicators, was not considered useful. Our set comprises intrinsically strong indicators.

Discussion

On the basis of a carefully planned procedure that combined evidence and expert opinion, we developed a set of valid quality indicators for antibiotic use in hospitalized patients with LRTIs. In addition, we showed the importance of subjecting these indicators to a practice test before using them to measure and improve the quality of care in a specific setting. In our example, only a part of the valid set (15 of 19 indicators) turned out to be applicable in daily practice.

None of our potential indicators could rely on a firm body of evidence that linked process to outcome of care. “Timely administration of antibiotics” and “prescription of an empirical antibiotic regimen according to current guidelines” were consistently associated with improved survival in patients with CAP, but this was only in observational, retrospective studies [38, 39]. Several prospective interventional trials have demonstrated that early-switch strategies are cost-effective and safe, but no randomized, controlled trials have yet confirmed these results [40, 41]. No firm associations were found between outcome and most of the other suggested indicators. Results from our expert consensus procedure demonstrated that these non–evidence-based recommendations may still be regarded as valuable by professionals. Changing from broad-spectrum to narrow-spectrum therapy once culture results become available will probably not directly affect short-term outcome for the individual patient. However, it has a theoretical effect on reducing the development of resistance, and it may thus turn out to be crucial for the outcome of future patients [42]. Unfortunately, studies that link process indicators with resistance patterns are confronted with large methodological difficulties, so it will be difficult to prove any definite relationship. Using a technique that systematically combined evidence and consensus enabled us to assess (and thus improve) a broader range of aspects than would have been possible if quality indicators had been restricted to evidence only.

Even if our set of indicators is considered to be valid, its applicability in daily practice has several other important prerequisites. In our demonstration data set of hospitalized Dutch patients with LRTI, most of the indicators showed reasonable applicability (i.e., they were found to be feasible and reliable and showed room for improvement). Unfortunately, the feasibility of data collection turned out to be poor for some indicators. These findings support our belief that Dutch hospitals do not have systematic and robust registration systems (e.g., for registering the timing of hospital procedures). This currently constitutes a major barrier against the application of these kinds of quality indicators in The Netherlands, not only for research purposes, but also for monitoring the quality of daily practice. Once Dutch hospitals are required to collect these data—for example, as part of their normal review process—documentation will probably pick up. Timing of procedures caused major feasibility problems in our example, but in other countries, the feasibility of these indicators may be very different. US hospitals, for example, readily collect data for antibiotic timing as part of their normal review and accreditation process. In the United States, however, other data collection problems may arise, jeopardizing feasibility. This underlines the importance of performing an applicability test before using indicators to measure and improve the quality of care in a specific setting.

In our applicability test, we used a performance rate of 85% of cases (for each participating hospital) as a cut-off value to exclude indicators. From the viewpoint of internal quality improvement, indicators that score >85% in all hospitals have little room for improvement. Quality measures must be capable of detecting changes to discriminate between and within subjects. If indicator performance is invariably high with little interhospital variation, this renders an indicator less sensitive and thus less successful as an indicator [36]. The main goal of subjecting our set of indicators to this criterion was to prioritize the indicators most in need of improvement in a quality-improvement project (i.e., those indicators with low performance rates and/or large interhospital variation). Using “opportunity for quality improvement” as a selection criterion is particularly important for internal quality-improvement efforts. If, on the other hand, the indicator set is to be used for accreditation purposes, for example, room for improvement might not be desirable as a selection criterion; the trend in regulating and accrediting organizations is to provide indicator sets that highlight excellent performance, as well as merely meet minimal standards [13].

Earlier sets of indicators, developed using somewhat different methodology, show many similarities to our set [10, 24, 43]. In some of these studies, clinimetric criteria, such as feasibility, reliability, and opportunity for improvement, were appraised by the clinical judgement of experts [10] and not on the basis of empirical data from real practice. However, in our experience, the feasibility of data collection is often overrated by professionals. All members of our expert panel believed that it was feasible to measure the time lag between performance of blood cultures and first antibiotic administration, but in reality, this could be done for only 25% of patients.

In summary, we developed a robust set of intrinsically strong indicators using rigorous methodology that combined the available evidence and expert opinion. Performance assessment in a practical test showed that some indicators were flawed by poor feasibility of data collection. Our experience demonstrates that, before implementation of a theoretically sound set of indicators, a practice test should be performed to assess its applicability in daily practice.

Acknowledgments

We thank B. Frijling and W. Wijnands, for independently selecting key recommendations from national guidelines; the panel of 11 experts who performed the consensus procedure (A. Cox, H. Wollersheim, E. Stobberingh, I. Gyssens, T. Casparie, W. Wijnands, Y. Hekster, B. J. Kullberg, W. Boersma, A. Schreurs, and R. Aleva); and Janine Trap, for administrative and statistical support.

Financial support. Grant support from Zon/Mw and the Dutch Department of Health.

Potential conflicts of interest. All authors: no conflicts.

  • Received January 10, 2005.
  • Accepted April 6, 2005.

References

| Table of Contents