Randomized trials of adjunctive treatment of bacterial sepsis with polyclonal immunoglobulin show conflicting results. We performed a systematic review and a meta-analysis of the results of randomized trials that compared reductions in mortality rates in patient groups treated with polyclonal immunoglobulin versus either placebo or no treatment in addition to conventional treatment. High-quality trials had adequate concealment of allocation, were double-blinded and placebo-controlled, and made data available for intention-to-treat analyses. Twenty trials were included. Meta-analysis of all trials showed a relative risk of death with immunoglobulin treatment of 0.77 (95% confidence interval [CI], 0.68–0.88). High-quality trials (involving a total of 763 patients, 255 of whom died) showed a relative risk of 1.02 (95% CI, 0.84–1.24), whereas other trials (involving a total of 948 patients, 292 of whom died) showed a relative risk of 0.61 (95% CI, 0.50–0.73). Because high-quality trials failed to demonstrate a reduction in mortality, polyclonal immunoglobulin should not be used for treatment of sepsis except in randomized clinical trials.
Randomized trials of polyclonal immunoglobulin for treatment of sepsis have yielded conflicting results [1, 2]. Systematic reviews have also come to different conclusions. Alejandria et al. [1] found that polyclonal immunoglobulin reduced mortality substantially and significantly among adults (relative risk [RR], 0.62; 95% CI, 0.49–0.79), but not among neonates (RR, 0.70; 95% CI, 0.42–1.18). A review by Ohlsson and Lacy [2] reported a marginally statistically significant reduction in mortality among neonates with suspected sepsis (RR, 0.63; 95% CI, 0.40–1.00).
At our hospital, immunoglobulin constitutes the second largest drug cost. That expenditure may be justified if it saves lives. Most of the evidence supporting its use is provided by small trials (which have a large random error) with methodological shortcomings (including increased risk of systematic error [i.e., bias]). Thus, we decided to perform an independent systematic review, with emphasis on the methodological quality of the studies.
Study selection and search strategy. We selected clinical trials described as randomized by the investigators, comparing reductions of mortality in any patient group with suspected or proved sepsis or septic shock treated with polyclonal immunoglobulin versus in those receiving placebo or no treatment in addition to conventional treatment. Studies focusing solely on prevention of sepsis were excluded. A free text literature search of all records in the databases of PubMed, Embase, and the Cochrane Library was last updated 21 January 2004. The search strategy included bacterial infection, to allow identification of studies containing results derived from subgroups with sepsis. The following groups of terms were searched: (1) “sepsis OR septicemia OR septicaemia OR shock-septic OR bacteriemia OR bacteraemia OR bacteremia,” (2) “bacterial infections OR bacterial infection OR bacterial-infections,” (3) “immunoglobulin OR immunoglobulins OR antibodies OR antibody OR polyclonal,” (4) “randomi* OR controlled OR blind* OR placebo” OR “controlled ? trial,” and (5) the combination of the terms listed in (3), (4), and either (1) or (2).
The database-specific indexing term is one of the synonyms in each of the first 3 search strings. No restrictions were applied. Decisions on which of the retrieved trials to include were made independently by the 2 reviewers. The first authors of the included trials were asked if they were aware of any unpublished trials. Reference lists were scanned for additional trials.
Outcomes. According to the protocol for the review, the primary aim was to assess whether treatment with immunoglobulin reduced total 30-day mortality in patients with suspected or proved sepsis. Secondary outcomes were number of days in hospital (if separate data for survivors and nonsurvivors were available, because pooled data can be misleading), complications to the infection, and adverse effects of immunoglobulin treatment.
The following sensitivity analyses were planned according to the protocol: High- versus lower-quality trials (a priori primary subgroup analysis); sepsis due to gram-negative organisms versus sepsis due to gram-positive organisms; neonates versus nonneonatal patients; immunocompetent versus nonimmunocompetent patients; underlying diseases; and albumin as placebo versus other placebos or no placebo (because albumin has been implied to increase mortality in seriously ill patients) [3].
Quality assessment. Trials were considered high quality if they (1) had adequate concealment of allocation, (2) were double-blinded and placebo controlled, and (3) applied an intention-to-treat analysis or data were available that allowed an intention-to-treat analysis [4]. Trials failing to meet ⩾1 of these criteria were considered lower quality. We restrict the use of the term “quality” to refer to these criteria.
We considered concealment of allocation adequate if there was central randomization; serially numbered, opaque, sealed envelopes; sequentially numbered but otherwise identical vehicles, including their contents; or other descriptions of convincing concealment of allocation. Concealment was inadequate if there was alteration; reference to case record numbers or date of birth; an open table of random numbers (unless the vehicles were correspondingly numbered and the blinding impeccable). Unclear concealment meant that there was no description of the method or that the description did not allow a clear distinction.
Data extraction. The 2 investigators independently extracted the data. Disagreements were rare and were the result of simple errors. All first authors of the included trials were contacted and asked for additional information on trial quality.
Data analysis. RRs were combined in a meta-analysis by the Mantel-Haenszel method with use of RevMan software, version 4.2.3 (Cochrane; available from http://www.cochrane.org) [5]. A fixed-effect model was used, which assumes that the true effect of the intervention is the same in all of the included trials, differences between study results being ascribed to sampling error. Variation in study results not ascribable to sampling error were referred to as heterogeneity. Large studies with high event rates received the most weight in the meta-analysis. The a priori primary hypothesis for exploring sources of heterogeneity was the influence of methodological quality, followed by the other sensitivity analyses.
According to the protocol, tests for heterogeneity were to be performed with use of the method of DerSimonian and Laird [6] and a test for interaction [7]. The former method [6] was replaced by a more sensitive test (I2) [8] that became available during the preparation of our article. Post hoc analyses to explore alternative explanations of heterogeneity included a random-effects model (assuming that the true effect varies around an overall average treatment effect) and a stepwise backward random-effects metaregression of the logarithm of the RR on quality, small-studies effect, age group, baseline risk, immunoglobulin preparation, and total dose provided within a week. The small-studies effect was present if the effect estimate varied with smaller study size (which may occur, for example, as a result of publication bias) [5]. Baseline risk is the underlying risk at trial entry. Because few trial reports provided this information as baseline sepsis score, we used the control group event rate instead (although this will tend to overestimate the association with treatment effect, because the control group event rate itself enters into the treatment effect estimate). High quality was coded as 1 and lower quality as 0; small-studies effect was modeled as the standard error of the logarithm of the RR; age groups were defined as neonates versus nonneonates (i.e., adults, except for very few school-age children) and were coded as 0 and 1, respectively; IgG preparations were coded as 1, and IgG preparations enriched with IgM and IgA (IgGMA) were coded as 0. The total dose was expressed as milligrams per kilogram of body weight. The metaregression was performed with use of Stata software, version 8 (StataCorp) [9, 10].
Description of studies. Twenty-nine trial reports were identified (figure 1) [11–39]. Eight reports were excluded for the following reasons: no mortality data available in the subgroup of septic patients [25, 36]; unclear whether the deaths among septic patients occurred in the intervention or the control group [12, 15]; fundamental design problems [32]; an interim analysis of a later full trial report [17]; and duplicate publications [28, 31].
The 21 included trials comprised 1711 patients and 547 deaths. Thirteen of the 21 corresponding authors answered our questions (see Acknowledgments), and 4 studies [13, 24, 34, 38] were reclassified from lower quality to high quality as a consequence of these responses. The characteristics of the trials are shown in table 1. One large trial by Werdan et al. [34] involved 624 patients and 239 deaths, and it provided 38% of the weight in the meta-analysis. The mortality data from this trial have previously only been reported qualitatively (“the 28-day mortality was not reduced” [34]) in an abstract. However, the authors have provided us with quantitative intention-to-treat data, and the trial was performed according to a detailed, published protocol, so the quality of the trial could be assessed [40]. Seven of the trials comprised neonates [14, 19, 21, 24, 26, 29, 30], and 14 of the trials comprised nonneonates (i.e., adults, except for very few school-age children) [11, 13, 16, 18, 20, 22, 23, 27, 33–35, 37–39].
The methodological quality of the studies was highly variable (table 1), and only 4 of the studies met all 3 quality criteria and were categorized as high quality [13, 24, 34, 38]. Nine studies had adequate concealment of allocation [13, 21, 23, 24, 27, 29, 34, 38, 39], 8 had unclear concealment of allocation [11, 14, 16, 18, 20, 22, 26, 35], and 4 were inadequately concealed [19, 30, 33, 37]. Thirteen were not double-blinded [16, 19–22, 26, 27, 29, 30, 33, 37–39], and 7 did not make data available for intention-to-treat analysis [11, 14, 19, 22, 23, 26, 30].
Four studies reported follow-up until death or discharge [18, 22, 29, 33]. In 5 studies, the length of follow-up was not available [19, 21, 23, 26, 35], and in the remaining studies, it varied and was often imprecisely reported. Thus, we report mortality data at the length of follow-up provided by the authors (table 1) and did not include length of follow-up in the metaregression.
Mortality. When data from all trials were pooled, there appeared to be a beneficial effect of immunoglobulin treatment on the RR of death of 0.77 (95% CI, 0.68–0.88; P = .0001). However, 23.2% of the variability between the study results could not be ascribed to sampling error (I2, 23.2%; figure 2). When the trials were analyzed in separate subgroups of high and lower quality, heterogeneity was no longer detectable (I2, 0%). The pooled RR for the 4 high-quality trials was 1.02 (95% CI, 0.84–1.24; P = .87). In contrast, the 17 lower-quality trials had a pooled RR of 0.61 (95% CI, 0.50–0.73; P < .00001) (figure 2). The difference between the estimates from the trials of high methodological quality versus those from the trials of lower methodological quality was highly statistically significant (P = .0002).
Meta-analysis of relative risk of all-cause mortality comparing patients with sepsis treated with polyclonal immunoglobulin (Immunoglobulin) with patients with receiving placebo or no additional treatment for sepsis (Control). Subtotals designate the subgroup analysis of trials of high quality and lower quality. Bars, 95% CI; n, number of deaths; N, number of patients; RR, relative risk; Fixed, fixed-effect model. I2quantifies the percentage of variation between study results that is not ascribable to sampling error.
The large study had a loss to follow-up of 4.3% of patients, but even extreme-case scenarios in favor of immunoglobulin treatment did not alter the finding that high-quality trials did not show a statistically significant effect on mortality (table 2). Loss to follow-up was not reported in other studies.
Sensitivity analysis of the loss to follow-up in the trial by Werdan et al. [34].
Sensitivity analyses of mortality. The results were similar if a random-effect model was applied. The overall estimate of RR was 0.70 (95% CI, 0.59–0.85), the RR for high-quality trials only was 1.03 (95% CI, 0.85–1.25), and the RR for lower-quality trials only was 0.64 (95% CI, 0.54–0 .76). Levels of heterogeneity were unaltered.
If the quality criteria for high-quality trials were reduced to require that only the most important criterion of methodological quality (i.e., concealment of allocation [41]) be fulfilled, the results would be as follows: 9 trials with adequately concealed allocation (RR, 0.91; 95% CI, 0.76–1.09; I2, 40%) versus 12 trials with unclear or inadequate concealment of allocation (RR, 0.63; 95% CI, 0.52–0.77; I2, 0%). The introduction of 40% heterogeneity in the high-quality trial group indicates that the lack of double blinding in the 5 reclassified trials that had adequate concealment made an important difference.
Most trials comprised a mixture of patients with sepsis due to gram-negative organisms and patients with sepsis due to gram-positive organisms, as well as immunocompetent and immunoincompetent patients with different underlying diseases. In general, separate mortality data were not provided for any of these subgroups, which precluded the planned sensitivity analyses. In only 2 studies was albumin used as placebo in a concentration within the range that has been suggested to increase mortality [11, 38].
The strong association between study quality and the RR of death would confound the planned subgroup analysis. Instead, we did an exploratory stepwise backward random-effects metaregression (table 3). It confirmed the strong association between study quality and effect, but it found no evidence for an association of the effect with age groups, baseline risk, immunoglobulin preparation, or total immunoglobulin dose. When the covariables were modeled alone, the only other covariable apart from quality with a P value suggestive of an association with the effect was the small-studies effect (P = .032). When both variables were included in the model, the P value for the regression coefficient for quality was .01; it was .18 for the small-studies effect. This reflects that many of the small studies also had lower quality, and after controlling for the quality of the study, the association of small study size with larger effect estimates was no longer significant. Thus, trial quality was the only variable that explained a statistically significant amount of variation in the outcomes of the included trials.
Length of hospital stay. For nonsurvivors, patients in the immunoglobulin group died 2.7 days earlier than others (95% CI, 0.2–5.3). For survivors, there was no statistically significant difference in length of hospital stay between groups (3.8 days, 95% CI, -2.3 to 9.9) [18, 35].
Complications and adverse effects. The information on complications and adverse effects was too scarce to be combined in a meta-analysis.
Major findings and possible explanations. Our most reliable estimate of the effect of treatment with intravenous, polyvalent immunoglobulin on mortality in patients with sepsis relied on the 41% of the statistical information that came from the high-quality trials designed to minimize bias. This estimate was a RR of 1.02 (95% CI, 0.84–1.24), which is compatible with a 16% reduction in mortality as well as with a 24% increase.
The overall pooled estimate, based on all of the trials, showed a large and significant reduction in mortality with immunoglobulin treatment, but one-fourth of the variation between the study results could not be ascribed to sampling error; this unexplained variability disappeared when high-quality and lower-quality studies were analyzed in subgroups. The difference in the results from these subgroups is large, but it is consistent with the expected influence of methodological quality. Trials with inadequate or unclear concealment of allocation exaggerated the effect of the experimental intervention by ∼30%, on average (when measured as a ratio of ORs) in 4 out of the 5 empirical studies of bias [41, 42]. Furthermore, the difference is highly statistically significant, and it is the result our primary subgroup analysis as defined a priori. Thus, it is likely to reflect a true difference between high-quality and lower-quality trials. This result remained robust to the metaregression that explored whether differences in other trial characteristics (including age group, type of immunoglobulin preparation, etc.) were better explanations for the heterogeneity between the results of the individual trials.
Placebo treatment to ensure blinding of patients and care providers may seem unimportant when the outcome is mortality. However, if lack of blinding concurs with a lack of intention-to-treat analysis or with a lack of predetermined stopping rules, the risk of bias is obvious. Lack of intention-to-treat data may imply that the patients who did not receive the full intervention were not accounted for in the published report [43]. But some patients in the intervention group may not have received full treatment because of rapid deterioration and subsequent death, whereas similar patients in the control group are not excluded because no placebo intervention was required. Hence, differential exclusions could lead to bias in favor of the intervention group. In 7 studies, the number of patients withdrawn or excluded from analysis was not available, and there was no statement that there were no exclusions [11, 14, 19, 22, 23, 26, 30]. Four of these studies did not apply placebo treatment [19, 22, 23, 26].
Lack of predetermined stopping rules increases the risk of spurious findings because of multiple looks at the data. If there is no blinding, the number of informal interim analyses can be large. The trial that reported the largest statistically significant effect was unblinded and prematurely terminated, and it stated that 12 interim analyses had been performed [27]. Predetermined stopping rules were not mentioned in 8 of the 11 trials without double blinding.
Two unblinded studies had predetermined goals of samples sizes [26, 29], but the sponsoring company (Sandoz India) withdrew support while the trials were ongoing and caused their premature termination. In one of the studies, the sponsor also made blinding impossible by refusing to provide identical vials with placebo [29].
Previous systematic reviews of immunoglobulin for treatment of sepsis. Alejandria et al. [1] find that polyclonal immunoglobulin significantly reduces mortality, both when all studies (including those involving adults and neonates) are pooled (RR, 0.64; 95% CI, 0.51–0.80) and when only high-quality studies are considered (RR, 0.30; 95% CI, 0.09–0.99) [21, 32]. The discrepancy with our findings can be explained by their less sensitive search strategy, their less rigorous application of quality assessment, and their retrieval of less information from the authors of the trials.
Ohlsson and Lacy [2] report results of trials from 2 settings. The first addresses mortality in neonates with clinically suspected sepsis; there is a borderline statistically significant reduction in mortality, as mentioned above. In the other setting (neonates with subsequently proven sepsis), they find a markedly reduced RR of 0.55 (95% CI, 0.31–0.98). A trial with a fundamental error in study design (in which inclusion of patients was dependent on the effect of the treatment) is included in the second setting [32]. If it were excluded, as in our analysis, then the combined result would no longer be statistically significant. Ohlsson and Lacy [2] do not present sensitivity analyses of the influence of the quality of the trials, but they cautiously conclude that there are insufficient data to support routine use of immunoglobulin for treatment of sepsis in neonates.
A recent review (without a meta-analysis) mentions the negative finding of the large high-quality trial and some of the methodological shortcomings of 6 of the smaller trials included here [44]. What our study adds to this is the presentation of 15 additional randomized controlled trials, with more detail on the methodological quality of these trials and a quantitative analysis of the sum of the evidence.
Strengths and limitations of our study. Our review demonstrated that the overall effect estimate of immunoglobulin on mortality among septic patients not only hinges on the precision provided by the largest trial, but also on the methodological quality of the trials. The intermediate publication status of the large study by Werdan et al. [34] entails some uncertainty, because we cannot know why it has not been fully published yet.
The classification of trials as lower quality did not indicate that they were necessarily all of low quality. Some trials classified as lower quality may even have been high-quality but failed to report the measures taken to ensure this. Further, lack of guarding against bias did not prove that bias occurred, just that it may have occurred. But with different results derived from well-guarded versus uncertainly or less well-guarded trials, we recommend trusting the former.
Combining trials that occurred in different settings and involved different severities of sepsis may seem counterintuitive, but the results of a sepsis trial are likely to be extrapolated beyond the particular inclusion criteria. In addition, if there is no effect of the treatment, then any trial result deviating from no effect would be ascribable to sampling error or bias, and then it would be legitimate to combine all trials according to their susceptibility to bias. We explored whether there was any evidence against this assumption in the metaregression and found none.
The metaregression could be used to gauge whether there were obvious alternative explanations (other than quality) for the observed heterogeneity, but we could not exclude an effect of immunoglobulin treatment in defined patient subgroups. This would require large studies, such as the one currently being conducted by Brocklehurst et al. [45], who plan to include 5000 neonates.
Implications for practice. Most of the immunoglobulin used in the United States is used off-label [46] and could be spurred by undue emphasis on results found in subgroups of the trials included here. However, the present review should serve to avoid this undue emphasis. For a common condition like sepsis, the burden of proof should be statistically and clinically significant treatment effects derived from high-quality randomized trials. Such evidence is not available, and we therefore suggest that polyvalent immunoglobulin for treatment of sepsis is not recommended for clinical practice. Exceptions could exist for rare conditions like streptococcal toxic shock syndrome, but guidelines will have to rest on a comprehensive analysis of the totality of the relevant data, including safety issues such as the risk of acute renal failure [47].
We thank the following corresponding authors for providing additional information on their trials: E. R. Burns, A. Norrby-Teglund (for Darenberg et al. [38]), R. Grundmann, K. N. Haque, L. Lindquist, J. Mancilla-Ramirez, I. Schedel, A. Shenoi, C. Wesoly, K. Werdan, M. Yakut, S. Tugrul, and S. Karatzas.
Financial support. The Nordic Cochrane Centre is financed by the Copenhagen Hospital Corporation. The views expressed in this article represent those of the authors and are not necessarily the views of other members of the Cochrane Collaboration.
Conflict of interest. All authors: No conflict.
IDSA Members: For your free access to this journal, log in via the IDSA members area.
Open access options for authors visit Oxford Open
This journal enables compliance with the NIH Public Access Policy