ARTIGO ARTICLE 
Iná S. Santos ^{1}  Caffeine intake and pregnancy outcomes: a metaanalytic review Consumo de cafeína na gravidez e desfechos perinatais

^{1} Programa de PósGraduação em Epidemiologia, Universidade Federal de Pelotas, Caixa Postal 464, Pelotas, RS 96100970. Brazil. ^{2} Department of Epidemiology and Population Sciences, London School of Hygiene and Tropical Medicine, Keppel Street, London WC1E 7HT, UK.  Abstract Epidemiological publications on the relationship of caffeine to birth weight and duration of human pregnancy, from 1966 to 1995, were searched through Medline. Each study was treated as the stratification variable, and its weight in the weighted average was proportional to the inverse of its variance. Twentysix studies were located. Among the twentytwo studies on birth weight, eleven were on mean birth weight, nine on low birth weight (LBW), and four on intrauterine growth retardation (IUGR). Combined analysis of mean birth weigh study results showed a significant decrease in birth weight of nearly 43g among newborns of the heaviest caffeineconsuming mothers. LBW, IUGR, and preterm delivery displayed significant homogeneity in the test results, indicating that a pooled estimate should not be taken as an adequate measure. The high heterogeneity of the available literature on the effects of caffeine on LBW, IUGR, and preterm delivery prevents estimation of reliable pooled estimates through metaanalysis. Further assessment of caffeine intake during pregnancy is needed in future research. Key words Caffeine; Low Birth Weight; Fetal Growth Retardation; Premature Infant Resumo Foram rastreadas as publicações epidemiológicas de 1966 a 1995 sobre a associação entre cafeína e peso ao nascer e duração da gestação humana através de pesquisa em Medline. Cada estudo foi tratado como uma categoria de uma variável e seu peso foi proporcional ao inverso de sua variância. Foram localizados vinte e seis estudos. Entre os vinte e dois estudos sobre peso ao nascer, onze foram sobre peso médio ao nascer, nove sobre baixo peso ao nascer (BPN) e quatro sobre retardo do crescimento intrauterino (RCIU). O efeito agregado sobre o peso médio ao nascer mostrou uma redução estatisticamente significativa de 43 gramas entre os recémnascidos de mães que consumiam maiores quantidades de cafeína. A análise agregada do efeito sobre BPN, RCIU e nascimentos prétermos apresentou teste de homegeneidade estatisticamente significativo, indicando que uma estimativa combinada não seria confiável. A grande heterogeneidade da literatura disponível quanto ao efeito da cafeína sobre o BPN, RCIU e partos prétermo não permite o cálculo confiável de estimativas agrupadas através de metaanálise. Tornase necessária uma avaliação mais cuidadosa do consumo de cafeína durante a gestação em estudos futuros. 
Introduction
The effect of caffeine consumption on birth weight and duration of pregnancy has been the subject of numerous epidemiological studies in recent years. Major sources of caffeine are coffee, black tea, maté, chocolate/cocoa, and cola soft drinks. It has also been estimated that nearly 200 nonprescription drugs contain caffeine, and this may be an important source for a minority of people.
Caffeine (1,3,7trimethylxanthine) is a plant alkaloid, structurally related to DNA purine bases, the focus of studies with laboratory animals. Caffeine is a pharmacologically active substance with effects on many different organ systems. Interest in the study of caffeine on human pregnancy, however, is based on the fact that its clearance is delayed in pregnant women, mainly in the second and third trimesters, when it is decreased to one half and one third the normal rate, respectively (Aldridge et al., 1979). Caffeine crosses the placental barrier so that maternal and fetal blood levels are virtually the same (Goldstein & Warren, 1962). The enzymes needed for caffeine metabolism are absent both in the fetus and until the eighth month after delivery (Hornung et al., 1985 apud James & Paull, 1985). Concern about the possible harmful effects of caffeine on pregnancy has evolved mainly from studies in animals that have indicated a decrease in intrauterine fetal growth, reduced birth weight, and skeletal abnormalities (Heller, 1987; Dlugosz & Bracken, 1992). Nevertheless, the implications of these findings for human beings are unclear because of differences in the mode of exposure to caffeine, the amounts consumed, and metabolism of the drug. However, despite a proliferation of studies on pregnant women in recent years, the conclusions are controversial. A metaanalytic approach was used here to determine the quantitative summary of these studies.
Material and methods
Study selection
A Medline search from 1966 to 1995 produced twentysix epidemiological publications on the relationship between caffeine and birth weight and duration of human pregnancy. These studies had investigated the effect of caffeine on low birth weight (LBW birth weight less than 2,500 grams), intrauterine growth retardation (IUGR birth weight under the 10th percentile for gestational age), and preterm delivery (gestational age of less than 37 weeks). A scoring system developed by the UK Nutritional Epidemiology Group for the Nutrition Society (UKNEGNS, 1993) was adapted to rank the overall quality of papers in an objective manner. Based on the system, separate scoring methods were used for casecontrol and cohort studies. For casecontrol studies, three areas were scored: quality of caffeine assessment, recruitment of subjects, and analysis of the results. Cohort studies were scored in four areas: caffeine assessment, definition of the cohort, ascertainment of outcome, and analysis of results. This scoring system allowed for a classification of the studies in a range from zero to 10.
Data extraction
When the effect estimate and its standard error for heaviest caffeine consumption as compared to none or lowest consumption were provided, they were simply copied directly from the report. When confidence limits were provided rather than the standard error, the latter was calculated. Considering that the 95% confidence interval for an estimate is equal to (estimate ± 1.96 standard error) than for a risk estimate with a given 95% confidence interval, the estimated standard error was calculated as follows: standard error = (log upper limit of the confidence interval  log estimate of risk)/1.96. When only a p value was given, instead of a standard error or confidence interval, a testbased standard error was estimated as follows: standard error = log estimate/t or z value or standard error = coefficient estimate/t or z value; where t or z is the value of the statistic corresponding to the p value (e.g., Z_{p} = 1.96 if p = 0.05, twotailed test). Assuming that the study outcomes were rare in all populations and subgroups under review, relative risks and odds ratios were pooled together.
Statistical metaanalysis
Once the individual studies were analyzed to bring standard errors, statistical metaanalysis was performed using the weighted average of studyspecific results (Kleinbaum et al., 1982). Each study result was treated as a dependent variable, with a corresponding weight. When two studies had comparable methodological quality but different sample sizes, the effect reported in the larger one was assumed to be more precise. For these, the statistical component of each study weight was the inverse of the variance of the result, computed from the estimated standard error as 1/SE^{2} (Greenland, 1987). The pooled summary of the study results was calculated as the exponential of the weighted sum of the results (S log estimate* 1/variance) divided by the sum of the weights (S 1/variance). The standard error of this estimate was calculated as the inverse of the square root of the sum of the weights. The 95% confidence interval for the estimate was calculated by (pool estimate ± 1.96 standard error). A test of significance of whether the assumed common value was zero was given by Z = (pool estimate/standard error), which has a standard normal distribution.
However, the adequacy of the pooled estimate as a metaanalytic summary of the effect under study depends on the homogeneity assumption, i.e., that the studies are estimating the same value for the effect and that, apart from bias, the differences observed among them are due to random error. The statistical test of homogeneity used was chisquared = S study weight (study estimatepooled estimate), which has a chisquared distribution with degrees of freedom one less than the number of studies.
Results
Caffeine and birth weight
Twentytwo studies (Mau & Netter, 1974; van den Berg, 1977; Arnandova & Katsulov, 1978; Soika, 1979; Kuzma & Sokol, 1982; Linn et al., 1982; Furuhashi et al., 1985; Watkinson & Fried, 1985; BeaulacBaillageon & Desrosiers, 1987; Martin & Bracken, 1987; Muñoz et al., 1988; Brooke et al., 1989; Caan & Goldhaber, 1989; Fenster et al., 1991; Olsen et al., 1991; Peacock et al., 1991; McDonald et al., 1992; Godel et al., 1992; Mills et al., 1993; Fortier et al., 1993; Larroque et al., 1993; Shu et al., 1995) focused on birth weight. Most of them showed, in the crude analysis, that coffee or caffeine was related to lower birth weights (Mau & Netter, 1974; van den Berg, 1977; Arnandova & Katsulov, 1978; Soika, 1979; Linn et al., 1982; Furuhashi et al., 1985; Watkinson & Fried, 1985; Martin & Bracken, 1987; Caan & Goldhaber, 1989; Godel et al., 1992; Fortier et al., 1993; Mills et al., 1993), the majority of these differences being statistically significant. Doseresponse effects were demonstrated in eight studies (Mau & Netter, 1974; van den Berg, 1977; Martin & Bracken, 1987; Brooke et al., 1989; Fenster et al., 1991; Peacock et al., 1991; McDonald et al., 1992; Fortier et al., 1993). It was also clear from these studies that drinking coffee was related to smoking (van den Berg, 1977; Arnandova & Katsulov, 1978; Linn et al., 1982; Watkinson & Fried, 1985; Martin & Bracken, 1987; Fenster et al., 1991; Olsen et al., 1991; Godel et al., 1992; Larroque et al., 1993; Mills et al., 1993) and, in some of them, to alcoholic beverage consumption (Arnandova & Katsulov, 1978; Martin & Bracken, 1987; Fenster et al., 1991; Olsen et al., 1991; Mills et al., 1993). After allowing for smoking, twelve of seventeen studies showed that caffeine was still related to LBW (Mau & Netter, 1974; Kuzma & Sokol, 1982; Watkinson & Fried, 1985; BeaulacBaillageon & Desrosiers, 1987; Martin & Bracken, 1987; Muñoz et al., 1988; Caan & Goldhaber, 1989; Fenster et al., 1991; Olsen et al., 1991; Peacock et al., 1991; McDonald et al., 1992; Fortier et al., 1993). Another study was restricted to nonsmokers and the association remained present (Peacock et al., 1991).
Among these studies, only thirteen were used for the pooled estimation (Mau & Netter, 1974; van den Berg, 1977; Kuzma & Sokol, 1982; Linn et al., 1982; Martin & Bracken, 1987; Caan & Goldhaber, 1989; Fenster et al., 1991; Olsen et al., 1991; McDonald et al., 1992; Mills et al., 1993; Fortier et al., 1993; Larroque et al., 1993; Shu et al., 1995). The remaining nine reports neither provided the estimate and its standard error nor permitted extraction of standard errors by publishing relative risks, confidence intervals, or p values. Three studies were on the effect of caffeine on mean birth weight (Kuzma & Sokol, 1982; Larroque et al., 1993; Shu et al., 1995), seven on LBW (Mau & Netter, 1974; van den Berg, 1977; Linn et al., 1982; Martin & Bracken, 1987; Caan & Goldhaber, 1989; Olsen et al., 1991; Fortier et al., 1993), one on IUGR (Mills et al., 1993), and two investigated both, LBW and IUGR (Fenster et al., 1991; McDonald et al., 1992). More detailed information and quality scoring of these studies are shown in Table 1.
Table 2 presents the results of reanalyses of the studies. The studies by Kuzma & Sokol (1982), Larroque et al. (1993) and Shu et al. (1995) presented results in terms of mean birth weight according to caffeine consumption. The weighted mean of these results was 42.99 (95% CI: 32.04 to 53.94; p<0.001), indicating that, among heavy consumers, newborns were about 43g lighter as compared to newborns of mothers who consumed lower amounts or no caffeine. The homogeneity test was not significant (p>0.05).
Among the nine studies reporting LBW, the pooled effect of greatest caffeine intake compared to none or lower amounts was 1.33 (95% CI: 1.221.44; p<0.001) (Figure 1). This pooled estimate suggests that there is a 33% increase in risk of LBW among mothers who consumed the largest amounts of caffeine throughout pregnancy. The homogeneity test, however, showed a significant result, indicating that the studies were highly heterogeneous and should thus not be summarized in a single estimate (p<0.001). Later calculations excluding the two outlier studies (Martin & Bracken, 1987; Caan & Goldhaber, 1989) produced a summary relative risk of 1.29 (95% CI: 1.181.41) and a homogeneity test with a p value greater than 0.10.
When IUGR was the outcome of interest, the pooled effect of the three studies which satisfied inclusion criteria was 1.24 (95% CI: 1.051.43; p<0.01), suggesting that heavy consumers have a 24% increase in risk of delivering a smallforgestationalage child. However, the homogeneity test was also significant, indicating that deriving a pooled estimate would not be adequate (p<0.001).
Caffeine and preterm delivery
Eleven studies focused on gestational age ( van den Berg, 1977; Weathersbee et al., 1977; Berkowitz et al., 1982; Linn et al., 1982; Watkinson & Fried, 1985; Martin & Bracken, 1987; Olsen et al., 1991; Williams et al., 1992; McDonald et al., 1992; Fortier et al., 1993; Pastore & Savitz, 1995) and only three found significant association (Weathersbee et al., 1977; van den Berg, 1977; Williams et al., 1992) (Table 1). The eight remaining studies, including two specifically designed to measure this outcome (Berkowitz et al., 1982; Pastore & Savitz, 1995) did not detect any association. Under the assumption of homogeneity among the studies, the weighted analysis of the effect of caffeine on gestational age showed a 24% increase in risk (combined estimate: 1.24; 95% CI: 1.111.38; p<0.001) among heavy consumers (Figure 2). This result was obtained from estimates reported by eight studies (Weathersbee et al., 1977; Berkowitz et al., 1982; Linn et al., 1982; Olsen et al., 1991; Williams et al., 1992; McDonald et al., 1992; Fortier et al., 1993; Pastore & Savitz, 1995). However, the homogeneity test result was statistically significant (p<0.001), indicating that a pooled estimate should not be valid. The exclusion of the outlier study of Weathersbee et al (1977) did not enhance homogeneity among the remaining studies (p< 0.001).
Discussion
Metaanalysis was used for contrasting and combining results of different studies on the effect of caffeine on human pregnancy outcomes, particularly birth weight and gestational age at birth. Apart from providing a combined effect, metaanalysis is useful for investigating whether the pooled studies are quantitatively consistent. Taken together, in a purely descriptive analysis, these studies suggested a probable effect of caffeine on birth weight, while gestational age did not seem to be affected. However, weighted analyses showed that it was not possible to derive a summary estimate based on the majority of available published studies, since heterogeneity among the studies was highly significant. The significant results of the homogeneity tests are direct evidence of the inconsistency of the studies on the effect of caffeine in human pregnancy.
Association between an exposure and a disease observed in a single study may depend on the population sampled, level of exposure in the study population, definition of outcome, and the study's various methodological characteristics (Greenland, 1987; Spector & Simon, 1991; Dickersin & Berlin, 1992; Hasselblad et al., 1995). The scoring system provided suitable evidence that the reported studies were not highly discrepant in terms of methodological quality. Definition of outcomes was also wellestablished and in agreement with the current literature. However, prior qualitative reviews had already suggested the need for more careful measurement of caffeine consumption (Heller, 1987; Narod et al., 1991; Shiono & Klebonoff, 1993). Some studies evaluated only coffee (Mau & Netter, 1974; van den Berg, 1977; McDonald et al., 1992), while others included a variety of caffeine sources. In most populations, coffee accounts for most of the caffeine consumed, but other sources may be equally important, including mate (a caffeinecontaining infusion widely used in South America), black tea, cola drinks, chocolate, cocoa, and medicines. Moreover, caffeine content in coffee varies with the strength and method of preparation. It is thus possible that observed differences in effect magnitudes in different studies result from inadvertent misclassification due to different levels of exposure in the distinct populations.
The possibility of bias induced by the exclusion of some studies is another explanation. Unfortunately, these studies failed to provide direct information on, or measures from which to infer, needed parameters for the pooled analyses.
In summary, this metaanalysis showed that the effects of caffeine consumption during pregnancy on birth weight and gestational age at birth are quantitatively too inconsistent for a valid summary estimate to be derived. From the above results, it is evident that available information on the effect of caffeine on pregnancy outcomes is incomplete and remains controversial. Measurement of exposure to caffeine is an issue requiring a more indepth approach in future research.
References
ALDRIDGE, A.; ARANDA, J.V., NEIMS, M. D., 1979. Caffeine metabolism in the newborn. Clinical Pharmacology and Therapeutic, 25:447453.
ARNANDOVA, R. & KATSULOV, A., 1978. Coffee and pregnancy. Akusherstvo i Ginekologiia (Sofiia), 17:5761.
BEAULACBAILLARGEON, L. & DESROSIERS, C., 1987. Caffeinecigarette interaction on fetal growth. American Journal of Obstetrics and Gynecology, 157:12361240.
BERKOWITZ, G. S.; HOLFORD, T. R. & BERKOWITZ, R. L., 1982. Effects of cigarette smoking, alcohol, coffee and tea consumption on preterm delivery. Early Development, 7:239250.
BROOKE, O. G.; ANDERSON, H. R.; BLAND, J. M.; PEACOCK, J. L. & STEWART, C. M., 1989. Effects on birth weight of smoking, alcohol, caffeine, socioeconomic factors and psychosocial stress. British Medical Journal, 298:795801.
CAAN, B. J. & GOLDHABER, M. K., 1989. Caffeinated beverages and low birthweight: a casecontrol study. American Journal of Public Health, 79: 12991300.
DICKERSIN, K. & BERLIN, J. A., 1992. Metaanalysis: stateofthescience. Epidemiologic Reviews, 14: 154176.
DLUGOSZ, L. & BRACKEN, M. B., 1992. Reproductive effects of caffeine: a review and theoretical analysis. Epidemiologic Reviews, 14:83100.
FENSTER, L.; ESKENAZI, B.; WINDHAM, G. C. & SWAN, S., 1991. Caffeine consumption during pregnancy and fetal growth. American Journal of Public Health, 81:458461.
FORTIER, I.; MARCOUX, S. & BEAULACBAILLARGEON, L., 1993. Relation of caffeine intake during pregnancy to intrauterine growth retardation and preterm birth. American Journal of Epidemiology, 137:931940.
FURUHASHI, N.; SATO, S.; SUZUKI, M.; HIRUTA, M.; TANAKA, M. & TAKAHASHI, T., 1985. Effects of caffeine ingestion during pregnancy. Gynecologic and Obstetric Investigation, 19:187191.
GODEL, J. C.; PABST, H. F.; HODGES, P. E.; JOHNSON, K. E.; FROESE, G. J. & JOFFRES, M. R., 1992. Smoking and caffeine and alcohol intake during pregnancy in a northern population: effect on fetal growth. Canadian Medical Association Journal, 147:181188.
GOLDSTEIN, A. & WARREN, R., 1962. Passage of caffeine into human gonadal and fetal tissue. Biochemical Pharmacology, 11:166168.
GREENLAND, S., 1987. Quantitative methods in the review of epidemiologic literature. Epidemiologic Reviews, 9:130.
HASSELBLAD, V.; MOSTELLER, F.; LITTEMBERG, B.; CHALMERS, T. C.; HUNINK, M. G.; TURNER, J. A.; MORTON, S. C.; DIEHR, P. & WONG, J. B., 1995. A survey of current problems in metaanalysis. Medical Care, 33:202220.
HELLER, J., 1987. What do we know about the risks of caffeine consumption in pregnancy? British Journal of Addiction, 82:885889.
JAMES, J. E. & PAULL, I., 1985. Caffeine and human reproduction. Reviews of Environmental Health, 5:151167.
KLEINBAUM, D. G.; KUPPER, L. L. & MORGENSTERN, H., 1982. Epidemiological Research Principles and Quantitative Methods. New York: Van Nostrand Rheinhold.
KUZMA, J. W. & SOKOL, R. J., 1982. Maternal drinking behavior and decreased intrauterine growth. Alcoholism: Clinical and Experimental Research, 6:396402.
LARROQUE, B.; KAMINSKI, M.; LELONG, N.; SUBTIL, D. & DEHAENE, P., 1993. Effects on birth weight of alcohol and caffeine consumption during pregnancy. American Journal of Epidemiology, 137: 941950.
LINN, S.; SCHOENBAUM, S. C.; MONSON, R. R.; ROSNER, B.; STUBBLEFIELD, P. G. & RYAN, K. J., 1982. No association between coffee consumption and adverse outcomes of pregnancy. New England Journal of Medicine, 306:141145.
MAU, G. & NETTER, P., 1974. Kaffee und alkoholkonsum risikofaktoren in der schwangerschaft? Geburtshilfe Frauenheilkd, 34:10181022.
MARTIN, T. R. & BRACKEN, M. B., 1987. The association between low birth weight and caffeine consumption during pregnancy. American Journal of Epidemiology, 126:813821.
MCDONALD, A. D.; ARMSTRONG, B. G. & SLOAN, M., 1992. Cigarette, alcohol, and coffee consumption and prematurity. American Journal of Public Health, 82:8790.
MILLS, J. L.; HOLMES, L. B.; AARONS, J. H.; SIMPSON, J. L.; BROWN, Z. A.; JOVANOVICPETERSON, L. G.; CONLEY, M. R.; GRAUBARD, B. I.; KNOPP, R. H. & METZGER, B. E., 1993. Moderate caffeine use and risk of spontaneous abortion and intrauterine growth retardation. Journal of the American Medical Association, 269:593597.
MUÑOZ, L. M.; LONNERDAL, B.; KEEN, C. L. & DEWEY, K. G., 1988. Coffee consumption as a factor in iron deficiency anemia among pregnant women and their infants in Costa Rica. American Journal of Clinical Nutrition, 48:645651.
NAROD, S. A.; SANJOSE, S. & VICTORA, C. G., 1991. Coffee during pregnancy: a reproductive hazard? American Journal of Obstetrics and Gynecology, 164:11091114.
OLSEN, J.; OVERVAD, K. & FRISCHE, G., 1991. Coffee consumption, birthweight, and reproductive failures. Epidemiology, 2:3702374.
PASTORE, L. M. & SAVITZ, D. A., 1995. Casecontrol study of caffeinated beverages and preterm delivery. American Journal of Epidemiology, 141:6169.
PEACOCK, J. L.; BLAND, M. & ANDERSON, H. R., 1991. Effects on birth weight of alcohol and caffeine consumption in smoking women. Journal of Epidemiology and Community Health, 45:159163.
SHIONO, P. H. & KLEBONOFF, M. A., 1993. Invited commentary: caffeine and birth outcomes. American Journal of Epidemiology, 137:951954.
SHU, X. O.: HATCH, M. C.; MILLS, J.; CLEMENS, J. & SUSSER, M., 1995. Maternal smoking, alcohol drinking, caffeine consumption, and fetal growth: results from a prospective study. Epidemiology, 6:11520.
SOIKA, L. F., 1979. Effects of methylxanthines on the fetus. Clinical Perinatology, 6:3751.
SPECTOR, T. D. & SIMON, G. T., 1991. The potential and limitations of metaanalysis. Journal of Epidemiology and Community Health, 45:8992.
UKNEGNS (The United Kingdom Nutritional Epidemiology Group for the Nutrition Society), 1993. Diet and Cancer: a Review of the Epidemiological Literature. Appendix 1: the Scoring System. London: UKNEGNS.
VAN DEN BERG, B. J., 1977. Epidemiologic observations of prematurity: effects of tobacco, coffee and alcohol. In: The Epidemiology of Prematurity (M. D. Reed & F. J. Stanley, eds.), pp. 157176. Baltimore: Urban & Schwarzenberg.
WATKINSON, B. & FRIED, P. A., 1985. Maternal caffeine use before, during and after pregnancy and effects upon offspring. Neurobehaviour Toxicology and Teratology, 7:917.
WEATHERSBEE, P. S.; OLSEN, L. K. & LODGE, J. R., 1977. Caffeine and pregnancy: a retrospective survey. Posgraduate Medicine, 62:649.
WILLIAMS, M. A.; MITTENDORF, R.; STUBBLEFIELD, P. G.; LIEBERMAN, E.; SCHOENBAUM, S. C. & MONSON, R., 1992. Cigarettes, coffee, and preterm premature rupture of the membranes. American Journal of Epidemiology, 135:895903.