Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Survival, Longitudinal, and Machine-Learning approaches to Reproductive, Obstetric, and Child Health

  • Rajeshwari Sundaram, PhD, MStat, Senior Investigator, Social and Behavioral Sciences Branch, DiPHR
  • Abhisek Saha, PhD, MStat, Research Fellow
  • Yuchen Mao, PhD, Visiting Fellow
Rajeshwari Sundaram

My research interests include development of statistical methodology for multivariate survival and longitudinal data (measured on differing time scales), joint modeling, dynamic risk prediction, complex environmental mixtures, dimension reduction methods with applications of statistics in population health and medicine. New and current projects include investigation into health effects of exposures to mixtures of environmental contaminants (particularly in relation to reproductive health, fetal growth, and child health), general questions in reproductive and environmental epidemiology, as well as identifying patterns of labor progression in obstetrics and identifying delays in labor.

Statistical challenges in assessing human fecundity and labor progression

Reproductive Health

Fecundity, the biologic capacity for reproduction in humans, is a health outcome of considerable interest, with impact on later onset adult diseases. Many prospective pregnancy studies have been conducted at the Division of Population Health Research (DiPHR), NICHD, including Longitudinal Investigation of Fertility and the Environment Study, EAGER, IDEAL, FAZST to assess fecundity, providing a rich source of data for studying this outcome. A unique feature of “pregnancy” as a time-to-event is that it requires an intermediate event to occur, i.e., a woman/couple must put her at risk for pregnancy. Thus, standard time to event analysis cannot be directly used in assessing time to pregnancy (TTP). Other examples of such data are in studies involving sexually transmitted diseases and HIV, as one must put oneself at risk for infection, for which our proposed methods can be easily adapted.

Furthermore, depending on the study design (retrospective, or prospective, or current duration), the amount of data collected can vary significantly, ranging from participant-level information to cycle (monthly)-level to daily-level data on both partners. This provides significant statistical challenges, where the longitudinal processes of interest may be measured on differing timescales (days or cycles) from the time-to-event of interest, which in turn informatively censors the longitudinal process of interest. My research seeks to build appropriate statistical models for such complex data, including joint models (with a view towards dynamic predictions), to account for the impact of various biological processes and behavioral processes on the endpoints of interest, and relaxing (statistical) assumptions that may not be biologically valid. The eventual goal of the research is to build an online risk calculator for infertility.

Environmental researchers have called for epidemiological studies to move beyond analyzing the health effects of individual chemical toxicant towards the study of chemical mixtures. It has been identified as one of the research priorities of U.S. National Institute of Environmental Health Sciences (NIEHS) and is recognized by the U.S. National Academy of Sciences, which recommends that risk assessments consider the possibility of cumulative effects from multiple chemical exposures. However, progress has been thwarted by the complexities of such studies; they place a high demand on the quantity and quality of data and require advanced statistical methods, the development of which is an active area of research. The statistical challenges are considerable: high correlation between exposures can lead to inflated standard errors and instability in effect estimates; differing degrees of measurement error among exposures can bias effect estimates; and there may be insufficient power to estimate small effects in the presence of measurement error, multicollinearity, small sample size, interactions, and non-linearity. Furthermore, chemical toxicants may be found in trace amounts with a significant proportion of values below limits of detection. The study of chemical mixtures addresses some of the shortcomings of single chemical epidemiology. There are promising approaches that have been provided in the literature, such as Bayesian Kernel machine Regression [Bobb JF et al, Biostatistics 2015;16(3):493], Weighted Quantile Regression Sums approach [Gennings C et al, Epidemiology 2010;21(Suppl 4):S77]; a detailed review can be found in Lazarevic N et al, Environ Health Perspect 2019;127(2):026001. Our focus is on developing methods to identify the “important drivers” in the mixtures of chemical toxicants that are significantly associated with health outcomes, particularly time-to-pregnancy, as well as menstrual cycle characteristics.

Our focus is trifold: to develop various regularization-based regression approaches in the context of discrete survival data with left truncation and right censoring and with frailty in the model; to address the impact of issues of missing data in such context; and, lastly, to account for issues of limits of detection. We also plan to extend these approaches in the context of joint modeling of longitudinal and survival data, so as to tease apart the association on menstrual characteristics and that on fecundity.

Statistical methods for labor progression

My interest also lies in developing state-of-the-art statistical methods to assess labor progression in pregnant women. Labor curves are useful in understanding the progression of labor in women and set the standards of care for women in labor. This was the focus of the Consortium on Safe Labor Study, conducted at the DiPHR, NICHD.

The progression of spontaneous labor in women is highly complex and is classified based on cervical dilation from 0 cm to 10 cm in the first stage; the second stage is defined by duration in which a woman pushes out the fetus, with fetal descent measures from -3 to + 3; and the final third stage occurs after the placenta is delivered. Typically, cervical dilation ranges from 1 cm to 10 cm, with the woman being fully dilated at the measurement of 10 cm. The measurements are made in per unit cm. change, so the information tracked is progression of dilation from 1 cm →2 cm→3 cm….→9 cm→10 cm. Furthermore, the second stage of labor, measured through fetal station (descent of fetus head), is labeled from 3→–2→…+3→+4. Given the changes in modern obstetrical practices, understanding the labor progression in the presence of various medical interventions (example, the use of epidural) and modern socio-demographic distribution of the women (later age at pregnancy) are of considerable interest. The questions of interest range from understanding the rate of change of cervical dilation, important in identifying whether a woman is in active labor, to determining whether there is a clear acceleration and deceleration phase in the active phase. Understanding whether a woman’s labor is arrested, i.e., held up at a certain cervical dilation, is an important factor in deciding on whether a woman undergoes a cesarean delivery. Understanding the relationship between prolonged first stage of labor with second stage of labor are some questions of considerable interest. Statistically, the data provide interesting challenges and require innovative methods for addressing these questions.

Publications

  1. Saha A, Sundaram R. Variable selection for discrete survival model with frailty in presence of left truncation and right censoring: Studying association of environmental toxicants on time‐to‐pregnancy. Stat Med 2023 42(2):193–208
  2. Saha A, Ma L, Biswas A, Sundaram R. Joint modeling of geometric features of longitudinal process and discrete survival time measured on nested timescales: an application to fecundity studies. Stat Biosci 2024 16(1):86–106
  3. Saha A, Putnick DL, Lin H, Yeung E, Sundaram R, Peddada SD. Multiple imputation for compositional data (MICoDa) adjusting for covariates. In: Statistics at the Forefront of the Biomedical Advances; Springer; Larriba Y, ed. 2023 157–184
  4. Lee M, Saha A, Sundaram R, Albert PS, Zhao S. Accommodating detection limits of multiple exposures in environmental mixture analyses: an overview of statistical approaches. Environ Health 2024 23:48
  5. Trees I, Saha A, Putnick DL, Clayton PK, Mendola P, Bell E, Sundaram R, Yeung E. Prenatal exposure to air pollutant mixtures and birthweight in the upstate KIDS cohort. Environ Int 2024 187:108692