Food and Nutrition

Methane prediction equations including genera of rumen bacteria as predictor variables improve prediction accuracy

Publié le - Scientific Reports

Auteurs : Boyang Zhang, Shili Lin, Luis Moraes, Jeffrey Firkins, Alexander N Hristov, Ermias Kebreab, Peter H Janssen, André Bannink, Alireza R Bayat, Les A Crompton, Jan Dijkstra, Maguy A Eugène, Michael Kreuzer, Mark Mcgee, Christopher K Reynolds, Angela Schwarm, David R Yáñez-Ruiz, Zhongtang Yu

Methane (CH 4) emissions from ruminants are of a significant environmental concern, necessitating accurate prediction for emission inventories. Existing models rely solely on dietary and host animalrelated data, ignoring the predicting power of rumen microbiota, the source of CH 4. To address this limitation, we developed novel CH 4 prediction models incorporating rumen microbes as predictors, alongside animal-and feed-related predictors using four statistical/machine learning (ML) methods. These include random forest combined with boosting (RF-B), least absolute shrinkage and selection operator (LASSO), generalized linear mixed model with LASSO (glmmLasso), and smoothly clipped absolute deviation (SCAD) implemented on linear mixed models. With a sheep dataset (218 observations) of both animal data and rumen microbiota data (relative sequence abundance of 330 genera of rumen bacteria, archaea, protozoa, and fungi), we developed linear mixed models to predict CH 4 production (g CH 4 /animal•d, ANIM-B models) and CH 4 yield (g CH 4 /kg of dry matter intake, DMI-B models). We also developed models solely based on animal-related data. Prediction performance was evaluated 200 times with random data splits, while fitting performance was assessed without data splitting. The inclusion of microbial predictors improved the models, as indicated by decreased root mean square prediction error (RMSPE) and mean absolute error (MAE), and increased Lin's concordance correlation coefficient (CCC). Both glmmLasso and SCAD reduced the Akaike information criterion (AIC) and Bayesian information criterion (BIC) for both the ANIM-B and the DMI-B models, while the other two ML methods had mixed outcomes. By balancing prediction performance and fitting performance, we obtained one ANIM-B model (containing 10 genera of bacteria and 3 animal data) fitted using glmmLasso and one DMI-B model (5 genera of bacteria and 1 animal datum) fitted using SCAD. This study highlights the importance of incorporating rumen microbiota data in CH 4 prediction models to enhance accuracy and robustness. Additionally, ML methods facilitate the selection of microbial predictors from high-dimensional metataxonomic data of the rumen microbiota without overfitting. Moreover, the identified microbial predictors can serve as biomarkers of CH 4 emissions from sheep, providing valuable insights for future research and mitigation strategies.