Stratyfikacja ze względu na płeć połączona z metodą SHAP w eksploracji podgrup: Budowanie zrozumiałego modelu uczenia maszynowego do przewidywania stężenia lakozamidu u dzieci z epilepsją i jego walidacja
Sex Stratification Combined with SHAP-Driven Secondary Subgroup Mining: Construction of an Interpretable Machine Learning Model for Steady-State Concentration of Lacosamide in Pediatric Epilepsy and Validation
W skrócie
[Preprint - wstępne wyniki] Naukowcy opracowali model sztucznej inteligencji do przewidywania stężenia leku lakozamidu we krwi dzieci chorych na epilepsję, biorąc pod uwagę różnice między chłopcami a dziewczętami. Badanie obejmowało ponad 3200 dzieci i wykazało, że dawka leku i płeć pacjenta mają kluczowy wpływ na to, jak organizm przetwarza lek. Nowy model okazał się bardziej dokładny i niezawodny niż tradycyjne podejścia, co może pomóc lekarzom w lepszym dostosowaniu leczenia do indywidualnych potrzeb małych pacjentów.
Oryginalny abstract (angielski)
Objective This study aimed to construct machine learning prediction models for the steady-state plasma concentration of lacosamide in children based on gender stratification. Methods A total of 3,235 children with epilepsy treated with lacosamide at Kunming Children's Hospital from May 2022 to May 2024 were retrospectively included as the modeling cohort. Following gender stratification, the cohort was divided into training and internal validation sets at a 7:3 ratio. Additionally, 1,600 patients selected from June 2024 to January 2026 served as the external validation cohort and were synchronously stratified. Initially, basic machine learning models for plasma concentration were constructed for males and females separately. SHAP analysis was employed to screen core influencing indicators, which were then combined with gender to establish a dual-layer stratified optimization model. The predictive performance of single-layer and dual-layer models was compared using R 2 , RMSE, and MAE. Model fitting effects and generalization capabilities were verified using both internal and external cohorts. Results Univariate stratified feature screening revealed significant differences in the core variables influencing plasma concentrations between male and female pediatric patients. Comparative analysis across multiple models confirmed that LightGBM was the optimal prediction model for both sexes, demonstrating robust goodness-of-fit in both internal and external validation. SHAP interpretability analysis identified dosage as the core regulatory factor for both males and females. The results of the "sex + dosage" two-tier stratified modeling showed a significant increase in R 2 compared to the single-tier model, accompanied by substantial improvements in model stability and precision. Furthermore, interpretability analysis and decision tree analysis revealed significant heterogeneity in the regulatory pathways of plasma concentrations among different patient subgroups. Conclusion Plasma concentrations of lacosamide in children exhibit gender- and dose-related pharmacokinetic heterogeneity. A two-layer machine learning model stratified by gender and dose demonstrates superior predictive performance, calibration, and generalizability compared to single-layer models.
Metadane publikacji
Journal
Preprint (medRxiv/bioRxiv)
Data publikacji
29.06.2026
DOI
10.22541/authorea.15005407/v1
Europe PMC ID
PPR1262653
Autorzy
li F, Lihui G, li F, Linbo L, Lilin Z, Jing Z, Aihua Y, Shuang L, Yi R