Systematyczne identyfikowanie chorobowych sąsiedztw 3D w strukturach białek
Systematic identification of disease-associated 3D neighborhoods in protein structures
W skrócie
[Preprint - wstępne wyniki] Naukowcy opracowali nową metodę, która pozwala na mapowanie rzadkich wariantów genetycznych na trójwymiarowe struktury białek i znalezienie obszarów wzbogaconych w warianty chorobowe. Badanie przeanalizowało miliony mutacji z pacjentami z autyzmem, epilepsją i schizofrenią, odkrywając skupiska wariantów chorobowych w ponad 870 genach. Ta metoda umożliwia zrozumienie, w jaki sposób mutacje w specyficznych miejscach białek powodują choroby, co jest niewidoczne dla tradycyjnych analiz genetycznych.
Oryginalny abstract (angielski)
Rare variant association studies (RVAS) have identified hundreds of genes contributing to human disease, yet gene-level signals provide limited insight into the molecular mechanisms underlying pathogenicity. Missense variants, which can be mapped onto three-dimensional protein structures, offer an opportunity to gain novel mechanistic insights. Here, we develop a scalable framework for systematically mapping case and control variants onto protein structures and identifying spatially localized regions enriched for case variants. Our framework builds on the 3D Neighborhood Test (3DNT), which we recently introduced in a single-gene analysis of ATP2B2, and enables the genome-wide analysis of rare coding variation beyond standard gene-level approaches. We applied 3DNT across multiple large-scale datasets, including Mendelian disease variants from ClinVar, de novo mutations from 37,486 autism spectrum disorder (ASD) probands, and case-control exome sequencing cohorts for epilepsy and schizophrenia. We identified significant clusters in 872 genes for Mendelian disease, in 70 genes for autism, in one gene for epilepsy, and in three genes for schizophrenia. These clusters are strongly enriched for known functional sites and provide insight into both known and previously unrecognized disease genes. Our results demonstrate that scalably integrating RVAS data with protein structure predictions localizes disease-associated variation to specific functional regions and reveals a layer of disease biology that is largely invisible to standard analyses.