02079nas a2200277 4500000000100000008004100001260001500042100002300057700001800080700001900098700001500117700002500132700002000157700002200177700001600199700001600215700002100231700001800252700002100270700002200291245013600313856005500449300001300504490000700517520127700524 2024 d c2024-05-081 aLawrence Middleton1 aIoannis Melas1 aChirag Vasavda1 aArwa Raies1 aBenedek Rozemberczki1 aRyan S. Dhindsa1 aJustin S. Dhindsa1 aBlake Weido1 aQuanli Wang1 aAndrew R. Harper1 aGavin Edwards1 aSlavé Petrovski1 aDimitrios Vitsios00aPhenome-wide identification of therapeutic genetic targets, leveraging knowledge graphs, graph neural networks, and UK Biobank data uhttps://www.science.org/doi/10.1126/sciadv.adj1424 aeadj14240 v103 aThe ongoing expansion of human genomic datasets propels therapeutic target identification; however, extracting gene-disease associations from gene annotations remains challenging. Here, we introduce Mantis-ML 2.0, a framework integrating AstraZeneca’s Biological Insights Knowledge Graph and numerous tabular datasets, to assess gene-disease probabilities throughout the phenome. We use graph neural networks, capturing the graph’s holistic structure, and train them on hundreds of balanced datasets via a robust semi-supervised learning framework to provide gene-disease probabilities across the human exome. Mantis-ML 2.0 incorporates natural language processing to automate disease-relevant feature selection for thousands of diseases. The enhanced models demonstrate a 6.9% average classification power boost, achieving a median receiver operating characteristic (ROC) area under curve (AUC) score of 0.90 across 5220 diseases from Human Phenotype Ontology, OpenTargets, and Genomics England. Notably, Mantis-ML 2.0 prioritizes associations from an independent UK Biobank phenome-wide association study (PheWAS), providing a stronger form of triaging and mitigating against underpowered PheWAS associations. Results are exposed through an interactive web resource.