DANN Score¶

DANN (Deleterious Annotation of genetic variants using Neural Networks) is a computational tool that predicts whether genetic variants are likely to cause disease.

It uses deep learning to analyze DNA changes and assigns each variant a score between 0 and 1, helping researchers prioritize which variants deserve further investigation.

How to Use DANN Scores in Practice¶

DANN scores are straightforward:

0.0 means likely harmless
1.0 means likely harmful

Research suggests a threshold of 0.96 can effectively balance sensitivity and specificity, though optimal thresholds may vary by application.

Interpretation

DANN score higher than 0.96 means the variant is likely harmful.

History and Methods Behind DANN¶

DANN was developed by researchers at UC Irvine as an improvement over the CADD method. While CADD used traditional support vector machines, DANN employs deep neural networks to capture complex, non-linear relationships in genetic data that simpler methods miss.

Training Data¶

The algorithm analyzes 949 different genomic features and was trained on over 16 million genetic variants. It contrasts naturally occurring variants (likely benign due to evolutionary selection) against simulated variants enriched for harmful effects.

Key Advantages¶

Scores everything: Unlike tools that only work on protein-coding genes, DANN can analyze the entire genome, including regulatory regions
Continuous scoring: Provides scores from 0 to 1, where higher scores indicate more likely harmful variants
Superior performance: Shows 19% reduction in error rates and 14% increase in area under the curve compared to previous methods

Additional Reading¶

Devon Jensen wrote a post in 2015 with review of the DANN method and results: "The Best Variant Prediction Method That No One Is Using"