PRIN Grant 2022TE5B7X (2023-2025) “Explainable Model for Protein Design”

Abstract

The aim of the project is to improve energy-based models for protein sequences and to adapt them for high-potential applications. Energy-based models are often preferable to black-box neural network architectures due to their simplicity combined with state-of-the-art performance in several tasks. Neural networks, on the other hand, are in principle able to model more complex patterns in the data and are easily scalable to very large datasets, albeit with a loss of interpretability in terms
of the underlying biology. The project will have thus two main aspects: i) combining Potts models with neural networks to develop models that are both expressive and interpretable and ii) applying the tools of sensitivity analysis and explainable AI to neural networks trained for protein design and other tasks for understanding what biological information these models capture from the data.