DPCfam is a new unsupervised procedure that uses alignments and Density Peak Clustering to automatically classify homologous protein regions.
Applied to protein sequences, it assists in manual annotation (e.g. domain discovery and boosting of clan membership) and can be used as a stand-alone tool for unsupervised classification of sparsely annotated protein datasets such as those from metagenomics studies. It has been recently adapted to structure predicted in silico with AlphaFold (AF) for structural domain discovery.
Results on relevant biological cases are released openly to the public and can be accessed through dedicated web-services:
The tool is developed and maintained by LADE at Area Science Park.
It is a state-of-the-art solution for predicting changes in protein thermodynamic stability resulting from single amino acid mutations. Leveraging the MSA Transformer architecture, exploiting the evolutionary information encoded in families of aligned homologous sequences, it stands out for its performance and efficiency. The algorithm is designed with a high degree of flexibility, allowing for easy adaptation to address various downstream tasks.
The tool is developed and maintained by LADE at Area Science Park.
The tool is developed and maintained by LADE at Area Science Park.