Model Name | Architecture | Training Data | Validated Applications | References |
---|---|---|---|---|
UniRep | Multiplicative Long Short-Term Memory (mLSTM) | UniRef50 sequences [52] | Predicting binding affinity, stability, and expression levels | [53] |
ESM-1b | Transformer | UniRef50 sequences | Secondary structure prediction, contact map prediction, remote homology detection | [55] |
ESM-2 | Transformer | UniRef50 and UniRef90 sequences | Atomic-level protein structure prediction, protein function prediction | [56] |
ESM-IF1 | Transformer with Geometric Vector Perceptron (GVP) layers | Sequences and structures from CATH [81], UniRef50 sequences and their predicted structures using AlphaFold2 | Inverse protein folding (predicting sequence from structure) | [73] |
ESM-3 | Bidirectional transformer | Sequences from UniRef, MGnify [82], JGI [82, 83], OAS [59], Sequences and structures from PDB, AlphaFoldDB, ESMAtlas | Multimodal protein generation (sequence, structure, function), protein design | [64] |
AntiBERTy | BERT | Understanding antibody affinity maturation process, generating diverse antibody sequences | [57] | |
AbLang | Transformer | OAS | Completing antibody sequences, identifying functionally relevant mutations, designing novel antibodies | [58] |
IgBert | BERT | OAS (unpaired + paired) | Antibody sequence recovery, binding affinity prediction | [60] |
IgT5 | Text-to-Text Transfer Transformer (T5) | OAS (unpaired + paired) | Antibody sequence recovery, binding affinity prediction | [60] |