MoLFormers is a suite of molecular foundation models that accelerate the development of chemicals, materials, and biologics. It includes a number of state-of-the-art pre-trained and fine-tuned models for molecular data analysis, property prediction, nearest neighbor retrieval, and discovery (such as design of catalysts, stability agents, antibodies, protein inhibitors) tasks, thus enabling molecular science at higher accuracy and success rate, lower cost, and faster speed. The suite contains high-quality families (encoder-only, encoder-decoder, decoder-only) of large molecular models that combine best-of-breed architectures (e.g., rotary embedding, multi-modal embedding, prompt tuning, linear attention, geometry encoder, etc.) and provide world-class predictive and generative AI capabilities. One such instantiation of MoLFormer is MoLFormer-XL, a chemical SMILES encoder trained on 1.1 B chemicals using rotary positional encodings and linear attention. MoLFormer-XL, finetuned on downstream task data, shows performant behaviors on wide range of property prediction benchmarks from quantum-chemical to physiological and beyond. And, the model shows emergent behavior such as geometry learning. Another example is CogMol, a deep generative foundation model, allowing target-specific discovery of molecules, which has demonstrated proven success to accelerate molecule discovery. This interactive web demo allows AI researchers and practitioners to experience and explore the MoLFormer predictions, generations, and explanations on user query.
Related Peer-Reviewed Publications: