August 6, 2024
Conference Paper
A Deep Multimodal Representation Learning Framework for Accurate Molecular Properties Prediction
Abstract
Drug discovery is a complex and challenging process, requiring the optimization of candidate compounds to identify those with the potential to become safe and effective drugs. Predicting molecular properties is an indispensable step in the drug discovery pipeline. Traditionally, this process is costly and time-intensive, involving multiple rounds of experiments and clinical trials, rendering it impractical for every candidate compound. Deep learning techniques have emerged as a promising approach to drug discovery to reduce the cost and time required to identify novel drugs. However, prevalent research in deep learning models focused on predicting molecular properties has primarily fixated on single-modal models, which utilize a single modality of data, neglecting the potential benefits of combining different data modalities. To overcome this limitation, we introduce MRL-Mol: a deep \textbf{M}ultimodal \textbf{R}epresentation \textbf{L}earning framework for accurate \textbf{Mol}ecular properties prediction. MRL-Mol harnesses three data modalities: sequence, graph, and image, augmenting the depth of comprehension. Leveraging a large-scale unlabeled dataset~($\sim$1M unique molecules), we pretrain MRL-Mol to extract inter- and intra-modal information. Our study demonstrates the superior performance of MRL-Mol in predicting molecular properties across six benchmark datasets, including both classification and regression tasks. Notably, MRL-Mol outperforms other state-of-the-art molecular properties prediction models. These findings suggest that by combining information from multiple data modalities, MRL-Mol can comprehend molecules better than single-modal deep learning models and identify molecular properties with better accuracy.Published: August 6, 2024