December 8, 2017
Conference Paper

ChemNet: A Transferable and Generalizable Deep Neural Network for Small-Molecule Property Prediction

Abstract

With access to large datasets, deep neural networks through representation learning have been able to identify patterns from raw data, achieving human-level accuracy in image and speech recognition tasks. However, in chemistry, availability of large standardized and labelled datasets is scarce, and with a multitude of chemical properties of interest, chemical data is inherently small and fragmented. In this work, we explore transfer learning techniques in conjunction with the existing Chemception CNN model, to create a transferable and generalizable deep neural network for small-molecule property prediction. Our latest model, ChemNet learns in a semi-supervised manner from inexpensive labels computed from the ChEMBL database. When fine-tuned to the Tox21, HIV and FreeSolv dataset, which are 3 separate chemical tasks that ChemNet was not originally trained on, we demonstrate that ChemNet exceeds the performance of existing Chemception models, contemporary MLP models that trains on molecular fingerprints, and it matches the performance of the ConvGraph algorithm, the current state-of-the-art. Furthermore, as ChemNet has been pre-trained on a large diverse chemical database, it can be used as a universal “plug-and-play” deep neural network, which accelerates the deployment of deep neural networks for the prediction of novel small-molecule chemical properties.

Revised: December 28, 2017 | Published: December 8, 2017

Citation

Goh G.B., C.M. Siegel, A. Vishnu, and N.O. Hodas. 2017. ChemNet: A Transferable and Generalizable Deep Neural Network for Small-Molecule Property Prediction. In Machine Learning for Molecules and Materials (NIPS 2017 Workshop), December 8, 2017, Long Beach, California. La Jolla, California:Neural Information Processing Systems Foundation, Inc. PNNL-SA-129942.