Chapter20: Machine Learning for In Silico ADMET Prediction
reading notes of《Artificial Intelligence in Drug Design》
文章目录
- 1.Introduction
- 2.Materials
- 2.1.Dataset Overview
- Descriptor Set Overview
1.Introduction
- The multiple task deep learning network (MT-DNN) and graph convolutional neural network (GCNN) methods play important role in the accuracy boost.
2.Materials
2.1.Dataset Overview
- PubChem is a large-scale chemical database of bioactive molecules with drug like properties.
- PubChem’s European counterpart ChEMBL is another database housing small molecule dataset for machine learning.
- Some additional well-curated databases include the Aquasol database for aqueous solubility and Tox21 for toxicity.
Descriptor Set Overview
- 2D molecular descriptors are the most popular for traditional ADMET modeling. These include cLogP (BioByte Corp., Claremont, CA), Kier connectivity, shape, and E-state indices, a subset of MOE descriptors (Chemical Computing Group Inc., 2004, http://www.chemcomp.com), and a set of ADMET keys that are structural features were used for our ADMET modeling.
- Some of the descriptors such as Kier shape indices contain implicit 3D information. Explicit 3D molecular descriptors were not routinely used to avoid bias of the analysis due to predicted conformational effects and speed of calculation for fast prediction.
- In the deep learning approach, molecular graph convolutional neural network was applied to transform molecular structures to embeddings.