A New Graph-Based Reinforcement Learning Environment for Targeted Molecular Generation and Optimization

Document Type

Conference Proceeding

Publication Date

Winter 1-24-2024

Abstract

Generating a new molecule that satisfies certain desirable objectives or optimizing an existing molecule to meet additional requirements continues to play a crucial part in the important area of computer-aided drug design. Many research studies have been conducted to improve this process in order to reduce time and all costs associated with proposing a new drug to markets. Moreover, any progress in generating or optimizing useful molecules would help reduce the risk of clinical trials and prevent potential side effects including possible severe consequences. In this paper, we propose MolGraphEnv, a new multi-objective molecular generation and optimization environment that models the process of generating and/or optimizing molecules as a Markov Decision Process (MDP) and provides a smooth integration with graph machine learning framework PyTorch Geometric (PYG) and RDKit [2, 7, 15]. In the proposed environment, molecules are modeled using graphs where atoms are represented by nodes and bonds are represented by edges. The observations are stored as a PYG Data object that accounts for the computed features for each node (atom) and each edge (bond). Some of these features are obtained from the chemical domain, such as Hybridization and atomic numbers, while other features are obtained from pure graph theory such as node degrees. By integrating such features from both the chemistry and the graphs’ domain, we ensure a better representation of the atoms and their interrelationships. The action space is multi-discrete and inherited from the gymnasium for better functionality. We show that the proposed environment provides a smooth and flexible experience for the end user by designing a reward system to intelligently bias the searching process toward desired properties, such as obtaining molecules with higher QED (Quantitative Estimate of Drug-likeness) and ensuring chemical and structural validity. MolGraphEnv represents a significant step forward in computer-aided drug design, providing a powerful platform for generating and optimizing molecules with specific objectives. It is a seamless integration with established graph machine-learning tools and cheminformatics frameworks makes it a valuable resource for researchers in the field.

Share

COinS