Mapping and bridging protein cavity space and chemical space
The Computer-Aided Drug Discovery (CADD) group leverages the vast amount of proprietary and public data on interactions of small molecules with biological macromolecules in order to enhance the drug discovery process. We work to improve understanding of the observed data and provide reliable predictive models relevant to pharmaceutical research.
One key contribution of CADD is to provide ideas to medicinal chemistry projects in the form of novel molecules with adequate intrinsic (physico-chemical) and extrinsic (in relation to the biological targets and/or anti-targets) properties. Identifying these novel molecules can be likened to traversing the so-called “Chemical Space”. The repertoire of possible biomacromolecule cavities (or pockets) can be seen as a “Cavity Space”. While the chemical space is too vast to navigate, the space of the cavitome, i.e., the ensemble of available cavities on biomacromolecule surfaces, exemplified by binding sites in proteins / DNA / RNA, is still large but computationally tractable.
One of our key objectives is to map the cavitome and create novel descriptors (fingerprints) that can bridge cavities to chemical space, thus linking biological and chemical universes. Bridging these two spaces will enable identification of novel compounds by matching compounds to binding sites of select proteins or other biomacromolecules. Conversely, this development will allow for biological target identification starting from the compounds. Integrating these two spaces opens potential developments in machine learning (e.g., classifying pockets, compounds, identifying non-obvious links and patterns, etc.) and involves manipulating a large amount of data.
The number of relevant applications in the context of pharma research is large and involves interactions between the CADD group and many other Novartis groups: the computational biology group, the cheminformatics and bioinformatics groups, and potentially, beyond computational groups, the structural biologists and screening scientists. In addition, strong cooperation with our scientific computing department is key, given the pressure on storage (“big data” integrated and generated by the project) and computational power (machine learning and potential applications of the project).