Learning rules for comparing drug activities
Contents and links
Introduction to the drug structure-activity
Most pharmaceutical R&D is based on finding slightly improved variants
of patented active drugs (292 out of 348 US drugs introduced between 1981
and 1988 were of this kind). In doing this, it is essential to understand
the relationships between chemical structure and activity. In most cases,
these relationships cannot be derived solely from physical theory, so experimental
evidence is essential. Such empirically derived relationships are called
Structure Activity Relationships (SARs). In a typical SAR problem, a set
of chemicals of known structure and activity are given, and the problem
is to construct a predictive theory relating the structure of a compound
to its activity. This relationship can them be used to select for structures
with high or low activity. Typically, knowledge of such relationships form
the basis for devising clinically effective, non-toxic drugs.
At present, much of this identification is done manually: the designer
displays a small number of molecules using 3-D graphics and tries to fit
``equivalent'' atoms or groups which have matching chemical properties.
The main automatic method is statistical correlation (using linear regression)
of biological activity with bulk chemical properties such as acidity.
Both methods are inadequate. Manual matching can only be used on a few
molecules at a time, because there is too much 3-D information for the
designer to handle: molecular shapes are horrendously convoluted, and it
is hard to comprehend 3-D plots.
Statistical correlation only works within a series of closely related
or ``homologous'' molecules, where for example each member differs from
the last in the length of a hydrocarbon chain or the basicity of an amine
group. Better ways to automatically discover the chemical properties which
affect the activity of drugs could greatly reduce pharmaceutical R&D
costs - at present, the average cost of developing a new drug is $230 million,
and the average development time is 10 years.
Our research, which includes collaboration with the Imperial Cancer
Research Fund, has shown shown that ILP can construct rules which predict
the activity of untried drugs, given examples of drugs whose medicinal
activity is already known. We found the these rules to be more accurate
than statistical correlations. More importantly, because the examples are
expressed in logic, it is possible to describe arbitrary properties of,
and relations between, atoms and groups. This means that the examples need
not be restricted to one homologous series. Finally, the logical nature
of the rules also makes them easy to understand and can provide key insights,
allowing considerable reductions in the numbers of compounds that need
to be tested.
For further information, please see the pages on our work with drugs
against Alzheimer's disease, drugs for inhibition
of E. Coli Dihydrofolate Reductase, and suramin
Up to applications main page.