RE: ORGLIST: Desperately seeking collaborator.

Date view Thread view Subject view Author view Attachment view

From: Dmitry Korkin (z17b3$##$unb.ca)
Date: Sun Aug 19 2001 - 21:18:37 EDT


Dear Jacob,

thank you very much for your reply. The discussion of new ideas, in my
opinion, is of extreme importance, especially at the beginning of these ideas'
evolution.
Now, I guess it's my turn to throw in some fire wood (it might be a big fire
though). Although, come from the non-chemical area I feel now like a Daniel in
the Lions Den. :-)

>1. I'm all for new approaches to everithing. That's the basis for
>scientific and social advancement.

>2. I'm even for starting with any conceptual approach far removed from
>chemistry.

Let me ask you to compare which model is farther from chemistry and especially
from organic chemistry: an approach to treat chemical objects as structural
ones or for example approximation and interpolation methods from numerical
analysis on which all QSAR methods are based, or Differential Equations and
Functional Analysis Theories which are the basis for quantum mechanics, or
mathematical statistics which was a base for Monte Carlo simulation methods
for molecular systems (I'm talking about the basic models in chemistry and I
can easily continue this list). So far the vast majority of the theoretical
methods for chemistry (which is about chemical structures, I believe) were
taken from mathematical formalisms which were developed much earlier and
definitely not for the purpose of chemical models. All of these models are
numerical ones (moreover they are related to continuous or "classical"
mathematics) while in chemistry we are dealing with discrete structural
objects. So, based on these arguments if I replaced your phrase "conceptual
approach far removed from chemistry" by "new conceptual approach to chemistry
different from existing ones" would it be fair?

>3. When you start approaching your "customers" (say, the chemists) with
>your new product, you may find that it is difficult to sale for several
>reasons:
>
>a) A language gap
>
>-- you do not understand each other because the same words (e.g.,
>structure, bonds, properties) carry a different meaning.

The thing is, we needed more or less formal definitions of those basic
concepts. But unfortunately we couldn't find such ones. One of a few
scientists who tried to formalize chemistry was Linus Pauling. And we used
some of his ideas, for example the idea that a bond might be considered as a
structural unit. Probably, You might help us with the "chemical" formal
definition of a chemical structure. I have asked this question on several
forums for several times but still don't have a satisfiable answer. So I would
appreciate it greatly if you or any other ORGLIST subscribers could give me
any ideas or sources for these concepts.

>b) A wrong conception
>
>-- did you ask yourself why does a chemist need a structural
>representation? I assume that most probably yes, at least when you started
>working.

We are still asking this question ;-) And it does reflect on the model we are
developing.

>-- did you ask chemists why do they need structural representations?
>Possibly, but probably not many of them.

We do consult with the chemists (mostly with organic chemists). And I think
you've noticed that we keep seeking for collaboration.

>-- did you compare your own answer with that of the chemists? What did
>you decide about the differences?

This is how the model has being modified since the time of its creation.

>c) Inflated package of information (an example)
>-- it's possible to give quite "mathematical" names to organic chemicals.
>Let's take for example
>w-keto-x,y-dimethyl-tetracyclo(j.k.l.m.n)heptadecan-z-ol, where the single
>letters represent certain specific numbers that I'm too lazy to sort out
>now, and add some litte wiggles to make clear where exactly the bridging of
>the structure occurs, and some other to express the chirality of the
>asymmetric carbons. A chemist knowledgable of the nomenclature will be
>able to draw a structure or translate it to your system. However, I doubt
>this strictly correct name will remind him the package of knowledge and
>physiological associations that the equivalent name "testosterone" could
>evoke.

Certainly, the idea of our model is neither to come with the new formal names
of the chemical structures nor to enumerate them (although those problems can
be solved as special cases in our model). The idea of the model was to come
with new formal notions of a chemical structure, chemical class description,
and inductive learning process (learning of a class description based on a
finite set of its representatives). Another important issue was the automation
of the process you described above. In order for a computer to work with the
chemical structures one has to give it a formal description of a model you're
planning to work with. Unfortunately so far, a computer could not be taught
that intuitive understanding of objects and their interactions most of the
scientists have. It is important to note, that the model was based on the
general ETS framework which has the same goals but in more abstract forms.

>-- certainly information about synthesis of testosterone need not be
>included in the structural package. Access to the information on synthesis
>using the structure is fine, but its inclusion in the structure may be
>totally misleading.

Why can it be totally misleading? (Could give me any examples?) In my opinion,
having some extra of correct information on a structure should only improve
our understanding of the structure. Especially when you're comparing this
structure to another one.

>4. Is that effort really necessary?
>
>-- The answer may be a qualified yes. The trouble is that many structures
>are not quite definitely known but certainly are of great interest. Some
>examples can be the humic acids, lignins, tars, the polysaccharides of the
>cell membranes, alumosilicates, inorganic complexes, etc.. The effort to
>develop a mathematical system to include such representations may be
>unwieldy.

Why is it so important to have a structural model? First of all, let's try to
analyze (very briefly) how organic chemists think of the chemical structures
(I asked about that several organic chemists, but please correct me if I'm
wrong). Well, they try to extract a structural information such as functional
groups, their mutual positions, etc. In a class of drugs the knowledge of a
pharmacophore is of extreme importance since it relates to the biological
activity of this class. Functional groups, pharmacophore - these are
substructures of the chemical structures given. So, organic chemists work on
defining those substructures together with their mutual structural position.
The model, in fact, describes the same ideas. When learning a class
description based on a given training set of chemical objects, one has to
extract the common structural parts of these objects (building blocks) and put
those structural parts into mutual relation via representation of the above
chemical objects using these building blocks (here is the "synthetic" way of
thinking).

>-- Do you know about the linear representation of cyclic stuctures the
>Chemical Society (I think) tried to introduce more than 50 years ago?

Yes I do, and as far as I know there are some recent attempts made. The
problem in such representation is that once you put your chemical structures
in the lower dimension space, you'll loose some of the structural information
and thus it'll be harder to work with such representation (I could represent
all the chemical objects as dots on a real-value axis, but it wouldn't be very
"workable"). In order to work effectively with the structural objects, the
encoding of a structural objects in a model should be very flexible and
involve much more than just simple encoding.


Thank you again for your comments and I really hope to hear from you (or may
be from other scientists from this forum who had the patience to read till
the end of the message) soon.

Sincerely,

Dmitry Korkin,
                               PhD candidate
                               Faculty of Computer Science
                               University of New Brunswick
Tel: (506) 451-6931
Fax: (506) 453-3566
WWW: www.cs.unb.ca/~dima/
e-mail: z17b3$##$unb.ca

"...A journey of a thousand miles begins with a single step..."
                                                      Confucious

__________________

ORGLIST - Organic Chemistry Mailing List
Website / Archive / FAQ: http://www.orglist.net/
List coordinator: Joao Aires de Sousa (jas$##$mail.fct.unl.pt)




Date view Thread view Subject view Author view Attachment view

This archive was generated by hypermail 2.1.4 : Fri Sep 19 2003 - 12:16:28 EDT