Date: Thu Feb 08 2001 - 20:37:17 EST

At 8:04 PM -0500 2/8/01, Avatar wrote:
>I was wondering if anybody has experience using SMILES strings for
>substructure searches? Is it possible? Is it efficient? Do users get the
>hang of it fairly quickly?
>What other mechanisms or representations would you suggest? SLN? Others?

If you're contemplating doing substructure searches by comparing textual
substrings of SMILES strings, don't bother: you can't do substructure
searches that way.

If you're looking at using SMILES strings simply as an input mechanism to
your existing connection table-based substructure search engine, then sure,
that would work. But why bother restricting yourself to SMILES strings?
Take one of the existing Plugin- or Java-based chemical structure
sketchers, and let users draw their queries graphically. See (or any of several other sites) for examples.

Is this efficient? Depends entirely on your search engine and the size of
your database, and what sort of response time you can live with. Any
decent structure search engine should be able to produce results in a few
seconds for databases of fewer than a million entries, which puts the
search time in the same magnitude as the general http overhead (I'm
assuming you're looking at a web-based search).

Do users get the hang of it? Depends on what users you're talking about,
and how interested they are in doing searches. If you have a database of
less than a thousand compounds or so, it's always going to be easier for
the user simply to scan through a list of names. Substructure searching
doesn't really offer benefits until you're dealing with several thousand
compounds, and doesn't really come into its own until you get several tens
of thousands.

Jonathan Brecher
CambridgeSoft Corporation


