Skip to content Documentation

CTS-Lite

UC Davis Fiehn Lab

Feedback   

Documentation

CTS-Lite is a lightweight chemical translation service based on the Fiehn Lab's original CTS. It allows users to match InChIs, InChIKeys, SMILES, and Molecular Formulas against an augmented PubChemLite dataset.

Using CTS-Lite

To use CTS-Lite, simply enter InChIs, InChIKeys, SMILES, and Molecular Formulas into the input box on the main page. You can separate entries using spaces, tabs, or newlines. When you click Match, the results will be displayed in the results section. Results can be downloaded in JSON and CSV formats using the buttons provided.

Note: Very large queries may take some time to process. Please be patient while the server handles your request.

REST API

To make queries using the REST API, use the following format:

curl -X POST \ -H "Content-Type: application/json" \ -d '{"queries":"query1 query2 ..."}' \ "cts-lite.metabolomics.us/match"

You can add the text/csv header to your command to get results in CSV format:

curl -X POST \ -H "Content-Type: application/json" \ -H "Accept: text/csv" \ -d '{"queries":"query1 query2 ..."}' \ "cts-lite.metabolomics.us/match"

JSON Example Response

Example query XMBWDFGMSWQBCA-UHDFADDYSA-N bad_query

[
  {
    "query": "XMBWDFGMSWQBCA-UHDFADDYSA-N",
    "query_type": "inchikey",
    "found_match": true,
    "match_level": "First Block",
    "matches": [
      {
        "inchikey": "XMBWDFGMSWQBCA-UHFFFAOYSA-N",
        "first_block": "XMBWDFGMSWQBCA",
        "inchi": "InChI=1S/HI/h1H",
        "smiles": "I",
        "compound_name": "iodane",
        "molecular_formula": "HI",
        "pubmed_count": "10707",
        "patent_count": "363920"
      }
    ],
    "error_message": ""
  },
  {
    "query": "bad_query",
    "query_type": "unidentified",
    "found_match": false,
    "match_level": "",
    "matches": null,
    "error_message": "Invalid query type, could not identify"
  }
]

Query Types

Query types are parsed using the following logic:

  • InChIKeys must be in the format XXXXXXXXXXXXXX-XXXXXXXXXX-X (14-10-1, all uppercase letters)
  • InChIs must start with InChI= (case-sensitive)
  • SMILES are first identified by the presence of structural characters: = # - / \ : . @ + [ ] ( )
  • Molecular Formulas are recognized by starting with letters that cannot be at the start of SMILES: ADEGHKLMRTUVWXYZ
  • SMILES/Mol. Formula some queries, like C are ambiguous, and can be either SMILES or Molecular Formulas. In these cases, the query first tries to match against Molecular Formula, and then SMILES.

Malformed Queries

Malformed queries are identified as follows:

  • InChIKeys that match the regex pattern: ^[a-zA-Z]{12,16}-[a-zA-Z]{9,11}-[a-zA-Z]{0,2}$
  • InChIs that start with InChI=, but with improper capitalization
  • Unidentified are queries that didn't fit any of the above criteria

Match Levels

InChIKey queries can match by first-block if they don't find an exact match. This gives the First Block match level value. SMILES, InChI, and Molecular Formula queries can only be Exact matches.