Atom Expression

From jV

Jump to: navigation, search

Contents

In order to specify a group of atoms in a molecule, the following five expressions are available.

The standard boolean connectives 'and', 'or' and 'not', and brackets can be used to construct a complex expression. The boolean connectives may be abbreviated as '&', '|', and '!', respectively.

predefined set

element name

Element names, such as carbon or nitrogen, can be used to select atoms.

set based on residue property

a) amino acids in protein molecule

ALAARGASNASPCYS GLUGLNGLYHISILE LEULYSMETPHEPRO SERTHRTRPTYRVAL
Acidic * *
Acyclic ***** **** *** ***
Aliphatic * * * * *
Aromatic * * **
Basic * * *
Buried * * * * ** * *
Charged * * * * *
Cyclic * ** **
Hydrophobic * * * * *** ***
Large * ** ** **** **
Medium *** * * *
Negative * *
Neutral * * * **** * *** *****
Polar **** ** * * **
Positive * * *
Small * *
Surface *** **** * * ** *
Cysteine *
Amino above 20 amino acids + other amino acids
Protein above 'Amino' set + UNK, ACE + FOR (PDB version 2.3 only)

Here, 'other amino acids' stands for components whose type is one of the followings in Chemical Component Dictionary.
D-PEPTIDE LINKING
D-PEPTIDE NH3 AMINO TERMINUS
L-PEPTIDE COOH CARBOXY TERMINUS
L-PEPTIDE LINKING
L-PEPTIDE NH3 AMINO TERMINUS
PEPTIDE LINKING
PEPTIDE-LIKE

b1) nucleotides (for PDB version 2.3)

A C G T U +U I1MA5MCOMC 1MG2MGM2G7MGOMG YGH2U5MUPSU
AT * *
CG **
Purine * *
Pyrimidine * *
DNA ****
RNA *** * ***** ***** ****
Nucleic ***** ***** ***** ****

b2) nucleotides (for PDB version 3)

DA DC DG DT A C G U other DNAother RNA
AT * *
CG **
Purine * * * *
Pyrimidine * * * *
DNA **** *
RNA **** *
Nucleic **** **** **

Here, 'other DNA' and 'other RNA' stand for components whose type is DNA LINKING and RNA LINKING respectively in Chemical Component Dictionary.

c) others

HOHDODSO4PO4
Water **
Ions **
Solvent ****

others

Alpha

atoms whose name is CA.

Hetero

atoms written as HETATM in PDB files or atoms in Solvent set.

Ligand

atoms in Hetero set and not in Solvent set.

Backbone
Mainchain

atoms in Amino set whose name is N, CA, C or O, or atoms in Nuceic set whose name is P,O1P,O2P,O5*,C5*,C4*,O4*,C3*,O3*,C2*,O2* or C1*.

Sidechain

atoms in Amino set or Nucleic set and not in Backbone set.

Bonded

atoms that is connected to at least one other atom.

Selected

atoms currently selected.

Helix

atoms in the alpha-helix structure.

Sheet

atoms in the beta-sheet structure.

Turn

atoms in the turn structure.

comparison operators

Parts of a molecule can be selected using equality, inequality and ordering operators on their properties. Possible operators are =, ==, <>, !=, /=, <, <=, > and >=, and possible property names are as follows.

AtomNo atom ID in PDB files.
ElemNo atomic number.
ResNo residue ID in PDB files.
Radius radius of a ball image of atoms.
Temperature temperature factor of atoms.
Model model ID in PDB files.
File File ID.

residue range

A group of atoms in a molecule can be selected by the residue ID. For example, command 'select 3' selects atoms whose residue ID is 3, and 'select 3-10' selects atoms whose residue ID is larger than or equal to 3 and smaller than or equal to 10. Optionally, the chain ID can be specified after residue range with a colon. For example, command 'select 3:A' selects atoms whose residue ID is 3 in the A chain. The chain ID can be specified in several ways (see primitive expression).

within expression

A within expression selects atoms that exist within a specified distance from another set of atoms. For example, 'select within(3.0, backbone) selects atoms within a 3.0 Angstrom radius of any atom in a protein or nucleic acid backbone.

PDBj expression

A PDBj expression selects a group of atoms according to molecule's properties defined in the PDBMLplus file. It takes the following form;

pdbj:keyword

For example, 'select pdbj:binding'. The keywords available for each molecule are obtained by the 'show pdbj' command.

primitive expression

A primitive expression takes such a form as

residue name[residue ID][:chain ID][.atom name][;alternate location][/model ID][@file ID]

Here, residue name and atom name are three letter and four letter name, respectively, and terms in square brackets can be omitted. For example, command 'select SER.CA' selects all alpha carbon atoms in serine. Residue names that contain numeric characters should be enclosed in square brackets such as '[SO4]'. Expressions are treated in a case-insensitive manner. However, the chain ID can be specified case-sensitively such as 'a_A' or 'a_a', where these mean the chains whose auth_asym_id is 'A' or 'a', respectively. In the same way, you can use label_asym_id to specify chains case-sensitively such as 'l_A'. Note that the label_asym_id is available only for PDBML files and not for PDB flat files.

Personal tools