Atom Expression
From jV
|
In order to specify a group of atoms in a molecule, the following five expressions are available.
The standard boolean connectives 'and', 'or' and 'not', and brackets can be used to construct a complex expression. The boolean connectives may be abbreviated as '&', '|', and '!', respectively.
predefined set
element name
Element names, such as carbon or nitrogen, can be used to select atoms.
set based on residue property
a) amino acids in protein molecule
| ALA | ARG | ASN | ASP | CYS | GLU | GLN | GLY | HIS | ILE | LEU | LYS | MET | PHE | PRO | SER | THR | TRP | TYR | VAL | |
| Acidic | * | * | ||||||||||||||||||
| Acyclic | * | * | * | * | * | * | * | * | * | * | * | * | * | * | * | |||||
| Aliphatic | * | * | * | * | * | |||||||||||||||
| Aromatic | * | * | * | * | ||||||||||||||||
| Basic | * | * | * | |||||||||||||||||
| Buried | * | * | * | * | * | * | * | * | ||||||||||||
| Charged | * | * | * | * | * | |||||||||||||||
| Cyclic | * | * | * | * | * | |||||||||||||||
| Hydrophobic | * | * | * | * | * | * | * | * | * | * | ||||||||||
| Large | * | * | * | * | * | * | * | * | * | * | * | |||||||||
| Medium | * | * | * | * | * | * | ||||||||||||||
| Negative | * | * | ||||||||||||||||||
| Neutral | * | * | * | * | * | * | * | * | * | * | * | * | * | * | * | * | ||||
| Polar | * | * | * | * | * | * | * | * | * | * | ||||||||||
| Positive | * | * | * | |||||||||||||||||
| Small | * | * | ||||||||||||||||||
| Surface | * | * | * | * | * | * | * | * | * | * | * | * | ||||||||
| Cysteine | * | |||||||||||||||||||
| Amino | above 20 amino acids + other amino acids | |||||||||||||||||||
| Protein | above 'Amino' set + UNK, ACE + FOR (PDB version 2.3 only) | |||||||||||||||||||
Here, 'other amino acids' stands for components whose type is one of the followings in Chemical Component Dictionary.
D-PEPTIDE LINKING
D-PEPTIDE NH3 AMINO TERMINUS
L-PEPTIDE COOH CARBOXY TERMINUS
L-PEPTIDE LINKING
L-PEPTIDE NH3 AMINO TERMINUS
PEPTIDE LINKING
PEPTIDE-LIKE
b1) nucleotides (for PDB version 2.3)
| A | C | G | T | U | +U | I | 1MA | 5MC | OMC | 1MG | 2MG | M2G | 7MG | OMG | YG | H2U | 5MU | PSU | |
| AT | * | * | |||||||||||||||||
| CG | * | * | |||||||||||||||||
| Purine | * | * | |||||||||||||||||
| Pyrimidine | * | * | |||||||||||||||||
| DNA | * | * | * | * | |||||||||||||||
| RNA | * | * | * | * | * | * | * | * | * | * | * | * | * | * | * | * | * | * | |
| Nucleic | * | * | * | * | * | * | * | * | * | * | * | * | * | * | * | * | * | * | * |
b2) nucleotides (for PDB version 3)
| DA | DC | DG | DT | A | C | G | U | other DNA | other RNA | |
| AT | * | * | ||||||||
| CG | * | * | ||||||||
| Purine | * | * | * | * | ||||||
| Pyrimidine | * | * | * | * | ||||||
| DNA | * | * | * | * | * | |||||
| RNA | * | * | * | * | * | |||||
| Nucleic | * | * | * | * | * | * | * | * | * | * |
Here, 'other DNA' and 'other RNA' stand for components whose type is
DNA LINKING and RNA LINKING respectively in Chemical Component Dictionary.
c) others
| HOH | DOD | SO4 | PO4 | |
| Water | * | * | ||
| Ions | * | * | ||
| Solvent | * | * | * | * |
others
| Alpha |
atoms whose name is CA. |
| Hetero |
atoms written as HETATM in PDB files or atoms in Solvent set. |
| Ligand |
atoms in Hetero set and not in Solvent set. |
| Backbone Mainchain |
atoms in Amino set whose name is N, CA, C or O, or atoms in Nuceic set whose name is P,O1P,O2P,O5*,C5*,C4*,O4*,C3*,O3*,C2*,O2* or C1*. |
| Sidechain |
atoms in Amino set or Nucleic set and not in Backbone set. |
| Bonded |
atoms that is connected to at least one other atom. |
| Selected |
atoms currently selected. |
| Helix |
atoms in the alpha-helix structure. |
| Sheet |
atoms in the beta-sheet structure. |
| Turn |
atoms in the turn structure. |
comparison operators
Parts of a molecule can be selected using equality, inequality and ordering operators on their properties. Possible operators are =, ==, <>, !=, /=, <, <=, > and >=, and possible property names are as follows.
| AtomNo | atom ID in PDB files. |
| ElemNo | atomic number. |
| ResNo | residue ID in PDB files. |
| Radius | radius of a ball image of atoms. |
| Temperature | temperature factor of atoms. |
| Model | model ID in PDB files. |
| File | File ID. |
residue range
A group of atoms in a molecule can be selected by the residue ID. For example, command 'select 3' selects atoms whose residue ID is 3, and 'select 3-10' selects atoms whose residue ID is larger than or equal to 3 and smaller than or equal to 10. Optionally, the chain ID can be specified after residue range with a colon. For example, command 'select 3:A' selects atoms whose residue ID is 3 in the A chain. The chain ID can be specified in several ways (see primitive expression).
within expression
A within expression selects atoms that exist within a specified distance from another set of atoms. For example, 'select within(3.0, backbone) selects atoms within a 3.0 Angstrom radius of any atom in a protein or nucleic acid backbone.
PDBj expression
A PDBj expression selects a group of atoms according to molecule's properties defined in the PDBMLplus file. It takes the following form;
pdbj:keyword
For example, 'select pdbj:binding'. The keywords available for each molecule are obtained by the 'show pdbj' command.
primitive expression
A primitive expression takes such a form as
residue name[residue ID][:chain ID][.atom name][;alternate location][/model ID][@file ID]
Here, residue name and atom name are three letter and four letter name, respectively, and terms in square brackets can be omitted. For example, command 'select SER.CA' selects all alpha carbon atoms in serine. Residue names that contain numeric characters should be enclosed in square brackets such as '[SO4]'. Expressions are treated in a case-insensitive manner. However, the chain ID can be specified case-sensitively such as 'a_A' or 'a_a', where these mean the chains whose auth_asym_id is 'A' or 'a', respectively. In the same way, you can use label_asym_id to specify chains case-sensitively such as 'l_A'. Note that the label_asym_id is available only for PDBML files and not for PDB flat files.
