Monomer Table | Biologics | SARvision
How to Prepare Monomer Tables for Biologic Research
by Mark Hansen, Ph.D.
The Monomer table (also referred to as the Residue Table) contains information about each monomer used in the sequence. Monomer information and parameters give meaning to the sequence and enhances the resulting sequence analysis. These parameters can color monomers by type and property, be used to sort monomers in a coherent way, give structural context in mouse overs, and be used in calculations.
The sequence table shown below illustrates the utility that data in the monomer table can add to sequence analyses. Monomer font colors are red, green and black for natural, enantiomer and unnatural amino acids respectively; background color of individual cells uses the BKG:Hydrophobicity-HW column to color the background based on hydrophobicity of each monomer, mouse over displays the chemical structure and the formal name of a monomer, and finally, sorting a sequence column sorts by the SORTORDER column in the monomer table to group like monomers together. Collectively, additions of these properties significantly enhance interpretability of sequence tables used in analyze activity.
At the bare minimum, a monomer table should contain a structure, naming conventions, and any physico-chemical and coloring parameters that may augment analysis. Additional columns should include a sorting column so that sequence columns in an alignment can be sorted, the closest natural residue to be used as a substitution for alignment algorithms, a category field (e.g. hydrophobic, aromatic, charged….) and a font color to help designate type (e.g. black: natural residue, red: enantiomer, green: unnatural residue, blue: N-methylated…..). An example monomer table is shown below. For naming monomers (short, medium and synonyms), ‘|’ (pipe) is a chain break and should not be used as a character in the names. Similarly privileged are single or double quotes, brackets, ‘#’ and periods: all of these should not be used. However, any character can be used in the long names. Note that if a residue name is used twice for two different structures, then only the last occurrence is retained. Duplicate names should be avoided.
The Monomer table can reside in any of several places. SARvision comes with a default Monomer table stored locally and can be added to manually using Excel or a molecular spreadsheet program. Or the monomers can be stored in Oracle using a molecule registration system and retrieved as necessary by SARvision. One good solution is to use CDDVault to register and store monomers for retrieval on demand by SARvision. This is an excellent way to keep monomers up to date and consistent across multiple research groups.
In addition to the monomer table, SARvision supports the use of a Modifier table. This is not used often but offers the ability to annotate sequences that have modifications that do not exist in the monomer table. For example, isotope labeling or pegylation can be added here. Similar to the Monomer table, these can be edited in excel and saved as a file or can be stored in a database system such as Oracle or CDDVault for on demand retrieval by the program. An example of a Modifier table is shown below. Modifiers use many of the same fields as the monomer table and behave similarly.