Protein Structure Modeling - Documentation

Introduction (What and How) Server Home
Input Format External Servers
Glossary FAQ


ProtMod is a protein modeling server. Its task is to predicts 3-dimensional structures of proteins based on their sequences. Instead of generating only one model for a given protein, ProtMod produces a list of protein models and orders them by different model evaluation scores. In ProtMod the protein modeling pipeline is implemented as a tree. The calculation of a single protein model consists of four steps:

(1) Finding a protein structure template related to the target (2) Aligning the target sequence to the template sequence and structure. (3) Building a 3D model of the target protein (4) Evaluation of the model.

In the tree-structured modeling pipeline implemented in ProtMod several different methods are used at each of these steps:

(1) Several templates are used if more than one protein template is available. (2) Different alignments are calculated for each template using different methods of generating alternative or suboptimal alignments. (3) Protein models are built using different modeling programs. (4) The models are evaluated using different scoring methods and an optimal model is selected.

Input Format

There are 3 diffrent input types that you can use to predict the structure of a protein. They are Query By Sequence, Query by Template and Query By Alignment. Accordingly, these 3 input types require increasing knowledge of the query protein and the modeling process from the user.

With Query By Sequence, you just need to give the query(target)'s protein sequence, and let the server to do all 4 steps of the modeling pipeline.

With Query by Template, you need to give both the query's sequence and the template structure you want to use to build the query's structure. Thus, the first step of the modeling pipeline is done by the user. The server will not choose template rather it starts to build the alignment between the query and the template based on the alignment method the user choose.

With Query By Alignment, you not only need to give the query's sequence and the template structure but also need to give the alignement between the sequences of the query and the template. Therefore, the server only need to do the last 2 steps, that is, build the models using the the algnment and the template that the user gave. In this input format, users have the most control of the modeling process. But it requires users to have more understanding of the query protein and the modeling pipeline.

Input Format for Sequence Alignment

(You can copy/paste the alignment output from the FFAS server directly into the alignment input area of this server)

Basically there are 2 types of input formats available.
One is to input the starting indices of the sequences separately from the sequence alignment, such as:

Starting Index of target sequence: 3    Starting Index of template sequence: 10


One is to include the starting indices of the sequences in the sequence alignment, such as:

Starting Index of target sequence:   Starting Index of template sequence:


Thus, you can input the starting indices of the sequences either outside or inside of the sequence alignment, but if you input them both separately and inside in the sequence alignment, the indices included in the sequence alignment are always used.

In addition, the sequence alignment can be broken into multiple pairs of alignment lines with the target sequence in the first line and template sequence in the second line, such as:



------ OR omit the sequence indices in the alignment, but give them separately --------



You can include any number of the white spaces anywhere but newline characters are NOT allowed inside the sequence line. Moreover, Only the starting indices from the first pair of the alignment lines are used if they are given. The rest of the starting indices, or ending indices are ignored.


Model Types

Here, the term of "Model Types" is in the context of protein homology modelling. Protein homology modelling is to predict the structure of a protein (the target), based on the struture of its homolog (the template), provided the sequence alignment between the target protein and the template protein.

Usually the modelling programs try to build models for all residues of the target sequence included in the alignment, although some programs don't build models for target residues that have no equivalent template residues in the sequence alignment. We call the model built by this approach as "All-Atom Model" because essentially all template residues are replaced by target residues according to the sequence alignment.

However, the "Mixed Model" is built by replacing template residues with serine for unconserved residues between the target and the template except for glycines and alanines. The mixed model is usually used for molecular replacement phasing method of crystallography when the All-Atom Model is not accurate enough for its usage.

If you are not sure which model type you should build, most likely you need to build the "All-Atom Model", which is also the default model the server will build.

Alignment Refinement

Sometimes, the input alignment or the alignment resulted from the template searching is not directly used for structure modelling. When you click on the alignment link of a specific model, the server usually shows the history of the alignment refinement from the original alignment given to the alignment used directly for modelling, and it also shows the method by which the alignment is refined.

Currently there are 2 types of alignment refining methods: (1) User;    (2) Blast;   

User - The alignment is refined manually by the user.

Blast - The alignment is refined by "Blast". This is automatically done by the server for almost all the given alignment before it's sent to the modelling programs. This is because most modelling programs require that the template sequence from the alignment is exactly consistent with the sequence from the COOR section of the template PDB file. But very often the temlate sequence from the given alignment is the full template sequence, and usually is longer or has exstra residues comparing to the sequence from the template PDB file. The server uses "Blast" program to correct the alignment by aligning the template sequence from the the alignment and the template sequence from the PDB file.


  1. Why do I have to login?
    With a private account, your data can't be viewed by others. In addition, you can view, modify and download your data at any time. However, due to the limitation of disk storage, please remember we will remove data that are more than 6 months old. Also, if your account remains inactive for more than 1 year, it will be removed too.

To Top