# Introduction to CONSRANK

CONSRANK (CONSensus-RANKing) is a web service to easily and effectively analyse and rank docking models of protein-protein and protein-nucleic acid complexes, based on the frequency of inter-residue contacts.

The user only needs to:
i) upload her/his own PDB formatted files and
ii) specify the chain identifiers for the molecules involved in the interaction to be analysed. Please note that more chains can be selected for each interacting partner.

CONSRANK (CONSensus-RANKing) is a web service to easily and effectively analyse and rank docking models of biological complexes (such as protein-protein, protein-DNA and protein-RNA complexes), based on the conservation (frequency) of inter-residue contacts.

A user-friendly interface allows the user to upload her/his own input PDB files separately or in an archive file. The user is also requested to specify the chain identifiers for the molecules involved in the interaction to be analyzed. A maximum of two chains can be selected for each interacting partner.

CONSRANK outputs are displayed on the results HTML page for one week and archived as downloadable compressed files. CONSRANK output includes:
• i) a list of observed inter-residue contacts in the models ensemble with relative conservation rate (CRkl), that is also visualized as a Sankey diagram;
• ii) the ranking of the submitted models based on the CONSRANK normalized score (\bar S_i).
• iii) parameters(C50, C70, C90) reflecting the overall conservation of inter-residue contacts in the models ensemble.
Furthermore, CONSRANK gives as output a “consensus map”, i.e. an intermolecular contact map where the conservation of inter-residue contacts is reported on a gray scale. This is an interactive map that can be zoomed and navigated to visualize the identity of the residue pairs corresponding to a given contact (dot) and its conservation rate. An interactive 3D representation of the consensus map, where the third dimension is given by the conservation rate (CRkl) of each inter-residue contact is also provided. Note that only contacts with a conservation rate of at least 0.01, are visualized in the 3D map (for targets having a maximum conservation rate of 0.015 or below, the threshold is lowered to 0.005). This is the output that is sent to the E-mail address (if provided).

Once the general output has been generated, the user can chose to perform further analyses on single models. By clicking on on a specific model name, two routines are activated. I) First, CONSRANK will calculate the inter-molecular contact map for the single model and visualize the corresponding contacts colored and superimposed on the consensus map. The 3D consensus map will also be modified to report, in the same color, the inter-residue contacts present in the specified model. The last three selected models will contemporarily be shown, in different colors, in such representation. This will help the user to easily visualize how much the models resemble each other and how well each of them reflects the overall consensus, in terms of inter-residue contacts. II) Second, the COCOMAPS server will be launched on the selected model(s), to provide detailed information on its(their) interface. This includes tables reporting the model interacting residues, residues at the interface (defined on the basis of the buried surface upon complex formation) and inter-molecular H-bonds, plus an online 3D visualization of the complex in JMol ( http://www.jmol.org/) and a downlodable ready-to-run Pymol (Delano Scientific, 2002) script, which generates a visualization of the interface.
Given an ensemble of N models of the same biomolecular complex, for each inter- residue contact we define the conservation rate, CRkl, as in Eq. 1,

CR_(kl) = (nc_(kl))/N (1)

where nckl is the total number of models where residues k and l are in contact. The conservation rate thus ranges between CRkl = 0, if the contact between residues k and l is never observed, to CRkl = 1, if the contact is observed in all the models. Once the conservation rates have been calculated, the models in the ensemble are ranked according to their ability to match the most conserved inter-residue contacts. To this aim, for each model i we first calculate a score as in Eq. 2:

S_(i) = sum_{1}^(M_i) CR_(kl) (2)

where Mi is the total number of contacts in model i. Then, we calculate a normalized score, \bar S_i, as in Eq. 3:

\bar S_i = S_i/M_i (3)

Note that the normalized score \bar S_i of Eq. 3 coincides with the average conservation of the inter-residue contacts in each model. Models are ranked according to their \bar S_i value. Within this work, two residues are defined in contact if any pair of atoms belonging to the two residues is closer than a cut-off distance of 5 Å.

C50 represents the fraction of inter-residue contacts common to 50% of the analysed models and is calculated as in Eq. 4, where nc50 is the total number of inter-residue contacts conserved in 50 % of the analysed models.

C_(50) = ( nc_(50) ) / ( sum_{1}^(N) (nc_i) / N ) (4)

C70 represents the fraction of inter-residue contacts common to 70% of the analysed models and is calculated as in Eq. 5, where nc70 is the total number of inter-residue contacts conserved in 70% of the analysed models.

C_(70) = ( nc_(70) ) / ( sum_{1}^(N) (nc_i) / N ) (5)

C90 represents the fraction of inter-residue contacts common to 90% of the analysed models and is calculated as in Eg. 6, where C90 is the total number of inter-residue contacts conserved in 90 % of the analysed models.

C_(90) = ( nc_(90) ) / ( sum_{1}^(N) (nc_i) / N ) (6)

The total number of inter-residue contacts in an ensemble of N models, Nt, is calculated as in Eq. 7.

N_(t) = sum_{1}^(N) nc_i (7)

Job Name:Optional. Any string can identify your result. If you leave it blank, a default name will be used. Don't use white-space separated words.
E-mail: Optional. If a valid e-mail address is provided, a link to a page on the server is e-mailed to the user, where she/he can access to her/his results (which are also displayed online) for one month. Please note that the link to the general output will be e-mailed to the user, not including later detailed analyses on single models.
Chain/s Molecule 1: Write a white-space separated single letters to identify the chain/s belonging to the first interacting partner (e.g A or A B)
Chain/s Molecule 2: Write a white-space separated single letters to identify the chain/s belonging to the second interacting partner (e.g C or C D)
Name Molecule 1: Optional. Insert the name of the first interacting molecule of the complex (e.g. barnase). Otherwise, the generic name "Molecule 1" will be used.
Name Molecule 2: Optional. Insert the name of the second interacting molecule of the complex (e.g. barstar). Otherwise, the generic name "Molecule 2" will be used.
File: Upload your input PDB files separately, using the “+” button, or as an archive (.zip, .tar) file.
Submit: Click on this button to submit the analysis
The output section “Table of inter-residue contacts with relative conservation rates (CRkl)” reports all the inter-residue contacts found in at least one of the analysed models with corresponding conservation rates (CRkl). It can be sorted according to different parameters and downloaded in several formats (.xls, .pdf, etc.).

The output section “Models ranking” reports the models ranking according to CONSRANK. By clicking on a model name it is possible to perform further analyses on it...

The output section “Consensus map” reports the consensus map, i.e. an intermolecular contact map where the conservation of inter-residue contacts is reported on a gray scale. It presents a dot at the crossover of two residues i and j, belonging to Molecule1 and Molecule2, respectively, if any heavy atom of the two residues are closer than 5 Å in at least one analysed model. The higher the conservation rate (CRkl) of the contact in the analysed models ensemble , the darker the dot.
By clicking on it, an interactive map is generated, which can be zoomed and navigated to visualize the identity of the residue pairs corresponding to a given contact (dot) and its conservation rate. An interactive 3D representation of the consensus map, where the third dimension is given by the conservation rate (CRkl) of each inter-residue contact is also provided. This is the output that is sent to the E-mail address (if provided).
If further analyses have been performed on single models, they will be visualized on the interactive Consensus map ...

The output section “Overall conservation scores” reports three parameters, C50, C70, C90, which reflect the overall conservation of the inter-residue contacts in the analysed ensemble of models. In particular, C50, C70 and C90 represent the fraction of inter-residue contacts common to 50%, 70% and 90% of the models, respectively (see Parameters).

In the 2D and 3D maps each model name is preceded by its rank and followed by the total number of contacts it presents (in brackets).