Match-Box Web Server 1.3

Guy Baudoux, Christophe Lambert, Ernest Feytmans and Eric Depiereux.

Laboratory of Structural Molecular Biology, The University of Namur, Belgium.

Project supported by the Walloon Government, Federal State of Belgium.

Return to SUBMIT | HELP page.


The Mammalian Serine Proteases example.

The mammalian serine proteases are used by Greer (1,2) to present the methods of comparative modelling of protein structures. If you send the following data to the Match-Box server:

>>>>>>>>>>>> cut here - begin of file - cut here <<<<<<<<<<<<<<
>CHT
IVNGEEAVPGSWPWQVSLQDK
TGFHFCGGSLINENWVVTAAHCGVTTSDVVVAGEFDQGSSSEKIQKLKIA
KVFKNSKYNSLTINNDITLLKLSTAASFSQTVSAVCLPSA
SDDFAAGTTCVTTGWGLTRYTNANTPDRLQQASLPLLSNTNCKKYWGTKI
KDAMICAGASGVSSCMGDSGGPLVCKKNGAWTLVGIVSWGSSTCS
TSTPGVYARVTALVNWVQQTLAAN
>TRP
IVGGYTCGANTVPYQVSLN
SGYHFCGGSLINSQWVVSAAHCYKSGIQVRLGQDNINVVEGNQQFISASK
SIVHPSYNSNTLNNDIMLIKLKSAASLNSRVASISLPTS
CASAGTQCLISGWGNTKSSGTSYPDVLKCLKAPILSNSSCKSAYPGQI
TSNMFCAGYLQGGKDSCQGDSGGPVVCSGKLQGIVSWGSGCAQ
KNKPGVYTKVCNYVSWIKQTIASN
>ELA
VVASLVLYGHSTQDFPETNARVVGGTEAQRNSWPSQISLQYRSGS
SWAHTCGGTLIRQNWVMTAAHCVDRELTFRVVVGEHNLNQNNGTEQYVGV
QKIVVHPYWNTDDVAAGYDIALLRLAQSVTLNSYVQLGVLPRA
GTILANNSPCYITGWGLTRTNGQLAQTLQQAYLPTVDYAICSSSSYWGST
VKNSMVCAGGDGVRSGCQGDSGGPLHCLVNGQYAVHGVTSFVSRLGCNV
TRKPTVFTRVSAYISWINNVIASN
>MCP
IIGGVESIPHSRPYMAHLDIVTEK
GLRVICGGFLISRQFVLTAAHCKGREITVILGAHDVRKAESTQQKIKVEK
QIIHESYNSVPNLHDIMLLKLEKKVELTPAVNVVPLPSP
SDFIHPGAMCWAAGWGKTGVRDPTSYTLREVELRIMDEKACVDYRYYEYK
FQVCVGSPTTLRAAFMGDSGGPLLCAGVAHGIVSYGHPD
AKPPAIFTRVSTYVPTINAVIN
>SGT
VVGGTRAAQGEFPFMV
RLSMGCGGALYAQDIVLTAAHCVSGSGNNTSITATGGVVDLQSGAAVKVR
STKVLQAPGYNGTGKDWALIKLAQPINQPTLKIATTTAYN
QGTFTVAGWGANREGGSQQRYLLKANVPFVSDAACRSAYGNELV
ANEEICAGYPDTGGVDTCQGDSGGPMFRKDNADEWIQVGIVSWGYGCAR
PGYPGVYTEVSTFASAIASAARTL
>TON
IVGGYKCEKNSQPWQVAV
INEYLCGGVLIDPSWVITAAHCYSNNYQVLLGRNNLFKDEPFAQRRLVRQ
SFRHPDYIPLIVTNDTEQPVHDHSNDLMLLHLSEPADITGGVKVIDLPTK
EPKVGSTCLASGWGSTNPSEMVVSHDLQCVNIHLLSNEKCIETYKDNV
TDVMLCAGEMEGGKDTCAGDSGGPLICDGVLQGITSGGATPCAK
PKTPAIYAKLIKFTSWIKKVMKENP
>>>>>>>>>>>> cut here - begin of file - cut here <<<<<<<<<<<<<<

The Match-Box server will give you the following alignment:

Sequences number, length and name
_________________________________

 1   230 CHT         2   223 TRP         3   261 ELA         4   224 MCP       
 5   223 SGT         6   235 TON       

           10        20        30        40        50        60        70
            +         +         +         +         +         +         +
1  ---------------------ivngeeavpgswpwqvsLQD---ktgfhfcggslinenwvvtaahcgvt
2  ---------------------ivggytcgantvpyqvsL-----nsgyhfcggslinsqwvvsaahcyks
3  VVASLVLYGHSTQDFPETNARvvggteaqrnswpsqisLQYRSGsswahtcggtlirqnwvmtaahcvdr
4  ---------------------iiggvesiphsrpymahLDIVTEkglrvicggflisrqfvltaahckgr
5  ---------------------vvggtraaqgefpfm--------vrlsmgcggalyaqdivltaahcvsg
6  ---------------------ivggykceknsqpwqva------vineylcggvlidpswvitaahcysn

                        66666666666666666      66555555555555444444444444

           80        90       100       110       120       130       140
            +         +         +         +         +         +         +
1  tsdvVVAGEFDQGSSSEKIQKLKIA--kvfknskynsLTI-----------nnditllklstaasfSQTV
2  giqvRLGQDNINVVEGNQQFISAS---ksivhpsynsNTL-----------nndimliklksaaslNSRV
3  eltfRVVVGEHNLNQNNGTEQYVGVQ-kivvhpywntDDVAA---------gydiallrlaqsvtlNSYV
4  eitvILGAHDVRKAESTQQKIKVE---kqiihesynsVPN-----------lhdimllklekkvelTPAV
5  sgnnTSITATGGVVDLQSGAAVKVRSTkvlqapgyngT-------------gkdwaliklaqpinqPTLK
6  nyqvLLGRNNLFKDEPFAQRRLVR---qsfrhpdyipLIVTNDTEQPVHDHsndlmllhlsepadiTGGV

   5566                       6666666666              555555555555556    

          150       160       170       180       190       200       210
            +         +         +         +         +         +         +
1  SAVCLPSASDDFAAGTTCvttgwgltrytnaNTPDRLQQASLPLLSNTNCKKYWGT-kikdamicagasG
2  ASISLPTSCASAGTQC--lisgwgntkssgtSYPDVLKCLKAPILSNSSCKSAYPG-qitsnmfcagylQ
3  QLGVLPRAGTILANNSPCyitgwgltrtngqLAQTLQQAYLPTVDYAICSSSSYWGStvknsmvcaggdG
4  NVVPLPSPSDFIHPGAMCwaagwgktgvrdpTSYTLREVEL----------------rimdekacvdyrY
5  IATTTAYNQGTF------tvagwganreggsQQRYLLKANVPFVSDAACRSAYGNE-lvaneeicagypD
6  KVIDLPTKEPKVGSTC--lasgwgstnpsemVVSHDLQCVNIHLLSNEKCIETYKD-nvtdvmlcagemE

                     4444444444456                          555555555666 

          220       230       240       250       260       270       280
            +         +         +         +         +         +         +
1  V---------------sscmgdsggplvckkngAWT-lvgivswgsstCS--tstpgvyarvtalvnwvQ
2  GGK-------------dscqgdsggpvvcsgk-----lqgivswgsgcAQ--knkpgvytkvcnyvswiK
3  VR--------------sgcqgdsggplhclvngQYA-vhgvtsfvsrlGCNVtrkptvftrvsayiswiN
4  YEYKFQVCVGSPTTLRaafmgdsggpllcagv-----ahgivsyghpd----akppaiftrvstyvptiN
5  TGGV------------dtcqgdsggpmfrkdnaDEWIqvgivswgygcAR--pgypgvytevstfasaiA
6  GGK-------------dtcagdsggplicdgv-----lqgitsggatpCAK-pktpaiyaklikftswiK

                   44444444444444556    66666666666    55555555555566666 

          290       300       310       320       330       340       350
            +         +         +         +         +         +         +
1  QTLAAN
2  QTIASN
3  NVIASN
4  AVIN--
5  SAARTL
6  KVMKEN


Discussion:

The amino-acids in lower case are aligned, and the amino-acids in upper case are not aligned. The Match-Box program tries to find "boxes" (like the blocks of the BLOCKMAKER (3) server), or segments that are similar in all the sequences submitted by the user.

Amino-acids in lowercase are aligned to gaps when two boxes overlap over several positions. This overlapping region may be precisely delineated from the list of the boxes in the table 3 of the alignment listing.

This occur when several possible alignments lead to a comparable level of similarity. Additional information, such as comparaison with known structure or site mutagenesis results may be considered in order to fix the gap position.


References:

  1. Greer, J. (1990). Comparative Modeling Methods: Application to the Family of the Mammalian Serine Proteases. Prot. Stru. Funct. Gen. 7, 317-334.
  2. Greer, J. (1991). Comparative Modeling of Homologous proteins. Meth. Enzymo. 202, 232-252.
  3. Henikoff, S., Henikoff, J.G., Alford, W.J., Pietrokovski, S. (1995). Automated construction and graphical presentation of protein blocks from unaligned sequences. Gene-COMBIS, Gene 13, GC 17-26.

Return to SUBMIT | HELP page.


Webmaster