GENSCAN 1.0 Date run: 1-Aug-100 Time: 16:43:38
Sequence HSBA536C5 : 168628 bp : 49.21% C+G : Isochore 2 (43 - 51 C+G%)
Parameter matrix: HumanIso.smat
Predicted genes/exons:
Gn.Ex Type S .Begin ...End .Len Fr Ph I/Ac Do/T CodRg P.... Tscr..
----- ---- - ------ ------ ---- -- -- ---- ---- ----- ----- ------
2.04 PlyA - 7901 7896 6 1.05
2.03 Term - 10642 10463 180 1 0 28 43 120 0.957 -0.89
2.02 Intr - 11044 10815 230 2 2 84 44 310 0.981 23.79
2.01 Init - 14499 13650 850 0 1 126 53 2079 0.818 202.23
2.00 Prom - 16112 16073 40 -5.56
3.00 Prom + 18327 18366 40 -5.06
3.01 Init + 18680 18726 47 1 2 84 105 30 0.585 4.46
3.02 Intr + 23250 23284 35 0 2 151 69 35 0.533 5.77
3.03 Term + 26615 26664 50 0 2 108 43 36 0.267 -1.43
3.04 PlyA + 27305 27310 6 1.05
8.32 PlyA - 114694 114689 6 1.05
8.31 Term - 117609 117581 29 1 2 139 37 35 0.986 1.74
8.30 Intr - 118004 117913 92 1 2 126 77 101 0.988 12.44
8.29 Intr - 121211 121110 102 1 0 85 89 95 0.997 8.59
8.28 Intr - 121457 121327 131 2 2 130 51 125 0.999 12.49
8.27 Intr - 125623 125478 146 2 2 108 92 121 0.958 14.50
8.26 Intr - 126663 126540 124 0 1 113 58 151 0.981 14.76
8.25 Intr - 127050 126896 155 1 2 72 91 196 0.685 18.09
8.24 Intr - 128563 128395 169 1 1 91 72 343 0.999 32.52
8.23 Intr - 129031 128881 151 0 1 68 95 202 0.996 19.06
8.22 Intr - 129561 129425 137 0 2 113 94 171 0.999 19.57
8.21 Intr - 131557 131385 173 2 2 121 94 69 0.957 10.46
8.20 Intr - 131891 131702 190 2 1 126 66 153 0.780 16.06
8.19 Intr - 135872 135738 135 2 0 37 92 171 0.802 13.16
8.18 Intr - 136182 136073 110 1 2 139 33 122 0.867 11.80
8.17 Intr - 136622 136424 199 2 1 96 22 400 0.999 33.12
8.16 Intr - 138994 138726 269 2 2 89 74 152 0.257 11.15
8.15 Intr - 143743 143626 118 1 1 100 63 113 0.289 10.04
8.14 Intr - 144150 144016 135 0 0 43 100 129 0.999 10.36
8.13 Intr - 147107 146994 114 2 0 102 91 154 0.995 17.74
8.12 Intr - 148107 147904 204 0 0 104 92 97 0.839 11.10
8.11 Intr - 149987 149928 60 2 0 114 113 90 0.999 13.03
8.10 Intr - 151157 150965 193 1 1 75 77 125 0.355 9.59
8.09 Intr - 161359 161278 82 2 1 105 95 51 0.520 6.20
8.08 Intr - 163259 163168 92 1 2 117 91 174 0.980 20.24
8.07 Intr - 163512 163411 102 2 0 141 89 85 0.999 13.19
8.06 Intr - 166251 166121 131 0 2 113 81 212 0.999 22.49
8.05 Intr - 166582 166437 146 2 2 111 92 215 0.999 24.20
8.04 Intr - 166905 166782 124 0 1 107 70 221 0.999 22.36
8.03 Intr - 167313 167159 155 1 2 116 89 268 0.999 29.49
8.02 Intr - 167718 167550 169 0 1 96 72 360 0.999 34.72
8.01 Intr - 168007 167857 151 0 1 75 99 227 0.984 22.66
Predicted peptide sequence(s):
Predicted coding sequence(s):
>HSBA536C5|GENSCAN_predicted_peptide_2|419_aa
MAQENAAFSPGQEEPPRRRGRQRYVEKDGRCNVQQGNVRETYRYLTDLFTTLVDLQWRLS
LLFFVLAYALTWLFFGAIWWLIAYGRGDLEHLEDTAWTPCVNNLNGFVAAFLFSIETETT
IGYGHRVITDQCPEGIVLLLLQAILGSMVNAFMVGCMFVKISQPNKRAATLVFSSHAVVS
LRDGRLCLMFRVGDLRSSHIVEASIRAKLIRSRQTLEGEFIPLHQTDLSVGFDTGDDRLF
LVSPLVISHEIDAASPFWEASRRALERDDFEIVVILEGMVEATGMTCQARSSYLVDEGLW
GHRFTSVLTLEDGFYEVDYASFHETFEVPTPSCSARELAEAAARLDAHLYWSIPSRLDEK
RVSPRCDQLPPDPCGRPGARHRYMGNCISEVVEEEEEEEGKAPGNVLKLESPRPPEPQV
>HSBA536C5|GENSCAN_predicted_CDS_2|1260_bp
atggcgcaggagaacgcggccttctcgcccgggcaggaggagccgccgcggcgccgcggc
cgccagcgctacgtggagaaggatggccggtgcaacgtgcagcagggcaacgtgcgcgag
acataccgctacctgacggacctgttcaccacgctggtggacctgcagtggcgcctcagc
ctgttgttcttcgtcctggcctacgcgctcacctggctcttcttcggcgccatctggtgg
ctgatcgcctacggccgcggcgacctggagcacctggaggacaccgcgtggacgccgtgc
gtcaacaacctcaacggcttcgtggccgccttcctcttctccatcgagaccgagaccacc
atcggctacgggcaccgcgtcatcaccgaccagtgccccgagggcatcgtgctgctgctg
ctgcaggccatcctgggctccatggtgaacgccttcatggtgggctgcatgttcgtcaag
atctcgcagcccaacaagcgcgcagccacgctcgtcttctcctcgcacgccgtggtgtcg
ctgcgcgacgggcgcctctgcctcatgttccgcgtgggcgacttgcgctcctcacacata
gtggaggcctccatccgcgccaagctcatccgctcgcgccagacgctggagggcgagttc
atcccgctgcaccagaccgacctcagcgtgggcttcgacacgggagacgaccgcctcttc
ctcgtctcgccgctggttatcagccacgagatcgacgccgccagccccttctgggaggcg
tcgcgccgtgccctcgagagggacgacttcgagatcgtcgttatcctcgagggcatggtg
gaagccacgggaatgacatgccaagctcggagctcctacctggtagacgaggggctgtgg
ggccaccgcttcacgtcagtgctgactctggaggacggcttctacgaagtggactatgcc
agctttcacgagacttttgaggtgcccacaccttcgtgcagtgctcgagagctggcagag
gctgccgcccgccttgatgcccatctctactggtccatccccagccggctggatgagaag
agagtgagtccaaggtgtgaccagcttcctccagacccctgtggcagaccgggggccaga
cacagatacatggggaactgcatatcggaggtggtggaggaggaggaggaggaggaaggc
aaagcccctggaaatgtgctaaagttggaaagtccccgtcccccagaacctcaagtctag
>HSBA536C5|GENSCAN_predicted_peptide_3|43_aa
MNTAAINIHRQIFMWTSSVVKTSFTVTFSSPGVIPPRLPYARE
>HSBA536C5|GENSCAN_predicted_CDS_3|132_bp
atgaatacagctgctataaacatccatcggcagattttcatgtggacgtcttctgtggtg
aagacctccttcactgtgaccttctcctcaccaggtgtgatcccccccaggctcccctat
gcccgtgaatga
>HSBA536C5|GENSCAN_predicted_peptide_8|1429_aa
XEAKACVVHGSDLKDMTSEQLDEILKNHTEIVFARTSPQQKLIIVEGCQRQGAIVAVTGD
GVNDSPALKKADIGIAMGISGSDVSKQAADMILLDDNFASIVTGVEEGRLIFDNLKKSIA
YTLTSNIPEITPFLLFIIANIPLPLGTVTILCIDLGTDMVPAISLAYEAAESDIMKRQPR
NSQTDKLVNERLISMAYGQIGMIQALGGFFTYFVILAENGFLPSRLLGIRLDWDDRTMND
LEDSYGQEWTYEQRKVVEFTCHTAFFASIVVVQWADLIICKTRRNSVFQQGMKNKILIFG
LLEETALAAFLSYCPGMGVALRMYPLKVTWWFCAFPYSLLIFIYDEVRKLILRRYPGDLA
ITKGSSGECKSLRLEKVDLSPSRGCFLPTVELGQLFLGIAMGLWGKKGTVAPHDQSPRRR
PKKGLIKKKMVKREKQKRNMEELKKEVVMDDHKLTLEELSTKYSVDLTKGHSHQRAKEIL
TRGGPNTVTPPPTTPEWVKFCKQLFGGFSLLLWTGAILCFVAYSIQIYFNEEPTKDNLYL
SIVLSVVVIVTGCFSYYQEAKSSKIMESFKNMVPQQALVIRGGEKMQINVQEVVLGDLVE
IKGGDRVPADLRLISAQGCKVDNSSLTGESEPQSRSPDFTHENPLETRNICFFSTNCVEG
TARGIVIATGDSTVMGRIASLTSGLAVGQTPIAAEIEHFIHLITVVAVFLGVTFFALSLL
LGYGWLEAIIFLIGIIVANVPEGLLATVTVCLTLTAKRMARKNCLVKNLEAVETLGSTST
ICSDKTGTLTQNRMTVAHMWFDMTVYEADTTEEQTGKTFTKSSDTWFMLARIAGLCNRAD
FKANQEILPIAKRATTGDASESALLKFIEQSYSSVAEMREKNPKVAEIPFNSTNKYQMSI
HLREDSSQTHVLMMKGAPERILEFCSTFLLNGQEYSMNDEMKEAFQNAYLELGGLGERVL
GFCFLNLPSSFSKGFPFNTDEINFPMDNLCFVGLISMIDPPRAAVPDAVSKCRSAGIKVI
MVTGDHPITAKAIAKGVGIISEGTETAEEVAARLKIPISKVDASAAKAIVVHGAELKDIQ
SKQLDQILQNHPEIVFARTSPQQKLIIVEGCQRLGAVVAVTGDGVNDSPALKKADIGIAM
GISGSDVSKQAADMILLDDNFASIVTGVEEGRLIFDNLKKSIMYTLTSNIPEITPFLMFI
ILGIPLPLGTITILCIDLGTDMVPAISLAYESAESDIMKRLPRNPKTDNLVNHRLIGMAY
GQIGMIQALAGFFTYFVILAENGFRPVDLLGIRLHWEDKYLNDLEDSYGQQWTYEQRKVV
EFTCQTAFFVTIVVVQWADLIISKTRRNSLFQQGMRNKVLIFGILEETLLAAFLSYTPGM
DVALRMYPLKITWWLCAIPYSILIFVYDEIRKLLIRQHPDGWVERETYY
>HSBA536C5|GENSCAN_predicted_CDS_8|4290_bp
nnagaagccaaggcatgcgtggtgcacggctctgacctgaaggacatgacatcggagcag
ctcgatgagatcctcaagaaccacacagagatcgtctttgctcgaacgtctccccagcag
aagctcatcattgtggagggatgtcagaggcagggagccattgtggccgtgacgggtgac
ggggtgaacgactcccctgcattgaagaaggctgacattggcattgccatgggcatctct
ggctctgacgtctctaagcaggcagccgacatgatcctgctggatgacaactttgcctcc
atcgtcacgggggtggaggagggccgcctgatctttgacaacttgaagaaatccatcgcc
tacaccctgaccagcaacatccccgagatcacccccttcctgctgttcatcattgccaac
atccccctacctctgggcactgtgaccatcctttgcattgacctgggcacagatatggtc
cctgccatctccttggcctatgaggcagctgagagtgatatcatgaagcggcagccacga
aactcccagacggacaagctggtgaatgagaggctcatcagcatggcctacggacagatc
gggatgatccaggcactgggtggcttcttcacctactttgtgatcctggcagagaacggt
ttcctgccatcacggctactgggaatccgcctcgactgggatgaccggaccatgaatgat
ctggaggacagctatggacaggagtggacctatgagcagcggaaggtggtggagttcacg
tgccacacggcattctttgccagcatcgtggtggtgcagtgggctgacctcatcatctgc
aagacccgccgcaactcagtcttccagcagggcatgaagaacaagatcctgatttttggg
ctcctggaggagacggcgttggctgcctttctctcttactgcccaggcatgggtgtagcc
ctccgcatgtacccgctcaaagtcacctggtggttctgcgccttcccctacagcctcctc
atcttcatctatgatgaggtccgaaagctcatcctgcggcggtatcctggtgaccttgca
atcacaaaaggttcttctggtgagtgcaagagcctgagactggaaaaggtggacttgtct
cccagtcgaggctgctttcttcccacagttgagctcgggcagctctttctggggatagct
atggggctttgggggaagaaagggacagtggctccccatgaccagagtccaagacgaaga
cctaaaaaagggcttatcaagaaaaaaatggtgaagagggaaaaacagaagcgcaatatg
gaggaactgaagaaggaagtggtcatggatgatcacaaattaaccttggaagagctgagc
accaagtactccgtggacctgacaaagggccatagccaccaaagggcaaaggaaatcctg
actcgaggtggacccaatactgttaccccaccccccaccactccagaatgggtcaaattc
tgtaagcaactgttcggaggcttctccctcctactatggactggggccattctctgcttt
gtggcctacagcatccagatatatttcaatgaggagcctaccaaagacaacctctacctg
agcatcgtactgtccgtcgtggtcatcgtcactggctgcttctcctattatcaggaggcc
aagagctccaagatcatggagtcttttaagaacatggtgcctcagcaagctctggtaatt
cgaggaggagagaagatgcaaattaatgtacaagaggtggtgttgggagacctggtggaa
atcaagggtggagaccgagtccctgctgacctccggcttatctctgcacaaggatgtaag
gtggacaactcatccttgactggggagtcagaaccccagagccgctcccctgacttcacc
catgagaaccctctggagacccgaaacatctgcttcttttccaccaactgtgtggaagga
accgcccggggtattgtgattgctacgggagactccacagtgatgggcagaattgcctcc
ctgacgtcaggcctggcggttggccagacacctatcgctgctgagatcgaacacttcatc
catctgatcactgtggtggccgtcttccttggtgtcactttttttgcgctctcacttctc
ttgggctatggttggctggaggctatcatttttctcattggcatcattgtggccaatgtg
cctgaggggctgttggccacagtcactgtgtgcctgaccctcacagccaagcgcatggcg
cggaagaactgcctggtgaagaacctggaggcggtggagacgctgggctccacgtccacc
atctgctcagacaagacgggcaccctcacccagaaccgcatgaccgtcgcccacatgtgg
tttgatatgaccgtgtatgaggccgacaccactgaagaacagactggaaaaacatttacc
aagagctctgatacctggtttatgctggcccgaatcgctggcctctgcaaccgggctgac
tttaaggctaatcaggagatcctgcccattgctaagagggccacaacaggtgatgcttcc
gagtcagccctcctcaagttcatcgagcagtcttacagctctgtggcggagatgagagag
aaaaaccccaaggtggcagagattccctttaattctaccaacaagtaccagatgtccatc
caccttcgggaggacagctcccagacccacgtactgatgatgaagggtgctccggagagg
atcttggagttttgttctacctttcttctgaatgggcaggagtactcaatgaacgatgaa
atgaaggaagccttccaaaatgcctacttagaactgggaggtctgggggaacgtgtgcta
ggcttctgcttcttgaatctgcctagcagcttctccaagggattcccatttaatacagat
gaaataaatttccccatggacaacctttgttttgtgggcctcatatccatgattgaccct
ccccgagctgcagtgcctgatgctgtgagcaagtgtcgcagtgcaggaattaaggtgatc
atggtaacaggagatcatcccattacagctaaggccattgccaagggtgtgggcatcatc
tcagaaggcactgagacggcagaggaagtcgctgcccggcttaagatccctatcagcaag
gtcgatgccagtgctgccaaagccattgtggtgcatggtgcagaactgaaggacatacag
tccaagcagcttgatcagatcctccagaaccaccctgagatcgtgtttgctcggacctcc
cctcagcagaagctcatcattgtcgagggatgtcagaggctgggagccgttgtggccgtg
acaggtgacggggtgaacgactcccctgcgctgaagaaggctgacattggcattgccatg
ggcatctctggctctgacgtctctaagcaggcagccgacatgatcctgctggatgacaac
tttgcctccatcgtcacgggggtggaggagggccgcctgatctttgacaacctgaagaaa
tccatcatgtacaccctgaccagcaacatccccgagatcacgcccttcctgatgttcatc
atcctcggtatacccctgcctctgggaaccataaccatcctctgcattgatctcggcact
gacatggtccctgccatctccttggcttatgagtcagctgaaagcgacatcatgaagagg
cttccaaggaacccaaagacggataatctggtgaaccaccgtctcattggcatggcctat
ggacagattgggatgatccaggctctggctggattctttacctactttgtaatcctggct
gagaatggttttaggcctgttgatctgctgggcatccgcctccactgggaagataaatac
ttgaatgacctggaggacagctacggacagcagtggacctatgagcaacgaaaagttgtg
gagttcacatgccaaacggccttttttgtcaccatcgtggttgtgcagtgggcggatctc
atcatctccaagactcgccgcaactcacttttccagcagggcatgagaaacaaagtctta
atatttgggatcctggaggagacactcttggctgcatttctgtcctacactccaggcatg
gacgtggccctgcgaatgtacccactcaagataacctggtggctctgtgccattccctac
agtattctcatcttcgtctatgatgaaatcagaaaactcctcatccgtcagcacccggat
ggctgggtggaaagggagacgtactactaa
Explanation
Gn.Ex : gene number, exon number (for reference)
Type : Init = Initial exon
Intr = Internal exon
Term = Terminal exon
Sngl = Single-exon gene
Prom = Promoter
PlyA = poly-A signal
S : DNA strand (+ = input strand; - = opposite strand)
Begin : beginning of exon or signal (numbered on input strand)
End : end point of exon or signal (numbered on input strand)
Len : length of exon or signal (bp)
Fr : reading frame (a codon ending at x is in frame f = x mod 3)
Ph : net phase of exon (length mod 3)
I/Ac : initiation signal or acceptor splice site score (x 10)
Do/T : donor splice site or termination signal score (x 10)
CodRg : coding region score (x 10)
P : probability of exon (sum over all parses containing exon)
Tscr : exon score (depends on length, I/Ac, Do/T and CodRg scores)
Comments
The SCORE of a predicted feature (e.g., exon or splice site) is a
log-odds measure of the quality of the feature based on local sequence
properties. Thus, for example, a predicted donor splice site with
score > 100 is excellent; 50-100 is acceptable; 0-50 is weak; and
below 0 is poor (probably not a real donor site).
The PROBABILITY of a predicted exon is the estimated probability under
GENSCAN's model of genomic sequence structure that the exon is correct.
This probability depends in general on global as well as local sequence
properties. This information can be used to assess the reliability of the
predicted exon, e.g., it would be better to design PCR primers based on
a predicted exon with probability > 0.95 than one with lower probability.