[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
MEME job 14367 MAST analysis: Dean"s Sequences
From: |
John W. Fondon III (Trey) |
Subject: |
MEME job 14367 MAST analysis: Dean"s Sequences |
Date: |
Thu, 12 Feb 1998 15:06:35 -0600 |
>From: address@hidden
>Date: Tue, 10 Feb 1998 14:03:32 +0100 (MET)
>To: address@hidden
>Subject: MEME job 14367 MAST analysis: Dean"s Sequences
>
>***************************************************************************
>*****
>MAST - Motif Alignment and Search Tool
>***************************************************************************
>*****
> MAST version 2.0 (Release date: 1996/11/17 00:39:06)
>
> For further information on how to interpret these results or to get
> a copy of the MAST software please access http://www.sdsc.edu/MEME.
>***************************************************************************
>*****
>
>
>***************************************************************************
>*****
>REFERENCE
>***************************************************************************
>*****
> If you use this program in your research, please cite:
>
> Timothy L. Bailey and Charles Elkan,
> "Fitting a mixture model by expectation maximization to discover
> motifs in biopolymers", Proceedings of the Second International
> Conference on Intelligent Systems for Molecular Biology, pp. 28-36,
> AAAI Press, Menlo Park, California, 1994.
>***************************************************************************
>*****
>
>
>***************************************************************************
>*****
>DATABASE AND MOTIFS
>***************************************************************************
>*****
> DATABASE meme.14367.data
> Last updated on Tue Feb 10 14:02:21 1998
> Database contains 5 sequences, 748 residues
>
> MOTIFS meme.14367.results
> MOTIF WIDTH BEST POSSIBLE MATCH
> ----- ----- -------------------
> 1 8 EVKCYMAC
> 2 8 CLNETGAT
> 3 11 PEDHCEAAFAY
>***************************************************************************
>*****
>
>
>***************************************************************************
>*****
>EXPLANATION OF RESULTS
>***************************************************************************
>*****
> SECTION I: HIGH-SCORING SEQUENCES
> - the names of sequences containing occurrences of the
> motif(s)
> SECTION II: MOTIF DIAGRAMS
> - the order and spacing of non-overlapping occurrences
> of the motif(s) in each of the high-scoring sequences
> SECTION III: ANNOTATED SEQUENCES
> - the high-scoring sequences annotated with the
> positions and strengths of the motif occurrences
>***************************************************************************
>*****
>
>
>***************************************************************************
>*****
>SECTION I: HIGH-SCORING SEQUENCES
>***************************************************************************
>*****
> - Each of the following 5 sequences has e-value of less than 10.
> - The e-value of a sequence is the expected number of sequences
> in a random database of the same size that would match the motifs as
> well as the sequence does and is equal to the combined p-value of the
> sequence times the number of sequences in the database.
> - The combined p-value of a sequence measures the strength of the
> match of the sequence to all the motifs and is calculated by
> o finding the score of the single best match of each motif
> to the sequence (best matches may overlap),
> o calculating the sequence p-value of each score,
> o forming the product of the p-values,
> o taking the p-value of the product.
> - The sequence p-value of a score is defined as the
> probability of a random sequence of the same length containing
> some match with as good or better a score.
> - The score for the match of a position in a sequence to a motif
> is computed by by summing the appropriate entry from each column of
> the position-dependent scoring matrix that represents the motif.
> - Sequences shorter than one or more of the motifs are skipped.
> - The table is sorted by increasing e-value.
>***************************************************************************
>*****
>
>SEQUENCE NAME DESCRIPTION E-VALUE
>LENGTH
>------------- ----------- --------
>------
>PBP-2 1.7e-18
>150
>PBP-5 2.5e-16
>143
>PBP-1 2e-09
>148
>PBP-3 4.4e-08
>154
>LUSH 0.015
>153
>
>
>***************************************************************************
>*****
>SECTION II: MOTIF DIAGRAMS
>***************************************************************************
>*****
> - The ordering and spacing of all non-overlapping motif occurrences
> are shown for each high-scoring sequence listed in Section I.
> - A motif occurrence is defined as a position in the sequence whose
> match to the motif has position p-value less than the value
> given below in the legend.
> - The position p-value of a match is the probability of a single
> random subsequence of the length of the motif scoring at least
>as well
> as the observed match.
> - For each sequence, all motif occurrences are shown unless there
> are overlaps. In that case, a motif occurrence is shown only if its
> p-value is less than the product of the p-values of the other
> (lower-numbered) motif occurrences that it overlaps.
> - The table also shows the e-value of each sequence.
>
> LEGEND
> ----------------------------------------------------------------------
> -d- `d' residues separate the end of the preceding motif occurrence
> and the start of the following motif occurrence
> [n] occurrence of motif `n' with p-value less than 0.0001
>***************************************************************************
>*****
>
>SEQUENCE NAME E-VALUE MOTIF DIAGRAM
>------------- -------- -------------
>PBP-2 1.7e-18 16-[2]-16-[2]-16-[1]-49-[3]-18
>PBP-5 2.5e-16 37-[2]-16-[1]-49-[3]-14
>PBP-1 2e-09 41-[2]-17-[1]-52-[1]-14
>PBP-3 4.4e-08 54-[2]-16-[1]-68
>LUSH 0.015 71-[1]-74
>
>
>
>***************************************************************************
>*****
>SECTION III: ANNOTATED SEQUENCES
>***************************************************************************
>*****
> - The positions and p-values of the non-overlapping motif occurrences
> are shown above the actual sequence for each of the high-scoring
> sequences from Section I.
> - A motif occurrence is defined as a position in the sequence whose
> match to the motif has position p-value less than 0.0001 as
> defined in Section II.
> - For each sequence, the first line specifies the name of the sequence.
> - The second (and possibly more) lines give a description of the
> sequence.
> - Following the description line(s) is a line giving the length,
> combined p-value, and e-value of the sequence as defined in
>Section I.
> - The next line reproduces the motif diagram from Section II.
> - The entire sequence is printed on the following lines.
> - Motif occurrences are indicated directly above their positions in the
> sequence on four lines showing
> o the motif number of the occurrence,
> o the position p-value of the occurrence,
> o the best possible match to the motif, and
> o columns whose match to the motif has a positive score
>(indicated by
> a plus sign).
>***************************************************************************
>*****
>
>
>PBP-2
>
> LENGTH = 150 COMBINED P-VALUE = 3.35e-19 E-VALUE = 1.7e-18
> DIAGRAM: 16-[2]-16-[2]-16-[1]-49-[3]-18
>
> [2] [2] [1]
>
> 7.1e-05 4.7e-08
>3.4e-07
> CLNETGAT CLNETGAT
>EVKCYMAC
> ++ + + + ++++++ ++++
>+++
> 1
>MSHLVHLTVLLLVGILCLGATSAKPHEEINRDHLLELANECKAETGATDEDVEQLMSHDLPERHEAKCLRACVMK
>
> [3]
>
> 5.6e-15
>
> PEDHCEAAFAY
>
> +++++++++++
>
> 76
>KLQIMDESGKLNKEHAIELVKVMSKHDAEKEDAPAEVVAKCEAIETPEDHCDAAFAYEECIYEQMREHGLELEEH
>
>
>PBP-5
>
> LENGTH = 143 COMBINED P-VALUE = 4.95e-17 E-VALUE = 2.5e-16
> DIAGRAM: 37-[2]-16-[1]-49-[3]-14
>
> [2] [1]
>
> 2.8e-07 4.4e-06
>
> CLNETGAT EVKCYMAC
>
> ++ + +++ ++ +++
>
> 1
>MQSTPIILVAIVLLGAALVRAFDEKEALAKLMESAESCMPEVGATDADLQEMVKKQPASTYAGKCLRACVMKNIG
>
> [3]
> 1.6e-14
> PEDHCEAAFAY
> +++++++++++
> 76 ILDANGKLDTEAGHEKAKQYTGNDPAKLKIALDIGETCAAITVPDDHCEAAEAYGTCFRGEAKKHGLL
>
>
>PBP-1
>
> LENGTH = 148 COMBINED P-VALUE = 3.97e-10 E-VALUE = 2e-09
> DIAGRAM: 41-[2]-17-[1]-52-[1]-14
>
> [2] [1]
>
> 3.6e-08
>4.9e-08
> CLNETGAT
>EVKCYMAC
> ++++++++
>++++++ +
> 1
>MVARHFSFFLALLILYDLIPSNQGVEINPTIIKQVRKLRMRCLNQTGASVDVIDKSVKNRILPTDPEIKCFLYCM
>
> [1]
> 1.5e-05
> EVKCYMAC
> ++++++
> 76 FDMFGLIDSQNIMHLEALLEVLPEEIYKTINGLVSSCGTQKGKDGCDTAYETVKCYIAVNGKFIWEEIIVLLG
>
>
>PBP-3
>
> LENGTH = 154 COMBINED P-VALUE = 8.73e-09 E-VALUE = 4.4e-08
> DIAGRAM: 54-[2]-16-[1]-68
>
> [2]
>
> 4.0e-07
>
> CLNETGAT
>
> ++++++ +
>
> 1
>MALNGFGRRVSASVLLIALSLLSGALILPPAAAQRDENYPPPGILKMAKPFHDACVEKTGVTEAAIKEFSDGEIH
>
> [1]
>
> 2.3e-08
>
> EVKCYMAC
>
> ++++++++
>
> 76
>EDEKLKCYMNCFFHEIEVVDDNGDVHLEKLFATVPLSMRDKLMEMSKGCVHPEGDTLCHKAWWFHQCWKKADPKH
>
> 151 YFLP
>
>
>LUSH
>
> LENGTH = 153 COMBINED P-VALUE = 2.97e-03 E-VALUE = 0.015
> DIAGRAM: 71-[1]-74
>
>
>[1]
>
>5.2e
>
>EVKC
>
>++ +
> 1
>MKHWKRRSSAVFAIVLQVLVLLLPDPAVAMTMEQFLTSLDMIRSGCAPKFKLKTEDLDRLRVGDFNFPPSQDLMC
>
>
>
> -06
>
> YMAC
>
> ++++
>
> 76
>YTKCVSLMAGTVNKKGEFNAPKALAQLPHLVPPEMMEMSRKSVEACRDTHKQFKESCERVYQTAKCFSENADGQF
>
> 151 MWP
>
>CPU: cleopatre.pasteur.fr
>Time 0.054656 secs.
>
>/local/gensoft/lib/meme/bin/decalpha/mast mast.logodds.14447.tmp
>meme.14367.data ACDEFGHIKLMNPQRSTVWY -nostatus -mf meme.14367.results
>
==================================
Swarm-Support is for discussion of the technical details of the day
to day usage of Swarm. For list administration needs (esp.
[un]subscribing), please send a message to <address@hidden>
with "help" in the body of the message.
==================================
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- MEME job 14367 MAST analysis: Dean"s Sequences,
John W. Fondon III (Trey) <=