Difference between revisions of "Masked"
(Created page with 'Masked genomes/sequence refer to genomic sequence that has been scanned for some type of internal sequence and then has those sequences converted to "X". Usually, repeat sequenc...') |
|||
(2 intermediate revisions by the same user not shown) | |||
Line 3: | Line 3: | ||
Masking sequences come in two general flavors | Masking sequences come in two general flavors | ||
− | ''Hard mask'': Masked sequence is converted to "X" | + | '''Hard mask''': Masked sequence is converted to "X" |
− | ''Soft mask'': Masked sequence is converted to lower-case ATCG | + | '''Soft mask''': Masked sequence is converted to lower-case ATCG |
− | For a popular repeat sequence identification program see: [http://www.repeatmasker.org/ RepeatMasker] | + | For a popular repeat sequence identification program see: [http://www.repeatmasker.org/ RepeatMasker]. For users that are logged in, CoGe provides an option to mask a genome through [[GenomeInfo]] |
+ | |||
+ | [[File:Screen Shot 2016-02-01 at 9.20.58 AM.png]] |
Latest revision as of 09:21, 1 February 2016
Masked genomes/sequence refer to genomic sequence that has been scanned for some type of internal sequence and then has those sequences converted to "X". Usually, repeat sequences are identified and masked as these cause sequence comparison algorithms to spend a lot of time identifying and matching these sequences. It is recommend to use repeat masked genomes in CoGe when given an opportunity for a whole genome comparisons (e.g. in SynMap)
Masking sequences come in two general flavors
Hard mask: Masked sequence is converted to "X"
Soft mask: Masked sequence is converted to lower-case ATCG
For a popular repeat sequence identification program see: RepeatMasker. For users that are logged in, CoGe provides an option to mask a genome through GenomeInfo