Systematic discovery of unannotated genes in 11 yeast species using a database of orthologous genomic segments

File Type:
PDFItem Type:
Journal ArticleDate:
2011Citation:
Sean S OhEigeartaigh, David Armisen, Kevin P Byrne and Kenneth H Wolfe, Systematic discovery of unannotated genes in 11 yeast species using a database of orthologous genomic segments, BMC GENOMICS, 12, 377, 2011Download Item:

Abstract:
Background: In standard BLAST searches, no information other than the sequences of the query and the database
entries is considered. However, in situations where two genes from different species have only borderline similarity
in a BLAST search, the discovery that the genes are located within a region of conserved gene order (synteny) can
provide additional evidence that they are orthologs. Thus, for interpreting borderline search results, it would be
useful to know whether the syntenic context of a database hit is similar to that of the query. This principle has
often been used in investigations of particular genes or genomic regions, but to our knowledge it has never been
implemented systematically.
Results: We made use of the synteny information contained in the Yeast Gene Order Browser database for 11
yeast species to carry out a systematic search for protein-coding genes that were overlooked in the original
annotations of one or more yeast genomes but which are syntenic with their orthologs. Such genes tend to have
been overlooked because they are short, highly divergent, or contain introns. The key features of our software -
called SearchDOGS - are that the database entries are classified into sets of genomic segments that are already
known to be orthologous, and that very weak BLAST hits are retained for further analysis if their genomic location
is similar to that of the query. Using SearchDOGS we identified 595 additional protein-coding genes among the 11
yeast species, including two new genes in Saccharomyces cerevisiae. We found additional genes for the mating
pheromone a-factor in six species including Kluyveromyces lactis.
Conclusions: SearchDOGS has proven highly successful for identifying overlooked genes in the yeast genomes.
We anticipate that our approach can be adapted for study of further groups of species, such as bacterial genomes.
More generally, the concept of doing sequence similarity searches against databases to which external information
has been added may prove useful in other settings.
Sponsor
Grant Number
Science Foundation Ireland
07/IN1/B911
Author's Homepage:
http://people.tcd.ie/khwolfehttp://people.tcd.ie/byrneke
Description:
PUBLISHEDPublisher:
BioMed CentralType of material:
Journal ArticleCollections:
Series/Report no:
BMC GENOMICS;12;
377;
Availability:
Full text availableLicences: