What's New in SGD in 2004

From SGD-Wiki
Jump to: navigation, search

Decmber 20, 2004

  • The Batch Download Tool has been updated so that Gene Ontology (GO) annotations can be downloaded for a list of standard gene names or systematic names. The GO annotations can be downloaded simultaneously with DNA sequences, protein sequences, and chromosomal coordinate information for the same list of gene games or systematic names. In addition, links to view "Flanking Features" have been added to the "Maps & Displays" pull-down menu on the Locus page and Gene/Sequence Resources so that neighboring chromosomal features can be viewed and downloaded.

December 20, 2004

  • Several new links and features have been added to the Locus page:
  • An "Expression Summary" histogram summarizes the number of microarray experiments (samples) in which the expression changed for that ORF. This histogram can be found in the lower right hand corner of the Locus page under the "Functional Analysis" pull-down menu. It is a hyperlink to the Expression Connection summary page.
  • Links to view ATCC/WashU Clones in Gbrowse are available from the "Maps & Displays" pull-down menu.
  • Links to GPM DB, a database summarizing mass spec data, are available from the "Protein Info & Structure" pull-down menu. Thanks to Ron Beavis for helping set up the links.

November 18, 2004

  • SGD sends out its quarterly newsletter to colleagues designated as contacts in SGD. An HTML version of the newsletter is available. If you would like to receive this letter in the future please use the Colleague Submission/Update form to let us know.

November 17, 2004

  • At the recent SeattleYeast Genetics & Molecular Biology Meeting (YGM), SGD curators circulated copies of a user survey to all meeting attendees. The purpose of the survey was to collect feedback from our users regarding the current features and tools that SGD provides, as well as to gather ideas about new information and tools that our users would like to see in the future. The results of the 2004 SGD survey are now available.
  • Updating the reference sequence is an ongoing project at SGD. All sequence updates are based on the re-sequencing of portions of the S288C strain background, and many result in annotation changes, such as altering the start or stop codons. In the past 6 months, the sequence of 16 ORFs has been updated, 3 new ORFs have been added, and one ORF has been "merged" with a neighboring gene. Click here for more details on these recent changes, or the Table of Updates to the Systematic Sequence for comprehensive summaries of all changes in the systematic sequence.

November 16, 2004

  • In response to comments received in the SGD survey taken at the YGM meeting this summer, SGD is in the process of expanding its literature search to add more relevant papers to the database. As a result, you may notice more references in the Literature Guide of the Locus pages. We are in the process of curating these new references and deleting any that are not relevant. Please feel free to alert us to any problems that you notice.

October 21, 2004

  • The CEN1-CEN16 centromere pages have been expanded to include sequence annotations for CDEI, CDEII, and CDEIII, the three adjacent DNA elements that comprise each yeast centromere. CDEI, the smallest domain, is an approximately 8 bp consensus sequence that is bound by centromere binding factor 1 (Cbf1p). CDEII, the central centromeric domain, is AT-rich and 75-100 bp in length. CDEIII consists of a 26-bp consensus sequence that provides the binding site for the centromere DNA binding factor 3 (CBF3) complex. These expanded pages also contain a new summary paragraph and figure describing the function of the centromere and proteins that bind to CDEI, II, and III. SGD thanks Dr. Sue Biggins for advice about this project.

September 4, 2004

  • To reflect changes in the database, several files on the ftp site have changed. The SGD_features.tab file replaces the chromosomal_feature.tab and intron_exon.tab files. The dbxref.tab file replaces the external_id.tab file. More information can be found in the README located in this directory: http://www.yeastgenome.org/download-data/curation

September 2, 2004

  • Over the past six months SGD staff have worked to significantly enhance the way data are stored at SGD. During this time, much of the data has been checked and updated as needed. These changes will allow SGD to incorporate new data and datatypes, such as additional chromosomal features, with greater ease.


Highlights of changes to the SGD schema include:

  • Adopting aspects of the CHADO database schema
  • Increased flexibility in relating features to other features
  • Increased flexibility in expanding the types of data that can be added
  • Defining explicit relationships between the SGDIDs of features at SGD to unique identifiers at other databases


In addition, the following new options have been added to existing tools:

  • Additional GO Slim sets have been added to the GO Slim Mapper
  • A customized set of GOIDs can be entered to the Chromosomal Features Search in order to retrieve the desired features
  • Genomic DNA, coding and intron sequences can be retrieved from the "Sequence Information" section of the Locus page
  • A Phenotype Resources pull-down menu has been added as an option on the right hand side of the Locus page


As a consequence of the changes to the way data are stored at SGD, the following changes and additions have been made to the available data at SGD:

  • Expansion of SGDIDs from 8 digits to 10 digits. Two additional padding zeros were added to the numerical portion of the SGDID (for example, S0000981 is now S000000981). The shorter 8 digit SGDID will still be supported via web interfaces, but only the longer 10 digit SGDID will be used in files available on the ftp site
  • SGDIDs now associated with references. These SGDIDs are now the official database identifier for references within the database and will be used on SGD reference pages and within any file in which we provide reference information, such as the gene_associations.sgd file available from the GO Consortium and SGD ftp sites
  • Implementation of Sequence Ontology terms. The majority of feature types and subfeature types have remained the same. However, one significant change is the use of CDS instead of exon. SGD has chosen to implement the use of CDS in order to reserve the word exon for instances when we have data that provide the entire exon, including any 5'-UTR or 3'-UTR sequences
  • Embedded features displayed on Locus page. Chromosomal features that are fully contained within the given feature are listed in the "Sequence Information" section of the Locus page.
  • Dates associated with chromosomal features. These dates are displayed in the "Sequence Information" section of the Locus page. The dates are in the following format: year-month-date. There are two types of dates:
  1. Coordinate dates indicate the date the coordinates of the feature were last changed. In most cases this is likely due to an insertion or deletion to the left of the feature, resulting in a shift of all chromosomal coordinates for features located to the right of the insertion or deletion.
  2. Sequence dates indicate the date that the sequence of the feature was last changed. This can be due to a sequence change within the feature, a change in the intron/CDS structure of the feature, or an change in the the start or stop coordinate of the feature which extends or shortens the feature. At the present time, the oldest date displayed is 2000-05-19.
  • Reorganized pull-down menus. The order and location of items on the pull-down menus on the right hand side of the Locus page have changed. The majority of options are now alphabetical within each pull-down menu. Please see the help page for more details on the location of pull-down menu items.
  • References associated with standard gene names and aliases. Citations referring to the standard gene name or to an alias name are available on the Locus History page for that gene.
  • Categorization of notes. In order to clarify the types of notes displayed on the Locus History page, they have been divided into sections, such as Nomenclature History, Mapping Notes, and Sequence Annotation Notes. Please see the help page for more information.
  • Reorganized GO Term page. Please see the help page for more information.
  • Modified Clone pages. Tools and resources that were previously available on the Clone page, such as the Physical map, are currently still under development.
  • Creation of Chromosome pages. Additional tools and resources for Chromosome pages are currently under development.

August 23, 2004

August 19, 2004

August 11, 2004

  • The two-hybrid results described in Hazbun et al. (2003) Mol Cell 12:1353-1365, have been added to SGD and are now shown on individual locus pages, under the Physical Interactions section. Thanks to Michael Riffle for providing the data files to SGD.

August 5, 2004

  • The SGD Locus page "Comparison Resources" pull-down menu now includes links between S. cerevisiae genes and Ashbya gossypii homologs at the Ashbya Genome Database (AGD). These homology determinations were based on analyses performed by Dietrich, et al., 2004 and Brachat, et al., 2003 In addition, Ashbya gossypii predicted protein sequences have been added to SGD's "Model Organism BLASTP Best Hits" resource. Links to this resource can also be found on each Locus page under the "Comparison Resources" pull-down menu. Thanks to Michael Primig and Leandro Hermida for setting up the links.

July 23, 2004

  • SGD sends its quarterly newsletter to colleagues designated as contacts in SGD. An HTML version of the newsletter is available.

July 23, 2004

  • YGM Abstracts and links to the YGM meeting site are available online from SGD. All abstracts can be searched via the online search form or individually browsed.

July 22, 2004

  • GBrowse, an interactive genome browser that can be customized to show selected chromosomal features as well as display user provided annotations, has been implemented at SGD. It is currently available from the Analysis & Tools contents page and from the BLAST results pages. GBrowse was developed by the Generic Model Organism Database (GMOD) project. Thanks to Scott Cain of GMOD for assistance during testing.

July 8, 2004

  • SGD announces the release of a Batch Download Tool that allows simultaneous retrieval of DNA sequences, protein sequences and chromosomal coordinates for a list of Standard Gene Names or Feature Names. This tool can be accessed through links on the Analysis & Tools and Download Data contents pages.

June 9, 2004

  • SGD announces the release of SGD Lite, a lightweight version of SGD in a freely available database system, which can be easily downloaded and installed on a personal desktop for your own research or development purposes. SGD Lite has been implemented in its entirety by using the components provided by the Generic Model Organism Database Construction Set as part of the GMOD project.

June 3, 2004

  • SGD has just released a GFF3 file for the S. cerevisiae genome that provides genomic features, gene names and aliases, gene descriptions, GO annotations, and more. This file meets the current GFF3 specifications and is fully compatible with GBrowse and Chado loading scripts. Thanks to Scott Cain of GMOD for assistance during testing. The GFF3 file is updated once a week from the SGD database and available for download from the SGD FTP site.

May 12, 2004

  • The following new links to databases describing different phenotypes of deletion strains have been added under the Functional Analysis pull-down menu on the locus page:
  • PROPHECY: provides quantitative information regarding growth rate, growth efficiency, and adaptation time for haploid deletion strains. Thanks to Luciano Fernandez-Ricaud for setting up the links to PROPHECY.
  • SCMD: a collection of micrographs of budding yeast mutants. Thanks to Shinichi Morishita and Yoshikazu Ohya for setting up the links to SCMD.

May 10, 2004

  • A new data set has been added to SGD's Expression Connection tool, which allows you to search gene expression data from multiple microarray studies. The data set added is entitled "Gene regulation by SWR1, HTZ1, and SIR2" ( Meneghini et al. (2003) Cell 112:725-36 and Kobor et al. (2004) PLoS Biol 2:E131) The associated data is also available for download on the SGD ftp site. Thanks to Hiten Madhani for assisting SGD in loading his data from Meneghini et al. and Kobor et al.

May 5, 2004

  • SGD releases a new resource, "Model Organism BLASTP Best Hits", that displays the results of NCBI BLASTP analysis using the protein sequence of each S. cerevisiae open reading frame as the query against the predicted protein sequences of several model organisms.

April 26, 2004

  • SGD sends its quarterly newsletter to colleagues designated as contacts in SGD. An HTML version of the newsletter is available.

April 12, 2004

  • SGD releases a new Fungal BLAST interface that can be used to do BLASTN or TBLASTN searches of any sequence of choice against any combination of fungal sequence datasets, including genome sequences of fungal model organisms and pathogens, ESTs, and other fungal sequence sets in GenBank.

March 31, 2004

  • Four data sets have been added to SGD's Expression Connection tool, which allows you to search gene expression data from multiple microarray studies. The data sets that have been added are:
  • Ploidy regulation of gene expression (Galitski et al. (1999) Science 285:251-4)
  • Gene regulation by Swr1, Htz1, and Ino80 (Mizuguchi et al. (2004) Science 303:343-8)
  • Expression regulated by the calcineurin/Crz1 pathway (Yoshimoto et al. J Biol Chem 277:31079-88)
  • Expression during the unfolded protein response (Travers et al. Cell 101:249-58)


These data are also available for download on the SGD ftp site. Thanks to Joe Landry for assisting SGD in loading his data from Mizuguchi et al.

March 8, 2004

  • SGD has expanded its annotation of the S. cerevisiae rDNA locus, RDN1, which consists of 100-200 tandem copies of a 9.1-kb repeat covering approximately 1-2 Mb on the right arm of Chromosome XII. In addition to the three ribosomal RNA coding sequences (for the 18S, 5.8S, and 25S rRNAs), each of the two annotated repeat units (RDN37-1 and RDN37-2) now contains annotated 5' ETS, 3' ETS, ITS1, ITS2, NTS1, and NTS2 spacer regions, including respective locus pages. Thanks to David Tollervey and Christophe Dez for verifying coordinates and providing feedback for these expanded yeast rDNA annotations.

February 6, 2004

  • Results from a large-scale yeast genetic footprinting study described in Dunn et al have been incorporated into SGD. Phenotypic profiling of individual genes was achieved by assaying the fitness of multiple independent mutants of each gene during competitive growth under five different conditions. These results are displayed in the Phenotype section of individual SGD locus pages. In addition, you can download all the data from the SGD ftp site. Thanks to Barbara Dunn for releasing these data pre-publication.

February 6, 2004

  • Results from the large-scale genetic interaction study described in Tong et al. (2004) Science 303:808-813 have been incorporated into SGD. These genetic interaction data are displayed on individual SGD locus pages. In addition, they can be downloaded from the SGD ftp site, or from the paper's supplemental web site. Thanks to Amy Tong and Charlie Boone for assisting SGD in incorporating their results in order to provide this valuable resource to the yeast community.

February 2, 2004

  • SGD is currently in the process of updating the systematic sequence. These sequence changes are based on the re-sequencing of portions of the S288C strain background. Some of these sequence changes also result in annotation changes, such as altering the start or stop codons. In addition, we will be correcting the annotations for some genes in cases where the sequence was correct, but the original gene call was not (e.g. new exons/introns, change in the start methionine, etc.).
  • Sequence and/or annotation updates have been completed on the following chromosomes: I, II, V, VII, VIII, XI, XVI. More information about the completed sequence updates can be found at the Table of Updates to the Systematic Sequence
  • Sequence and/or annotation updates are still in process on the following chromosomes: III, IV, VI, X, XII, XIII, XIV, and XV.
  • A new issue of the SGD quarterly newsletter has been sent to colleagues designated as contacts in SGD. An HTML version of the newsletter can be viewed here.

January 7, 2004

  • Based on the analysis published by Hazbun et al. (2003) Mol Cell 12:1353-1365, SGD curators have updated GO annotations for 61 uncharacterized but essential ORFs. In addition, links to the Yeast Resource Center Informatics Platform have been added to the localization, interaction, and/or protein info pull-down menu as appropriate for each ORF. Thanks to Michael Riffle and Trisha Davis for setting up the links.

January 1, 2004

  • The 2004 Nucleic Acids Research Database issue is now available. In this issue, the SGD staff wrote a paper (html|pdf) describing four new tools in SGD:
  • Fungal Alignment Viewer (for a ClustalW alignment of related fungal sequences)
  • Sequence Similarity Query tool (for results of a PSI-BLAST query to find related sequences from any organism)
  • Yeast Biochemical Pathways tool (for information about metabolic pathways in S. cerevisiae)
  • Find Chromosomal Features search interface (for an advanced search based on criteria like molecular weight or pI)