SGD Newsletter, Summer 2024

From SGD-Wiki
Revision as of 15:30, 12 June 2024 by Stacia (talk | contribs) (Give a Gift / Support SGD)
Jump to: navigation, search

About this newsletter:
This is the Summer 2024 issue of the SGD newsletter. The goal of this newsletter is to inform our users about new features in SGD and to foster communication within the yeast community. You can view this newsletter as well as previous newsletters, on the SGD Community Wiki.

Ref genome update R64.5

Extend gene coordinates in GFF

The saccharomyces_cerevisiae.gff contains sequence features of Saccharomyces cerevisiae and related information such as Locus descriptions and GO annotations. The saccharomyces_cerevisiae.gff is fully compatible with Generic Feature Format Version 3, and is updated weekly.

After November 2020, SGD updated the transcripts in the GFF file to reflect the experimentally determined transcripts (Pelechano et al. 2013, Ng et al. 2020), when possible. The longest transcripts were determined for two different growth media – galactose and dextrose. When available, experimentally determined transcripts for one or both conditions were added for a gene. When this data was absent, transcripts matching the start and stop coordinates of an open reading frame (ORF) were used.

yal061w w2transcripts.jpg Old version: BDH2/YAL061W with longest transcripts expressed in GAL and in YPD.

Beginning in February 2024, SGD increased the start and stop coordinates of genes to encompass the start and stop coordinates of the longest experimentally determined transcripts, regardless of condition. This change was made in order to comply with JBrowse 2, a newer and more extensible genome browser, which requires that parent features in GFF files (genes) are larger than child features (mRNA, CDS, etc) (Diesh et al., 2023).

yal061w extendedgene.jpg After February 2024: BDH2/YAL061W with increased start/stop coordinates.

This is a standard format used by many groups. SGD uses the GFF file to load the reference tracks in SGD’s genome browser resource.

Updates to SGD search

datasets
complex aliases
allele descriptions, SGDIDs
RNAcentral IDs

PubTator link on SGD reference pages

microPublications - latest yeast papers

​microPublication Biology is part of the emerging genre of rapidly-published research communications. We are seeing a strong set of microPublications come through the database and are glad for this venue to publish brief, novel findings, negative and/or reproduced results, and results which may initially lack a broader scientific narrative. Each article is peer-reviewed, assigned a DOI, and indexed through PubMed and PubMedCentral.

Consider microPubublications when you have a result that doesn't necessarily fit into a larger story, but will be of value to others.

Latest yeast microPublications:

All yeast microPublications can be found in SGD.

Alliance of Genome Resources - Release 7.1

The Alliance of Genome Resources, a collaborative effort between SGD and other model organism databases (MODs), released version 7.1 in May 2024.

The 7.1.0 release updates the Disease pages’ Associated Genes table:

  • A new column contains Disease Qualifier, which describes whether an allele may be, for example, implicated in the onset of a disease or implicated in the severity of a disease.
  • The “Annotation Details” pop-up now includes more information: Association, Additional Implicated Genes, Genetic Modifiers, Genetic Sex, Strain Background, Notes, and Annotation Type.
  • The Download file from the disease page Associated Genes table now includes Additional Implicated Gene ID, Additional Implicated Gene Symbol, Gene Association, Genetic Entity Association, Disease Qualifier, Evidence Code Abbreviation, Experimental Conditions, Genetic Modifier Relation, Genetic Modifier IDs, Genetic Modifier Names, Strain Background ID, Strain Background Name, Genetic Sex, Notes, Annotation Type, and Source URL.
  • On the highest-level, generic “disease” page and other applicable pages:
    • There is now a 90,000 row limit on how many rows can be downloaded at any given time.
    • There is a message in red text at the bottom of any table displaying more than 90,000 rows: “The table above cannot be downloaded because there are too many rows in the unfiltered table. Please apply filter(s) to limit the number of rows to less than 90000 to enable the Download button or visit our Downloads page to download the entire data set.”
    • This table now loads quickly and consistently; due to the amount of data this table had been previously slow to load.

Upcoming conferences and courses