Skip Header

 

Release 11.0

Published May 29, 2007

Full statistics and release notes

Headlines

UniProtKB/Swiss-Prot major release (53.0)

Release 53.0 of 29-May-07 of UniProtKB/Swiss-Prot contains 269'293 sequence entries, comprising 98'902758 amino acids abstracted from 156'204 references. 9'228 sequences have been added since release 52.0: this represents a 3.5% increase. In addition, the sequence data of 734 existing entries have been updated and the annotations of 210'454 entries have been revised.

The following improvements were carried out in the last 3 months:

  • We have created a new comment subsection: SEQUENCE CAUTION. This subsection is used for all the comments related to submitted sequences which differ from the sequence shown in the entry because of conflicts, such as frameshifts, erroneous gene model predictions, erroneous translation or other discrepancies that cannot be described in the feature 'Conflict'.
  • We have added cross-references to 3 new databases: BuruList, a database dedicated to the analysis of Mycobacterium ulcerans genome, Orphanet, a database dedicated to information on rare diseases and orphan drugs, and PseudoCAP, a Pseudomonas genome database.

UniProt Knowledgebase release 11.0 includes Swiss-Prot release 53.0 and TrEMBL release 36.0.

We are pleased to announce a new UniProt database, the UniProt Metagenomic and Environmental Sequences (UniMES) database, a repository specifically developed for metagenomic and environmental data. UniMES is available in FASTA format on the UniProt ftp servers, in the new subdirectory current_release/unimes:

UniProtKB News

New ftp directory for UniProt Metagenomic and Environmental Sequences (UniMES)

We have added a new subdirectory, current_release/unimes, to the UniProt ftp servers

to distribute metagenomic and environmental sequences.

New comment subsection SEQUENCE CAUTION

We have introduced a new subsection in the General Annotation (Comments) section, SEQUENCE CAUTION, to describe protein sequence reports that differ from the sequence that is shown in UniProtKB due to conflicts that are not described in the SEQUENCE CONFLICT lines (Features), such as frameshifts, erroneous gene model predictions, etc. This kind of information was before reported in the topic CAUTION together with other warnings that are unrelated to sequence conflicts.

The format of the SEQUENCE CAUTION topic is:

     CC   -!- SEQUENCE CAUTION:
     Sequence=Sequence; Type=Type;[ Positions=Positions;][ Note=Note;]
    

Where:

These lines are not wrapped and their length may therefore exceed 75 characters.

Examples:

Q93W20:
     Previous annotation:
     CC   -!- CAUTION: Ref.2 (BAA97015) sequence differs from that shown due to
     CC       erroneous gene model prediction. The predicted gene At5g49940 has
     CC       been split into 2 genes: At5g49940 and At5g49945.
     New annotation:
     CC   -!- SEQUENCE CAUTION:
     CC       Sequence=BAA97015.1; Type=Erroneous gene model prediction; Note=The predicted gene At5g49940 has been split into 2 genes: At5g49940 and At5g49945;
    
Q83M39:
     Previous annotation:
     CC   -!- CAUTION: Ref.1 and Ref.2 sequences differ from that shown due to a
     CC       stop codon at position 273 which was translated as Gln to extend
     CC       the sequence.
     New annotation:
     CC   -!- SEQUENCE CAUTION:
     CC       Sequence=AAN42076.1; Type=Erroneous termination; Positions=273; Note=Translated as Gln;
     CC       Sequence=AAP15953.1; Type=Erroneous termination; Positions=273; Note=Translated as Gln;
    
P17814:
     Previous annotation:
     CC   -!- CAUTION: Ref.1 (CAA36850) sequence differs from that shown due to
     CC       a frameshift in position 496.
     CC   -!- CAUTION: Ref.1 (CAA36850) sequence differs from that shown due to
     CC       erroneous gene model prediction.
     New annotation:
     CC   -!- SEQUENCE CAUTION:
     CC       Sequence=CAA36850.1; Type=Erroneous gene model prediction;
     CC       Sequence=CAA36850.1; Type=Frameshift; Positions=496;
    
P0A7B3:
     Previous annotation:
     CC   -!- CAUTION: Ref.4 (X07863) sequence differs from that shown due to
     CC       several frameshifts.
     CC   -!- CAUTION: Ref.5 (Y00357) sequence differs from that shown due to
     CC       frameshifts in positions 204, 215 and 282.
     New annotation:
     CC   -!- SEQUENCE CAUTION:
     CC       Sequence=X07863; Type=Frameshift; Positions=Several;
     CC       Sequence=Y00357; Type=Frameshift; Positions=204, 215, 282;
    
P27612:
     Previous annotation:
     CC   -!- CAUTION: Ref.2 (AAA39943) sequence differs from that shown due to
     CC       frameshifts in positions 4, 32, and 42.
     CC   -!- CAUTION: Ref.2 (AAA39943) sequence differs from that shown due to
     CC       contaminating sequence.
     CC   -!- CAUTION: Ref.3 sequence differs from that shown due to a
     CC       frameshift in position 697.
     Current annotation:
     CC   -!- SEQUENCE CAUTION:
     CC       Sequence=AAA39943.1; Type=Miscellaneous discrepancy; Note=Several frameshifts and contaminating sequence;
     CC       Sequence=Ref.3; Type=Frameshift; Positions=697;
    

Change in comment subsection SUBCELLULAR LOCATION

From now on, the comment subsection SUBCELLULAR LOCATION may occur more than once per entry.

Cross-references to PseudoCAP

Cross-references have been added to the Pseudomonas aeruginosa Community Annotation Project database. This database provides genome annotation of P. aeruginosa strain PAO1 and of other Pseudomonas species, acting as a valuable comparative resource for P. aeruginosa research, as well as being useful for the larger Pseudomonas research community. Over the coming year this database will be further enhanced toward more focus on comparative analysis of P. aeruginosa isolates and more specific information about putative drug and vaccine targets.

The Pseudomonas aeruginosa Community Annotation Project database is available at http://www.pseudomonas.com/.

The format of the explicit link is:

Data bank identifier PseudoCAP
Primary identifier The primary identifier consists of the ordered locus name.
Secondary identifier None; a dash '-' is stored in that field.
Example
        Q9I576:
        DR   PseudoCAP; PA0865; -.
       

Cross-references to Orphanet

Cross-references have been added to the Orphanet database. This database is dedicated to information on rare diseases and orphan drugs. It aims to improve management and treatment of genetic, auto-immune or infectious rare diseases, rare cancers, or not yet classified rare diseases. ORPHANET offers services adapted to the needs of patients and their families, health professionals and researchers, support groups and industry.

The Orphanet database is available at http://www.orpha.net/consor/cgi-bin/home.php?Lng=GB.

The format of the explicit link is:

Data bank identifier Orphanet
Primary identifier The primary identifier consists of the Orpha unique disease identifier.
Secondary identifier The secondary identifier consists of the name of the disease.
Example
        P26439:
        DR   Orphanet; 418; Adrenal hyperplasia, congenital.
        DR   Orphanet; 3185; Stein-Leventhal syndrome.
       

Changes concerning keywords

New keyword: