spacer

Sequin Revision History

This file lists changes that have been made to the Sequin program. Where appropriate, there are also links to the relevant section of the Sequin help documentation.

Version 2.80--January 27, 1999

  • Setting up Sequin to communicate over the network is now much easier.
    Sequin can function as either a stand-alone or network-aware program. The stand-alone version is all that is needed to perform most sequence submissions. In its network aware mode, Sequin can also communicate with the NCBI to download sequences from Entrez, perform Power-BLAST searches and Entrez queries, and screen for the presence of vector sequences or repeat elements.
    The Network Configuration program is located under the Misc menu, both on the initial Welcome to Sequin page and in the record viewer. Most users can select a "Normal" connection and click on Accept to begin the configuration. If you are behind a firewall, you may need to contact your system administrator in order to fill in the Proxy and Port fields. Users outside the United States or with a bad Internet connection may need to increase the Timeout, the length of time for which Sequin will wait for a response from the network.

  • Asnload files are no longer included in the Sequin distribution
    Due to improvements in NCBI services, Asnload files are no longer necessary on any platform for recent NCBI software. Therefore, when you download and install the new version of Sequin, the Asnload folder will no longer be present.

Version 2.70--September 14, 1998

  • This version is capable of editing complete bacterial chromosomes or large eukaryotic chromosomal segments in a single record. The generation of reports (i.e., GenBank and Graphic view) and validation are now much faster.

  • Sequin can now annotate features by reading in a tab-delimited table. The table specifies the location and type of feature, and Sequin processes the feature intervals and translates any CDSs. The table is read in the record viewer (after the sequence has been imported) using the File-->Open menu. The table must follow a defined format. The first line starts with >Feature, a space, and then the Sequence ID of the sequence you are annotating. In the example below, eIF4E is the Sequence ID. The table is composed of five columns: start, stop, feature key, qualifier key, and qualifier value. The columns are separated by tabs. The first row has start, stop, and feature key. Additional feature intervals just have start and stop. The qualifiers follow on lines starting with three tabs.

    For example, a table which looks like this:

    >Features eIF4E
    80	2881	gene
    			gene	eIF4E
    
    201	224	CDS
    1550	1920
    1986	2085
    2317	2404
    2466	2629
    			product	eukaryotic initiation factor 4E-II
    
    1402	1458	CDS
    1550	1920
    1986	2085
    2317	2404
    2466	2629
    			product	eukaryotic initiation factor 4E-I
    			note	encoded by two messenger RNAs
    
    80	224	mRNA
    1550	1920
    1986	2085
    2317	2404
    2466	2881
    			product	eukaryotic initiation factor 4E-II
    
    80	224	mRNA
    892	1458
    1550	1920
    1986	2085
    2317	2404
    2466	2881
    			product	eukaryotic initiation factor 4E-I
    
    80	224	mRNA
    1129	1458
    1550	1920
    1986	2085
    2317	2404
    2466	2881
    			product	eukaryotic initiation factor 4E-I
    
    

    will result in a GenBank flatfile which contains this:

    
         mRNA            join(80..224,1129..1458,1550..1920,1986..2085,2317..2404,
                         2466..2881)
                         /gene="eIF4E"
                         /product="eukaryotic initiation factor 4E-I"
         mRNA            join(80..224,892..1458,1550..1920,1986..2085,2317..2404,
                         2466..2881)
                         /gene="eIF4E"
                         /product="eukaryotic initiation factor 4E-I"
         mRNA            join(80..224,1550..1920,1986..2085,2317..2404,2466..2881)
                         /gene="eIF4E"
                         /product="eukaryotic initiation factor 4E-II"
         gene            80..2881
                         /gene="eIF4E"
         CDS             join(201..224,1550..1920,1986..2085,2317..2404,2466..2629)
                         /gene="eIF4E"
                         /codon_start=1
                         /product="eukaryotic initiation factor 4E-II"
                         /translation="MVVLETEKTSAPSTEQGRPEPPTSAAAPAEAKDVKPKEDPQETG
                         EPAGNTATTTAPAGDDAVRTEHLYKHPLMNVWTLWYLENDRSKSWEDMQNEITSFDTV
                         EDFWSLYNHIKPPSEIKLGSDYSLFKKNIRPMWEDAANKQGGRWVITLNKSSKTDLDN
                         LWLDVLLCLIGEAFDHSDQICGAVINIRGKSNKISIWTADGNNEEAALEIGHKLRDAL
                         RLGRNNSLQYQLHKDTMVKQGSNVKSIYTL"
         CDS             join(1402..1458,1550..1920,1986..2085,2317..2404,
                         2466..2629)
                         /gene="eIF4E"
                         /note="encoded by two messenger RNAs"
                         /codon_start=1
                         /product="eukaryotic initiation factor 4E-I"
                         /translation="MQSDFHRMKNFANPKSMFKTSAPSTEQGRPEPPTSAAAPAEAKD
                         VKPKEDPQETGEPAGNTATTTAPAGDDAVRTEHLYKHPLMNVWTLWYLENDRSKSWED
                         MQNEITSFDTVEDFWSLYNHIKPPSEIKLGSDYSLFKKNIRPMWEDAANKQGGRWVIT
                         LNKSSKTDLDNLWLDVLLCLIGEAFDHSDQICGAVINIRGKSNKISIWTADGNNEEAA
                         LEIGHKLRDALRLGRNNSLQYQLHKDTMVKQGSNVKSIYTL"
    
    
    

    Note that if the gene feature spans the intervals of the CDS and mRNA features for that gene, you don't need to include gene "qualifiers" in those features, since they will be picked up by overlap.

    Features which are on the complementary strand are indicated by reversing the interval locations. For example, the table:

    >Features dna2
    2710	2639	tRNA
    			note	codon recognised: GAA
    			product	tRNA-Glu
    			anticodon	(pos:2675..2677, aa:Glu)
    

    will result in a GenBank flatfile containing:

    
         tRNA            complement(2639..2710)
                         /note="codon recognised: GAA"
                         /product="tRNA-Glu"
                         /anticodon=(pos:2675..2677, aa:Glu)
    
    

Version 2.60--June 2, 1998

  • You can now open a FASTA-formatted DNA sequence file in Sequin without first creating a Sequin record. On the Welcome to Sequin Form, click on "Read Existing Record" to read in your sequence and open it in the record viewer. Or, if you are already viewing a record in Sequin, choose File-->Open to open a FASTA-formatted DNA sequence. However, although the sequence will be displayed in Sequin and can be analysed with tools such as PowerBLAST, Vector Screen, or ORF Finder, it should not be submitted, because it does not have the appropriate annotations or the required contact information to make it a valid submission.

  • A variety of minor bugs have been fixed (affecting all platforms).

Version 2.45--March 3, 1998

  • Easier sequence annotations
    You can now use the Sequence Editor Feature menu, as well as the main Sequin Annotate menu, to annotate features on the sequence. The features listed are identical, and the instructions for adding them are the same, with one exception. If you annotate them in the Annotate menu, you must provide the nucleotide sequence location of the feature. However, if you add features from the Sequence Editor, you do not need to enter the nucleotide coordinates manually. Simply highlight the sequence which the feature covers, and the location of the sequence will be automatically entered in the feature location box.

  • New PowerBLAST features
    PowerBLAST capabilities have been enhanced. When you do a PowerBLAST from within Sequin, you can limit a search either for or against an organism or taxonomic group. Under Organism Filter, click on "Restrict to" to limit your search to a particular organism. Or, conversely, click on "Filter against" to search against all organisms except one. Type the scientific name of the organism (e.g., Homo Sapiens) or taxonomic group (e.g., Mammalia) in the "Name" box. After you do a PowerBLAST search, additional controls will be added to the bottom of the record viewer window. These controls allow you to retrieve the PowerBLAST hits from Entrez, and then look for Entrez neighbors. Use the alignment pop-up to select the type of alignment (search) that was performed. If multiple blast search types were run in one PowerBLAST search, this allows you to get one type at a time. Then click on the Retrieve button to retrieve the records in a document summary window, where you can view Medline, Protein, Nucleotide, Structure, and Genome neighbors of the sequence(s). Click on the Refine button to open a query refinement window in which you can further refine the PowerBLAST hits by selecting other Entrez terms, such as Author name to view sequences belonging to a specific author.

  • Replacing or updating your sequence
    We had previously explained that you can now replace or merge the sequence in the record with a new sequence without going into the Sequence Editor. This option is available in the Update Sequence submenu of the main Sequin Edit menu. In addition to being able import a sequence in FASTA format (Read FASTA File), import a sequence record in ASN.1 format (Read Sequence Record), or download a sequence record from Entrez (Download Accession), you can now import a sequence from a Sequin PowerBLAST alignment (Selected Alignment). Note that in all cases, both the target and the imported sequence must be nucleotide sequences. The alignment between your original sequence and the imported sequence can be viewed in a separate window. You can then choose to merge the 5' or 3' end of the imported sequence with the target sequence in the record, or replace the target sequence with the imported sequence. The features on the imported sequence will be automatically copied to the original sequence. You can also choose only to propagate features from the imported sequence record to the target.

  • Contact information for future submissions
    The contact, authors, and affiliation information you provide on the Submitting Authors form can now be saved as a block and used for subsequent submissions. For your first Sequin submission, fill in the requested information. Then, in the record viewer, click on "Edit Submitter Info" under the Edit menu, and then on Export Submitter Info under the File menu in the resulting Submission Instructions form. For subsequent Sequin submissions, click on Import Submitter Info on the first page of the Submitting Authors form. You must still fill in the manuscript title on the this page, though.

  • Formatting segmented population/phylogenetic sets
    Sequin can now read segmented sets of sequences which are parts of phylogenetic, population, or mutation studies. A segmented set is a colllection of non-overlapping sequences which cover a specified genetic region, such as a set of exons along with fragments of flanking introns. The sequences must be in FASTA or FASTA+GAP format. Each segment should have its own sequence identifier (the term immediately following the ">", but organism name and source modifiers should only be indicated for the first segment from each sequence. Square brackets are used to delimit the members of a set. For example:
    
    [ >bioseq1part1 [org=Mus musculus] [strain=BALB/c] 
    CAGATGGCTCC
    >bioseq1part2 ATAATGACAGCTTCATAATGGCAGTGGGTGAGCCCCTGGTGCACATCAG 
    ] 
    [
    >bioseq2part1 [org=Rattus norvegicus] [strain=Sprague-Dawley]
    CAGTCGGCTCC 
    >bioseq2part2
    ATAATGATGTCTTCATAATGGCAGAAAGTGAGCCCCTGGTGCACATCAG 
    ]
    
    

  • Creating automatic definition lines
    Sequin can now create definition lines (sequence titles) automatically based on information provided in the record. This option works for single sequences as well as sets of sequences, and can handle complex annotations with multiple features. The definition lines will follow standard GenBank conventions. Use the function "Generate Definition Line" under the Sequin Annotate menu.

  • Encoding new information in definition line
    If you are submitting the sequence for an organism which is not present in the NCBI taxonomy database, you can indicate the lineage of the organism on the first line (definition line) of your FASTA-formatted nucleotide sequence. Use the modifier [lineage=lineage] on the line where other modifiers are indicated. For example,
     >dna1 [org=Neworganism] [strain=A] [lineage=Newlineage] 
      GGGGGGGGGGAAAAAAAAAAAAAAATTTTTTTTTTTTTTTTCCCCCCCCCCCCCGGGGGGGGGGG
      AAAAAAATTTTTTTTTTTTTCCCCCCCCCCCCCC 
      
  • For information about the organisms presently in GenBank, see the NCBI taxonomy browser at http://www.ncbi.nlm.nih.gov/Taxonomy/
    Additional source information can also be encoded directly in the definition line. You can now indicate [location={genomic,chloroplast,kinetoplast,mitochondrion,macronuclear, extrachromosomal,plasmid,transposon,insertion sequence,cyanelle,proviral, virion}] and [molecule={dna,rna}] . In each case you can pick one item from the list in {}, so a sample definition line could be:

  • 
    >dna1 [org=Homo sapiens] [location=genomic] [molecule=dna]
    
    
    

  • Direct submission information
    The DIRECT SUBMISSION reference for your new submission will now appear as it will once the record is released to the public. In the past, this information was stored by Sequin, but not displayed. This citation lists the authors who should recieve scientific credit for the sequence, and may have as many authors as you see fit. It should have, at the very least, one author. Author names are initially entered on the Submitting Authors form. You can modify the list by double clicking on the reference.

  • Minor changes
    • Reference publication types now include Proceedings (meetings) and Proceedings Chapter (meeting abstracts).
    • When you highlight a range of sequence in the Sequence Editor, the selected sequence is shown as a box in the Graphic view of the record.
    • An option called "Select Target" was added to the Sequin Search menu. This option changes the sequence which is selected in the Target Sequence pop-up.

 

Version 2.28--September 19, 1997

  • You can now replace or merge the sequence in the record with a new sequence without going into the Sequence Editor. In the main Sequin window, choose one of the three items in the Edit-->Update Sequence submenu. You can import a sequence in FASTA format (Read FASTA File), a sequence record in ASN.1 format (Read Sequence Record), or, if you are running Sequin in its Network Aware mode, download a sequence record from Entrez (Download Accession). The alignment between your original sequence and the imported sequence can be viewed in a separate window. You can then choose to merge the 5' or 3' end of the imported sequence with the target sequence in the record, or replace the target sequence with the imported sequence. The features on the imported sequence will be automatically copied to the original sequence. You can also choose only to propagate features from the imported sequence record to the target.

  • If you are submitting an aligned set of sequences, and one or more of the sequences is already present in the GenBank/EMBL/DDBJ database, you can mark that sequence(s) so that it does not get a new accession number. Instead of providing the sequence(s) with a new Sequence Identifier, add 'acc' to the existing accession or gi number. For instance, use the identifier accU12345, where U12345 is the existing accession number. The sequence does not need a title since it is not being resubmitted to the database. Thus, an example of a nucleotide definition line would be:

    >accU54469

  • You can now encode a a comment in a protein definition line, and the text of the comment will turn into a /note on the CDS feature. For example, if the definition line for the protein sequence is
    >aa1 [gene=eIF4E] [prot=eukaryotic initiation factor 4E-I]
     [comment=alternative splice product] 
    Drosophila melanogaster eukaryotic initiation factor 4E-I, complete sequence
    
    the corresponding CDS feature will have the following fields:
    /gene="eIF4E"
    /note="alternative splice product"
    /product="eukaryotic initiation factor 4E-I"
    
  • PowerBLAST capabilities have been enhanced. From within Sequin, you can perform blastn or tblastn searches of the sequence(s) in the record against many NCBI supported nucleotide databases, and blastp or blastx searches against protein databases. PowerBLAST can now also handle large sequences. For additional information, see the BLAST home page at http://www.ncbi.nlm.nih.gov/BLAST/.
  • A variety of minor bugs have been fixed (affecting all platforms).

Version 2.20--July 29, 1997

  • In the Sequence Editor, the "Find" command under the Edit menu now searches the displayed nucleotide sequence for amino acid as well as nucleotide sequence patterns. If you type in an amino acid sequence, Sequin will search for that sequence in a three-frame translation of the nucleotide sequence. For example,
    • CDLPEYC finds DNA sequences encoding CDLPEYC
    • [CRQ]DLPEYC finds DNA sequences encoding C or R or Q followed by DLPEYC
    • XDLPEYC finds DNA sequences encoding any amino acid followed by DLPEYC
    • CDL(3)EYC finds DNA sequences encoding CDLEEEYC
    • CDL(1:3)PE finds DNA sequences encoding CDLPE, CDLPPE, and CDLPPPE
    • CDL(1:3)XE finds DNA sequences encoding CDL, followed by 1-3 occurrences of any amino acid, followed by E, i.e., CDLAAE, CDLRSE, or CDLAPQE

  • For gene features on a segmented set, the location is now specified by an "order" rather than a "join". In the past, a gene feature on a segmented set was indicated on the last record in the set as follows:
    	gene            join(AF000001:1..2000,1..5388)
    	                /gene="testbar"
    
    In this version of Sequin, the location of the gene feature is shown on the last record as follows:
    	gene            order(AF000001:1..2000,1..5388)
     	                /gene="testbar"
    
  • The order in which the gene, CDS, and other features are displayed has been changed. In the past, the order of these features, if they covered the same sequence interval, was random. Now, features with the same interval are always displayed in the following order:
    gene
    CDS
    any_feature
    

    Please note that, as always, the order in which features are displayed depends on their left-most sequence position. Thus, the source feature, whose left-most position is "1", is always the first feature. For example,

    	source		1..4495
    			/organism="Homo sapiens"
    	gene		86..4339
    			/gene="ABC1"
    	CDS		86..4339
    			/gene="ABC1"
    
  • Under the Sequin Search menu, the command previously called "Find" has been renamed "Find ASN.1". This command allows you to find and replace strings of text in your submission.
  • Under the Sequin Search menu, a new command called "Find FlatFile" has been added. This command allows you to find strings of text in your submission.
  • Under the Sequin Annotate menu, an additional choice called "Remaining Features" has been added. Now any feature which is legal under the DDBJ/EMBL/GenBank feature table can be added to a submission.
  • The default HUP date is now one year from the current date. The HUP (hold until published date) is the date on which you specify that the sequence can be released to the public.
  • In the Sequence Editor, the "Label" option is now available under the View menu. This option allows you to choose how sequence names are displayed in an alignment.
  • A number of bugs in the Sequence Editor have been fixed.
  • A number of bugs specific to the DEC Alpha OSF1 version of Sequin have been fixed.

Version 2.14--July 5, 1997

  • Network Entrez, an NCBI tool for accessing bibliographic, sequence, and structure records is now fully integrated into Sequin. As before, users can download sequences from Entrez to view or to edit and resubmit to the databases. Now, users can also view any type of record on demand from within Sequin. Users are reminded that they can only update their own records this way.
  • PowerBLAST, a version of the client for the popular BLAST software for sequence comparisons, is now available from within Sequin. Users can compare their sequence to sequences in the nucleotide and protein databases and view the from results within Sequin.
  • A new algorithm is now used to calculate global sequence alignments. You can have Sequin calculate and display the alignment between a sequence in the record and another sequence in a file. In the sequence editor, select the option "Align with" under the Edit menu.
  • Sequin will now recognise sequence alignments which have been saved in one of two NEXUS formats, NEXUS Interleaved and NEXUS Contiguous formats.
  • Records can now be viewed with two new Display Formats, Summary and Sequence. Summary format shows the range of any sequence alignments in the record. Sequence format shows the sequence(s) in the record along with any associated features.
  • The location of certain menu items has been changed. All features and descriptors are now accessed from the Annotate menu.
  • A variety of minor bugs have been fixed (affecting all platforms).

Version 1.94--March 12, 1997

  • We have increased support for phylogenetic/population study submissions. The supporting documentation has been extended and now includes instructions on how to propagate features (such as CDS or rRNA) through alignments.
  • We have added pattern matching for nucleotide sequences using regular expressions. This is used via the "Find" command in the Sequence Editor Edit menu. For example,
    • TCAGGGC finds the sequence TCAGGGC
    • [TCA]CAGGGC finds T or C or A followed by CAGGGC
    • NCAGGGC finds T or C or G or A followed by CAGGGC
    • TCA(3)GC finds the sequence TCAGGGC
    • TCA(1:3)GC finds the sequences TCAGC, TCAGGC, and TCAGGGC
    • TCA(1:3)NC finds the sequence TCA, followed by 1-3 occurrences of G,A,T,or C, followed by C, i.e., TCATC or TCATTC or TCAATGC
  • You must now enter the scientific, not the common, organism name when preparing a new submission. Sequin, as well as the GenBank/EMBL/DDBJ record, will still list both scientific and common names, however.
  • A variety of minor bugs have been fixed (affecting all platforms).

Revised June 2, 1998
Comments and questions to: info@ncbi.nlm.nih.gov (NCBI) or http://www.ebi.ac.uk/support/ (EBI)

spacer
spacer