spacer

Sequin FAQ


  1. Network-aware Sequin
  2. Indicating organism names for phylogenetic studies
  3. Adding titles to sets of sequences
  4. Annotating selenocysteines
  5. Library problems under Solaris
  6. Adding modifiers (e.g., strain, chromosome, cell-type) to biological source information
  7. Appending a reference to a feature
  8. Propagating features
  9. Export versus Save
  10. Problems importing sequences
  11. Automatic Definition Line Generation

  1. How do I change between the stand-alone and network-aware modes of Sequin?

    There are two ways to change between the stand-alone and network-aware modes of Sequin. (1) When you launch the Sequin program, you will see a menu called Misc on the Welcome to Sequin form. Select Net Configure under this menu. (2) If you are already running Sequin, select the option under the Sequin Misc menu called Net Configure. In either case, Sequin will prompt you to set certain preferences, and will then run a network configuration program. In most cases, the default preferences are sufficient. To switch Sequin back into its stand-alone mode, select the Net Configure option again. You must restart Sequin before any changes to the network mode take effect. For additional information, see the Sequin help documentation under Net Configure.

    [Top of Page]

  2. I am submitting a set of sequences as part of a phylogenetic study. Each sequence comes from a different organism. How do I indicate which sequence comes from which organism?

    There are two ways to indicate the organism. First, you can encode the organism name directly into the file which contains the nucleotide sequence. Second, you can indicate the organism name on the Source Modifiers form which appears after the Organism and Sequences form.

    For either method, your sequences must be in FASTA, FASTA+GAP, PHYLIP, NEXUS Contiguous, or NEXUS Interleaved format. FASTA format is used for single unaligned sequences. It consists of a "definition line" followed by lines of sequence. FASTA+GAP, PHYLIP, and NEXUS formats are used for sets of aligned sequences. FASTA+GAP format is similar to FASTA format, except that gaps, indicated by a "-", are allowed. PHYLIP and NEXUS format are generated by certain sequence analysis packages.

    To encode the organism name into the file which contains the nucleotide sequence:

    If your sequences are in FASTA or FASTA+GAP format, insert the phrase [org=organism scientific name], such as [org=Mus musculus] or [org=Drosophila melanogaster] in the definition line of each sequence. The definition line is the line starting with a ">" character which immediately precedes your sequence. Use the scientific name of the organism, but don't use abbreviations such as D. melanogaster. The first word immediately following the ">" character is the SeqId, a unique identifier which you provide for your sequence.

    If your sequences are in PHYLIP or NEXUS format, you must create a FASTA-style definition line for each sequence. Place this definition line containing the [org=organism scientific name] phrase at the bottom of the PHYLIP or NEXUS file.

    You can also encode other modifiers in the definition line. Another FAQ discusses how to add modifiers (e.g., strain, chromosome, cell-type) to biological source information. Here are some examples:

    sample file containing sequences in FASTA format:

     
    											
    >dna1 [org=Mus musculus] [strain=A] 
    GGGGGGGGGGAAAAAAAAAAAAAAATTTTTTTTTTTTTTTTCCCCCCCCCCCCCGGGGGGGGGGG 
    AAAAAAATTTTTTTTTTTTTCCCCCCCCCCCCCC 
    >dna2 [org=Drosophila melanogaster] 
    [strain=B] GGGGGTGGGGAAAAAAAAAAAAAAATTTTTTTATTTTTTTTCCCCGGCCCCCCCGGGGGGGGGGG 
    AAAAAAATTTTTTTTATTTTCCCCACCCCCCCCC 
    >dna3 [org=Saccharomyces cerevisiae] 
    [strain=C] GGGGGGCGGGAAAAAAAAAAAAAAATTTTTTTTTTTTTTTTCCCCCGCCCCCCCGGGGGGGGGGG 
    AAAAAAATTTTTTTTCTTTTCCCCCCCCCCCCCC 
    											

    sample file containing sequences in PHYLIP format:

     
    											
    3 100 ABC-1 GGGGGGGGGG AAAAAAAAAA 
    AAAAATTTTT TTTTTTTTTT TCCCCCCCCC ABC-2 
    GGGGGTGGGG AAAAAAAAAA AAAAATTTTT TTATTTTTTT 
    TCCCCGGCCC ABC-3 GGGGGGCGGG AAAAAAAAAA 
    AAAAATTTTT TTTTTTTTTT TCCCCCGCCC CCCCGGGGGG 
    GGGGGAAAAA AATTTTTTTT TTTTTCCCCC CCCCCCCCCC 
    CCCCGGGGGG GGGGGAAAAA AATTTTTTTT ATTTTCCCCA 
    CCCCCCCCCC CCCCGGGGGG GGGGGAAAAA AATTTTTTTT 
    CTTTTCCCCC CCCCCCCCCC >[org=Mus musculus] 
    [strain=A] >[org=Drosophila melanogaster] 
    [strain=B] >[org=Saccharomyces cerevisiae] 
    [strain=C] 
    											

    sample file containing sequences in NEXUS Interleaved format:

     
    #NEXUS [!This data assembled using 
    Sequencher*, from Gene Codes Corporation.] 
    begin data; dimensions ntax=3 nchar=100; 
    format datatype=dna gap=: interleave; 
    matrix 3 100 ABC-1 GGGGGGGGGG AAAAAAAAAA 
    AAAAATTTTT TTTTTTTTTT TCCCCCCCCC ABC-2 
    GGGGGTGGGG AAAAAAAAAA AAAAATTTTT TTATTTTTTT 
    TCCCCGGCCC ABC-3 GGGGGGCGGG AAAAAAAAAA 
    AAAAATTTTT TTTTTTTTTT TCCCCCGCCC ABC-1 
    CCCCGGGGGG GGGGGAAAAA AATTTTTTTT TTTTTCCCCC 
    CCCCCCCCCC ABC-2 CCCCGGGGGG GGGGGAAAAA 
    AATTTTTTTT ATTTTCCCCA CCCCCCCCCC ABC-3 
    CCCCGGGGGG GGGGGAAAAA AATTTTTTTT CTTTTCCCCC 
    CCCCCCCCCC >[org=Mus musculus] [strain=A] 
    >[org=Drosophila melanogaster] [strain=B] 
    >[org=Saccharomyces cerevisiae] [strain=C] 

    sample file containing sequences in NEXUS Contiguous format:

     
    #NEXUS BEGIN DATA; DIMENSIONS NTAX=3 
    NCHAR=100; FORMAT MISSING=? GAP=- 
    DATAtype=DNA ; MATRIX ABC-1 GGGGGGGGGGAAAAAAAAAAAAAAAT
    TTTTTTTTTTTTTTTCCCCCCCCCCCCCGGGGGGGGGGG 
    AAAAAAATTTTTTTTTTTTTCCCCCCCCCCCCCC 
    ABC-2 GGGGGTGGGGAAAAAAAAAAAAAAATTTTTTTATTTTTTTTCCCCGGCCCCCCCGGGGGGGGGGG 
    AAAAAAATTTTTTTTATTTTCCCCACCCCCCCCC 
    ABC-3 GGGGGGCGGGAAAAAAAAAAAAAAATTTTTTTTTTTTTTTTCCCCCGCCCCCCCGGGGGGGGGGG 
    AAAAAAATTTTTTTTCTTTTCCCCCCCCCCCCCC 
    >[org=Mus musculus] [strain=A] >[org=Drosophila 
    melanogaster] [strain=B] >[org=Saccharomyces 
    cerevisiae] [strain=C] 
                                           

    To enter the organism name on the Source Modifiers form:

    Although each sequence should have a SeqID, a unique identifier, you do not need to add any additional information. After you fill out the Organism and Sequences form, you will see the Source Modifiers form. From the top pop-up menu, choose the modifier you want to annotate, in this case, Organism. The left column lists the sequences by their SeqID. Type the scientific organism name for each sequence in the corresponding box labelled Value. Do not use abbreviations such as D. melanogaster.

    You can also add additional optional modifiers on this page, such as strain or chromosome. Another FAQ discusses how to add modifiers (e.g., strain, chromosome, cell-type) to biological source information.

    sample file containing sequences in FASTA format:

     
    dna1 GGGGGGGGGGAAAAAAAAAAAAAAATTTTTTTTTTTTTTTTCCCCCCCCCCCCCGGGGGGGGGGG 
    AAAAAAATTTTTTTTTTTTTCCCCCCCCCCCCCC 
    >dna2 GGGGGTGGGGAAAAAAAAAAAAAAATTTTTTTATTTTTTTTCCCCGGCCCCCCCGGGGGGGGGGG 
    AAAAAAATTTTTTTTATTTTCCCCACCCCCCCCC 
    >dna3 GGGGGGCGGGAAAAAAAAAAAAAAATTTTTTTTTTTTTTTTCCCCCGCCCCCCCGGGGGGGGGGG 
    AAAAAAATTTTTTTTCTTTTCCCCCCCCCCCCCC 

    sample file containing sequences in PHYLIP format:

     
    3 100 ABC-1 GGGGGGGGGG AAAAAAAAAA 
    AAAAATTTTT TTTTTTTTTT TCCCCCCCCC ABC-2 
    GGGGGTGGGG AAAAAAAAAA AAAAATTTTT TTATTTTTTT 
    TCCCCGGCCC ABC-3 GGGGGGCGGG AAAAAAAAAA 
    AAAAATTTTT TTTTTTTTTT TCCCCCGCCC CCCCGGGGGG 
    GGGGGAAAAA AATTTTTTTT TTTTTCCCCC CCCCCCCCCC 
    CCCCGGGGGG GGGGGAAAAA AATTTTTTTT ATTTTCCCCA 
    CCCCCCCCCC CCCCGGGGGG GGGGGAAAAA AATTTTTTTT 
    CTTTTCCCCC CCCCCCCCCC 

    [Top of Page]

  3. I am submitting a large set of sequences. I want each sequence to have the same title, but I don't want to add all the titles to all the definition lines by hand. Can Sequin add titles to sequences automatically?

    There is no need to add a title to the definition line for each sequence. On the Annotation Page, which is part of the Organism and Sequences form, you can add the title which you would like to apply to all the sequences. The title should start with the name of the organism. If your sequences all come from different organisms, you can instruct Sequin to prefix the title with the organism name.
    Examples of sequence titles are

    Arabidopsis thaliana pyruvate dehydrogenase E1 alpha subunit mRNA, nuclear gene encoding mitochondrial protein, complete cds.

    Bos Taurus retinal pigment (RPE1) mRNA, 3' end.

    Ophraella conferta 16S mitochondrial ribosomal RNA, large subunit, mitochondrial gene.

    Sequin can also create titles automatically from information you provide in the record. See the FAQ on the Automatic Definition Line Generation, below.

    For additional information on formatting or adding titles, see the help documentation for the Nucleotide definition line or the Annotation Page, respectively.

    [Top of Page]

  4. How do I indicate the position of a selenocysteine residue?

    Open the Coding Region feature form by double clicking on the CDS in the record viewer. Select the Exceptions subpage of the Coding Region page by clicking on the appropriate folder tabs. In the box labelled Position, indicate, with a single number, the amino acid location of the selenocysteine. Select Selenocysteine from the Amino Acid popup menu, and click on Accept.

    [Top of Page]

  5. I am running Sequin under Solaris, and it won't start without a library named libresolv.so.2. I do not have this library on my machine.

    The library libresolv.so.2 is a security patch issued by Sun. Your system administrator should be able to install it. Alternatively, the library called libresolv.so.1 can substitute for libresolv.so.2. Copying libresolv.so.1 to libresolv.so.2 by typing

    cp libresolv.so.1 libresolv.so.2

    will also solve the problem.

    [Top of Page]

  6. I would like to add some additional information about the biological source from which the sequence is derived. What modifiers can I include, and how should I annotate them on the sequence?

    You can add biological source information which describes the organism or source from which the sequence was derived. If you are submitting a single sequence, you must encode the information directly in the definition line. If you are submitting multiple sequences as part of a phylogenetic, population, or mutation study, you can either encode the information directly in the definition line or enter it on the Source Modifiers form which follows the Organism and Sequences form.

    The file containing your sequences may be in FASTA, FAST+GAP, PHYLIP, or NEXUS Interleaved, or NEXUS Contiguous format. FASTA format is used for single unaligned sequences. It consists of a "definition line" followed by lines of sequence. FASTA+GAP, PHYLIP, and NEXUS formats are used for sets of aligned sequences. FASTA+GAP format is similar to FASTA format, except that gaps, indicated by a "-", are allowed. PHYLIP and NEXUS format are generated by certain sequence analysis packages.

    To encode biological source information into the definition line:

    The definition line is the line starting with a ">" character. The modifiers, such as [strain=BALB/c] or [chromosome=2], should be placed in brackets, and should be placed ahead of the sequence title. Two other FAQs, Adding titles to sets of sequences and Automatic Definition Line Generation describe how to add titles to your sequences.

    For example, if your sequences are in FASTA or FASTA+GAP format, you could indicate the name of the organism, as well as the strain, chromosome, and cell type:

    sample file containing sequences in FASTA format:

     
    >dna1 [org=Mus musculus] [strain=BALB/c] 
    [chromosome=2] [cell-type=leukocyte] 
    Mus musculus example (XMP) mRNA, partial 
    cds GGGGGGGGGGAAAAAAAAAAAAAAATTTTTTT
    TTTTTTTTTCCCCCCCCCCCCCGGGGGGGGGGG 
    AAAAAAATTTTTTTTTTTTTCCCCCCCCCCCCCC 
    >dna2 [org=Rattus norvegicus] [strain=Sprague-Dawley] 
    [chromosome=5] [cell-type=macrophage] 
    Rattus norvegicus example (XMP) mRNA, 
    partial cds GGGGGTGGGGAAAAAAAAAAAAAAATTTTTTTA
    TTTTTTTTCCCCGGCCCCCCCGGGGGGGGGGG 
    AAAAAAATTTTTTTTATTTTCCCCACCCCCCCCC 

    You can include the same modifiers at the very end of your PHYLIP or NEXUS file:

    sample file containing sequences in PHYLIP format:

     
     2 100 ABC-1 GGGGGGGGGG AAAAAAAAAA 
    AAAAATTTTT TTTTTTTTTT TCCCCCCCCC ABC-2 
    GGGGGTGGGG AAAAAAAAAA AAAAATTTTT TTATTTTTTT 
    TCCCCGGCCC CCCCGGGGGG GGGGGAAAAA AATTTTTTTT 
    TTTTTCCCCC CCCCCCCCCC CCCCGGGGGG GGGGGAAAAA 
    AATTTTTTTT ATTTTCCCCA CCCCCCCCCC >[org=Mus 
    musculus] [strain=BALB/c] [chromosome=2] 
    [cell-type=leukocyte] Mus musculus 
    example (XMP) mRNA, partial cds >[org=Rattus 
    norvegicus] [strain=Sprague-Dawley] 
    [chromosome=5] [cell-type=macrophage] 
    Rattus norvegicus example (XMP) mRNA, 
    partial cds 
                                             

    To enter biological source information on the Source Modifiers form:

    You will only have access to this form if you are submiting a set of sequences as part of a phylogenetic, population, or mutation study. You do not need to include any information about the biological source on the definition line. Rather, from the top pop-up menu on the Source Modifiers form, choose the modifier you want to add. The left column lists the sequences by their SeqID, or the unique identifier which you provided for your sequence. Type the modifier for each sequence in the corresponding box labelled Value. For example, if you select the Strain modifier, you might type BALB/c in the first Value box, Sprague-Dawley in the second, etc.

    A FASTA file might look like this:

    >dna1 
    GGGGGGGGGGAAAAAAAAAAAAAAATTTTTTTTTTTTTTTTCCCCCCCCCCCCCGGGGGGGGGGG
    AAAAAAATTTTTTTTTTTTTCCCCCCCCCCCCCC
    
    >dna2 
    GGGGGTGGGGAAAAAAAAAAAAAAATTTTTTTATTTTTTTTCCCCGGCCCCCCCGGGGGGGGGGG
    AAAAAAATTTTTTTTATTTTCCCCACCCCCCCCC
    
    

    A PHYLIP file might look like this:

          2    100
    ABC-1      GGGGGGGGGG AAAAAAAAAA AAAAATTTTT TTTTTTTTTT TCCCCCCCCC  
    ABC-2      GGGGGTGGGG AAAAAAAAAA AAAAATTTTT TTATTTTTTT TCCCCGGCCC 
    
               CCCCGGGGGG GGGGGAAAAA AATTTTTTTT TTTTTCCCCC CCCCCCCCCC
               CCCCGGGGGG GGGGGAAAAA AATTTTTTTT ATTTTCCCCA CCCCCCCCCC
    
    

    Alternatively, you can provide the modifiers after your sequence(s) has been imported into Sequin. Under the Sequin Misc menu, click on Create New Descriptor-->Biological Source. Select the Modifiers Page, and either the Source or Organism subpage. Additional information about the biological source is available in the help documentation under Biological Source.

    The following is a list of preferred modifiers. You may add as many modifiers as you wish.

    Source modifiers:
    Usage: [sex=xxx] or [transposon-name=xxx]

    • Chromosome: Chromosome to which the gene maps.
    • Map: Map location of the gene.
    • Clone: Name of clone from which sequence was obtained.
    • Subclone: Name of subclone from which sequence was obtained.
    • Haplotype: Haplotype of the organism.
    • Genotype: Genotype of the organism
    • Sex: Sex of the organism from which the sequence derives.
    • Cell-line: Cell line from which sequence derives.
    • Cell-type: Type of cell from which sequence derives.
    • Tissue-type: Type of tissue from which sequence derives.
    • Clone-lib: Name of library from which sequence was obtained.
    • Dev-stage: Developmental stage of organism.
    • Frequency The fraction of population carrying the variation, expressed as a decimal fraction.
    • Germline Used for immunoglobulin genes. Not rearranged.
    • Rearranged Used for immunoglobulin genes.
    • Lab-host: Laboratory host used to propagate the organism from which the sequence was derived.
    • Pop-variant Population variant
    • Tissue-lib: Tissue library from which the sequence was obtained.
    • Plasmid-name: Name of plasmid from which the sequence was obtained.
    • Transposon-name: Name of transposable element from which the sequence was obtained.
    • Ins-seq-name Insertion sequence name
    • Plastid-name: Name of plastid from which the sequence was obtained.
    • Country The country of origin of DNA samples used for epidemiological or population studies

    Source locations
    Usage: [location=macronuclear] or [location=virion]

    • genomic
    • chloroplast
    • chromoplast
    • kinetoplast
    • mitochondrion
    • plastid
    • macronuclear
    • extrachromosomal
    • plasmid
    • transposon
    • insertion sequence
    • cyanelle
    • proviral
    • virion

    Molecule types
    Usage: [molecule=dna] or [molecule=rna]

    • dna
    • rna

    Organism modifiers
    Usage: [cultivar=xxx] or [sub-species=xxx]

    • Strain: Strain of organism from which sequence was obtained.
    • Substrain The substrain of organism from which sequence was obtained.
    • Type The type of organism from which sequence was obtained
    • Subtype The subtype of organism from which sequence was obtained
    • Variety: Variety of plant from which sequence was obtained. ask carol
    • Serotype The serotype of organism from which sequence was obtained
    • Serogroup The serogroup of organism from which sequence was obtained
    • Serovar Defined by the serotype
    • Cultivar: Common name for cultivated strain of plant, ie, Big Boy tomato or Silver Queen corn.
    • Pathovar Defined by the host on which the organism is pathogenic
    • Chemovar Defined by a chemical characteristic
    • Biovar Defined by a biological characteristic
    • Biotype Defined by a biological characteristic
    • Group The group of organism from which sequence was obtained
    • Subgroup The subgroup of organism from which sequence was obtained
    • Isolate: Identification or description of the specific individual from which this sequence was obtained, for example, Patient X14.
    • Common Common name of the organism
    • Acronym. Acronym for organism, usually found for viruses, such as HIV-1.
    • Natural Host: When the sequence submission is from an organism that exists in a symbiotic, parasitic or other special relationship with some second organism, the 'natural host' modifier can be used to identify the name of the host species.
    • Sub-species: Sub-species of organism from which sequence was obtained.
    • Specimen voucher An identifier of the individual or collection of the source organism and the place where it is currently stored, usually an institution.

    If you are including an amino acid translation of a nucleotide sequence, we suggest that you include the gene and protein names in the FASTA definition line. No other modifiers may be included in protein definition lines. For example,

    >aa1 [gene=XMP] [prot=example] Mus musculus example (XMP) protein, partial sequence
    GGGKKKKKFFFFFSPPPPGGGEKNFFFFPPPPP
    
    >aa2 [gene=XMP] [prot=example] Rattus norvegicus example (XMP) protein, partial sequence
    GVGKKKKKFFYFFSPAPPGGGEKNFFYFPHPPP
    
    

    [Top of Page]

  7. How do I append a publication reference to a single feature, e.g., a CDS?

    There are two types of publications in a database record, publication descriptors and publication features. Publication descriptors, like all descriptors, refer to the entire sequence. Publication features, like all features, refer to a part of a sequence, like a CDS or mRNA feature.

    First, you need to add the publication to the record. Under the Sequin Misc menu, click on Create New Publication --> Publication Feature. Then fill in the information for the citation. If you are referencing a published citation, this process is much easier if you have made Sequin network aware). If Sequin is network aware, go to the Journal page, and fill out the MUID (medline unique identifier). Click on Lookup By MUID, and the pages will be filled out automatically. MUIDs can be found by looking up the citation in Entrez. Alternatively, enter the Journal,Volume, Pages, and Year. Then select "Lookup Article." Sequin will retrieve the missing Title and Authors information. If the citation does not yet have a MUID, or if you are not running Sequin in its network-aware mode, fill out the Title, Authors, and Journal pages by hand.

    Next, enter the interval that the citation refers to. Go to the Location page, and type the nucleotide sequence range in the boxes. Click on accept when you are finished. You will notice the publication near the top of the record.

    Finally, add the publication to the feature. Double click on the feature to get the form for that feature. Click on the Properties page, then the Citations subpage. Click on the button labeled Edit Citations. Choose the desired citation from clicking in the box next to it, and click on Accept.

    [Top of Page]

  8. I have an alignment of multiple sequences. I would like to annotate the same feature (such as rRNA) on all the sequences. Is there an easy way to do this with Sequin?

    You can use Sequin to propagate any kind of features from one sequence to a complete set.

    1. Import the set of sequences in a pre-aligned format, for example, in PHYLIP, NEXUS, or gapped FASTA format.
    2. After you import the sequences, Sequin will open a window (the record viewer) showing the GenBank flat file format of the first sequence.
    3. Annotate the desired feature(s) on the first sequence by choosing the appropriate item from the Annotate menu and entering the base span in the Location page of the dialog box that appears.
    4. Now, select ALL SEQUENCES under the 'target' option and set the display format to Alignment.
    5. You will see a red border around the graphical view of the sequences. The sequences are diplayed as bars.
    6. Click anywhere within the red border to highlight the alignment. Launch the alignment editor by selecting Edit Alignment under the Sequin Edit menu. You will see your sequences aligned as you imported them (as bases).
    7. Click on the first button ("Show Feat") in the lower left corner of the window to show the features in the alignment. You can scroll down the alignment to verify the accuracy of the feature locations.
    8. Select Propagate in the Features menu. A dialog box will appear with three panels.
    9. In the first panel, select the source sequence, that is, the sequence that has the feature you want to propatate. In another panel, select the feature(s) to propagate. In the final panel, select the target sequences (that means all the sequences onto which you want to propagate features, i.e., all the sequences EXCEPT the one that already has them). To select multiple, nonadjacent, sequences, hold the "Ctrl" key while clicking with the mouse. To select a range of sequences, highlight the first sequence by clicking with the mouse and then "Shift-click" (hold the "Shift" key while clicking with the mouse) to select the last.
    10. Choose whether you want the feature to be split at sequence gaps or extended over sequence gaps. If you want to extend the translation after an internal stop codon in the target sequence, click on that box.
    11. Click on the 'propagate' button.

    [Top of Page]

  9. What is the difference between the Export and Save functions under the Sequin File menu?

    The Sequin Save and Save As functions save the record so that it can be re-opened by Sequin. The record is saved in a format called ASN.1, a data description language used by the NCBI. Be sure to save the record before you exit Sequin or you will lose all of your work. The ASN.1 format is also the format into which the file is saved when you prepare your your record for submission (by clicking the Done button on the record viewer or selecting Prepare Submission under the File menu). The database staff use the ASN.1 to build your sequence submission.

    The Sequin Export function exports the current view of the record. The information which is exported depends on the option that is selected in the Display Format pop-up menu. In Sequence display format, Sequin will export a text file which shows the sequence and any annotated features, such as a CDS or mRNA. In GenBank or EMBL display format, Sequin will export a copy of the record as it would appear in GenBank or EMBL format, respectively. In FASTA display format, Sequin will export the DNA sequence in FASTA format. In ASN.1 dislay format, Sequin will export a copy of the record in ASN.1.

    Note that Exporting the record is not the same as Saving. A file which has been created by Exporting cannot be re-opened by Sequin, and should not be submitted to the database.

    [Top of Page]

  10. I've formatted my sequence as you suggest, but I can't import it into Sequin. What could be wrong?

    If there is no line break (carriage return) between the definition line and the first line of sequence, you will not be able to import the sequence. Some word processors will break a single line into two lines without actually adding a carriage return. In this case, although the definition line and sequence will appear to be on two different lines, they really are on a single line. If you are unsure whether there is a carriage return, you can either set up your word processor so it shows invisible characters like carriage returns, or view the file in a text editor which does not create artificial line breaks.

    [Top of Page]

  11. Can Sequin make sequence titles (definition lines) automatically?

    There is an option in the Sequin Annotate Menu, called Generate Definition Line, which will generate a title for your sequence based on the information provided in the record. This option will work for single sequences as well as sets of sequences, and can handle complex annotations with multiple features. The title will follow GenBank conventions, but may be modified by the database staff if it is not appropriate. The title you enter here will replace any title you entered elsewhere in the submission, for example, any title which was attached to the nucleotide sequence. An additional way of adding titles to sets of sequences is described above.

    [Top of Page]

Revised January 21, 1999

Comments and questions to: info@ncbi.nlm.nih.gov (NCBI) or http://www.ebi.ac.uk/support/ (EBI)

spacer
spacer