EMBOSS Seqret

Introduction

EMBOSS Seqret reads and writes (returns) sequences. It is useful for a variety of tasks, including extracting sequences from databases, displaying sequences, reformatting sequences, producing the reverse complement of a sequence, extracting fragments of a sequence, sequence case conversion or any combination of the above functions.

Official Website
Download Software

How to use this tool

Running a tool from the web form is a simple multiple steps process, starting at the top of the page and following the steps to the bottom.

Each tool has at least 2 steps, but most of them have more:

  • The first steps are usually where the user sets the tool input (e.g. sequences, databases...)
  • In the following steps, the user has the possibility to change the default tool parameters
  • And finally, the last step is always the tool submission step, where the user can specify a title to be associated with the results and an email address for email notification. Using the submit button will effectively submit the information specified previously in the form to launch the tool on the server

Note that the parameters are validated prior to launching the tool on the server and in the event of a missing or wrong combination of parameters, the user will be notified directly in the form.

Step 1 - Input Sequences

Input Sequence

One or more sequences to be translated can be entered directly into this form. Sequences can be be in GCG, FASTA, EMBL, GenBank, PIR, NBRF, PHYLIP or UniProtKB/Swiss-Prot format. Partially formatted sequences are not accepted. Adding a return to the end of the sequence may help certain applications understand the input. Note that directly using data from word processors may yield unpredictable results as hidden/control characters may be present. There is a limit of 2MB.

Sequence File Upload

A file containing one or more valid sequences in any format (GCG, FASTA, EMBL, GenBank, PIR, NBRF, PHYLIP or UniProtKB/Swiss-Prot) can be uploaded and used as input for the translation. Word processor files may yield unpredictable results as hidden/control characters may be present in the files. It is best to save files with the Unix format option to avoid hidden Windows characters. There is a limit of 2MB.

Sequence Type

Indicates if the query sequence is protein, DNA or RNA. Used to force FASTA to interpret the input sequence as specified type of sequence (via. the '-p', '-n' or '-U' options), this prevents issues when using nucleotide sequences that contain many ambiguous residues.

Type Abbreviation
PROTEIN protein
DNA dna
RNA rna

Step 2 - Select parameters

Input Format

Input format name

Format Name Value
Unknown format unknown
GCG sequence format gcg
GCG old (version 8) sequence format gcg8
EMBL entry format embl
Swissprot entry format swiss
NBRF/PIR entry format nbrf
PDB protein databank format ATOM lines pdb
PDB protein databank format SEQRES lines pdbseq
PDB protein databank format nucleotide ATOM lines pdbnuc
PDB protein databank format nucleotide SEQRES lines pdbnucseq
FASTA format including NCBI-style IDs fasta
Plain old fasta format with IDs not parsed further pearson
FASTQ short read format ignoring quality scores fastq
FASTQ short read format with phred quality fastq-sanger
FASTQ Illumina 1.3 short read format fastq-illumina
FASTQ Solexa/Illumina 1.0 short read format fastq-solexa
Sequence Alignment/Map (SAM) format sam
Genbank entry format genbank
Genbank/DDBJ entry format (alias) ddbj
Refseq entry format refseq
Refseq protein entry format refseqp
Codata entry format codata
DNA strider output format strider
Clustalw output format clustal
Phylip interleaved and non-interleaved formats phylip
Phylip non-interleaved format phylipnon
ACE sequence format ace
ACE sequence format consed
ACEDB sequence format acedb
Fasta format variant with database name before ID dbid
GCG MSF (multiple sequence file) file format msf
Hennig86 output format hennig86
Jackknifer interleaved and non-interleaved formats jackknifer
Nexus/paup interleaved format nexus
Treecon output format treecon
Mega interleaved and non-interleaved formats mega
Intelligenetics sequence format strict parser igstrict
Intelligenetics sequence format ig
Old staden package sequence format staden
Plain text text
GFF feature file with sequence in the header gff2
GFF3 feature file with sequence gff3
GFF3 feature file with sequence gff
Stockholm (pfam) format stockholm
Selex format selex
Fitch program format fitch
Biomart tab-delimited results biomart
Mase program format mase
Raw sequence with no non-sequence characters raw
Staden experiment file experiment
ABI trace file abi
Binary Sequence Alignment/Map (BAM) format bam
Ensembl SQL format ensembl

Default value is: Unknown format [unknown]

Output Format

Output format name.

Format Name Value
EMBL entry format embl
Unknown format unknown
GCG sequence format gcg
GCG old (version 8) sequence format gcg8
EMBL new entry format emblnew
Swissprot entry format swiss
Swissprot entry format (swold) swold
Swissprot entry format (swissold) swissold
Swissprot entry format (swissprotold) swissprotold
Swissprot entry format (swissnew) swissnew
FASTA format fasta
NCBI fasta format with NCBI-style IDs ncbi
NCBI fasta format with NCBI-style IDs using GI number gifasta
NBRF/PIR entry format nbrf
Genbank entry format genbank
Genbank/DDBJ entry format (alias) ddbj
Genpept entry format genpept
Refseq entry format refseq
Refseqp entry format refseqp
GFF2 feature file with sequence in the header gff2
GFF3 feature file with sequence in FASTA format after gff3
GFF3 feature file with sequence in FASTA format after gff
Intelligenetics sequence format ig
Codata entry format codata
DNA strider output format strider
ACEDB sequence format acedb
Staden experiment file experiment
Old staden package sequence format staden
Plain text text
Fitch program format fitch
GCG MSF (multiple sequence file) file format msf
Clustalw multiple alignment format clustal
Selex format selex
Phylip interleaved format phylip
Phylip non-interleaved format phylipnon
NCBI ASN.1 format asn1
Hennig86 output format hennig86
Mega interleaved output format mega
Mega non-interleaved output format meganon
Nexus/paup interleaved format nexus
Nexus/paup non-interleaved format nexusnon
Jackknifer output interleaved format jackknifer
Jackknifer output non-interleaved format jackknifernon
Treecon output format treecon
Mase program format mase
DASDNA DAS nucleotide-only sequence dasdna
DASSEQUENCE DAS any sequence das
FASTQ short read format with phred quality fastq-sanger
FASTQ short read format with phred quality fastq
FASTQ Illumina 1.3 short read format fastq-illumina
FASTQ Solexa/Illumina 1.0 short read format fastq-solexa
Sequence alignment/map (SAM) format sam
Binary sequence alignment/map (BAM) format bam
Debugging trace of full internal data content debug

Default value is: EMBL entry format [embl]

Feature Selection

Use feature information

Default value is: yes [true]

First Sequence Only

Read one sequence and stop

Default value is: no [false]

Reverse-Complement of Input Sequencess

Reverse-complement of input DNA sequences

Default value is: no [false]

Output Case

Change alphabet case for output sequences.

Default value is: No change [none]

Sequence Range

Specify a range or section of the input sequence to use in the search. Example: Specifying '34-89' in an input sequence of total length 100, will tell EMBOSS seqret to only use residues 34 to 89, inclusive.

Default value is: START-END

Step 3 - Submission

Job title

It's possible to identify the tool result by giving it a name. This name will be associated to the results and might appear in some of the graphical representations of the results.

Email Notification

Running a tool is usually an interactive process, the results are delivered directly to the browser when they become available. Depending on the tool and its input parameters, this may take quite a long time. It's possible to be notified by email when the job is finished by simply ticking the box "Be notified by email". An email with a link to the results will be sent to the email address specified in the corresponding text box. Email notifications require valid email addresses.

Email Address

If email notification is requested, then a valid Internet email address in the form joe@example.org must be provided. This is not required when running the tool interactively (The results will be delivered to the browser window when they are ready).

References

EMBOSS: the European Molecular Biology Open Software Suite.
(2000 Jun) Trends in genetics : TIG 16 (6) :276-7
A new bioinformatics analysis tools framework at EMBL-EBI.
(2010 Jul) Nucleic acids research 38 (Web Server issue) :W695-9
Analysis Tool Web Services from the EMBL-EBI.
(2013 Jul) Nucleic acids research 41 (Web Server issue) :W597-600