Intein N-terminal splicing region (IPR006141)

Short name: Intein_N


Inteins, or protein introns, are parts of protein sequences that are post-translationally excised, their flanking regions (exteins) being spliced together to yield an additional protein product [PMID: 7756989,PMID: 8165123]. This process is believed to be self-catalysed, apparently initiating at the C-terminal splice junction, where a conserved asparagine residue mediates the nucleophilic attack of the peptide bond between it and its neighbouring residue. Most inteins consist of two domains: One is involved in autocatalytic splicing, and the other is an endonuclease that is important in the spread of inteins [PMID: 12142479].

Inteins are between 134 and 608 amino acids long found in eukaryotes, bacteria, and archaea, although most frequently in archaea. Inteins are found in proteins with diverse functions, including metabolic enzymes, DNA and RNA polymerases, proteases, ribonucleotide reductases, and the vacuolar-type ATPase. However, enzymes involved in DNA replication and repair appear to dominate. Inteins are found in conserved regions of conserved proteins and can be regarded as parasitic genetic elements [PMID: 12142479].

The splicing of inteins initiates at the C-terminal splice junction. The delta-nitrogen group of a conserved asparagine residue makes a nucleophilic attack on the peptide bond that links this asparagine to the next residue. The next residue (a Cys, Ser or Thr) is then free to attack the peptide bond at the N-terminal splice junction by a transpeptidation reaction that releases the intein and creates a new peptide bond. Such a mechanism is briefly schematised in the following figures.

 1) Primary translation product

      +---------------+  +-------------+  +--------------+
  NH2-| Extein 1      x--y Intein      N--z Extein 2     |-COOH
      +---------------+  +-------------+  +--------------+

 2) Breakage of the peptide bond at the C-terminal splice junction by
    nucleophilic attack of the asparagine.

      +---------------+  +-------------+      +--------------+
  NH2-| Extein 1      x--y Intein      N  NH2-z Extein 2     |-COOH
      +---------------+  +-------------+      +--------------+

 3) Transpeptidation to produce the final products.

      +---------------+  +-------------+           +--------------+
  NH2-| Extein 1      x--z Extein 2    |-COOH  NH2-y Intein       N
      +---------------+  +-------------+           +--------------+

Most inteins are bifunctional proteins mediating both protein splicing and DNA cleavage. The domain involved in splicing is formed by the two terminal splicing regions, which are separated by a small linker in mini-inteins or a homing endonuclease of 200-250 amino acids in larger inteins [PMID: 11092822, PMID: 10592269]. The N-terminal splicing region spans the about 100 N-terminal amino acids and contains the conserved intein blocks A and B which are similar to the motifs found in the C-terminal autoprocessing domain of the hedgehog protein. The C-terminal splicing region is composed of the two conserved blocks F and G located in the about 50 C-terminal amino acids. Although, no single residue is invariant, the Ser and Cys in block A, the His in block B, the His, Asn and Ser/Cys/Thr in block G are the most conserved residues in the splicing motifs. Protein splicing requires neither cofactors nor auxiliary enzymes and involves a series of four intramolecular reactions in which several of these most conserved residues are implicated [PMID: 11092822, PMID: 9092614].

The entry represents a the N-terminal splicing region that covers the intein blocks A and B. It starts with the first N-terminal amino acid of the intein.

GO terms

Biological Process

GO:0016539 intein-mediated protein splicing

Molecular Function

No terms assigned in this category.

Cellular Component

No terms assigned in this category.

Contributing signatures

Signatures from InterPro member databases are used to construct an entry.
PROSITE profiles