spacer
IntAct Logo

v.3.1-SNAPSHOT

Curated Datasets

Curated datasets are publications tagged, either computationally or manually by a curator, as being relevant to a specific area of biology. These are actively maintained and grow with every release.
New datasets can be requested, if relevant to your work, by mailing intact-help@ebi.ac.uk.

Manually selected datasets

  • Alzheimers - Interaction dataset based on proteins with an association to Alzheimer disease

    The compilation of this dataset and its curation was carried out in collaboration with Perreau V.M. University of Melbourne, Australia. Interactions were investigated in the context of Alzheimers disease with a particular focus on APP (A4) protein. The articles to be curated were determined based on protein annotations and literature scanning.

    Publications based on this dataset: PMID: 20391539

    webappLogoPSI 2.5
  • BioCreative - Critical Assessment of Information Extraction systems in Biology

    The Biocreative dataset is a large dataset of curated publications from the Journal of Biological Chemistry (2006) and Nature publishing house which were manually curated by IntAct curators. This dataset has been used in BioCreative II (Critical Assessment of Information Extraction systems in Biology): Protein-Protein Interaction Task . The protein-protein interaction task focused on the prediction of protein interactions from full text articles, which are represented in the Biocreative dataset. The Biocreative dataset provided by IntAct is a resource for text mining development and testing. The data file (source-text.txt) that provides a mapping between IntAct interactions and the sentence(s) of the publication that allowed an IntAct curator to identify the interaction is available here.

    Publications based on this dataset: PMID: 18834496, 18834487, 19208158

    webappLogoPSI 2.5
  • Cancer - Interactions investigated in the context of cancer

    This dataset consists of interactions of proteins that are involved in cancer. An ongoing literature survey was carried out to determine publications of interest. Protein annotations were also considered when choosing the publications to be curated.

    webappLogoPSI 2.5
  • Chromatin - Epigenetic interactions resulting in chromatin modulation

    Chromatin relevant protein-protein interaction studies have been curated by IntAct curators from peer reviewed literature. These comprise interactions which are involved in modulating, modifying or forming chromatin. This dataset aims at capturing major epigenetic interactions resulting in chromatin modulation. Most of the publications were derived from 'Chromatin Papers ListServe' maintained by Bone J.

    webappLogoPSI 2.5
  • Complexes - Interaction dataset based on curated protein complexes.

    The Complexes dataset is a dataset for modelled protein complexes curated by IntAct curators from peer reviewed literature.

    webappLogoPSI 2.5
  • Cyanobacteria - Interaction dataset based on Cyanobacteria proteins and related species

    This dataset was obtained in a collaboratice effort with Franck Chauvat, Corinne Cassier-Chauvat, Jean-Cristophe Aude, Magali Michaut, and Pierre Legrain from DBJC, CEA Saclay, Gif-Sur-Yvette, France. Cyanobacteria like Synechocystis sp. can be used as model organism as they undergo both oxidative respiration and photosythesis. Cyanobacteria have many features common to bacteria including a lack of compartmentalisation. This dataset is used to gather articles showing interactions relevant to plant photosynthesis, redox metabolism, resistance to metal and oxidative stress. Most interactors belong to Cyanobacteria species, with a focus on Synechocystis sp. (strain PCC 6803), TaxID 1148, but some interactors belong to species where biological events seen in Cyanobacteria may also occurs like plants for photosynthesis. Also, the dataset contains a number of hybrid experiments using electron transfer between proteins from different species.

    Publications based on this dataset: PMID: 18508856

    webappLogoPSI 2.5

Computationally maintained datasets

These datasets are computationally maintained but additional papers may be manually added to this set by a curator during the curation process. When datasets are computationally added to a publication, the large scale papers (more than 100 interactions per experiment) are excluded.
  • AFCS - Interactions from the Alliance for Cell Signaling database

    This dataset was obtained from the Alliance for Cell Signaling database. The Alliance of Cellular Signalling (AFCS) consisted of around 20 institutions which were engaged in a collaborative effort to investigate and understand cellular signalling networks (http://www.afcs.org/). The AfCS used high-throughput methods to detect protein-protein interactions between signaling molecules expressed in B cells and cardiac myocytes. The AfCS arranged a collaboration with Myriad Genetics to perform large-scale yeast two-hybrid screens. IntAct acted as a data repository of protein-protein interaction data generated by the AFCS project.

    webappLogoPSI 2.5
  • Apoptosis - Interactions involving proteins with a function related to apoptosis

    Datasets of apoptosis relevant protein-protein interaction studies are curated by IntAct curators from peer reviewed literature. These datasets are a resource for biologists seeking to understand protein interaction networks and cell death. Small-scale interactions involving proteins annotated with the GO terms "Apoptosis" are included in this set.

    webappLogoPSI 2.5
  • Archaea - Interaction dataset based on Archaea proteins

    Archaea are phylogenetically very different from Bacteria and Eukarya and show many differences in their biochemistry from other forms of life. This was considered of interest and peer reviewed literature that is curated is scanned for interactions involving proteins from this group.

    webappLogoPSI 2.5
  • PDBe - Data obtained from the Protein Data Bank Europe

    The Protein Data Bank in Europe (PDB3) is the European project for the collection, management and distribution of data about macromolecular structures, in collaboration with Worldwide Protein Data Bank (wwPDB). IntAct has incorporated a subset of the data from this database involving heterodimeric protein interactions.

    webappLogoPSI 2.5
  • NDPK - Interactions involving proteins containing InterPro domain IPR001564, Nucleoside diphosphate kinase, core.

    NDPKs, which play a major role in the synthesis of nucleoside triphosphates other than ATP, also possess other enzymatic activities and are required for cell proliferation, differentiation and development.

    Publications based on this dataset: PMID: 19415463

    webappLogoPSI 2.5
  • Synapse - Interactions of proteins with an established role in the presynapse.

    This dataset has been created for proteins-protein interactions involving at least one protein with an established link to the synapse. The list of human, rat and mouse gene names used for computationnally maintaining this dataset are available here. Interactions made by orthologous proteins have been added manually by IntAct curators.

    webappLogoPSI 2.5

Species-based datasets

Species specific datasets are generated from the protein-protein interaction data curated from peer reviewed journals and are available here. The data are based on the taxonomy of the proteins taking part in the interaction. Analysis of one such dataset, which involved Arabidopsis proteins has been discussed in PMID: 20371643.

spacer