Where do the data come from?

Where do the data come from?

IntAct mainly gathers interaction data through active, in-depth curation of the available molecular interactions literature (Figure 5). Our curators read published scientific articles containing molecular interactions evidence and represent the interactions following the IMEx consortium guidelines. This allows for consistent representation between the databases in the IMEx consortium, which aims to improve curation efficiency and to provide interaction-data users with a large, uniform set of interactions. In addition to classical, low-throughput experimental publications, the data can be curated from large datasets produced in high-throughput protein–protein interaction projects. Direct user submissions of both high- and low-throughput datasets are also accepted.

Sources of interaction data available for curation

   Figure 5. Sources of interaction data available for curation.

IntAct data inputs and cross-referencing

The interactions represented in IntAct come from experimental data obtained through a plethora of different methodologies [Figure 6A].

IntAct makes heavy use of cross-references in order to enrich the representation of our molecular interactions datasets [Figure 6B]. For a given interaction, other services from inside and outside the EBI such as UniProtKB, InterPro or the Gene Ontology can be used to depict the proteins interacting, the domains that actually take part in the interaction and the functional context in which it takes place, whenever this information is available.

IntAct data input pipeline

Figure 6. IntAct data input pipeline: [A] manual curation based on published literature and direct submission (including high throughput projects). [B] cross reference to several relevant databases allows the users to access a variety of other resources (e.g. ChEBI, UniProt).