Glossary

Hint

Click here to go back to the SAT.

Accession

An accession number, or simply accession, in bioinformatics, is a unique identifier given to a DNA or protein sequence record to allow for tracking of different versions of that sequence record and the associated sequence over time in a single data repository. Because of its relative stability, accession numbers can be utilized as foreign keys for referring to a sequence object, but not necessarily to a unique sequence. All sequence information repositories implement the concept of “accession number” but might do so with subtle variations.

Annotated EMDB-SFF file

After annotation, an EMDB-SFF becomes an EMDB-SFF file but may or may not include the segmentation geometry. By merging the annotated geom-less EMDB-SFF file with the corresponding unannotated EMDB-SFF with geometry results in a complete annotated EMDB-SFF file.

Annotation

An annotation is extra information associated with a particular point in a document or other piece of information added by way of explanation or commentary.

Archive

An archive is an accumulation of historical records or materials – in any medium – or the physical facility in which they are located.

CLI

A command-line interpreter or command-line processor uses a command-line interface (CLI) to receive commands from a user in the form of lines of text. This provides a means of setting parameters for the environment, invoking executables and providing information to them as to what actions they are to perform.

Data model

A data model is an abstract model that organizes elements of data and standardizes how they relate to one another and to the properties of real-world entities.

EMDB-SFF

The EMDB Segmentation File Format is an open, interoperable data model to describe both geometrical annotations (segmentations) and textual annotations of electron microscopy images, typically in the EMDB MAP format.

Geometrical annotations

Another name for segmentation consisting of one or more geometrical representations of the regions of interest in an image (2D/3D).

HDF5

Hierarchical Data Format (HDF) is a set of file formats (HDF4, HDF5) designed to store and organize large amounts of data. The current version, HDF5, differs significantly in design and API from the major legacy version HDF4.

Header

The part of a file that summarises metadata about the file, which can be used to determine various capabilities of the data.

JSON

JSON (JavaScript Object Notation is an open standard file format and data interchange format that uses human-readable text to store and transmit data objects consisting of attribute–value pairs and arrays (or other serializable values). It is a common data format with diverse uses in electronic data interchange, including that of web applications with servers.

Ontology

In computer science and information science, an ontology encompasses a representation, formal naming, and definition of the categories, properties, and relations between the concepts, data, and entities that substantiate one, many, or all domains of discourse.

Open

In the field of information technology, open or more generally, open source, refers to a specification or implementation whose code artefacts are fully available for inspection and modification by third parties.

Project

In SAT, a project is a way or organising a group of related segmentations. It houses zero or more segmentations.

Segmentation

In SAT, a segmentation refers specifically to a JSON EMDB-SFF file containing one or more segments. It is required that all geometry has been removed to make the file easier to work with online.

XML

Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing arbitrary data. It defines a set of rules for encoding documents in a format that is both human-readable and machine-readable.

Most of the terms in this glossary are the definitions found in Wikipedia with the corresponding links provided in the definition.