BioModels Database logo

BioModels Database

spacer

Development with BioModels Database source code

Guide explaining how to create a custom installation of the repository.

Requirements

Get the code

The source code is available via the Subversion server hosted on SourceForge. You can use the following URL for checking out the code: https://biomodels.svn.sourceforge.net/svnroot/biomodels. For example, to get the code for the repository, you can use the following command:

  svn co https://biomodels.svn.sourceforge.net/svnroot/biomodels/repo/trunk/ biomodels

You can browse the available files at: http://biomodels.svn.sourceforge.net/viewvc/biomodels/

There are various modules:

  • repo: code for the repository and its numerous features
  • database: schemas for the internal databases
  • convert: SBML related converters (so far, the converters maintained by us are available on the download section of the SourceForge project and information about the others is located in this folder)
  • web: sample HTML pages and part of pages which are provided to get you started with running your local instance of the repository

Database

BioModels Database relies on two internal databases:

  • biomodels: used to store meta-data about the models, the annotations and the various information needed for the curation and annotation pipeline. This also includes subsets of various public resources, for better performance: Gene Ontology, Taxonomy, Systems Biology Ontology and PubMed. A Cron job updates monthly the content from these external resources.
  • web-auth: used to store information about persons (authors of publications and users of the resource -including their role, which is linked to the kind of privileges they have-).

Currently these databases are provided by a MySQL server and the schemas are available on SourceForge. Another RDMS could be used, providing you change the database connector jar and the API does not differ too much from the MySQL Connector/J.

For the curation and annotation pipeline, access to two additional external databases is required:

For this purpose, we currently directly connect to Oracle instances (not managed by us) running at the EBI, although note that these are probably not available from outside the EBI. It is planned to release in the future a curation/annotation pipeline not relying on external databases. For example, the Web Services provided by UniProt or Dbfetch could be used.

Various samples of content is available at: http://biomodels.svn.sourceforge.net/viewvc/biomodels/database/sample/

Persons and users database

We are using a database to store persons (authors of publications as well as users of BioModels Database). This database has been developed to be shared by various applications developed by the BioModels.net initiative. Therefore people can use several applications with only one account and having different roles within these resources.

auth_apps
List of all the applications needed an authentication feature.
auth_persons
Persons related information. Contains data about persons (authors of publications) and users (with an account to access an application).
auth_roles
List of all the roles for all the registered applications.
auth_roles_detail
List of the resouces which can be accessed by all the possible roles registered.
auth_users
List of the users: persons who can login on the registered applications.
auth_users_privileges
List of the roles users have (which can varies according to the application).

Users logged in as administrator can manage the users of BioModels Database by accessing the page admin-users.do and the persons by using the page admin-persons.do.

Build the war

The building process relies on Ant and the build.xml file available at the root of the repository source code.

Some files are not available via SVN as they contains information such as passwords, ... The two necessary files you will need are a context file and a Log4J configuration file. You can get an example of these files below.

Use the following command to generate a war:

  ant [-Dversion=[demo|main|alpha]] [-Dmirror=[true|false]]

The mirror option is to generate a war without the restricted features (curation and annotation pipeline, administration, ...).

Sample config files

Configuration

In order to run properly, BioModels Database needs the following specific configuration:

Servlet container configuration

Make sure your servlet container has the following environment variables defined (in the case of Tomcat, those should be declared in catalina.sh):

  export BIOMODELS_HOME="/path/to/additional/files/"
  export BIOMODELS_BASE="/path/to/web/files/"
  export BIOMODELS_WWW="URL of access"
  export BIOMODELS_WWW_INC="URL of some institution's general JS and CSS files"
  export BIOMODELS_WWW_GRAPHICS="URL of graphic files"
  export BIOMODELS_SVN_PATH="file:///path/to/subversion/repository/"

BioModels Database relies on a number of external libraries. Please ensure that your Tomcat has all the JAR files that reside in the project's lib/tomcat folder.

You should configure your HTTP server in order to have the content of BIOMODELS_BASE provided via the BIOMODELS_WWW URL.

You should choose a different URL than BIOMODELS_WWW for your Tomcat application endpoint.

For example, BIOMODELS_WWW can be equal to "http://www.ebi.ac.uk/biomodels/" and your Tomcat endpoint could be "http://www.ebi.ac.uk/biomodels-main/". Please don't forget the last forward slash in the URL!

config.properties file

In order to prevent too many hard-coded values, a config.properties files has been created. It contains elements such as URLs. The JSPs and HTML files contains tokens which are replaced by the Ant build with values coming from the config.properties when generating the WAR.

Token Usage Example of value
[REPO_ROOT_URL] Root URL of the web application http://www.ebi.ac.uk/biomodels/
[REPO_GRAPHICS_URL] URL used to access some graphical content (images, icons, ...) http://www.ebi.ac.uk/compneur-srv/GRAPHICS/
[WWW_ROOT_URL] Root URL used to access some static files http://www.ebi.ac.uk/
[WWW_STATIC_URL] URL used to access some static files http://www.ebi.ac.uk/compneur-srv/biomodels/

You need to change these values in order to match your configuration.

Constants class

The Constants class contains a lot of variables used in various other Java classes.

It is needed to customise some of the values stored in this class to match your environment.

JVM options

  export JAVA_OPTS='-Dcom.sun.management.jmxremote -Xmx2048m -Djava.awt.headless=true'

LibSBML

For various features, BioModels Database relies on libSBML. Therefore, you need to have it installed.

  export LD_LIBRARY_PATH="/path/to/libsbml/lib/"
  export CLASSPATH="/path/to/libsbml/lib/libsbmlj.jar:."

Some issues can arise by using some versions of libSBML (actually the Java interface of libSBML) with Apache Tomcat. Please test thoroughly before using in production the latest stable release of libSBML! A pure Java library for handling SBML files is under development: jsbml, which should solved these shortcomings.

Compute farm

Online simulation jobs are sent to a compute farm using LSF (currently uses the EBI Cluster).

Filesystem

We currently use two main folders BIOMODELS_HOME and BIOMODELS_BASE. The first one contains the various scripts and tools (mainly the converters). The second one contains all the web related elements (static pages, models, exports, ...) and is accessible by the URL BIOMODELS_WWW. The javascript files and CSS stylesheets are available at: https://biomodels.svn.sourceforge.net/svnroot/biomodels/web/.

In BIOMODELS_HOME, you will need to put the XML export of MIRIAM Resources.

Subversion

The models are stored under Subversion. Therefore you will need to setup your own server. We also setup a web interface (a customised version of WebSVN) in order to allow curators to easily retrieve, via the curation interface, previous versions of a model.

It is worth to mention that all the (latest version of the) models are also stored on the filesystem. This is for performance reasons, and therefore all actions of model retrieval will not use SVN. When a model is submitted or a curator replace a model, a SVN commit is automatically performed.

File system structure

Here are the needed folders (VERSION can be 'demo', 'main', ...). These folders are linked with curation pipeline used. For more information, a figure of the curation pipeline can be found on the BioModels Database Wikipedia page.

uplo_dir = BIOMODELS_BASE + "models-" + VERSION + "/uplo/"
Contains the models uploaded to the repository (using the submission form).
cura_dir = BIOMODELS_BASE + "models-" + VERSION + "/cura/"
Contains models (and their export formats) which are in the curation phase of the pipeline.
cura_index = BIOMODELS_BASE + "models-" + VERSION + "/cura_index/"
Contains the Lucene index of the models in the curation phase.
anno_dir = BIOMODELS_BASE + "models-" + VERSION + "/anno/"
Contains models (and their export formats) which are in the annotation phase of the curated branch.
anno_index = BIOMODELS_BASE + "models-" + VERSION + "/anno_index/"
Contains the Lucene index of the models in the annotation phase of the curated branch.
anno_math_dir = BIOMODELS_BASE + "models-" + VERSION + "/anno_math/"
Contains the graphical exports (PNG files) of the MathML contained in the models in the annotation phase of the curated branch.
publ_dir = BIOMODELS_BASE + "models-" + VERSION + "/publ/"
Contains models (and their export formats) which have been published in the curated branch.
publ_index = BIOMODELS_BASE + "models-" + VERSION + "/publ_index/"
Contains the Lucene index of the models which have been published in the curated branch.
publ_math_dir = BIOMODELS_BASE + "models-" + VERSION + "/publ_math/"
Contains the graphical exports (PNG files) of the MathML contained in the models which have been published in the curated branch.
uncura_publ_dir = BIOMODELS_BASE + "models-" + VERSION + "/uncura_publ/"
Contains models (and their export formats) which have been published in the non-curated branch.
uncura_publ_index = BIOMODELS_BASE + "models-" + VERSION + "/uncura_publ_index/"
Contains the Lucene index of the models which have been published in the non-curated branch.
uncura_anno_dir = BIOMODELS_BASE + "models-" + VERSION + "/uncura_anno/"
Contains models (and their export formats) which are in the annotation phase of the non-curated branch.
uncura_anno_index = BIOMODELS_BASE + "models-" + VERSION + "/uncura_anno_index/"
Contains the Lucene index of the models in the annotation phase of the non-curated branch.
simu_dir = BIOMODELS_BASE + "models-" + VERSION + "/simu/"
Contains the results of the simulations performed during curation of the models. These files are mainly screenschots of simulators.
ode_simu_dir = BIOMODELS_BASE + "models-" + VERSION + "/ode_simu/"
Folder containing all the results of the online simulation tool. Older files from this folder are regularly deleted.
release_dir = BIOMODELS_BASE + "models-" + VERSION + "/release/"
Contains the archive of all the models, generated during each release.

The content of all these folders (except the ones containing Lucene indexes) is updated each time a model is modified and/or moved from one phase to another.

Concerning the models indexes: these are manually updated during each release or when models are published outside releases. There is a page for that purpose: generate-models-index.do, which can be accessed once logged in with a user with administrator privileges.

Converters

BioModels Database relies on various converters. Some are maintained by the BioModels.net team, some are developed by other groups.

Two archives are available for download from the SourceForge project page:

  • sbml-converters: SBML converters developed and maintained by the BioModels.net team.
  • biomodels-scripts: whole set of converters and scripts required for a fully functional instance of the repository.

In order to setup your local instance of the repository, you should download the latest release of biomodels-scripts, extract it somewhere and update "BIOMODELS_HOME" in Constants.java accordingly.

Dependencies

MathML to PNG

In order to display nice equations from the MathML contained in the models, we convert it into pictures (PNG). We use a customised version of this XSL for the conversion between Content to Presentation MathML. After, we use JEuclid in order to generate nice pictures from the Presentation MathML.

Now JEuclid includes the XSLT we were previously using separately and therefore provides support for Content MathML. The according code has been updated in BioModels Database and do not use a separate XSL any more.

External connections

Various external connections are used:

  • CiteXplore: details about a publication are retrieved (during the model submission process) by using CiteXplore REST Web Services.
  • ChEBI: Web Services (via their Java library) used for the annotations

Curation and annotation pipeline

BioModels Database has been developed to provide high quality models. This means that after their submission, models go through a curation and an annotation phase. This ensures that, when published, models are valid, reproduce the results described in the reference publication and are annotated.

Access to the curation and annotation pipeline is limited to logged users with sufficient privileges.

An overview of the pipeline is shown below:

BioModels Database pipeline

Some information about the curation and annotation processes is available:

Web Services

BioModels Database provide access to its content via SOAP Web Services.

The server part is embedded in the respository source code. Access to the WSDL can be achieved by using the URL: http://path_to_local_install/services/BioModelsWebServices?wsdl.

The source code of the client will be soon available from the SourceForge SVN repository.

Need help?

You can contact biomodels-net-support AT lists.sourceforge.net if you have any questions. If you find a bug, please use the bug tracker on SourceForge.

Contributing

If you wish to contribute to the code base, please feel free to contact us at: biomodels-net-support AT lists.sourceforge.net.

You might also be interested by Jummp. This is a collaborative project that has been started to provide the next generation model repository infrastructure. Ultimately, BioModels Database will use this infrastructure.


spacer
spacer