University Home
Manchester Centre for Integrative Systems Biology

Systematic integration of experimental data and models in systems biology


Mathematical models are key in studying the behaviour of biological systems. However, multiple sources of data in diverse forms are required in the construction of a model in order to define its components and their biochemical reactions, and corresponding parameters. The integration of these data involves the interoperation of data and analytical resources which can be implemented as workflows. The example set of Taverna workflows below shows how SBML models of yeast metabolic networks can be constructed, parameterised and calibrated for use in simulations.


User guide

The workflows require the use of libsbml version 3 in order to generate models in SBML format. The libsbml library has to be installed on your computer and its location made known to Taverna in order for it to be used during workflow execution.


Installation of libsbml for Taverna

The use of libsbml 3 has been tested to work with the latest version of Taverna 2.1.2. To enact SBML workflows, Taverna needs access to the Java binding of libSBML as well as to the underlying native library generated from the C and C++ source code for libsbml. Platform specific versions of libsbml 3 are available as zip files below.


Installation of the libsbml libraries and configuration of Taverna varies according to your PC platform.

Windows XP

Extract the zip file in the lib directory of your taverna.home folder in C:\Documents and Settings\username\Application Data\taverna-2.1.2. This is followed by making the following changes highlighted in bold in the taverna-debug.bat file in your Taverna installation directory:

@ECHO OFF REM Taverna startup script REM go to the distribution directory pushd "%~dp0" REM 300 MB memory, 140 MB for classes set ARGS=-Xmx300m -XX:MaxPermSize=140m REM Internal system properties set ARGS=%ARGS% -Draven.profile=file:conf/current-profile.xml set ARGS=%ARGS% -Djava.system.class.loader=net.sf.taverna.raven.prelauncher.BootstrapClassLoader set ARGS=%ARGS% -Dsun.swing.enableImprovedDragGesture set ARGS=%ARGS% -Dtaverna.startup=. REM Required for dependencies on libsbml set LIB_PATH=%APPDATA%\taverna-2.1.2\lib set PATH=%PATH%;%LIB_PATH% set ARGS=%ARGS% "-Djava.library.path=%LIB_PATH%" java %ARGS% -jar lib\prelauncher-*.jar pause REM restore current directory popd


Linux

For Linux, its zip file should be extracted into your $HOME/.taverna-2.1.2/lib folder. Your Taverna taverna.sh script should then be edited with the text highlighted in bold as follows:

#!/bin/sh ## resolve links - $0 may be a symlink PRG="$0" progname=`basename "$0"` saveddir=`pwd` # need this to resolve relative symlinks cd "`dirname "$PRG"`" while [ -h "$PRG" ] ; do ls=`ls -ld "$PRG"` link=`expr "$ls" : '.*-> \(.*\)$'` if expr "$link" : '.*/.*' > /dev/null; then PRG="$link" else PRG=`dirname "$PRG"`"/$link" fi done # Required for dependencies on libsbml LIB_PATH="$HOME/.taverna-2.1.2/lib" export LIB_PATH LD_LIBRARY_PATH="$LIB_PATH" export LD_LIBRARY_PATH TAVERNA_HOME="`dirname "$PRG"`" cd $TAVERNA_HOME # 300 MB memory, 140 MB for classes exec java -Xmx300m -XX:MaxPermSize=140m \ -Djava.library.path=$LIB_PATH \ -Draven.profile=file:conf/current-profile.xml \ -Dtaverna.startup=. \ -Djava.system.class.loader=net.sf.taverna.raven.prelauncher.BootstrapClassLoader \ -Dapple.laf.useScreenMenuBar=true \ -Dapple.awt.graphics.UseQuartz=false \ -Dsun.swing.enableImprovedDragGesture \ -jar lib/prelauncher-*.jar

MacOS X

For the Mac OS X platform, the libsbml *.jni and *.dylib native libraries should be placed within the Java folder in your Taverna application bundle . So, for example, if Taverna was installed in the Applications folder then the jni and dylib libraries should be moved into /Applications/Taverna 2.1.app/Contents/Resources/Java. In contrast, the *.jar files should be placed in Taverna's lib folder which is located at /Users/peterli/Library/Application Support/taverna-2.1.2/lib.


Automated installation of libsbml for Taverna

Installers for semi-automated installation of the libsbml java and native libraries for Windows, Linux and MacOS X are available below. These installers will also make the necessary changes to the Taverna startup script where required. To use these installers, download the file relevant to your platform, and run the executable following its extraction if required. You will need to inform the installer on where Taverna has been installed on your filesystem.


Example systems biology workflows

Four sets of workflows have been developed for the network construction, parameterisation, calibration and simulation of systems biology models described in SBML format. The informatics infrastructure (right) is currently being used by the MCISB for managing the data required for modelling metabolic networks in yeast. It is these data which are integrated into SBML models by these workflows.

Information about yeast metabolic reactions originate from Herrgard et al., (2008). This information is stored in a SQLITE database. Measured concentrations of enzymes and metabolites are stored in a key results database. Kinetic parameters for metabolic reactions are stored in the SABIO-RK database. All three databases have been deployed as web services enabling their data to be queried and access programmatically. The Copasi web service provides tools for calibrating and performing simulations using SBML models.



Workflow 1: createSbmlModelByOrfs.t2flow

The first stage in constructing a mathematical model of a metabolic network is to create the network structure of the system. This is represented by a list of components (metabolites and enzymes) in the system and their relationships with one another. This information is obtained from the web service that has been deployed for a SQLITE database containing data from a genome-scale consensus network of yeast metabolism (Herrgard et al., 2008).

Given an input in the form of a list of yeast open reading frame numbers, the workflow obtains information about the metabolites involved in the reactions catalysed by the enzymes represented by the ORF numbers. This information is integrated into a SBML model using classes and methods from the libSBML API which have been exposed as workflow components by API consumer activities in the workflow (Li et al., 2008).

The output of this workflow is a SBML file and GraphViz representation of the model.



Workflow 2: parameteriseSbmlModel.t2flow

The SBML model created by the first workflow requires furnishing with kinetic parameters in order for it to be used in simulations. This quantitative information is obtained by this workflow from two databases in the MCISB informatics infrastructure. Firstly, the SABIO-RK database is used to obtain kinetic parameters pertaining to specific metabolic reactions in the SBML model. The key results database is used to obtain measured concentrations of enzymes and metabolites from yeast cultures.

An SBML model containing parameterised reactions is generated by this workflow.



Workflow 3: optimiseSbmlModel.t2flow

In order for simulations to calculate more accurate results which are representative of the real biological system, the SBML model can be optimized with experimental data that provide measurements of the variables and parameters of the system. This has been implemented by this workflow using optimization algorithms provided byCOPASI which have been deployed as a web service by COPASIWS.

This workflow outputs an SBML model whose reaction parameters have been modified according to the provided experimental data.



Workflow 4: runSbmlTimeCourseSim.t2flow

The COPASI web service is also used to perform a time course simulation to determine the concentration of metabolites over a period of time. This workflow uses the calibrated and parameterised SBML model which has been produced by the preceding three workflows. The results of the COPASI time simulation is represented in SBRML format. These results are also plotted as a graph using an R script by this workflow.


The tasks performed by the above three workflows may be considered as a series of data transformations. Firstly, a qualitative SBML model is generated by the first workflow using information about a set of metabolic reactions that a catalysed for a list of yeast enzymes (A). This model is then parameterised resulting in a quantitated SBML model whose reactions have been annotated with kinetic parameters (B). These parameters are optimized against a set of experiment data resulting in a SBML model (C) which can be using for simulations (D).



Acknowledgement

The installers were created using the InstallBuilder software with an open source licence kindly provided by BitRock.