Guide to Reference Environments

Overview

At the Systems Biology Lab, we produce reference environments for each of our research outputs. A reference environment is a complete software stack which reproduces a scientific result or set of results; they have a wide range of uses, but their overall goal is to make the processes of replicating and exploring science easier and more reliable.

Reference environments are smaller and simpler than a standard operating system installation, because they only include the tools required to produce a specific result. Most reference environments include basic utilities to get code and data (FTP clients, version control clients) and one or more programming languages (R, Python, Java, MATLAB, C/C++, Fortran). Many reference environments also include a lightweight desktop and some tools to access results once they are produced (a PDF viewer, an image viewer, a web browser).

You can build reference environments using a range of different platforms:

Using the open-source tools Packer and Vagrant you can produce equivalent environments for a single scientific result across all these platforms, using a single set of provisioning scripts. This means that readers and reviewers can access results using the platform that suits them, and the computational methods described in the result are clearly separated from the implementation of those methods.

Figure 1

An example reference environment with a desktop

To learn how to use existing reference environments, go here. For examples of reference environments for published work, go here. To learn how to create reference environments for your own work, go here. To learn about different types of reproducibility, and how to make sure your own reference environments produce reproducible results, go here. To understand how reference environments work with other aspects of reproducible research, go here.

How to use reference environments

Reference environments are generally designed to repeat a computation or set of computations that are the key results for a piece of research. Using a reference environment generally involves running a single command or set of commands to execute code on the environment, generate some results and write those results to a local filesystem. Results may reproduce figures or tables from a publication, or more generally demonstrate the activity of a program.

Formats for reference environments

Because reference environments are produced in different formats, you can choose the format of the reference environment best suited to your specific situation and your goals. The table below summarises differences between the environment formats.

Format Requirements Use Permanence Source
Bootable ISO Most virtualization software Replication/recomputation, verifying results Read-only - changes are erased on reboot Direct download
Docker container Docker v1.8.1 or greater Replication/recomputation, verifying results Read-only unless container is committed Docker Hub
VirtualBox VM VirtualBox v4.3.10 or greater, Vagrant v1.7.2 or greater, Project-specific resources on startup Exploring results, testing impact of changes, troubleshooting installation Changes persist until the environment is destroyed/deleted Hashicorp Atlas (via Vagrant)
AWS EC2 instance Vagrant v1.7.2 or greater, Amazon Web Services EC2 account, Project-specific resources on startup Exploring results, testing impact of changes, troubleshooting installation Changes persist until the environment is destroyed/deleted Amazon EC2 (via Vagrant)

Summary of formats for reference environments

Self-contained environments vs. provisioned environments

Reference environments come in two types: self-contained environments which include all the data and code required to reproduce a result, and provisioned environments which need to download data and code from the Internet the first time they are started. The source of the reference environment will usually tell you whether it is a self-contained or provisioned environment.

  • Self-contained environments generally come in the bootable ISO image or Docker formats. Once you have downloaded them, they do not depend on access to the Internet to work, or on the future availability of data or code resources, meaning that they will generally produce exactly the same results in future as they do now. Also, they are usually not 'permanent', so changes made to the environments will not be retained. Self-contained environments are ideal for replicating or recomputing a result.

  • Provisioned environments generally come in the virtual machine (VirtualBox VM) or cloud (Amazon EC2) formats. Each of these formats builds the reference environment 'on-demand': they start with a 'base box', then install and configure the project software and data. This means that to work they require access to the Internet, and for project-specific resources to be available (for instance, tools and packages must be accessible via online repositories). It also means that if resources change (e.g. a new version of a software tool is released), they may not produce exactly the same results in future as they do now. Once set up they are permanent environments in which changes are retained, and the process of installing and configuring project software and data should be similar to the process a researcher goes through when configuring their physical machine. Provisioned environments are ideal for analyzing or working with a result. Having a worked example of software installed and running on a machine can help with troubleshooting installation problems, and with testing the effect of configuration or software changes.

Using a bootable ISO reference environment

To use a bootable ISO reference environment, you need to have virtualization software installed on your machine. Almost all virtualization tools have the ability to boot from an ISO image, but common examples are Oracle VirtualBox or VMWare for Windows, OSX and Linux, or QEMU for Linux. Here we will describe the steps using VirtualBox v4.3, but the steps should be similar for other tools.

Step Description Action
1 Download the ISO file ISO images may be available on a data repository like Figshare or Zenodo, or they may be available through a journal website. For some examples, go here
2 Create a virtual machine to boot the ISO Create a new virtual machine by clicking the ‘New’ button, and choose an operating system type of ‘Linux Ubuntu’ 64- or 32- bit, depending on the ISO. Set the memory size to 2048Mb; this should be enough for most reference environments, although some may specify more. Choose not to add a virtual hard drive, since you will be booting off the ISO image.
3 Mount the project ISO Select the new virtual machine and click Machine/Settings on the menu. In the Settings window, click ‘Storage’. Under ‘Attributes’, click the CD-ROM icon and select ‘Choose a virtual CD/DVD file’. Find the project ISO file and click ‘Open’. Click OK to close the Settings window.
4 Start the machine Click the Start button on the menu bar.,When the boot menu appears, choose ‘Boot the live system’ from the boot menu. The machine will start up, automatically log in and show the desktop.
5 Execute the project instructions Most reference environments will have instructions directly on the desktop which tell you how to execute the project code; this will normally involve running a script on the desktop by double-clicking it. The script should run and a message should appear telling you where the project outputs have been written.

Using a Docker container

To use a Docker container, you need to have Docker v1.8.1 or greater installed on your machine. For instructions on installing Docker, go here.

Step Description Action
1 Get the Docker image for the project If the image is on the Docker Hub, run the command docker pull <organisation-name>/<project-name>. The image will download and be ready to run. For some examples from the Systems Biology Lab, go here
2 Execute the project instructions Most projects will have a shell script suitable for use with the docker run command: consult the project documentation for the specific syntax.
Note that it is often necessary to run Docker with elevated privileges using the sudo command.

Using a VirtualBox VM managed by Vagrant

To use a VirtualBox VM managed by Vagrant, you need VirtualBox and Vagrant installed on your machine. For instructions on installing VirtualBox, go here, and for instructions on installing Vagrant, go here.

Step Description Action
1 Get the Vagrant code for the project If you have git installed, you can clone the project from GitHub: git clone https://github.com/<organisation-name>/<project-name>.git. Otherwise, go to the project on GitHub and choose ‘Download Zip’ from the project repository page. For some examples from the Systems Biology Lab, go here
2 Go to the Vagrant-managed project directory If you have cloned the repository, cd to the repository directory. If you have downloaded the zipfile, expand it and cd to the created directory.
3 Bring up the Vagrant managed environment Type vagrant up at the command prompt. The virtual machine will start up and the provisioning scripts will run.
4 Access the environment You can access the environment using the desktop if one is provided, or using the command vagrant ssh. The default username and password for reference environments built by the Systems Biology Lab is ‘sbl’, but other environments may be different: consult the project documentation for details.
5 Execute the project instructions If a desktop is provided, most reference environments will have instructions directly on the desktop which tell you how to execute the project code; this will normally involve running a script on the desktop by double-clicking it. The script should run and a message should appear telling you where the project outputs have been written. If there is no desktop and you are using vagrant ssh to access the machine, consult the project documentation for the specific syntax.

Using an AWS EC2 instance managed by Vagrant

To use a VirtualBox VM managed by Vagrant, you need VirtualBox and Vagrant installed on your machine. For instructions on installing VirtualBox, go here, and for instructions on installing Vagrant, go here. You will also need to sign up for an Amazon Web Services EC2 account: to sign up, go here.

AWS offers a free tier which should be enough to run most reference environments, but using more than the free tier's allocated storage space incurs charges, so be careful to delete the environments once you have finished using them.
Step Description Action
1 Get the Vagrant code for the project If you have git installed, you can clone the project from GitHub: git clone https://github.com/<organisation-name>/<project-name>.git. Otherwise, go to the project on GitHub and choose ‘Download Zip’ from the project repository page.
2 Go to the Vagrant-managed project directory If you have cloned the repository, cd to the repository directory. If you have downloaded the zipfile, expand it and cd to the created directory.
3 Create environment variables for the AWS access keys The keys are called AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and AWS_SECURITY_GROUP. You will receive the keys when you sign up for an AWS account, and you will need to create a security group before you launch any AWS instances. The Vagrantfile for the reference environment uses these keys to log in to your account and start the AWS instance. Creating environment variables varies by operating system, but generally in Linux and OSX they are added by modifying /etc/environment or /etc/profile. In Windows they are added using the desktop through Control Panel/System/Advanced System Settings.
4 Bring up the Vagrant managed environment Type vagrant up --provider=aws at the command prompt. The virtual machine will start up and the provisioning scripts will run.
5 Access the environment You can access the environment using the command vagrant ssh, or vagrant rdp if you have a Remote Desktop Protocol client installed.
6 Execute the project instructions If a desktop is provided, most reference environments will have instructions directly on the desktop which tell you how to execute the project code; this will normally involve running a script on the desktop by double-clicking it. The script should run and a message should appear telling you where the project outputs have been written. If there is no desktop and you are using vagrant ssh to access the machine, consult the project documentation for the specific syntax.

Examples of reference environments

Reference environments are used across a range of technologies and languages by the Systems Biology Lab and by other groups. The table below shows some examples of reference environments for different projects and publications, and where you can find them.

Example Technologies used Locations
Bond graph modelling of
biochemical networks
Octave
LaTeK
Vagrant VM
Docker container
Bootable ISO image
Network link prediction MATLAB Vagrant VM
Docker container
Bootable ISO image
Network deconvolution MATLAB Vagrant VM
Docker container
Bootable ISO image
Machine learning approaches
to modelling eukaryotic transcription
R Vagrant VM
Docker container
Bootable ISO image
Hormonal regulation of renal
excretion
R
OCaml
Vagrant VM
Bootable ISO image
Parallel data mining using WEKA Java Vagrant VM
Docker container
Bootable ISO image
Cell modelling using CHASTE C++, Python Vagrant VM
Bootable ISO image
ERK-MAPK signal transduction modelling in human epidermis Python, MATLAB Vagrant VM
Docker container
Bootable ISO image

List of example reference environments

How to create reference environments

Reference environments are put together in stages. Most reference environments start from a common base: a 32- or 64-bit Linux system using Lubuntu, a lightweight variant of Ubuntu Linux. Some reference environments also have the Lubuntu desktop installed, along with basic tools for viewing and manipulating results (text editor, picture viewer, web browser). The common base for each reference environment is built using a single set of scripts under version control; then project-specific code and data is added, and the environments are made available on the Internet. The picture below shows the process of constructing and distributing reference environments.

Figure 1

Schematic of constructing reference environments

The core environment layer (1) uses Packer to create equivalent base environments (‘base boxes’ in Packer terminology) across three platforms: VirtualBox VMs, Docker containers and Amazon Web Services AMIs.

The project layer (2) builds on top of these base boxes using Vagrant to add code and data for each specific research output. There is one Vagrant project for each research output and set of environments. Each Vagrant project contains a ‘Vagrantfile’, a text configuration file which references the base box to use for the project, and calls a set of scripts to provision (build) the code and data for the research output.

In the distribution layer (3), each version of the reference environment is pushed to the services which host it. VirtualBox VM and AWS cloud versions of an environment are available through Vagrant scripts, and Docker containers are on the Docker Hub.

Overview of steps for creating a reference environment

To create a reference environment that reproduces your own analysis and results, follow the steps summarized below.

# Step Description
1 Define the scope of the environment Choose what you want to replicate in the environment
Decide whether to create self-contained environments, provisioned environments or both
2 Prepare your code and programs for being run in a reference environment Make sure all the code and data necessary to reproduce your results is available for download
3 Choose a base environment from which to start Preconfigured reference environments using R and Python are available to use as templates
4 Edit the install and configuration scripts to get your code, resources and data into the environment Make sure that all configuration and installation instructions are in the Vagrantfile shell scripts
5 (Optional) Create self-contained ISO and Docker versions of the environments Creating a read-only ISO or Docker container version of the environment is optional, but useful for archiving and complete reproducibility
6 Distribute the environments Upload read-only ISO versions of environments to any downloadable file store. To promote reproducibility, consider a data repository that provides a DOI for digital artifacts like Zenodo or Figshare
Push Docker containers to the Docker Hub
Upload scripts for VM and cloud versions to a public version control repository (e.g. GitHub)

Overview of steps to create a reference environment

####1. Define the scope of the environment####

To decide how to replicate an analysis or result in a reference environment, think about which parts of the analysis provide the strongest supporting evidence to the claims you are making, and what kind of output presents them in the most convincing manner. Reference environments do any and all of the following:

  • Output individual values on the command line showing the result of a single computation
  • Output text files to the filesystem
  • Output plots in PDF or bitmap form and open them on the desktop
  • Compile and run programs which open windows on the desktop

Which of these is the best way to demonstrate a result depends on your specific situation. Good reference environments typically produce output which is clear and compelling, like a figure in a publication or talk, but is also backed up with detail, so that readers and reviewers can investigate further if they want to. To do this, consider generating visual output (plots) together with data output (text files), and tell users that this has been done.

You also need to decide how much of an analysis will be replicated in the reference environment. In an ideal situation, every single part of an analysis starting from some form of 'raw data' would be replicated in a reference environment, but for many results this may not be realistic or desirable – it may take too long to run, require large amounts of one resource type or another, or not be possible because of licensing or privacy restrictions. The table below lists the most common issues to consider when defining scope for a reference environment, and suggests responses for each one.

Issue Response
Computational load: Result requires high-performance computing resources to reproduce Consider limiting scope to post-processing after the step requiring high-performance computing resource.
Make intermediate datasets publically available, and design the reference environment to download and process them.
Architecture requirements: Result requires specialized architecture (e.g. parallel processing, low-latency system) to reproduce Consider simulating the architecture and reproducing a reduced version of the result, or an example result derived from test data, to illustrate the dependence on architecture.
Data size: Core data for a result are too large for users to reasonably download Reproduce a partial result from a subset of the data, or implement batch processing so that results can be delivered in sections.
Also consider limiting scope to post-processing if intermediate datasets are small enough to download.
License restrictions for software: Result requires commercially-licensed software to reproduce Identify open-source alternatives for commercial software, or limit scope to parts of a result which can be reproduced without license restrictions.
Consider negotiation with license holders: some companies may be willing to produce limited-use licenses for the purpose of reproduction.
Data availability: Data are confidential, embargoed or restricted in access Consider producing an example result derived from test or publically-available data to illustrate the core method.
Explore options for anonymizing data if appropriate to allow public distribution.

Issues affecting scope definition for a reference environment

Finally, consider whether you want to create self-contained versions of your reference environment as well as provisioned versions. Self-contained environments include all data and code required to reproduce a result, while provisioned environments need to download data and code from the Internet the first time they are started.

  • Self-contained environments generally come in the bootable ISO image or Docker formats. Once you have downloaded them, they do not depend on access to the Internet to work, or on the future availability of data or code resources, meaning that they will generally produce exactly the same results in future as they do now. Changes made to the environments will not be retained after 'restarting' the virtual machine but self-contained environments are ideal for replicating or recomputing a result.
  • Provisioned environments generally come in the virtual machine (VirtualBox VM) or cloud (Amazon EC2) formats. Each of these formats builds the reference environment 'on-demand': they start with a 'base box', then install and configure the project software and data. This means that they require access to the Internet, and for project-specific resources to be available (for instance, tools and packages must be accessible via online repositories). It also means that if resources change, they may not produce exactly the same results in future as they do now. Once set up, they are permanent environments in which changes are retained, and the process of installing and configuring project software and data on them should be similar to a researcher's physical machine. Provisioned environments are ideal for analyzing or working with a result. Having a worked example of the software installed on a machine can help troubleshoot installation problems, and with testing the effect of configuration or software changes.

Self-contained environments have the advantage that they are a permanent record of your result, but they also require that all the code and data required to produce a result be present in the environment, and this may not be possible or desirable for the reasons described in the table above. When you create self-contained or provisioned environments, we recommend that you clearly explain the difference in the documentation for a reference environment, and in the accompanying publication.

####2. Prepare your code and programs for being run in a reference environment####

To make your analysis suitable to run in a reference environment, you will need to make the entire installation and analysis process scripted and non-interactive. To do this, work through the following steps:

  • Make a list of all the dependencies it requires – all the programs, tools, packages and data needed to run it.
  • Define and list all the commands required to install these dependencies on a basic Linux-based system in a non-interactive fashion (e.g. using apt-get commands). Users will not be able to interact with the system while it is provisioning, so this process needs to be completely non-interactive: look for 'quiet' or 'hands-off' options in each step of the installation process.
  • Structure your code or analysis so that it can be executed non-interactively. This might mean rewriting particular parts, or it might mean creating a 'run_experiments.sh' script which runs a series of analyses from the command-line one after another.
  • Make sure the code and data (if required) are available at a source from which you can script the installation. If you make this available at a public code or data repository, this means you can make a provisioned environment as described above – when the provisioned environment starts up, it will get the code or data from the public repository. This is the most accurate replica of what a user would do if they were trying to replicate your analysis on their own machine.
  • If you do not make your code available at a public repository, you can still create self-contained environments with the code/data in it, but you will not be able to make provisioned environments without providing special access (e.g. by creating a separate provisioning user with limited access to your private repository).

####3. Choose a base environment from which to start####

The easiest way to develop a reference environment is to start from one of the preconfigured template environments. There are three available:

Blank reference environment template:

https://github.com/uomsystemsbiology/reference_environment_template

R reference environment template: (Comes with R preinstalled)

https://github.com/uomsystemsbiology/r_reference_environment_template

Python reference environment template: (Comes with Python preinstalled)

https://github.com/uomsystemsbiology/python_reference_environment_template

Each of these templates includes sample scripts and suggested text for each part of the environment.

Note that the default user 'sbl' and password 'sbl' has administrator access to all environments produced from the reference environment templates. These environments are therefore not secure, and should not be used to store confidential data. If a secure reference environment is required, edit the installation scripts to remove this user after installation.

If none of these environments are suitable you can also develop your own 'base box' using Packer. Developing base boxes requires Packer v0.7.5 or greater installed. To make VirtualBox and Amazon EC2 format base boxes available, a Hashicorp Atlas ( https://atlas.hashicorp.com) account is also required to host the base box. To make Docker base images available, a Docker Hub account ( https://hub.docker.com) is required.

Once the accounts above are created, clone the Packer templates for creating the Systems Biology Lab boxes from GitHub:

https://github.com/uomsystemsbiology/vre_base

https://github.com/uomsystemsbiology/vre_base64

Scripts for building the base boxes are in the /scripts directory. The default user for environments produced by the Systems Biology Lab is sbl - users can be changed or added by editing the setup\_users.sh script.

With a configured Docker account, the Packer post-processors will export the Docker image to the Docker Hub. With a configured Atlas account, they will export the AWS and VirtualBox base boxes to Atlas.

####4. Edit the install and configuration scripts to get your code, resources and data into the environment####

  • If you have a git client installed and you are using one of the template reference environments then you can use the clone command:

git clone https://github.com/uomsystemsbiology/reference_environment_template.git project-name

  • Otherwise, you can use the 'Clone or download' link on the Github page to download a ZIP archive for the environment template.
  • If you have developed your own base boxes, then edit the Vagrantfile to replace the values of vm.box and image with the names of the new base boxes.
  • Edit 2_install_core.sh to add the instructions you identified in Step 2 for installing specific packages or code needed for the environment. To list the packages installed by default in the base box, bring up the template reference environment using vagrant up, then vagrant ssh into the machine and use dpkg -l from the command line. 2_install_core.sh also usually populates data/build_info.txt which includes build information for the specific version of the code.
  • Edit 3_install_desktop.sh to add instructions for anything specific to the desktop environment. Remember that the Docker version of a reference environment, if you make one, typically doesn't have a desktop, so graphical or interactive tools and results won't work.
  • Edit 4_configure_core.sh to set up applications and compile code.
  • Edit the existing data/run_experiments.sh script or write your own; most reference environments include one or more shell scripts which execute the commands required to reproduce a particular result.
  • Edit the text files in the /data directory to specify the project name and organisation name, and some instructions to be written to the desktop wallpaper.

####6. (Optional) Create self-contained ISO and Docker versions of the environment####

  • If you want to make a read-only ISO of the reference environment, edit data/remastersys.conf to change the name of the output ISO file (the CUSTOMISO parameter) and the name displayed when the CD is booted (LIVECDLABEL). To create the ISO, uncomment the lines:
#if !(is_docker)
#config.vm.provision "shell", path: "scripts/make_iso.sh", privileged: false
#end

in the project Vagrantfile. When the environment is launched with vagrant up, this script will create an ISO in the output directory.

  • If you want to make a Docker image of the environment, make sure you have Docker v1.8.1 or greater installed, and launch the environment with vagrant up –provider=docker.

When the environment has started, tag the image using docker tag, and push the image to the Docker hub using docker push. The Docker documentation here ( https://docs.docker.com/engine/getstarted/step_six/) has detailed instructions on using Docker commands.

####7. Distribute the environment####

Once reference environment scripts are finished and the created environments have been tested, they can be made available in a range of different ways. The table below lists suggested distribution methods for each type of output.

Environment type Recommended distribution method
Vagrant-managed virtual machine, Amazon EC2 cloud instance Make the Vagrantfile and the provisioning scripts available on a public version control repository (e.g. GitHub, SourceForge, BitBucket).
Docker image Make the image available on the Docker Hub using docker tag, and docker push
Bootable ISO Upload to any public download service. For accessibility and reproducibility, consider a generalised data repository like Figshare (https://figshare.com/) or Zenodo ( https://zenodo.org/) which provide a DOI for digital artifacts, making citation easy.

Distribution for reference environments by format

Checklist of activities for creating a reference environment

  • All installation and configuration instructions for packages and resources are in shell scripts in the /scripts directory
  • Instructions for downloading code from a public repository (e.g. GitHub) are in shell scripts in the /scripts directory
  • Version and build information in data/build\_info.txt is being correctly set from scripts/2\_install\_core.sh.
  • Configuration for applications is set, and code is being compiled, if appropriate, from scripts/4\_configure\_core.sh.
  • Scripts that users will run to execute code and generate results are in data/run\_experiments.sh
  • Text files in /data have been edited to specify the project and organisation name, and instructions for reproducing results
  • (Optional) Username and password for the reference environment has been changed
  • (Optional) Read-only ISO of the reference environment has been produced
  • (Optional) Docker container version of the reference environment has been produced
  • Environments have been distributed to downloadable repositories

Ensuring reproducibility using reference environments

Creating a reference environment usually involves installing software and packages from the Internet. To make sure that reference environments produce consistent results in future, follow these guidelines:

  • All commands to install packages, resources or make changes to the environment should be in the installation scripts (in the /scripts directory), or in scripts called by these scripts. Avoid using other methods to install packages or resources, such as installation of packages when R/Python project code is run for the first time. Keeping all environment setup and installation centralized in one location allows users to see all the changes and dependencies required to reproduce a result. If changes are made in other places, users may not realise they are happening and are important for reproducing a result.
  • When installing packages or code in a script, specify a version to be installed. For instance, when installing operating system packages in Lubuntu, use apt-get install package=versionto install a specific numbered version of the package. When getting code from a version control repository like GitHub, specify a particular release of the code. If you specify the latest release of the code, the environment may change in future and produce different results when run at different times.
  • For self-contained environments (generally ISO images and Docker containers), remind users that these environments are a snapshot, and that changes made in them will not be saved. You can put reminders on the download page for the environment, or edit the Vagrantfile to put different information in the environment for each environment type.
  • For provisioned environments (generally VM and cloud images), remind users that these require access to the internet to set up, and be clear about the version of packages or code that they install. If they install the latest version of packages or code, warn users that this may mean results differ from published ones obtained using a previous version of a resource.

How reference environments relate to other reproducible research tools

Because reproducible research is a very general concept, and different types of research have very different demands, it can be hard to make sense of the wide range of tools and approaches supporting it, and the relationships between them. The table below lists some of the best-known tools, platforms and standards relevant to reproducible research, and describes how reference environments can integrate with or make use of them.

Type Examples Relationship to reference environments
Workflow systems which record, manage and execute a series of steps in a computation Taverna
Galaxy
geWorkbench
Vistrails
Kepler
Pegasus
Sumatra
Run workflow systems within a reference environment to demonstrate a specific workflow, or as a worked example of installation and configuration
Literate programming approaches, which integrate executable code with human-readable documents iPython/Jupyter
knitr/Sweave
Reference environments can recreate analyses and papers using literate programming tools
Software containers, which separate software processes on a single machine for security, stability and resource management purposes Docker
Rocket
Docker images are produced as part of the reference environment process, and other container tools can be supported in future
Standards and formats for reporting and encoding models and experimental details MIAME
MIABI
MIAPE
MIASE
MIRIAM
CellML
SBML
SBGN
SED-ML
Many research outputs that use reference environments follow specific standards and formats (e.g. SBML translation)
Repositories for storing method and protocol information,,data and environments protocols.io
OSF.io
Runmycode.org
Dryad
nucleotid.es
Bioboxes
Gigadb
GigaGalaxy
Figshare
Zenodo
Dataverse
MyExperiment
Repositories allowing automated access to information and data can be used as sources for provisioning reference environments. ISO versions of reference environments can be given DOIs and made into citable research outputs

Reference environments and other reproducible research tools