Documentation

Conan is an extremely light-weight workflow management application. It is designed to provide a user-friendly interface to interact with various components of the "script-chaining" types workflows that are typical in bioinformatics.

Description

Conan was developed to handle the various loading scenarios and workflows involved in the ArrayExpress and the GXA databases. However, it is available as a standalone tool and can be customised to chain processes (for example, Java processes, perl or shell scripts) together in a resuable way. It is possible to install the Conan web application in your own environments and put together a new workflow with a minimum amount of development.

The following documentation describes how to use Conan to run workflows. If you are interested in installing Conan and developing your own processes, see the developer documentation.

Availability

Download Conan from the FGPT downloads page.

The source code for Conan is available from Conan2 on github.

Using Conan

Getting Started

Whenever you log into Conan, you will be prompted to log in with your email address. This should be you official email address (not an alias, as Conan cannot recognize these). Once you do, Conan should recognize you as a user and grant you submitter privileges. You should see a window for creating new submissions, three tables (queue, progress, and history) and your name should be displayed at the top-right of the window. Note that if you do not see this, and instead see a window informing you that you are logged in as a guest, you may have entered your email address incorrectly, or else you do not currently have an account - contact the administrator to find out why.

Once logged in, you will see the main features of Conan. The topmost widget is for creating new submissions, and there are a few tabs you can switch between for different types of submissions. Underneath this are the two tables for interacting with running tasks - the queue shows anything that is pending execution (either because it is a new submission, or needs user interaction to restart) whereas the progress table shows everything that is currently executing.

Underneath this is the history table: this shows the provenance information of everything that has been run through Conan at any time. This table is split into pages of 10 tasks at a time.

Each table can be sorted by any column (start/end time is the usual default) or can be filtered by typing any text into the search box just above the table. This filter works on all fields, so if I start typing my name into the field, I should only see tasks submitted by me. I could also type in an accession, a start or end date, and so on.

More details about how to work with Conan, how to submit new tasks or interact with old ones, are detailed in the next section.

Submitting New Tasks

To submit a new task, you must select a submission type - "single", "multiple" or "batch". In single submission mode, you enter a single accession and that experiment/array is loaded. In multiple mode, you can expand the submission window and enter many accessions into it - one per line - before hitting the submit button to enter new tasks for each accession. Batch mode works much the same way, but instead of entering multiple accessions or copy 'n' pasting them into the window, you can simple upload a text file of accessions (again, one per line) and they'll all be created as new tasks.

Once you've selected your mode, also select your pipeline and starting process. The pipeline is the set of operations that Conan will run. This is not, strictly, a workflow as inputs from one process aren't fed to the next. Processes are chained together in a defined order, and subsequent processes are only run if the previous one completes. If any process fails, you'll be emailed some details that may provide the reason for the failure, and you can take the appropriate action.

Once you've entered your accession (or, if possible, other parameter(s) for a given pipeline) and confirmed that you want to create the submission, the task will appear in the queue table. It will stay here for two minutes after creation to give you a chance to cancel it if you've made a mistake. If you need to kill a pending task, see the section below on how to do this.

After the two minute "cooling off" period, the task will be moved to the progress table (assuming the maximum number of jobs isn't already running) and execution will begin.

Fixing Failed Tasks

Whenever a task you have submitted fails, it will move back from the progress table into the queue. Failed tasks will be highlighted red, with some extra information about where they failed, and if possible a brief explanation of why it failed.

At this point, the task requires some intervention. You can click on the task name to see a summary of the task, including reports sorted by most recent first. If you haven't been able to work out what went wrong from the brief explanation, this should provide lots more details. Correct any errors and then restart the task. You do this by clicking on the red box in the queue table, and selecting one of the various options. For any failed task, you can:

  • Retry the last process
  • Resume from the next process
  • Stop the task altogether

Retrying the last process will be the option you'll normally select (so, for example, if validation fails and you correct the error in the MAGE-TAB, you would rerun validation to check everything is now OK). Resuming means that you want to ignore the failure and carry on to the next process anyway: this will not normally be advisable, but in some circumstances may be unavoidable. Stopping the tasks means that you could not correct the problem, and you don't want to retry later either, so the task will be removed from the queue.

Pausing Running Tasks

You can pause any running tasks in the progress table. As long as you are logged in as a submitter, next to any running task you will see a little "pause" button. Clicking this pops up a dialog informing you that this task will be paused after the current process completes. If you say "OK" to this, the task will complete it's current operation (NO running tasks are ever killed or interrupted) and then stop execution until you explicitly restart it. Once the current process is complete, paused tasks move from the progress table back to the queue, and are highlighted yellow (very much like failed tasks).

Resuming Paused Tasks

You can interact with paused tasks in exactly the same way as with failed tasks, except they're highlighted yellow instead of red. Clicking on the pause box gives you similar options. Normally you'll just resume a paused task, unless there is a specific reason to retry or stop.

Killing Pending Tasks

Every task in the queue has a checkbox next to it's name. You'll also see three buttons below the queue table: "Select all", "Deselect all" and "Stop". Selecting tasks by clicking the checkbox marks them to be stopped, and any selected tasks will be stopped when you hit the stop button. The select/deselect buttons obviously check or uncheck all tasks in the queue.

Clicking the checkbox and then stop is, in effect, the same as clicking on the interaction box for failed or paused tasks and selecting the stop option. Rather than have to do this many many times, in case of failed batches, you can simply select all the tasks to stop and remove them in one go.

Also, you can select tasks that have not yet been started (and so do not have a failed or paused interaction box) and remove them. This means, if you spot that you have entered an incorrect accession in the cooling off period, you can clear a task out before it starts.

Notes

The precise setup of the Conan installation you are using will depend on which processes have been installed by your local administrator. You may need to get in touch with the person who installed Conan if you experience problems with running specific workflows.

Issues and support

Document any known bugs, unexpected or unintuitive behaviour (gotchas), unimplemented features and other issues here.

Issues and feature requests

To request a new feature or if you think you've found a bug, please send us an email using one of the links below.

Support

If you need help using this tool, please email Tony Burdett.

Contact

For more information or to get involved please email Tony Burdett.

Acknowledgements

This tool was developed by Tony Burdett.

This work was paid for by EMBL-EBI core funding



Software

spacer