spacer
spacer

make

Introduction

While commonly associated with application development make is commonly to produce pipelines and workflows using existing command-line tools as components, since make can handle dependencies between individual targets.

Some examples in bioinformatics:

Many implementations of make exist, but probably the most popular is GNU make due to its inclusion on Linux systems.

Documentation

GNU make

GNU make provides a number of additional features over many make implementations, see the GNU make manual for details.

Chain of Dependencies

A common task in sequence analysis to is obtain a phylogenetic tree on homologues given a sequence:

  1. Get query sequence (e.g. using WSDbfetch)
  2. Perform a sequence similarity search to find homologues (e.g. SSEARCH against UniProtKB/SwissProt using WSFasta)
  3. Get the sequences for a selection of hits (e.g. using WSDbfetch)
  4. Align the homologues sequences using multiple sequence alignment (e.g. using WSClustalW2)
  5. Generate a phylogenetic tree from the alignment (e.g. using WSClustalW2)

This can be implemented using make and the sample clients provided for the services. For example, using the SOAP::Lite based Perl clients for WSDbfetch, WSFasta and WSClustalW2, a sample makefile would look like:

# Directory containing Perl clients
WSDIR = ../../perl/soaplite
# Perl installation to use
WSPREFIX = perl
# User e-mail address for jobs
EMAIL = your@email
 
# First target in makefile is run by default
all: clustalw2-tree.ph
 
# Get the entry to find homologues for
query_seq.fasta:
	$(WSPREFIX) $(WSDIR)/dbfetch.pl fetchData 'uniprot:wap_rat' fasta raw > $@
 
# Perform a S&W search using SSEARCH to find homologues
ssearch.idlist: query_seq.fasta
	$(WSPREFIX) $(WSDIR)/fasta.pl --email $(EMAIL) --program ssearch --database swissprot --sequence query_seq.fasta --eupper 0.001 --quiet --async > $@.jobid
	$(WSPREFIX) $(WSDIR)/fasta.pl --jobid `cat $@.jobid` --poll --outfile ssearch
	$(WSPREFIX) $(WSDIR)/fasta.pl --jobid `cat $@.jobid` --ids --quiet > $@
 
# Get sequences of 20 most significant homologues
homologue_seqs.fasta: ssearch.idlist
	head -20 $^ | $(WSPREFIX) $(WSDIR)/dbfetch.pl fetchBatch uniprot - fasta raw > $@
 
# Align homologue sequences using ClustalW2
clustalw2-align.aln: homologue_seqs.fasta
	$(WSPREFIX) $(WSDIR)/clustalw2.pl --email $(EMAIL) --align $^ --quiet --async > $@.jobid
	$(WSPREFIX) $(WSDIR)/clustalw2.pl --jobid `cat $@.jobid` --poll --outfile clustalw2-align
 
# Create a tree from the alignment using ClustalW2.
clustalw2-tree.ph: clustalw2-align.aln
	$(WSPREFIX) $(WSDIR)/clustalw2.pl --email $(EMAIL) --tree --outputtree phylip --kimura $^ --quiet --async > $@.jobid
	$(WSPREFIX) $(WSDIR)/clustalw2.pl --jobid `cat $@.jobid` --poll --outfile clustalw2-tree
 
# Target to delete working files.
clean:
	rm -f *.fasta ssearch.* clustalw2-align.* clustalw2-tree.*

If using makefile or Makefile for the file name, then make will pick up the file automatically:

make

Alternatively using a name such as clustal.makefile, the -f option is used to specify the file:

make -f clustal.makefile

Examples

More examples can be found on myExperiment.


Up Workflows Contents Contents
 
tutorials/07_workflows/make.txt · Last modified: 2012/07/08 09:44 by hpm
spacer
spacer