The files exonrank2_sorted.html and exonrank2_unsorted.html in this
directory list the results of our analysis to detect exons whose
sitewise dn/ds (omega) values do not conform to the background
dsitribution derived from all the proposed protein-coding parts of the
Encode regions.
Note that each row corresponds to one proposed exon.
In file exonrank2_sorted.html, the exons are sorted according to their
'oddness' (most strange exons first).
In file exonrank2_unsorted.html, the exons are ordered according to
target region and exon_id (see below).
In each file, the columns are:
exon_id: gene name + exon's Encode coordinates + reading frame (links
to plot of sitewise dn/ds values)
plot_coords: exon's first and last nucleotide positions in our own
gene position labelling system (corresponds to positions
in the sitewise dn/ds plots)
region: Encode target region
splicing: indicates constitutive or alternatively spliced or
information not available (const/alt/nA)
num_codons: no. of codons
pos_sites: no. of sites with significant evidence of positive selection
(dn/ds>1)
weimean_omega: weighted mean dn/ds
best_scenario, worst_scenario, avg_score:
We use two measures of 'oddness' for an exon, one (best_scenario)
based on the optimistic scenario that all its sitewise omegas
coincide with the lower bounds of their confidence intervals and one
(worst_scenario) based on the pessimistic scenario that all its
sitewise omegas coincide with the upper bounds of their confidence
intervals. We use the average of these two values as a sorting
criterio. More detailed classification is possible using the rules:
An exon should be considered as good if its worst_scenario score is low.
An exon should be considered as bad if its best_scenario score is high.