back to Goldman Group homepage, ENCODE SEP2005 index page


Here is CFTR (target region 1), which looks like a sensible gene.

gene prank tba
CFTR.AC000061.1
slr-plot browser
slr-plot browser
001
alignment browser summary raw data
alignment browser summary raw data

This is the (alphabetically) first gene of the first target region. It's a bit dodgy (two transcripts, equally poor).

gene prank tba
AC106873.3.AC106873.3
slr-plot browser
slr-plot browser
001
alignment browser summary raw data
alignment browser summary raw data
004
alignment browser summary raw data
alignment browser summary raw data

Gene CAV2 in target region 1 has two variants that differ in their first exon but share the rest of the coding sequence. The dn/ds estimates for the first exon of the second transcript (003) are unreasonably high, suggesting that this exon is not conserved in all the species. The alignment for the corresponding region (sites 1-35) looks sensible for primates but the start codon isn't conserved even in all of them. If the alternative first exon is real, it seems to be rather recent invention or lost in most other species.

gene prank tba
CAV2.AC006159.1
slr-plot browser
slr-plot browser
001
alignment browser summary raw data
alignment browser summary raw data
003
alignment browser summary raw data
alignment browser summary raw data

Gene OSCAR in target region 7. Many transcripts, all pretty 'bad'. Transcript 007 is the strangest, indicated by the whole-transcript dn/ds values here and here. The exons from positions 258-384 and 387-666 in the slr-plot make this transcript, and they are high up in the 'dubious exon' identification lists here and here. Notice from the slr-plot that transcript 007 is in a different reading frame. The alignment is pretty reasonable, and there is a start codon (not monodelphis and sequence missing in platypus).

gene prank tba
OSCAR.AC012314.3
slr-plot browser
slr-plot browser
001
alignment browser summary raw data
alignment browser summary raw data
002
alignment browser summary raw data
alignment browser summary raw data
003
alignment browser summary raw data
alignment browser summary raw data
004
alignment browser summary raw data
alignment browser summary raw data
005
alignment browser summary raw data
alignment browser summary raw data
006
alignment browser summary raw data
alignment browser summary raw data
007
alignment browser summary raw data
alignment browser summary raw data
008
alignment browser summary raw data
alignment browser summary raw data

Gene AC113617.1 in target region ENr113. All four of its transcripts are picked up by looking at the whole-transcript dn/ds values: it is reasonably near the top of the sorted lists here and here. In addition, the exons from positions 13-130 and 223-283 in the slr-plot score highly on the 'dubious exon' identification lists here and here.

gene prank tba
AC113617.1.AC113617.1
slr-plot browser
slr-plot browser
001
alignment browser summary raw data
alignment browser summary raw data
002
alignment browser summary raw data
alignment browser summary raw data
003
alignment browser summary raw data
alignment browser summary raw data
004
alignment browser summary raw data
alignment browser summary raw data

Gene CATSPER2 in target region ENr233. It doesn't stand out particularly according to the whole transcipt statistics, but on the per-exon measures here and here the exons at positions 1123-1177 and especially 1180-1396 are high in the list of strange things. Also interesting is to look in the browser (here: pops up a new window) and zoom right in to the region chr15:41711650-41712450: you see a 'good' exon to the left (lots of signal for negative selection; a little for positive selection) and one of the 'bad' ones to the right (too much positive selection signal).

gene prank tba
CATSPER2.AC011330.3
slr-plot browser
slr-plot browser
001
alignment browser summary raw data
alignment browser summary raw data
002
alignment browser summary raw data
alignment browser summary raw data
004
alignment browser summary raw data
alignment browser summary raw data
009
alignment browser summary raw data
alignment browser summary raw data

NG has spotted something odd in each of the following regions, but not written up full notes yet:

ENm014/GRM8: tx003 has an extra exon in different frame, leading to an early stop codon. It doesn't look well conserved, judging by sitewise omega.

EMn012/FOXP2: tx004 has an extra weird exon; high omega values resulting from poor alignment in monodelphis+Platypus+chicken plus notice gaps in xenopus+all fish.

ENm002/PDLIM4: tx002 has an exon in a different reading frame, with reasonable omega values.

ENm003/ZNF259: notice transcripts in all reading frames, some exons with reasonable omega values and others not.