Datasets

What are Datasets?

Datasets are defined file collections, whose access is governed by a Data Access Committee (DAC).

Total number of Datasets: 3997
Displaying 1 - 3997

Dataset Accessionsort ascending Description Technology Samples File Types
EGAD00010001577 RNA from the same tumor sample (n=98) was also processed using the 3' IVT kit (Affymetrix) and hybridized to U133 Plus 2.0 arrays (Affymetrix). Affymetrix GeneChip Scanner 3000 7g 98
EGAD00010001571 Genomic Landscape of Chordoid Glioma Illumina HumanCoreExome-24 array 9
EGAD00010001569 Summary statistics from Stage-1 GWAS for blood pressure phenotypes 5
EGAD00010001564 Primary renal cell carcinoma (RCC) by Affymetrix GeneChip miRNA 4.0 Affymetrix GeneChip miRNA 4.0 56
EGAD00010001562 WG mRNA profiling in FFPE primary melanoma Illumina HT12.4 703
EGAD00010001561 Quantile-normalised and batch corrected Illumina HT12.4 703
EGAD00010001557 503 genotypes obtained from Illumina DNA-arrays. Available as plink formatted files Illumina arrays 503
EGAD00010001551 The Kibbutzim Family Study (KFS) aimed to investigate the environmental and genetic determinants of cardiometabolic traits (phenotype is LDL-C) Illumina HumanCoreExome BeadChip array 901
EGAD00010001546 ATRT expression Illumina HumanHT-12 v4.0 Array 92
EGAD00010001544 Imputed genetic data for INTERVAL proteomics cohort Affymetrix Axiom UK Biobank + imputation to 1000GP3 and UK10K 3,301
EGAD00010001542 Expression data for 42 PMBCL patient samples (32 IL4R WT cases and 10 cases with mutations in IL4R) Illumina DASL Assay 42
EGAD00010001540 Oncoscan CHP files for the Mesothelemia Project Illumina Oncoscan Array 100
EGAD00010001536 kidney cancer tissue sample Illumina CytoSNP 12 bead array 129
EGAD00010001535 mRNA expression profile of kidney cancer nanostring 126
EGAD00010001533 A cohort of 2886 participants of the Japan PBC-GWAS Study Affymetrix Axiom Genome-Wide ASI 1 Array 2,886
EGAD00010001528 DNA for 2979 individuals from Guangzhou was extract from peripheral blood and genotyped by Sequenom, with digit-number working memory, visuospatial working memory, recent long-term memory measured. Sequenom 2,979
EGAD00010001527 DNA for 1546 individuals from Chongqing was extract from peripheral blood and genotyped by Illumina Omni Zhonghua-8 version 1 gene chips, with digit-number working memory, visuospatial working memory, recent long-term memory measured. Illumina 1,546
EGAD00010001526 DNA for 2482 individuals from Chongqing was extract from peripheral blood and genotyped by Illumina Omni Zhonghua-8 version 2 gene chips, with digit-number working memory, visuospatial working memory, recent long-term memory measured. Illumina 2,482
EGAD00010001521 Bisulfite Converted DNA obtained from Whole Blood analysed on IlluminaHumanMethylationEPIC BeadChip microarrays processed with bigmelon R package IlluminaHumanMethylationEPIC 1,175
EGAD00010001519 Raw Array data from the PRAD-CA for ICGC DCC Release27 Affymetrix OncoScan FFPE Express 116
EGAD00010001515 Nanostring PanCancer immune profiling data for The interface of malignant and immunologic clonal dynamics in high-grade serous ovarian cancer Nanostring 120
EGAD00010001513 Copy Number Variation as determined on Illumia Omin Arrays Illumina Beadchip 122
EGAD00010001511 SNP 6.0 arrays of LCNEC samples Affymetrix SNP 6.0 54
EGAD00010001509 A WTCCC2 project - replication study for bacteraemia susceptibility in 2518 individuals from Kenya, genotyped on the Illumina Immunochip chip. Illumina Infinium ImmunoChip 2,518
EGAD00010001506 Methylation array dataset Illumina 450k 38
EGAD00010001501 mRNA profiling of human plucked hair follicle from frontal and occipital scalp Illumina HT12 48
EGAD00010001500 miRNA profiling of human plucked hair follicle from frontal and occipital scalp Affymetrix miRNA 4.0 Array 48
EGAD00010001499 EXOME ARRAY ANALYSIS OF ADVERSE REACTIONS TO FLUOROPYRIMIDINE-BASED THERAPY FOR GASTROINTESTINAL CANCER Illumina HumanExome Array 504
EGAD00010001497 The UK BioBank directly genotyped dataset 2018 Affymetrix 485,000
EGAD00010001495 Intensity files for Immunochip genotypes from blood Illumina Immunochip 314
EGAD00010001491 ADP array data, comprised of 2217 samples of Asian ancestry (excluding the Japanese population from ADP). Samples were genotyped on different Illumina or Affy platform. Affymetrix/Illumina 3,933
EGAD00010001489 Genotype data for 5,699,237 genotyped and imputed SNPs in the 816 healthy donors of the Milieu Intérieur cohort Illumina HumanOmniExpress 816
EGAD00010001487 252 dengue fever patients and 159 dengue shock syndrome patients Illumina Human 660W Quad BeadChip 411
EGAD00010001486 290 controls Illumina HumanOmniExpress BeadChip 290
EGAD00010001484 Genetic Overlap between Metabolic and Psychiatric disease Illumina HumanCoreExome-12v1-0 2,611
EGAD00010001482 CASE SAMPLES USING QuantStudio 12K Flex Real-Time PCR System (Thermo Fisher Scientific, Waltham, MA, USA) OpenArray 657
EGAD00010001481 CONTROL SAMPLES USING QuantStudio 12K Flex Real-Time PCR System (Thermo Fisher Scientific, Waltham, MA, USA) OpenArray 258
EGAD00010001479 SNP data for 991 Irish individuals Illumina 991
EGAD00010001474 UK BioBank Imputed Dataset 1
EGAD00010001473 Himalayan population genetic study raw data (Tibet) Illumina humanomniexpress-24-v1-1 148
EGAD00010001472 Himalayan population genetic study raw data (Himalaya) Illumina HumanOmni1-Quad_v1-0 565
EGAD00010001471 Himalayan population genetic study QC filtered data Illumina HumanOmni1-Quad_v1-0, HumanOmniExpress-12-v1-0, humanomniexpress-24-v1-1, HumanOmni25-8v1-2_A1 738
EGAD00010001470 Himalayan population genetic study raw data (Himalaya) Illumina HumanOmniExpress-12-v1-0 170
EGAD00010001463 Genotype cases using Illumina HumanOmni5 Illumina HumanOmni5 279
EGAD00010001461 illumina 450K 450K 472
EGAD00010001457 These are the log2CPM (log2 counts per million) fragments per gene counts associated with the BAM files in EGAD00001003806, in tab separated format. Counts for 36 postmortem brain samples from 9 non-demented control subjects and 9 Hereditary cerebral hemorrhage with amyloidosis-Dutch type subjects are included (1 Frontal cortex sample and 1 Occipital cortex sample per subject). RNA samples were depleted for ribosomal RNA with the Ribo Zero Gold Human kit (Illumina) and strand specific RNA-Seq libraries were generated. Paired-end sequencing was performed on a HiSeq2500 Illumina system (2x50bp reads). Alignments were performed using GSNAP v2014-12-23 with setting "--npaths 1" on GRCh38 reference genome without the alternative contigs. Fragment per gene counting was performed using HTSeq-count v0.6.1p1 with setting "--stranded reverse". The gene annotation used for quantification were UCSC RefSeq genes for GRCh38 downloaded on 2015-07-13. Illumina HiSeq 2500 36
EGAD00010001455 illumina 450K 450K 1,347
EGAD00010001452 Genome-wide SNP genotyping data for 102 Pakistani individuals by Illumina HumanOmni2.5-8 array, used in the EGAS00001002558 study Illumina HumanOmni2.5-8 102
EGAD00010001450 Methylation JMML samples using 450K Array Illumina_450K 92
EGAD00010001449 Methylation Control samples using 450K Array Illumina_450K 22
EGAD00010001447 Array-based association data Illumina Omni Express/Illumina Core Exome 784
EGAD00010001443 SNP array Affymetrix SNP6.0 100
EGAD00010001433 ATRT methylation Illumina HumanMethylation450 BeadChip 162
EGAD00010001430 Gene expression analysis from primary human JMML samples using Illumina Human HT-12 v4 Illumina_HumanHT-12_V4 15
EGAD00010001428 Cardio-Metabochip genotypes for IHIT cohort Illumina 2,791
EGAD00010001427 Cardio-Metabochip genotypes for B99 cohort Illumina 1,336
EGAD00010001425 Codelink Human Whole Genome from Blood taken at 72 hours after birht Codelink Human Whole Genome Bioarray 9
EGAD00010001424 Codelink Human Whole Genome from Blood taken at 72 hours after birht Codelink Human Whole Genome Bioarray 11
EGAD00010001422 1000G Phase 3 Imputed cases and controls from NSAID-induced PUD study Illumina Omni 2.5 676
EGAD00010001420 Read counts determined using HTSeq-count for the BBMRI BIOS Freeze 2 RNAseq data RNAseq 3,560
EGAD00010001418 HumanOmni25M-8v1-1 Illumina 24
EGAD00010001416 BBMRI - BIOS project - Freeze 2 - methylation Illumina Human Methylation 450k BeadChip 4,386
EGAD00010001414 Raw Array data from the PRAD-CA for ICGC DCC Release26 Affymetrix OncoScan FFPE Express 86
EGAD00010001412 Blood transcriptome from women participating in the Norwegian Women and Cancer study (NOWAC) Illumina HumanWG-6 version 3 or Illumina HumanHT-12 expression bead chip, combined on identical nucleotide universal identifier 920
EGAD00010001410 Genotyped samples using Illumina Infinium HumanCoreExome Beadchip Illumina Infinium HumanCoreExome Beadchip 502
EGAD00010001408 Illumina Infinium 450K array data Illumina 450K 34
EGAD00010001406 Breast cancer tissue and controls Exiqon 7th generation miRCURY LNA microRNA microarray system 149
EGAD00010001403 Gene expression read counts Illumina HiSeq2000 132
EGAD00010001400 Difference in gene expression values between case and control, log2 values. Blood transcriptome from women participating in the Norwegian Women and Cancer study (NOWAC) Post-genome Cohort taken up to eight years before brest cancer diagnosis. Illumina HumanWG-6 version 3 or Illumina HumanHT-12 expression bead chip, combined on identical nucleotide universal identifiers. Illumina HumanWG-6 467
EGAD00010001396 A discovery cohort of 856 adult survivors of pediatric ALL Genome-Wide Human SNP Array 6.0 - Thermo Fisher Scientific 856
EGAD00010001395 A replication cohort consisting of 1428 adult survivors of any non-ALL pediatric cancer Genome-Wide Human SNP Array 6.0 - Thermo Fisher Scientific 1,428
EGAD00010001392 Genotyping data from Swahili individuals Illumina Human Omni5 Bead Chip 91
EGAD00010001390 HipSci - Battens Disease - Expression Array - July 2017 Illumina 1
EGAD00010001388 HipSci - Retinitis Pigmentosa - Expression Array - July 2017 Illumina 1
EGAD00010001386 HipSci - Macular Dystrophy - Expression Array - July 2017 Illumina 1
EGAD00010001384 HipSci - Bleeding and Platelet Disorders - Expression Array - July 2017 Illumina 1
EGAD00010001382 HipSci - Primary Immune Deficiency - Expression Array - July 2017 Illumina 1
EGAD00010001380 HipSci - Hypertrophic Cardiomyopathy - Expression Array - July 2017 Illumina 1
EGAD00010001378 HipSci - Congenital Hyperinsulinia - Expression Array - July 2017 Illumina 1
EGAD00010001376 HipSci - Alport Syndrome - Expression Array - July 2017 Illumina 1
EGAD00010001374 HipSci - Usher Syndrome - Expression Array - July 2017 Illumina 1
EGAD00010001372 HipSci - Kabuki Syndrome - Expression Array - July 2017 Illumina 1
EGAD00010001370 HipSci - Hereditary Spastic Paraplegia - Expression Array - July 2017 Illumina 1
EGAD00010001368 HipSci - Hereditary Cerebellar Ataxias - Expression Array - July 2017 Illumina 1
EGAD00010001366 HipSci - Battens Disease - Genotyping Array - July 2017 Illumina 1
EGAD00010001364 HipSci - Retinitis Pigmentosa - Genotyping Array - July 2017 Illumina 1
EGAD00010001362 HipSci - Macular Dystrophy - Genotyping Array - July 2017 Illumina 1
EGAD00010001360 HipSci - Bleeding and Platelet Disorders - Genotyping Array - July 2017 Illumina 1
EGAD00010001358 HipSci - Primary Immune Deficiency - Genotyping Array - July 2017 Illumina 1
EGAD00010001356 HipSci - Hypertrophic Cardiomyopathy - Genotyping Array - July 2017 Illumina 1
EGAD00010001354 HipSci - Congenital Hyperinsulinia - Genotyping Array - July 2017 Illumina 1
EGAD00010001352 HipSci - Alport Syndrome - Genotyping Array - July 2017 Illumina 1
EGAD00010001350 HipSci - Usher Syndrome - Genotyping Array - July 2017 Illumina 1
EGAD00010001348 HipSci - Kabuki Syndrome - Genotyping Array - July 2017 Illumina 1
EGAD00010001346 HipSci - Hereditary Spastic Paraplegia - Genotyping Array - July 2017 Illumina 1
EGAD00010001344 HipSci - Hereditary Cerebellar Ataxias - Genotyping Array - July 2017 Illumina 1
EGAD00010001342 HipSci - Monogenic Diabetes - Expression Array - July 2017 Illumina 1
EGAD00010001340 HipSci - Bardet-Biedl Syndrome - Expression Array - July 2017 Illumina 1
EGAD00010001334 HipSci - Monogenic Diabetes - Genotyping Array - July 2017 Illumina 1
EGAD00010001332 HipSci - Bardet-Biedl Syndrome - Genotyping Array - July 2017 Illumina 1
EGAD00010001330 HipSci - Healthy Normals - Expression Array - July 2017 Illumina 1
EGAD00010001328 HipSci - Healthy Normals - Genotyping Array - July 2017 Illumina 0
EGAD00010001326 Papuan Genotyping Illumina Multi-EthnicGlobal_A1 380
EGAD00010001323 Medulloblastoma methylation profiling Illumina Infinium HumanMethylation450 BeadChip 911
EGAD00010001319 Medulloblastoma methylation profiling Illumina Infinium HumanMethylation450 BeadChip 345
EGAD00010001315 Single cell transcriptomics of PBMCs of 47 donors from the Lifelines Deep cohort (general population, Northern part of the Netherlands). Cells of five or six different donors were pooled together in one sample pool, resulting in eight different sample pools. In total, 28.855 cells were captured and their transcriptomes were sequenced to an average depth of 74k. Genotype data was available for each donor, which allowed us to use the Demuxlet method that uses variable SNPs between the pooled individuals to determine which cell belongs to which individual. Since genotype information is lacking of 2 individuals, the transcriptome of only 45 individuals could be retrieved. Illumina HiSeq4000 45
EGAD00010001310 iOmics lipid data via mass spectrometry (MS) Agilent 1200 LC system 359
EGAD00010001309 iOmics genomic data using 2.5M and Exome array Illumina 2.5M and Illumina Exome array 323
EGAD00010001308 iOmics miRNA data via qPCR quantification patented mSMRT-qPCR miRNA assay (MIRXES) 351
EGAD00010001307 iOmics gene expression data using Expression Array Affymetrix Human Gene 1.0 ST Array 269
EGAD00010001304 Genotyping data from Comorian individuals Illumina Human Omni5 Bead Chip 49
EGAD00010001301 Medulloblastoma expression profiling Affymetrix expression array 246
EGAD00010001300 Medulloblastoma expression profiling Affymetrix expression array 146
EGAD00010001298 primary human ACC and normal samples using 450K Illumina_450K 110
EGAD00010001296 DNA methylation analysis from primary human JMML and normal blood samples using 450K Illumina_450K 0
EGAD00010001294 Methylation data using 450K Illumina 450k 1,128
EGAD00010001292 Genotyping of hip osteoarthritis patients who have undergone total joint replacement Illumina InfiniumCoreExome-24v1-1_A 9
EGAD00010001291 Methylation profiling of hip osteoarthritis patients who have undergone total joint replacement Illumina HumanMethylation450K 27
EGAD00010001289 Resolving the Genetic Architecture of Aseptic Loosening After Total Hip Replacement Illumina InfiniumCoreExome-24v1-1_A 2,880
EGAD00010001287 Array methylation profiling of knee osteoarthritis patients who have undergone total joint replacement Illumina HumanMethylation450K 68
EGAD00010001285 Genotyping of knee osteoarthritis patients who have undergone total joint replacement Illumina InfiniumCoreExome-24v1-1_A 17
EGAD00010001283 Illumina HumanOmni5-Quad BeadChips Illumina 229
EGAD00010001281 SNP array dataset HUMANOMNIEXPRESS 50
EGAD00010001280 Transcriptome array dataset Affymetrix HG_U133_+2 25
EGAD00010001276 Expression profiling by Nanostring cancer pathway Nanostring cancer pathway 30
EGAD00010001275 Affymetrix GeneChip® Human Transcriptome Array 2.0 Affymetrix GeneChip® Human Transcriptome Array 2.0 34
EGAD00010001274 Expression profiling by Nanostring cancer immune Nanostring Cancer Immune 30
EGAD00010001273 Affymetrix GeneChip® Human Transcriptome Array 2.0 Affymetrix GeneChip® Human Transcriptome Array 2.0 34
EGAD00010001262 DNAm Case samples using Illumina Infinium 450K Illumina 450K array 32
EGAD00010001261 DNAm Case samples using Illumina Infinium 450K Illumina 450K array 33
EGAD00010001260 DNAm Case samples using Illumina Infinium 450K Illumina 450K array 33
EGAD00010001258 Human Cardio Metabochip Illumina 973
EGAD00010001255 Autosomal STR genotypes using 15 Identifiler loci Applied Biosystems 990
EGAD00010001251 Epigenome of 36 rainforest hunther-gathering Baka of Cameroon by Illumina HumanMethylation450 array, used in the EGAS00001002226 study Illumina HumanMethylation450 38
EGAD00010001249 TGCT - GWAS loci Hi-C data Illumina HiSeq 2000 1
EGAD00010001247 UK TGCT case samples using theInfinium OncoArray-500K BeadChip Infinium OncoArray-500K BeadChip 3,206
EGAD00010001246 UK TGCT controls samples using theInfinium OncoArray-500K BeadChip Infinium OncoArray-500K BeadChip 7,422
EGAD00010001243 UK TGCT control samples using the Infinium 1.2M array Illumina Infinium 1.2M array 4,946
EGAD00010001241 CN/LOH-profile of Translocation-negative FL_7 Affymetrix SNP 6.0 1
EGAD00010001240 CN/LOH-profile of Translocation-negative FL_1 Affymetrix SNP 6.0 1
EGAD00010001239 CN/LOH-profile of Translocation-negative FL_6 Affymetrix SNP 6.0 1
EGAD00010001238 CN/LOH-profile of Translocation-negative FL_2 Affymetrix SNP 6.0 1
EGAD00010001237 CN/LOH-profile of Translocation-negative FL_10 Affymetrix SNP 6.0 1
EGAD00010001236 CN/LOH-profile of Translocation-negative FL_4 Affymetrix SNP 6.0 1
EGAD00010001235 CN/LOH-profile of Translocation-negative FL_11 Affymetrix SNP 6.0 1
EGAD00010001234 CN/LOH-profile of Translocation-negative FL_9 Affymetrix SNP 6.0 1
EGAD00010001233 CN/LOH-profile of Translocation-negative FL_5 Affymetrix SNP 6.0 1
EGAD00010001232 CN/LOH-profile of Translocation-negative FL_8 Affymetrix SNP 6.0 1
EGAD00010001228 Primary and PDX SqCC samples using Infinium OmniExpress-24 Infinium_OmniExpress-24v1.0 24
EGAD00010001223 Illumina Omni 2.5M SNPchip data (build37) of Egyptian samples from the Pagani et al. 2015 AJHG paper (doi: http://dx.doi.org/10.1016/j. ajhg.2015.04.019) Illumina HumanOmni2-5M-8v1-1_B 100
EGAD00010001221 Illumina Omni 2.5M SNPchip data (build37) of Ethiopian samples from the Pagani et al. 2015 AJHG paper (doi: http://dx.doi.org/10.1016/j. ajhg.2015.04.019) Illumina HumanOmni2-5_8v1_A 124
EGAD00010001218 Raw Array data from the CPCGene 200PG study Affymetrix OncoScan FFPE Express 248
EGAD00010001216 Melanoma cell lines CNV by SNP6 SNP6 22
EGAD00010001212 Genetic studies of pregnancy-related cardiometabolic disorders in Central Asian, Northern European, and Colombian populations Illumina HumanOmniExpress-12v1_J 1,207
EGAD00010001211 Inverse variance weighted fixed effect meta-analysis of three European GWAS studies of the offspring of Pre-eclampsia affected births (2658 Cases and 308267 Controls). 4
EGAD00010001209 Genome-wide SNP genotyping data for 1,235 western Africans by Illumina HumanOmniExpress-12 array, used in the EGAS00001002078 study Illumina HumanOmniExpress-12 1,235
EGAD00010001202 Human genotyping data for patients infected by hepatitis C virus Affymetrix UKBiobank Array 563
EGAD00010001200 Genotyping data from Indonesian sea nomad and surrounding populations Illumina Omni 5 105
EGAD00010001198 Case control samples using Infinium Omni2.5 Infinium Omni2.5M 274
EGAD00010001196 Raw Array data from the CPCGene BRCA study Affymetrix OncoScan FFPE Express 48
EGAD00010001192 Germline genotype data on 56,479 ovarian cancer cases and controls Illumina OncoArray 56,479
EGAD00010001188 This data set includes the following summary level data files used for the 13k analysis of T2D-GENES data: wes.variants.list: list of variants to keep for any analysis of the exomes data wes.assoc.samples.list: list of samples to keep for association analysis wes.assoc.variants.list: list of variants to keep for association analysis wes.sv.assoc.txt: single variant association analysis results wes.gene.ptv.variants.list.txt: list of protein truncating variants to use in gene-level analysis wes.gene.ptv.assoc.txt: results from gene-level tests of protein truncating variants wes.gene.nsstrict.variants.list.txt: list of NSstrict variants to use in gene-level analysis wes.gene.nsstrict.assoc.txt: results from gene-level tests of NSstrict variants wes.gene.nsbroad.variants.list.txt: list of NSbroad variants to use in gene-level analysis wes.gene.nsbroad.assoc.txt: results from gene-level tests of NSbroad variants wes.gene.ns.variants.list.txt: list of non synonymous variants to use in gene-level analysis wes.gene.ns.assoc.txt: results from gene-level tests of non synonymous variants 0
EGAD00010001187 This data set includes the following summary level data file used for the exome chip analysis: exome_chip.sv.assoc.txt: results from single variant association analysis in exome chip 0
EGAD00010001185 This data set includes the following summary level data files used for the GoT2D WGS analysis: wgs.assoc.samples.list: list of samples to keep for association analysis wgs.assoc.variants.list: list of variants to keep for association analysis wgs.sv.assoc.txt: single variant association results 0
EGAD00010001184 This data set includes the following summary level data file used for the imputation data: imputation.sv.assoc.txt: results from single variant association analysis in imputed samples 0
EGAD00010001179 Tissue samples using Illumina HumanOmniExpress-FFPE-12 v1.0 BeadChip Illumina HumanOmniExpress-FFPE-12 v1.0 BeadChip 22
EGAD00010001177 This dataset contains 61 tumors SNP-array dataset from 15 EGFR mutant lung adenocarcinoma patients. Illumina 61
EGAD00010001176 This dataset contains 15 control SNP-array dataset from 15 EGFR mutant lung adenocarcinoma patients. Illumina 15
EGAD00010001162 Oncotrack primary tumor samples using 450K. The dataset includes shared AF analysis files oncotrackDNAmAnalysis.R and oncotrackDNAmBetaScores.txt which were applied for both Oncotrack_450K_tumor (EGAD00010001162) and Oncotrack_450K_metastatic (EGAD00010001161) datasets. Illumina 450K 67
EGAD00010001161 Oncotrack metastatic samples using 450K. The shared AF analysis files oncotrackDNAmAnalysis.R and oncotrackDNAmBetaScores.txt which were applied for both Oncotrack_450K_tumor (EGAD00010001162) and Oncotrack_450K_metastatic (EGAD00010001161) datasets are included on Oncotrack_450K_tumor (EGAD00010001162) dataset. Illumina 450K 15
EGAD00010001158 Genotyping of additional Inflammatory Bowel Disease cases - 2014 (all samples) Illumina Human Core Exome 12v1-1_a 11,767
EGAD00010001157 Genotyping of additional Inflammatory Bowel Disease cases - 2014 (QC pass samples) Illumina Human Core Exome 12v1-1_a 9,247
EGAD00010001155 Crohn's disease DNA samples genotyped using UK Biobank Axiom array Axiom UKB 1,676
EGAD00010001153 Family Trios on aCGH 8x60K Agilent 8x60K 138
EGAD00010001149 HipSci - Monogenic Diebetes - Methylation Array - October 2016 Illumina 35
EGAD00010001147 HipSci - Healthy Normals - Genotyping Array - September 2016 Illumina 613
EGAD00010001145 HipSci - Bardet-Biedl Syndrome - Methylation Array - October 2016 Illumina 45
EGAD00010001143 HipSci - Healthy Normals - Expression Array - September 2016 Illumina 613
EGAD00010001141 Summary data from Meta-analysis of Genome-Wide-Association Studies for plasma levels of Coagulation Factor XI (FXI) 0
EGAD00010001139 HipSci - Healthy Normals - Methylation Array - October 2016 Illumina 181
EGAD00010001131 The 100 European-descent (EUB) and 100 African-descent (AFB) Belgians studied were genotyped for a total of 4,301,332 SNPs on the Illumina HumanOmni5-Quad BeadChips. Whole-exome sequencing was carried out for the same 200 individuals with the Nextera Rapid Capture Expanded Exome kit, on the Illumina HiSeq 2000 platform, with 100-bp paired-end reads. This kit delivers 62 Mb of genomic content per individual, including exons, untranslated regions (UTR), and microRNAs. Omni5 and exome datasets were merged, yielding a concordance rate between platforms of 99.93%. Illumina HumanOmni5-Quad and exome sequencing 200
EGAD00010001103 Genotype data from Chad, Lebanon, and Yemen Illumina HumanOmni2.5-8 v1.1 B 238
EGAD00010001102 Genotype data from Chad, Lebanon, and Yemen Illumina HumanOmni2.5-8 v1.2 A 126
EGAD00010001101 Genotype data from Chad, Lebanon, and Yemen Illumina HumanOmni2.5-8 v1.1 B 20
EGAD00010001099 Digital images of ovarian cancer metastases Aperio 127
EGAD00010001081 Summary statistics for Malaria Genomic Epidemiology Network, "A novel locus of resistance to severe malaria in a region of ancient balancing selection", Nature (2015) Illumina Omni 2.5M 11,657
EGAD00010001079 Affymetrix SNP6.0 array breast cancer data Affymetrix SNP6.0 66
EGAD00010001075 Argentine samples using 250K Illumina Exome 250K 391
EGAD00010001074 Rare CNVs from schizophrenia cases and controls Mulitple CNV platforms 0
EGAD00010001064 tumor-based gene expression from breast cancer cases IlluminaHuman HT12 173
EGAD00010001063 blood-based gene expression from breast cancer cases IlluminaHuman AWG-6 and HT12 173
EGAD00010001062 blood-based gene expression from breast cancer cases and age-matched controls IlluminaHuman AWG-6 and HT12 455
EGAD00010001058 APCDR AGV Project: Array data from 100 Ga-Adangbe. Raw data, intensity files and post-QC Plink files. Illumina HumanOmni2.5-4v1_B and HumanOmni2-5_8v1_A 100
EGAD00010001057 APCDR AGV Project: Array data from 88 Mandinka. Raw data, intensity files and post-QC Plink files. Illumina HumanOmni2-5_8v1_A 88
EGAD00010001056 APCDR AGV Project: Array data from 100 Zulu. Raw data, intensity files and post-QC Plink files. Illumina HumanOmni2.5-4v1_B and HumanOmni2-5_8v1_A 100
EGAD00010001055 APCDR AGV Project: Array data from 100 Baganda. Raw data, intensity files and post-QC Plink files. Illumina HumanOmni2.5-4v1_B and HumanOmni2-5_8v1_A 100
EGAD00010001054 APCDR AGV Project: Array data from 74 Fula Illumina HumanOmni2-5_8v1_A 74
EGAD00010001053 APCDR AGV Project: Array data from 100 Banyarwanda. Raw data, intensity files and post-QC Plink files. Illumina HumanOmni2.5-4v1_B and HumanOmni2-5_8v1_A 100
EGAD00010001052 APCDR AGV Project: Array data from 100 Kalenjin. Raw data, intensity files and post-QC Plink files. Illumina HumanOmni2.5-4v1_B 100
EGAD00010001051 APCDR AGV Project: Array data from 97 Barundi. Raw data, intensity files and post-QC Plink files. Illumina HumanOmni2-5_8v1_A 97
EGAD00010001050 APCDR AGV Project: Array data from 78 Wolof. Raw data, intensity files and post-QC Plink files. Illumina HumanOmni2-5_8v1_A 78
EGAD00010001049 APCDR AGV Project: Array data from 99 Kikuyu. Raw data, intensity files and post-QC Plink files. Illumina HumanOmni2.5-4v1_B 99
EGAD00010001048 APCDR AGV Project: Array data from 79 Jola. Raw data, intensity files and post-QC Plink files. Illumina HumanOmni2-5_8v1_A 79
EGAD00010001047 APCDR AGV Project: Array data from 107 Ethiopians (Amhara, Oromo, Somali; subset of Ethiopian Genome Project Genotyping). Raw data, intensity files and post-QC Plink files. Illumina HumanOmni2-5_8v1_A 107
EGAD00010001046 APCDR AGV Project: Array data from 86 Sotho. Raw data, intensity files and post-QC Plink files. Illumina HumanOmni2-5_8v1_A 86
EGAD00010001045 APCDR AGV Project: Array data from 99 Igbo. Raw data, intensity files and post-QC Plink files. Illumina HumanOmni2.5-4v1_B 99
EGAD00010001043 WTCCC3 Anorexia Nervosa Infinium-HumanCoreExome Illumina HumanCoreExome-12v1-0_A and HumanCoreExome-24v1-0_A 925
EGAD00010001040 Methylation changes in OA patients with chronic exposure to cobalt and chromium Illumina HumanMethylation450 68
EGAD00010001034 WTCCC3 Anorexia Nervosa GWAS Illumina Human670-QuadCustom_v1_A 1,696
EGAD00010001032 RNA Expression using Illumina HT12 v3 Illlumina HT12 v3 153
EGAD00010001029 Summary statistics for a multi-cohort epigenome-wide association study. This includes summary statistics (effect-size, standard error, p-value) for 470,000 methylation markers. 0
EGAD00010001025 BLUEPRINT DNA methylation profiles of monocytes, T cells and B cells in type 1 diabetes-discordant monozygotic twins Illumina 450K 302
EGAD00010001012 BLUEPRINT DNA Methylation 450K data of mantle cell lymphoma Illumina HumanMethylation 450K 86
EGAD00010001006 Proteomics LC-MS MS dataset Liquid chromatography–mass spectrometry 8
EGAD00010001005 Illumina HumanCoreExome-12v1-1_A chip typing in a Greek adolescent population Illumina Human Core Exome 12v1.1 120
EGAD00010001004 WTCCC1 project samples from 1958 British Birth Cohort Infinium 550K 1,504
EGAD00010001003 This data set contains two data files. First data file (file name: PREDO_GA_EGA_methylation_data.csv) includes methylation data from 485512 sites accross human genome from 96 individuals acquired from Illumina 450K -chip. The other data file (file name: PREDO_GA_EGA_phenotypes.csv) contains the gestation ages and the genders of the 96 samples. Illumina 450K-chip (methylation data) 96
EGAD00010001001 Primary renal cell carcinoma (RCC), RCC metastases and cell lines by Illumina 450K Illumina 450K 62
EGAD00010000983 MeDIP-seq RPM chromsome BED files for Peripheral Blood from EPITWIN Project (Columns 4-4353 represent samples) MeDIP-seq 4,350
EGAD00010000965 Array data from 4778 individuals from general population of rural Uganda Illumina HumanOmni2.5-8 BeadChip 4,778
EGAD00010000963 Healthy volunteers recruited in Samoa HumanCore-24 BeadChip 24
EGAD00010000962 Healthy volunteers and missing phenotype individuals recruited in New Caledonia with higher density genotyping HumanOmniExpressExome-8 BeadChip 30
EGAD00010000961 Rheumatic heart disease cases recruited in Fiji HumanCore-24 BeadChip 535
EGAD00010000960 Definite and borderline rheumatic heart disease cases and patients with mild non-diagnostic valvulopathy recruited in Samoa HumanCore-24 BeadChip 126
EGAD00010000959 Healthy volunteers recruited in Fiji HumanCore-24 BeadChip 854
EGAD00010000958 Healthy volunteers recruited in Fiji with higher density genotyping HumanOmniExpressExome-8 BeadChip 32
EGAD00010000957 Rheumatic heart disease cases recruited in New Caledonia HumanCore-24 BeadChip 465
EGAD00010000956 Rheumatic heart disease cases recruited in New Caledonia with higher density genotyping HumanOmniExpressExome-8 BeadChip 34
EGAD00010000955 Rheumatic heart disease cases recruited in Fiji with higher density genotyping HumanOmniExpressExome-8 BeadChip 32
EGAD00010000954 Healthy volunteers recruited in New Caledonia HumanCore-24 BeadChip 356
EGAD00010000953 Healthy adult volunteers and newborns recruited in various countries across Oceania. HumanCore-24 BeadChip 937
EGAD00010000952 Where Are You From? samples types at 517K SNP loci Illumina HumanOmniExpress-24 BeadChip 598
EGAD00010000951 SNP array data for 668 cancer cell lines Illumina 2.5M 668
EGAD00010000950 WTCCC2 Bacteraemia Susceptibility (BS) smaples using Affymetrix 6.0 Affymetrix 6.0 4,924
EGAD00010000949 Lymphoma samples using HumanOmni Illumina HumanOmni2.5 104
EGAD00010000948 Lymphoma samples using 450k Illumina 450k 95
EGAD00010000947 Lymphoma samples using CytoSNP Illumina CytoSNP 35
EGAD00010000946 Human samples, 450k analysis Illumina 450k 127
EGAD00010000944 Genotyping data from Southeast Borneo individuals Illumina Human Omni Express Bead Chip-24 v1.0 41
EGAD00010000943 Sahel population study using 2.5M Illumina HumanOmni2.5 161
EGAD00010000942 Breast lesions assayed with Affymetrix SNP 6.0 Affymetrix SNP 6.0 125
EGAD00010000941 Gambian specimens without trachomatous scarring Illumina Omni 2.5 1,531
EGAD00010000940 Gambian specimens with trachomatous scarring WHO grade C2/C3 Illiumina Omni 2.5 1,531
EGAD00010000939 Illumina 1M SNP Array dataset Illumina 1M SNP Array 2
EGAD00010000938 mRNA Array Agilent 44K dataset Agilent 44K 16
EGAD00010000937 ACGH 180K dataset Agilent 180K 5
EGAD00010000936 Affymetrix Exon Array dataset Affymetrix GeneChip Human Exon 1.0 ST 2
EGAD00010000935 ACGH 244K dataset Agilent 244K 10
EGAD00010000934 Agilent miRNA dataset Agilent SurePrint Human miRNA Microarray 2
EGAD00010000929 WTCCC3_Primary Biliary Cirrhosis Replication Illumina ImmunoChip 2,981
EGAD00010000928 WTCCC3_Primary Biliary Cirrhosis Replication Post-QC Illumina ImmunoChip 2,861
EGAD00010000927 Subset 2 of osteoarthritis cases from the arcOGEN Consortium (http://www.arcogen.org.uk/) genotyped on HumanCoreExome-24v1-0 with consent for osteoarthritis studies only. Illumina HumanCoreExome-24v1-0 248
EGAD00010000926 Subset 1 of osteoarthritis cases from the arcOGEN Consortium (http://www.arcogen.org.uk/) genotyped on HumanCoreExome-12v1-1 with broader consent. Illumina HumanCoreExome-12v1-1 3,075
EGAD00010000925 Subset 1 of osteoarthritis cases from the arcOGEN Consortium (http://www.arcogen.org.uk/) genotyped on HumanCoreExome-12v1-0 with broader consent. Illumina HumanCoreExome-12v1-0 855
EGAD00010000924 Subset 2 of osteoarthritis cases from the arcOGEN Consortium (http://www.arcogen.org.uk/) genotyped on HumanCoreExome-12v1-1 with consent for osteoarthritis studies only. Illumina HumanCoreExome-12v1-1 991
EGAD00010000923 Subset 2 of osteoarthritis cases from the arcOGEN Consortium (http://www.arcogen.org.uk/) genotyped on HumanCoreExome-12v1-0 with consent for osteoarthritis studies only. Illumina HumanCoreExome-12v1-0 463
EGAD00010000922 Subset 1 of osteoarthritis cases from the arcOGEN Consortium (http://www.arcogen.org.uk/) genotyped on HumanCoreExome-24v1-0 with broader consent. Illumina HumanCoreExome-24v1-0 494
EGAD00010000921 samples using Affymetrix CYTOSCANHD CYTOSCANHD 12
EGAD00010000920 samples using Illumina HUMANOMNIEXPRESS HUMANOMNIEXPRESS 50
EGAD00010000919 samples using Illumina HUMANOMNI1QUAD HUMANOMNI1QUAD 2
EGAD00010000918 Understanding Society GWAS, samples that passed quality control, imputed to UK10K + 1000 Genomes combined reference panel Illumina HumanCoreExome-12v1-0 chip, UK10K + 1000 Genomes combined reference panel imputed 9,944
EGAD00010000917 399 tumors profiled using Agilent miRNA microarrays (Product Number G4872A, design ID 046064). The arrays are based on miRBase release 19.0 and 2006 human miRNAs are represented. 150 ng total RNA was used as input. Agilent miRNA microarrays 399
EGAD00010000916 BASIS breast cancer DNA methylation Illumina 450k Illumina 450k 457
EGAD00010000915 Affymetrix SNP6.0 breast cancer genome sequencing data Affymetrix SNP6.0 344
EGAD00010000913 SEA 660K Illumina 660K 3
EGAD00010000912 SEA 610K Illumina 610K 1
EGAD00010000911 HipSci - Embryonic Stem Cells - Genotyping Array - April 2016 Illumina 2
EGAD00010000910 HipSci - Embryonic Stem Cells - Expression Array - April 2016 Illumina 2
EGAD00010000909 HipSci - Embryonic Stem Cells - Methylation Array - April 2016 Illumina 2
EGAD00010000908 Illumina SNP-arrays for matching retinoblastoma-blood pairs and retinoblastoma cell lines. HumanOmni1 Quad BeadChip 132
EGAD00010000904 Genome-wide study of resistance to severe malaria in eleven worldwide populations:Kenya Illumina Omni 2.5M 3,865
EGAD00010000903 Genome-wide study of resistance to severe malaria in eleven worldwide populations:Malawi Illumina Omni 2.5M 3,088
EGAD00010000902 Genome-wide study of resistance to severe malaria in eleven worldwide populations:Gambia Illumina Omni 2.5M 5,594
EGAD00010000901 Russian Tuberculosis samples using Affymetrix 6.0 Affymetrix Genome-Wide Human SNP Array 6.0 Genotypes 11,937
EGAD00010000897 Infinium 450K in Rhabdomyosarcoma Infinium HumanMethylation450 BeadChip 53
EGAD00010000892 Healthy individuals from Italy Illumina 300
EGAD00010000891 Understanding Society GWAS, samples that passed quality control Illumina HumanCoreExome-12v1-0 9,944
EGAD00010000890 Understanding Society GWAS, all samples Illumina HumanCoreExome-12v1-0 10,463
EGAD00010000889 Gencode control samples using SNP6.0 SNP6.0 183
EGAD00010000887 Freeze 1 of the RP3 project Illumina Human Methylation 450k BeadChip 3,898
EGAD00010000886 samples using Affymetrix HG_U133_+2 Affymetrix HG_U133_+2 99
EGAD00010000883 The ARGO-Larissa GWAS. Illumina HumanCoreExome-24v1-0 859
EGAD00010000881 Digital images of ovarian cancer sections Aperio 91
EGAD00010000875 CLL Expression Array Affymetrix U219 1,008
EGAD00010000874 Understanding Society Sequenom genotypes Sequenom 4,295
EGAD00010000872 Genotyped case and control sampes using HumanExome Beadchip 1,610
EGAD00010000871 CLL and normal B cell samples using 450K 226
EGAD00010000870 DNA methylation microarray Illumina_Infinium_HumanMethylation450 48
EGAD00010000869 RNA expression microarray Illumina_HumanHT-12v4 62
EGAD00010000868 Targeted bisulfite sequencing Illumina Bisulfite-Sequencing 16
EGAD00010000867 Expression Arrays Illumina beadarray 16
EGAD00010000865 MBDSEQ Illumina MBD-Sequencing 16
EGAD00010000863 H3K27Ac Illumina ChIP-Sequencing 16
EGAD00010000862 H3K27me3 Illumina ChIP-Sequencing 16
EGAD00010000860 Pol2 Illumina ChIP-Sequencing 16
EGAD00010000859 Smad3 Illumina ChIP-Sequencing 16
EGAD00010000858 Achalasia cases & controls 8,151
EGAD00010000854 WTCCC3 UK maternal cases of pre-eclampsia Illumina Human670-QuadCustom_v1 1,990
EGAD00010000853 VeraCode GoldenGate GT Assay technology 147
EGAD00010000850 BLUEPRINT DNA methylation profiles of monocytes, neutrophils and T cells from healthy donors Illumina 450K 525
EGAD00010000847 Genotyping using Affymetrix SNP6.0 49
EGAD00010000831 BLUEPRINT EpiMatch: harnessing epigenetics for hematopoietic stem cell transplantation Illumina Infinium HumanMethylation450 BeadChips 85
EGAD00010000829 Illumina Infinium 450K array data 70
EGAD00010000827 Illumina Infinium 450K array data 1
EGAD00010000823 Results of SNP arrays on synchronous CRC samples 1
EGAD00010000819 Summary statistics from meta-analysis for BP phenotypes 0
EGAD00010000817 HipSci - Monogenic Diabetes - Methylation Array - April 2015 0
EGAD00010000815 ATL tumor samples using Affymetrix 250K SNP array 1
EGAD00010000813 ATL tumor samples using Illumina 450K Methylation array 1
EGAD00010000811 ATL tumor samples using Illumina 610K SNP array 1
EGAD00010000807 Illumina HumanCoreExome genotyping data from the British Society for Surgery of the Hand Genetics of Dupuytren's Disease consortium (BSSH-GODD consortium) collection 4,201
EGAD00010000791 Illumina HumanOmni2.5-8 BeadChip 1
EGAD00010000790 ATRT expression unknown, Illumina Human HT6-v3 Array 41
EGAD00010000789 ATRT expression unknown, Illumina Human HT6-v3 Array 4
EGAD00010000787 Epigen-Brasil samples using HumanOmni2.5 6,487
EGAD00010000785 HipSci - Monogenic Diabetes - Expression Array - November 2014 0
EGAD00010000783 HipSci - Bardet-Biedl Syndrome - Expression Array - November 2014 0
EGAD00010000781 HipSci - Bardet-Biedl Syndrome - Methylation Array - April 2015 0
EGAD00010000779 HipSci - Monogenic Diabetes - Genotyping Array - November 2014 Illumina, unknown 9
EGAD00010000777 HipSci - Bardet-Biedl Syndrome - Genotyping Array - November 2014 0
EGAD00010000775 HipSci - Healthy Normals - Expression Array - November 2014 Illumina 580
EGAD00010000773 HipSci - Healthy Normals - Genotyping Array - November 2014 Illumina 580
EGAD00010000771 HipSci - Healthy Normals - Methylation Array - April 2015 0
EGAD00010000768 Replication data for HipSci normal samples using both HumanCoreExome-12_v1 and HumanOmni2.5-8 BeadChips 0
EGAD00010000766 We have established a mechanism for the collection of postal DNA samples from consenting National Joint Registry for England and Wales (NJR) patients and have carried out genotyping genome-wide in 903 patients with the condition Developmental Dysplasia of the Hip (DDH) on the Illumina CoreExome array 903
EGAD00010000764 Ovarian tumor samples using Illumina 0
EGAD00010000758 French glioma case germline genotypes using Illumina HumanExome-12v1_A array Illumina HumanExome-12v1_A 906
EGAD00010000756 French glioma control germline genotypes using Illumina HumanExome-12v1_A array Illumina HumanExome-12v1_A 699
EGAD00010000754 UK glioma case germline genotypes using Illumina HumanExome-12v1_A array Illumina HumanExome-12v1_A 596
EGAD00010000752 German glioma case germline genotypes using Illumina HumanExome-12v1_A array Illumina HumanExome-12v1_A 899
EGAD00010000750 German glioma control germline genotypes using Illumina HumanExome-12v1_A array Illumina HumanExome-12v1_A 2,391
EGAD00010000748 Genotyping using Illumina Human OmniExpress12v1.0 1
EGAD00010000744 Subset 2 of osteoarthritis cases genotyped on Illumina 610k from the arcOGEN Consortium (http://www.arcogen.org.uk/) with consent for osteoarthritis studies only. 2,326
EGAD00010000742 Subset 1 of osteoarthritis cases genotyped on Illumina610k from the arcOGEN Consortium (http://www.arcogen.org.uk/) with broader consent. 5,383
EGAD00010000740 Osteoarthritis cases genotyped on Illumina HumanOmniExpress from the arcOGEN Consortium (http://www.arcogen.org.uk/) with broader consent. 674
EGAD00010000738 Generation Scotland APOE data 18,336
EGAD00010000736 AAD case and control samples from UK and Norway 117
EGAD00010000730 WTCCC2 Psychosis Endophenotype samples from UK, Germany, Holland, Spain and Australia using the Affymetrix 6.0 array 1
EGAD00010000724 Pilot experiment on functional genomics in osteoarthritis (methyl) 0
EGAD00010000722 Pilot experiment on functional genomics in osteoarthritis (coreex) 1
EGAD00010000718 BLUEPRINT Gene expression of different B-cell subpopulations 42
EGAD00010000716 BLUEPRINT DNA Methylation of different B-cell subpopulations 35
EGAD00010000714 aplastic anemia samples tumor using 250K Affymetrix 250K Nsp-GTYPE 440
EGAD00010000712 ATRT genotyping 40
EGAD00010000710 ATRT genotyping blood 11
EGAD00010000708 Human samples typed on Illumina Omni 5M 0
EGAD00010000704 610k genotyping imputed on Hapmap 3 and 1000G Phase 1 CEU 714
EGAD00010000702 SNP-chip genotyping data for one proband in the DDD study (Ref : Carvalho AJHG 2015) 0
EGAD00010000698 PCGP INF ALL SNP6 0
EGAD00010000696 PCGP ETP ALL SNP6 0
EGAD00010000694 HCC array for cnv 55
EGAD00010000692 Genome-wide DNA methylation epigenotyping of African rainforest hunter-gatherers and neighbouring agriculturalists by Illumina HumanMethylation450 372
EGAD00010000690 Genome-wide SNP genotyping of African rainforest hunter-gatherers and neighbouring agriculturalists by Illumina HumanOmniExpress 160
EGAD00010000688 glioma normal samples using 250K 0
EGAD00010000686 glioma samples tumor using cytoscan 0
EGAD00010000684 glioma normal samples using cytoscan 0
EGAD00010000682 glioma samples tumor using 250K 0
EGAD00010000680 Tumor sample CGH arrays Agilent CGH array 4
EGAD00010000678 Tumor sample SNP arrays Illumina SNP array 11
EGAD00010000676 ELSA genome-wide genotypes, including estimated related individuals. There are 3 files: .fam, .bim, .bed 7,452
EGAD00010000674 ELSA genome-wide genotypes, excluding estimated related individuals. There are 3 files: .fam, .bim, .bed 7,412
EGAD00010000672 Purified plasma cells from bone marrow of Multiple myeloma patient unknown 1
EGAD00010000670 Purified plasma cells from bone marrow of Pooled healthy donors unknown 1
EGAD00010000668 Purified plasma cells from bone marrow of Monoclonal gammopathy of unknown significance patient unknown 1
EGAD00010000666 Purified plasma cells from tonsil of Healthy donor unknown 1
EGAD00010000664 Finnish population cohort genotyping_B 340
EGAD00010000662 Finnish population cohort genotyping 7,803
EGAD00010000658 DLBCL 148 SNP 6.0 Cohort 0
EGAD00010000656 Case samples using SNP 6.0 Array 0
EGAD00010000654 Control samples using SNP 6.0 Arrays 0
EGAD00010000652 Genotyped samples using Illumina HumanOmni2.5 402
EGAD00010000650 Genotypes from Omni2.5 chip 1,213
EGAD00010000648 nccRCC tumor/normal genotypes 0
EGAD00010000646 DNA methylation analysis of 35 prostate tumor and 6 normal prostate samples 41
EGAD00010000644 Affymetrix SNP6.0 cancer cell line exome sequencing data 1,022
EGAD00010000642 CLL Expression Array 144
EGAD00010000640 WTCCC2 Visceral Leishmaniasis samples from Sudanl using Illumina 670k 21
EGAD00010000638 WTCCC2 Visceral Leishmaniasis samples from Indial using Illumina 670k 97
EGAD00010000636 WTCCC2 Visceral Leishmaniasis samples from Brazil using Illumina 670k 119
EGAD00010000634 WTCCC2 People of the British Isles (POBI) samples using Affymetrix 6.0 array 2,930
EGAD00010000632 WTCCC2 People of the British Isles (POBI) samples using Illumina 1.2M array 2,912
EGAD00010000630 The TEENAGE study target population comprised adolescent students aged 13 to 15 years attending the first three classes of public secondary schools located in the wider Athens area of Attica. 436
EGAD00010000628 The TEENAGE study target population comprised adolescent students aged 13–15 years attending the first three classes of public secondary schools located in the wider Athens area of Attica. 748
EGAD00010000626 A new beta-globin mutation responsible of a beta-thalassemia (HbVar database ID 2928) was observed in 8 unrelated French families. The mutation carriers originated from Nord-Pas-de-Calais, a Northern French region where the chief town is Lille. 5 unrelated mutation carriers were genotyped for a set of 12 microsatellites from chromosome 11, around the beta-globin gene. Among the 5 mutation carriers, 4 were genotyped for 97 European Ancestry Informative SNPs (EAIMs). 37
EGAD00010000624 A new beta-globin mutation responsible of a beta-thalassemia (HbVar database ID 2928) was observed in 8 unrelated French families. The mutation carriers originated from Nord-Pas-de-Calais, a Northern French region where the chief town is Lille. 5 unrelated mutation carriers were genotyped for a set of 12 microsatellites from chromosome 11, around the beta-globin gene. Among the 5 mutation carriers, 4 were genotyped for 97 European Ancestry Informative SNPs (EAIMs). 0
EGAD00010000622 SNP array data for gastric cancer cell lines unknown 30
EGAD00010000620 Controls 3,683
EGAD00010000618 Ischemic stroke cases 3,682
EGAD00010000616 HumanOmni1-Quad genotyping array 230
EGAD00010000614 40 Druze Trios 120
EGAD00010000612 Celiac disease North Indian samples using Immunochip 0
EGAD00010000610 Samples from the Greek island of Crete, MANOLIS cohort 221
EGAD00010000608 SNP6 data for seminoma samples 8
EGAD00010000606 SNP6 data for matched normal samples 8
EGAD00010000604 DNA methylation data using Illumina 450K 2,195
EGAD00010000602 WTCCC2 Reading and Mathematics ability (RM) samples from UK using the Affymetrix 6.0 array 3,665
EGAD00010000600 Prostate Adenocarcinomas samples using 450K Illumina450K 80
EGAD00010000598 PCGP Ph-likeALL SNP6 1,724
EGAD00010000596 PCGP Ph-likeALL GEA 837
EGAD00010000594 SCOOP severe early-onset obesity cases 1,720
EGAD00010000584 WTCCC2 Glaucoma samples using Illumina 670k array 2,765
EGAD00010000580 Gencode control samples using 550K 217
EGAD00010000578 Gencode case samples using 550K 249
EGAD00010000574 Pleuropulmonary blastoma samples using 250K 14
EGAD00010000572 Imputation-based meta-analysis of severe malaria in Gambia. 2,870
EGAD00010000570 Imputation-based meta-analysis of severe malaria in Kenya. 3,343
EGAD00010000568 HipSci - Healthy Normals - Methylation Array - May 2014 0
EGAD00010000566 HipSci - Healthy Normals - Genotyping Array - May 2014 120
EGAD00010000564 HipSci - Healthy Normals - Expression Array - May 2014 120
EGAD00010000562 Medulloblastoma DNA methylation Illumina_HumanMethylation450 115
EGAD00010000560 SNP array of 7 HCCs and matched background liver in children with bile salt export pump deficiency Illumina HumanOmniExpress-12 v1. 14
EGAD00010000558 SNP 6.0 arrays of small cell lung cancer Affymetrix SNP 6.0 54
EGAD00010000556 SNP 6.0 arrays of small cell lung cancer 0
EGAD00010000554 SNP 6.0 arrays of small cell lung cancer 1,032
EGAD00010000552 Neuroblastoma samples 130
EGAD00010000546 SNP 6.0 arrays of carcinoid samples Affymetrics_SNP_6.0- 74
EGAD00010000544 Cusihg's syndrome tumor samples using 250K Affymetrix 250K Nsp-GTYPE 16
EGAD00010000542 Cusihg's syndrome normal samples using 250K Affymetrix 250K Nsp-GTYPE 16
EGAD00010000538 28 unlinked autosomal microsatellite loci for 20 African and 4 philippine populations Applied Biosystems 3100 automated sequencer-GeneMarker v.1.6 (Softgenetics) 1,702
EGAD00010000536 21 unlinked autosomal microsatellite loci for 30 Central Asian populations Applied Biosystems 3100 automated sequencer-GeneMarker v.1.6 (Softgenetics) 1,702
EGAD00010000534 Illumina HumanMethylation450 BeadChip 0
EGAD00010000532 Illumina Human Omni1-Quad SNP genotyping array 0
EGAD00010000528 Illumina HumanHT-12 v4 array 0
EGAD00010000526 SNP 6.0 arrays of small cell lung cancer Affymetrics_SNP_6.0- 63
EGAD00010000522 Samples from the Greek island of Crete, MANOLIS cohort HumanOmniExpress-12 v1.1 BeadChip-GenCall 1,364
EGAD00010000520 Healthy volunteer collection of European Ancestry Illumina OmniExpress v1.0-Illumina GenomeStudio 144
EGAD00010000518 Samples from the Greek island of Crete, MANOLIS cohort HumanExome_12v1.1_A -GenCall, zCall 1,280
EGAD00010000516 Samples from the Pomak Villages in Greece, Pomak isolate HumanExome_12v1.1_A -GenCall, zCall 1,046
EGAD00010000514 Case samples using SNP 6.0 Array GenomeWideSNP_6-BirdseedV2 12
EGAD00010000512 Case samples using HumanOmni1-Quad GenomeWideSNP_6-BirdseedV2 12
EGAD00010000510 Matched control samples using HumanOmni1-Quad GenomeWideSNP_6-BirdseedV2 12
EGAD00010000508 Matched control samples using SNP 6.0 Array GenomeWideSNP_6-BirdseedV2 12
EGAD00010000506 WTCCC2 BO (Barretts oesophagus) samples Illumina_670k-Illuminus 1,991
EGAD00010000504 Control samples using SNP Array 6.0 Affymetrix_U133plus2- 35
EGAD00010000502 Case samples using SNP Array 6.0 Affymetrix_U133plus2- 35
EGAD00010000500 Case samples using U133 Plus 2.0 Array Affymetrix_U133plus2- 35
EGAD00010000498 Affymetrix SNP6.0 genotype data for prostate cancer patients Affymetrix_SNP6- 18
EGAD00010000496 Genome-wide SNP genotyping of African rainforest hunter-gatherers and neighbouring agriculturalists Illumina HumanOmni1-Quad-Illumina GenomeStudio 260
EGAD00010000494 Controls_Human660W-Quad_v1_A Illumina_Human660W-Quad_v1_A-Not supplied 4
EGAD00010000492 Cases_Human660W-Quad_v1_A Illumina_Human660W-Quad_v1_A-Not supplied 4
EGAD00010000490 Affymetrix Genome-Wide Human SNP Array 6.0 data Affymetrix 6.0- 19
EGAD00010000488 Chondroblastoma case sample genotype using Affymetrix SNP6.0 Affymetrix_SNP6- 7
EGAD00010000486 ccRCC case samples using expression array Agilent Human Whole Genome 4x44k v2 - Feature Extraction 101
EGAD00010000484 ccRCC control samples using 250K Nsp Affymetrix_250K(Nsp) - gtype 234
EGAD00010000482 ccRCC case samples using methylation array Illumina Infinium HumanMethylation 450K - GenomeStudio 1
EGAD00010000480 ccRCC case samples using 250K Nsp Affymetrix_250K(Nsp) - gtype 240
EGAD00010000478 blood-based gene expression from breast cancer cases and age-matched controls in case-control serie 3 (CC3) Illumina 118
EGAD00010000476 blood-based gene expression from breast cancer cases and age-matched controls in case-control serie 1 (CC1) Illumina 110
EGAD00010000474 blood-based gene expression from breast cancer cases and age-matched controls in case-control serie 2 (CC2) Illumina 98
EGAD00010000472 CLL Expression Array Affymetrix U219 219
EGAD00010000470 CLL Expression Array GPL570 20
EGAD00010000468 Uveal melanoma matched Tumour and blood samples Illumina HumanOmni2.5 24
EGAD00010000466 Down syndrome CNV genotyping data NimbleGen 135K aCGH - NimbleScan 108
EGAD00010000464 Down syndrome SNP genotyping data Illumina 550K - Illumina Genome Studio 338
EGAD00010000462 SJLGG Case samples using Gene Expression Array Affymetrix_U133v2 75
EGAD00010000460 GENCORD2 DNA methylation 294
EGAD00010000458 Controls using 450K DNA methylation 151
EGAD00010000456 Leukemia samples using 450K DNA methylation 800
EGAD00010000452 Chondrosarcoma case sample genotype using Affymetrix SNP6.0 Affymetrix_SNP6 36
EGAD00010000450 Genome Wide Genotype Data Illumina Human Custom 1,2M and Human 610 Quad Custom arrays 758
EGAD00010000448 Macrophage Gene Expression Illumina Human-Ref-8 v3 beadchip 758
EGAD00010000446 Monocyte Gene Expression Illumina Human-Ref-8 v3 beadchip 758
EGAD00010000444 Agilent ncRNA 60k txt files Agilent ncRNA 60k 1,480
EGAD00010000442 Affymetrix SNP 6.0 CEL files Affymetrix_SNP6_raw 1,302
EGAD00010000440 Segmented copy number data Affymetrix_SNP6_raw 1,302
EGAD00010000438 Normalized miRNA expression data Agilent ncRNA 60k 1,480
EGAD00010000436 Illumina HT 12 IDAT files Illumina HT 12 1,302
EGAD00010000434 Normalised mRNA expression Illumina HT 12 1,302
EGAD00010000429 DNA methylation analysis of 4 primary lymphoma samples HumanMethylation450k Bead Chip - Genome Studio 4
EGAD00010000427 DNA methylation analysis of 4 peripheral blood samples HumanMethylation450k Bead Chip - Genome Studio 4
EGAD00010000425 Han Chinese samples using Immunochip HanChinese_Immunochip 192
EGAD00010000423 Han Chinese samples using Illumina OMNIExpress (controls) Illumina OMNIExpress 213
EGAD00010000421 Han Chinese samples using Affymetrix (controls) Affymetrix_6.0 187
EGAD00010000419 Han Chinese samples using Affymetrix (cases) Affymetrix_6.0 62
EGAD00010000417 Han Chinese samples using Illumina OMNIExpress (cases) Illumina OMNIExpress 62
EGAD00010000395 Myeloma case sample genotype using Affymetrix SNP6.0 Affymetrix_SNP6 19
EGAD00010000391 Cambridge control samples using a 660K genotyping chip from Illumina Illumina Human 660K Quad BeadChips - Illuminus 232
EGAD00010000389 Cambridge control samples using a 24k expression array from Illumina Illumina Human-Ref 8 v3.0 expression array 395
EGAD00010000387 Cambridge control samples using a 1.2M genotyping chip from Illumina Illumina Human 1.2M Duo custom BeadChips v1 - Genome Studio 188
EGAD00010000385 MRCA sample using 300K Illumina 300K - GenomeStudio 394
EGAD00010000383 MRCA sample using 100K Illumina 100K - GenomeStudio 394
EGAD00010000381 MRCE sample using 300K Illumina 300K - GenomeStudio 543
EGAD00010000379 DNA methylation analysis of 2 peripheral blood samples HumanMethylation450k Bead Chip - Genome Studio 2
EGAD00010000377 DNA methylation analysis of 6 primary lymphoma samples HumanMethylation450k Bead Chip - Genome Studio 6
EGAD00010000371 Case and control samples (Genotypes) Infinium_370k - GenomeStudio 170
EGAD00010000300 Summary statistics from Haemgen RBC GWAS Illumina, Affymetrix, Perlegen 1
EGAD00010000298 All cases and controls (Hap300) 13,761
EGAD00010000296 1958BC control samples only (Hap550) 2,224
EGAD00010000294 1958BC control samples only (Hap300) 2,436
EGAD00010000292 All cases and Finnish, Dutch, Italian control samples (Hap300) 10,339
EGAD00010000290 NBS control samples only (Hap550) 2,276
EGAD00010000288 All cases and Finnish, Dutch, Italian control samples (Hap550) 6,313
EGAD00010000286 All cases and controls (Hap550) Illumina (various) 11,950
EGAD00010000284 NBS control samples only (Hap300) Illumina (Various) 2,500
EGAD00010000282 Pharmacogenomic response to Statins samples (Genotypes/Phenotypes) Affymetrix 6.0 - CHIAMO 4,134
EGAD00010000280 CLL Expression array Affymetrix snp 6.0 4
EGAD00010000278 SCLC matched normal genotypes Illumina_2.5M 51
EGAD00010000276 SCLC tumor genotypes Illumina_2.5M 56
EGAD00010000274 Colon matched tumour samples Illumina_2.5M 74
EGAD00010000272 Colon tumour samples Illumina_2.5M 75
EGAD00010000270 Metabric breast cancer samples (Images) Aperio image - H&E stained tissue_section 564
EGAD00010000268 Metabric breast cancer samples (Expression raw data) Illumina HT 12 543
EGAD00010000266 Metabric breast cancer samples (Genotype raw data) Affymetrix SNP 6.0 543
EGAD00010000264 WTCCC2 project samples from Ischaemic Stroke Cohort Illumina_670k - Illuminus 4,205
EGAD00010000262 WTCCC2 project Schizophrenia (SP) samples Affyemtrix 6.0 - CHIAMO 3,019
EGAD00010000260 PNET genotyping Illumina OmniQuad 2.5 - CNVpartition 77
EGAD00010000254 CLL Methylation Arrays Illumina HumanMethylation450 165
EGAD00010000252 CLL Expression Arrays Affymetrix U219 137
EGAD00010000250 NBS control samples Illumina ImmunoBeadChip - Illuminus, GenoSNP 3,030
EGAD00010000248 1958BC control samples Illumina ImmunoBeadChip - Illuminus, GenoSNP 6,812
EGAD00010000246 Coeliac disease cases and control samples. (1958BC samples excluded) Illumina ImmunoBeadChip - Illuminus, GenoSNP 10,758
EGAD00010000238 CLL Expression array Affymetrix GeneChip Human Genome U133 plus 2.0 64
EGAD00010000236 WTCCC2 samples from Coronary Artery Disease Cohort - Illuminus, GenoSNP 3,125
EGAD00010000234 WTCCC2 samples from 1958 British Birth Cohort Illumina HumanExome-12v1_A-GenCall, zCall 12,241
EGAD00010000232 WTCCC2 samples from Type 2 Diabetes Cohort - Illuminus 2,975
EGAD00010000230 WTCCC2 samples from Hypertension Cohort - Illuminus 2,943
EGAD00010000220 Ovarian & matched normal (Genotypes) Complete Genomics - CG Build 1.4.2.8 2
EGAD00010000217 Segmented (HMM) copy number aberrations (CNA); discovery set Affymetrix SNP 6.0 997
EGAD00010000216 Segmented (CBS) copy number variants (CNV); validation set Affymetrix SNP 6.0 995
EGAD00010000215 Segmented (CBS) copy number aberrations (CNA); validation set Affymetrix SNP 6.0 995
EGAD00010000214 Segmented (CBS) copy number variants (CNV); discovery set Affymetrix SNP 6.0 997
EGAD00010000213 Segmented (CBS) copy number aberrations (CNA); discovery set Affymetrix SNP 6.0 997
EGAD00010000212 Normalized expression data; normals Illumina HT 12 144
EGAD00010000211 Normalized expression data; validation set Illumina HT 12 995
EGAD00010000210 Normalized expression data; discovery set Illumina HT 12 997
EGAD00010000202 Case samples (Illumina_660K & Illumina_670K) Illumina_660K/Illumina_670K 1,478
EGAD00010000164 Affymetrix 6.0 CEL files Affymetrix SNP 6.0 1,992
EGAD00010000162 Illumina HT 12 IDATS Illumina HT 12 2,136
EGAD00010000160 Illumina HT 12 IDATS Illumina HT 12 1,001
EGAD00010000158 Affymetrix 6.0 cel files Affymetrix SNP 6.0 1,001
EGAD00010000150 WTCCC2 project samples from Ankylosing spondylitis Cohort Illumina_670k - Illuminus 2,005
EGAD00010000148 tumour samples using Affymetrix Genome-Wide SNP6.0 arrays Affymetrix_GenomeWide_SNP6.34 104
EGAD00010000144 Healthy volunteer collection of European Ancestry Illumin OmniExpress v1.0 - Illumina GenomeStudio 288
EGAD00010000130 Cerebellar ataxia, mental retardation, and disequilibrium syndrome (CAMRQ) samples Illumina 300 Duo V2 - Bead Studio, Illumina 2
EGAD00010000124 Psoriasis cases as part of WTCCC2 phase 2 Illumina_670k - Illuminus 2,622
EGAD00010000096 DBA case samples using 250K Nsp Affymetrix_250K(Nsp) - gtype 27
EGAD00010000052 Monozygotic twins that are discordant for schizophrenia (Genotyping) CompleteGenomics build 1.4.2.8 - CG Build 1.4.2.8 36
EGAD00010000051 Cell line derived from microdissected primary pancreatic ductal adenocarcinoma tissues Affymetrix SNP 6.0 15
EGAD00010000050 Matched tumor-negative pancreas tissues Affymetrix SNP 6.0 15
EGAD00001004160 We compared bacterial communities in breast milk from teen (≤19 yr, n = 26) vs. adult (>19 yr, n = 56) mothers, normal weight (BMI 18.5-24.9, n = 63) vs. overweight (BMI ≥ 25, n = 19) mothers, primiparous (parity = 1, n = 41) vs. multiparous (parity > 1, n = 44), early (5-46d postpartum, n = 39) vs. established lactation (4-6 mo postpartum, n = 45), breastfeeding (EBF: PBF, n = 72) vs. mixed feeding (n = 11) and mothers with (Na/K ratio < 0.6, n = 75) and without SCM (Na/K ration ≥ 0.6, n = 10). 86
EGAD00001004159 The extent to which cells in normal tissues accumulate mutations during life is poorly understood. Some mutant cells expand into clones that can be detected by genome sequencing. We mapped mutant clones in normal esophageal epithelium from nine donors aged 20-75. Somatic mutations accumulate with age and are mainly caused by intrinsic mutational processes. We found strong Darwinian selection of clones carrying mutations in 14 cancer genes, with tens to hundreds of such clones per square centimeter. By middle age, clones with cancer-associated mutations cover most of the epithelium, with NOTCH1 and TP53 mutations affecting 40% and 10% of all cells, respectively. Remarkably, the prevalence of NOTCH1 mutations in normal esophagus is several times higher than in esophageal cancers. The esophagus emerges as an evolving patchwork of mutant clones that colonize the majority of the epithelium, with implications for our understanding of cancer and ageing. 25
EGAD00001004158 The extent to which cells in normal tissues accumulate mutations during life is poorly understood. Some mutant cells expand into clones that can be detected by genome sequencing. We mapped mutant clones in normal esophageal epithelium from nine donors aged 20-75. Somatic mutations accumulate with age and are mainly caused by intrinsic mutational processes. We found strong Darwinian selection of clones carrying mutations in 14 cancer genes, with tens to hundreds of such clones per square centimeter. By middle age, clones with cancer-associated mutations cover most of the epithelium, with NOTCH1 and TP53 mutations affecting 40% and 10% of all cells, respectively. Remarkably, the prevalence of NOTCH1 mutations in normal esophagus is several times higher than in esophageal cancers. The esophagus emerges as an evolving patchwork of mutant clones that colonize the majority of the epithelium, with implications for our understanding of cancer and ageing. 866
EGAD00001004153 Gastric neuroendocrine tumors (gNETs) occur with an estimated frequency of 2 per 100,000 in the general population. Type I gastric neuroendocrine tumors (NETs) represent the 75% of gNTEs and arise from gastric enterochromaffin-like (ECL) cells. They have late age of onset and usually benigh course. Classically, hypergastrinemia in patients who have autoimmune atrophic gastritis, causes hyperplasia of gastric ECL cells that progresses into type I gastric NETs and parietal cell (PC) destruction. The genetic bases in families with this disease are unknown. We performed an exome sequencing study of an atypical aggressive familial gNETs case (with early age onset, nodal infiltrations and gastric adenocarcinomas) that followed a recessive model. We identified a deleterious mutation in homozygosis in the ATP4A gene, which encodes the proton pump responsible for acid secretion by gastric parietal cells. This mutation lead to achlorhydria first, and hypergastrinemia and gNET developing as consequence (Calvete et al. 2014). Recently, two more families with gNETs, classical clinical traits and recessive model have been studies by WES but we didn't find any mutation in the ATP4a gene. However, putative mutations affecting genes that contribute to the development and the integrity of PC have been found suggesting that genetic alterations associated to this disorder target to a unique cell type (parietal cells). In order to cinfirm this hypothesis, it is necessary the search for new genes implicated in the gNETs, more familial cases are needed to be studied. We have identified four more new familial gNETs cases. Here, we propose their study by WES. The first family is formed by thress siblings with gNETs. The other families include two siblings with gNETs. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . This dataset contains all the data available for this study on 2018-06-06. 7
EGAD00001004152 Targeted pulldown of approx 60 ffpe normal samples to use as normal controls . This dataset contains all the data available for this study on 2018-06-06. 80
EGAD00001004151 A case-control series of melanoma cases from Leeds, UK have been sequenced in the Fluidigm platform to identify genetic variants associated with sporadic melanoma development. Samples in which potentially contributing variants have been detected are being sequenced in an orthogonal platform for variant confirmation. . This dataset contains all the data available for this study on 2018-06-06. 201
EGAD00001004150 This data set contains whole exome sequences of individuals with self-stated parental relatedness from the East London Genes & Health cohort. Rare frequency functional variants in these healthy individuals will be studied with respect to the genetic health of the participants and loss-of-function analysis of human genes. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . This dataset contains all the data available for this study on 2018-06-06. 1,141
EGAD00001004149 bulk Exome-seq data of the 5 HCC patinets. Single cell RNA seq data of these patients was under the accession number EGAD00001003337 10
EGAD00001004148 bulk RNA-seq data of the 5 HCC patinets. Single cell RNA seq data of these patients was under the accession number EGAD00001003337 5
EGAD00001004147 The dataset contains three BAM files that include SPATC1L variants identified in Italian patients affected by hearing loss (both hereditary and age-related hearing loss). Data have been produced by whole exome sequencing and targeted re-sequencing, using Ion Proton and Ion Torrent PGM platforms respectively. 3
EGAD00001004144 This dataset contains FASTQ files obtained through whole exome sequencing of 25 pairs of glioma and matched blood samples. These files were used to analyze the somatic mutational signatures taking part in gliomagenesis. 57
EGAD00001004139 This dataset consists of 44 compressed paired fastq files, 15 of which are generated from whole exome sequencing, and 29 of which are generated from DNA sequencing using a targeted gene panel capturing the exonic regions of 73 prostate cancer driver genes. Targeted DNA sequencing was performed on an Illumina MiSeq (v3 600 cycle kit), and exome sequencing was done using an Illumina HiSeq 2500 (v4 250 cycle kit) machine. The fastq files are named in accordance with the sample aliases provided, which reflect the pathology of interest to this study (small cell prostatic carcinoma--SCPC), whether it was sequenced using an exome or targeted gene panel, whether the FFPE sample was sourced from tumor or benign tissue (labeled T or B, respectively), and whether there exists multiple samples belonging to a single patient. Illumina MiSeq;ILLUMINA, Illumina HiSeq 2500;ILLUMINA 44
EGAD00001004137 Illumina HiSeq 2000;ILLUMINA 832
EGAD00001004136 The overall goal of the Identification of recurrent mutations in Cushing’s disease project is to study the impact of whole-exome sequencing (WES) on the clinical care of cancer patients and oncology provider practices. The aims of Project are to implement and establish the feasibility of WES in patients with USP8 wild-type corticotroph adenomas; to develop a framework for the understanding of the molecular mechanism of the pathogenesis of corticotroph adenoma. 44
EGAD00001004135 Synovial sarcoma (SS) is defined by a recurrent t(x;18) chromosomal translocation, which produces the hallmark SS18-SSX oncogenic fusion. Incorporation of SS18-SSX into BAF complexes renders BAF complexes aberrant in two distinct manners: the addition of 78aa of SSX onto SS18, and concomitant loss of BAF47 assembly. However, the importance and functional contributions of each of these perturbations on BAF complex targeting and gene expression regulation remain unclear. Here we use an integrative set of genomic approaches in human cancer cell lines and primary tumor samples to define the mechanistic consequences of the SS18-SSX fusion oncoprotein. We find that SS18-SSX hijacks BAF complexes to broad polycomb domains to activate bivalent genes, driving a unique gene expression program distinct from other loss-of-function BAF complex malignancies. Importantly, restoration of BAF47 rescues enhancer activation but is dispensable for proliferative arrest in cell lines. These results demonstrate that gain-of-function SS18-SSX-mediated BAF complex targeting and gene activation is the driving event in SS, and present a mechanism by which distinct functions of BAF complexes can be co-opted to drive oncogenesis. Illumina HiSeq 2000;ILLUMINA, NextSeq 500;ILLUMINA 85
EGAD00001004134 The dataset includes sequencing data generated using the TruSight Cancer Panel (TSCP) a targeted NGS assay for analysis of CPGs and orthogonally generated data supporting at least one pathogenic variant in a CPG for a total of 645 pathogenic CPG variants. The set of pathogenic CPG variants includes strong representation of some of the most challenging types of pathogenic variants, with 339 indels, including 16 complex indels and 24 insertions or deletions with length greater than 5bp, and 74 exon CNVs, including 23 single exon CNVs. There are 502 pathogenic variants in BRCA1 or BRCA2, making this an important first-line validation dataset for laboratories performing NGS testing of BRCA1 and BRCA2. Illumina HiSeq 2500;ILLUMINA 639
EGAD00001004133 Epigenetic profiling of colorectal cancer initiating cells (CC-ICs) to identify bivalently marked genes (H3K4me3 and H3K27me3 ChIP-seq), and investigation of changes in transcriptome following EZH2 inhibition using RNA-seq. Illumina HiSeq 2500;ILLUMINA, NextSeq 500;ILLUMINA 17
EGAD00001004131 Pheno-seq is a new approach that integrates high-throughput imaging and transcriptomic profiling of clonal spheroids/organoids to dissect functional tumor cell heterogeneity in 3D cell culture systems. The method is based on the iCELL8 technology (TakaraBio) that uses barcoded nanowells and a micro-solenoid valve dispenser. The CRC_spheroid dataset contains demultiplexed RNA-sequencing profiles (FASTQ file format, NextSeq 500) of 95 clonal tumor spheroids derived from a patient with colorectal cancer. NextSeq 500;ILLUMINA 1
EGAD00001004129 Sequence was aligned to the GRCH38 reference genome. Aligned sequence was analyzed with GATK/MuTect, to generate somatic variant calls across the SureSelect All Exon V5+UTR target region. Somatic variant calls are in VCF format. In total there are 166 tumour samples, 94 of which have a matched normal. Somatic variants for tumours without a matched normal, were called against a panel of normals. Details for the mutect call can be found in the vcf header. 250
EGAD00001004128 Sequence was aligned to the GRCH38 reference genome. Aligned sequence was analyzed with SomaticSniper. Somatic variant calls are in VCF format. In total there are 94 tumour samples, each with a matched normal. 178
EGAD00001004127 Sequence was aligned to the GRCH38 reference genome. Aligned sequence was analyzed with GATK Haplotype Caller, to generate germline variant calls across the SureSelect All Exon V5+UTR target region. Variant calls are in VCF format. In total there are samples from 173 donors. 101 donors have calls generated from both normal and tumour samples tumour samples, 94 of which have a matched normal. Details for the call can be found in the vcf headers. 263
EGAD00001004126 Sequence data in fastq format was aligned to the GRCH38 reference genome. Aligned sequence was preprocessed with GATK for Indel Realignment and Base Quality Score Recalibration. Duplicates were marked with Picard Mark Duplicates. Aligned sequence is in bam format. Details of the alignment can be found int he bam header. In total, data generated from 174 tumour samples 102 matched blood normal controls was aligned. Tumour samples were classified as Anaplastic Thyroid, Poorly-differentiated or well-differentiated cancers. 264
EGAD00001004125 Normal prostatectomy project analysis and leftovers 71
EGAD00001004124 CRISPR-Cas9 loss-of-function screens are instrumental to systematically identify genes important for cellular fitness in cancer cells. While structural rearrangements are a ubiquitous feature in cancer, their impact on CRISPR-Cas9 response has not yet been systematically assessed. Utilising data for 163 CRISPR-Cas9 screened cancer cell lines, we demonstrate that targeting tandem amplified regions is highly detrimental to cellular fitness, in stark contrast to amplifications arising from chromosomal duplications, which have little to no effect. In addition, high ploidy leads to decreased CRISPR-Cas9 loss of fitness effects in a gene-independent way. Using whole-genome sequencing and fluorescent in situ hybridisation we confirm that clustered Cas9 double-strand DNA cuts in a single chromosome, contrary to multiple chromosomes, are associated with a strong decrease in cell fitness. We propose this as a novel way to exploit collateral vulnerabilities introduced by structural rearrangements in cancer cells, by systematically identifying tissue non-expressed genes that are tandem amplified. 25% of the screened cell lines have at least one putative collateral essentiality, showing that this is a generalizable way to selectively kill tumour cells. Lastly, we present a flexible computational tool, Crispy, to perform association analysis of different types of genomic alterations in CRISPR-Cas9 screens. Our results demonstrate the importance of structural rearrangements in mediating the effect of CRISPR-Cas9-induced DNA damage on cell fitness, and how this could be harnessed to create selective cancer therapies, especially in tumours enriched for tandem duplications. Illumina HiSeq 2000;ILLUMINA 12
EGAD00001004121 A total of 14 samples that has been analyzed with the Spatial Transcriptomics method. H&E stain can be sent if requested. NextSeq 500;ILLUMINA 14
EGAD00001004117 Tumor DNA was extracted from 100 bone marrow aspirate samples where CD138+ selection had been performed to enrich plasma cells from patients with multiple myeloma. Patient matched control DNA from either peripheral blood leukocytes or CD34+ stem cell harvests was also isolated. Both tumor and control DNA underwent library preparation using the Hyperplus kit (KAPA Biosystems) and were hybridized to baits for a targeted SeqCap myeloma panel (Nimblegen) encompassing 129 genes, regions for SNPs for copy number determination, and the IGH, IGK, IGL loci, as well as approximately 5 Mb surrounding the MYC locus. Samples were sequenced on a HiSeq2500 using 100 bp paired end reads. Resulting BAM files were returned along with annotations for somatic events including single nucleotide mutations, indels and structural rearrangements. Illumina HiSeq 2500;ILLUMINA 200
EGAD00001004116 RNA sequencing of paediatric high grade gliomas and diffuse intrinsic pontine gliomas. RNA was sequenced from fresh frozen surgical material or from primary cells cultured under stem cell conditions. RNA was subjected to Illumina whole transcriptome paired end sequencing. Data is provided as paired-end FASTQ files Illumina HiSeq 2000;ILLUMINA 16
EGAD00001004115 Whole genome sequencing reads consisting of paired end Fastq and aligned bam files from pediatric medulloblastoma samples. 22
EGAD00001004114 The failure to develop effective therapies for paediatric glioblastoma (pGBM) and diffuse intrinsic pontine glioma (DIPG) is in part due to their intrinsic heterogeneity. Analysis of 142 sequenced cases revealed multiple tumour subclones, spatially and temporally co-existing in a stable manner as observed by multiple sampling strategies. This dataset provides multi region sequencing of high grade gliomas and diffuse intrinsic pontine gliomas from 15 patients. DNA was extracted from FFPE sections in 2-13 regions of each tumour and sequenced with Agilent SureSelect whole exome sequencing. Germline DNA was also sequenced in 14 cases. Data was aligned to hg19 with bwa and is provided as 79 separate BAM files. 79
EGAD00001004113 DNA (n=1281) and RNA (n=767) were extracted from bone marrow aspirates where CD138+ selection had been performed to enrich plasma cells from patients with monoclonal gammopathy of undetermined significance (MGUS), smoldering multiple myeloma (SMM), or multiple myeloma (MM). DNA and/or RNA were sent to Foundation Medicine where targeted sequencing was performed using their Foundation 1 Heme panel. Resulting BAM files were returned along with annotations for somatic events including single nucleotide mutations, indels and structural rearrangements. Illumina HiSeq 4000;ILLUMINA 1,281
EGAD00001004112 This data set consist genomic information of 10 Chordoid Glioma samples: - Exome sequencing: 10 tumors and matched normal DNA for four of them (BAM files) - RNAseq : 10 tumors (fastq files) - CNV array: 9 tumors (IDAT files) 10
EGAD00001004108 The whole blood of six female volunteers and sperm from one male volunteer were used to extract genomic DNA using a DNeasy Blood & Tissue Kit (QIAGEN). 500 ng gDNA was fragmented into 300 bp by Covaris. Then, the libraries were constructed using a KAPA Hyper Prep Kit (Kapa Biosystems). In total we have 7 samples and the files we uploaded are pair-end fastq files. Illumina HiSeq 4000;ILLUMINA 7
EGAD00001004106 2170 faecal samples Illumina MiSeq 16S rRNA sequencing, V4 hypervariable region Illumina MiSeq;ILLUMINA 2,170
EGAD00001004105 Clonally expanded liver adult stem cell clones of healthy liver and cirrhotic liver (due to alcohol abuse, NASH and PSC), as well as biopsies of liver cancers were subjected to whole genome sequencing to determine the mutational impact of precancerous liver disease HiSeq X Ten;ILLUMINA 32
EGAD00001004096 We sequenced the coding exons of core genes involved in telomere maintenance using peripheral blood DNA of 192 CRC patients. The primary sequencing data were generated by using Ion Torrent Personal Genome Machine® (PGM™) platform. Ion Torrent PGM;ION_TORRENT 192
EGAD00001004089 SRNS unknown 4
EGAD00001004088 Multiple primary tumors (MPT) affect a substantial proportion of cancer survivors and may result from various causes including inherited predisposition. Currently, germline genetic testing of MPT cases for cancer predisposition gene (CPG) variants is mostly targeted by tumor type. We ascertained pre-assessed MPT cases from genetics centers (defined as ≥2 primaries by age 60 years or ≥3 by 70) and performed whole genome sequencing (WGS) on 460 individuals from 440 families. Despite previous negative genetic assessment/molecular investigations, pathogenic variants in moderate and high-risk CPGs were detected in 67/440 (15.2%) of probands. WGS detected variants that would not be (or were not) detected by targeted resequencing strategies including structural variants at low frequency (6/440 (1.4%) of probands). In most individuals with a germline variant assessed as pathogenic or likely pathogenic (P/LP), at least one of their tumor types was characteristic of variants in the relevant CPG. However, in 29 probands (42.2% of those with a P/LP variant) the tumor phenotype appeared discordant. The frequency of individuals with truncating or splice site CPG variants and at least one discordant tumor type was significantly higher than a control population (χ2=43.642 P=<0.0001). 2/67 (3%) of probands with P/LP variants had evidence of multiple inherited neoplasia allele syndrome (MINAS) with deleterious variants in two CPGs. Summing together variant detection rates from a similarly ascertained previous MPT case series, the present results suggest that first-line comprehensive CPG analysis in a clinical genetics referral-based MPT cohort would detect a deleterious variant in about a third of cases. Illumina HiSeq 2000;ILLUMINA 453
EGAD00001004087 We took a bone marrow aspirate and peripheral blood samples from a healthy patient aged around 60, and use flow cytometry to isolate 100 HSCs, 50 MEPs, and 50 GMPs. We grew these up into colonies, then whole genome sequenced each colony. Somatic mutations act as a unique barcode for each clone. We have designed a panel for targeted resequencing of the mutations that we find. We are now looking for these mutations in the peripheral blood, to see the dynamics of how HSCs contribute to the peripheral blood in health. This dataset contains all the data available for this study on 2018-04-19. Illumina HiSeq 2500;ILLUMINA 48
EGAD00001004086 We will take a bone marrow aspirate and peripheral blood samples from a healthy patient aged around 60, and use flow cytometry to isolate 100 HSCs, 50 MEPs, and 50 GMPs. We will grow these up into colonies, then whole genome sequence each colony. Somatic mutations will act as a unique barcode for each clone. We will then design a panel for targeted resequencing of the mutations that we find. It will then be possible to look for these mutations in the peripheral blood over several years, to see the dynamics of how HSCs contribute to the peripheral blood in health. This dataset contains all the data available for this study on 2018-04-19. HiSeq X Ten;ILLUMINA, Illumina HiSeq 2500;ILLUMINA 207
EGAD00001004085 In this study, we have examined microbial infection in brain tissue from 9 control samples from healthy patients and 10 samples from patients diagnosed with Multiple sclerosis, by Next-generation sequencing NGS using Miseq sequencing platform (Illumina). Illumina MiSeq;ILLUMINA 19
EGAD00001004082 In this study, we applied an Illumina HighSeq platform-based high-coverage WES technique, which, in addition to the exons, allows the determination of 5′- and 3′-UTRs, promoters to a certain length, along with off-target sequences, such as introns, intergenic regions and infecting viruses. Brains from suicide victims (n = 23; 15 males and eight female) who had suffered from major depressive disorder and from control participants (n = 21; 14 males and seven females) who had died from other causes were used for whole-exome sequencing. Alignment files in bam format were uploaded. Illumina HiSeq 2000;ILLUMINA 44
EGAD00001004081 Smart-seq2 protocol was used to perform single cell RNA-sequencing on 465 immune cells. The immune cells analysed include 215 HLA-DQ2: gluten-(DQ2.5-glia-α1, -α2, -ω1, and -ω2) tetramer-sorted T cells, 247 transglutaminase 2 (TG2)-positive plasma cells from intestinal biopsy or peripheral blood from celiac disease patients, and 3 unassigned cells in 3 batches. Illumina HiSeq 4000;ILLUMINA 1
EGAD00001004080 3,301
EGAD00001004079 RNA-seq data from sorted populations from 10 CML samples and 4 normal bone marrow samples. NextSeq 500;ILLUMINA 28
EGAD00001004078 For this tissue dataset, we applied low-pass whole genome sequencing to 96 advanced adenomas. Advanced adenomas were classified as lesions with low-risk or high-risk of progression, according to the presence of specific DNA copy number changes (Carvalho et al, CancerPrevRes, 2018). Illumina HiSeq 2000;ILLUMINA 96
EGAD00001004077 43 low-coverage genomes derived from fresh-frozen glioblastoma tumor samples. These genomes have been produced for validation purposes and match the corresponding RRBS and RNA-seq profiles in that DNA and RNA was extracted from the same tumor samples. Illumina HiSeq 3000;ILLUMINA 43
EGAD00001004076 37 transcriptomes derived from fresh-frozen glioblastoma tumor samples. These transcriptomes have been produced for validation purposes and match the corresponding RRBS and WGS profiles in that DNA and RNA was extracted from the same tumor samples. Illumina HiSeq 3000;ILLUMINA 37
EGAD00001004075 For this tissue dataset, we applied low-pass whole genome sequencing to 98 non-advanced and advanced adenomas. As small number of lesions was sequenced multiple times, this dataset consists of 103 fastq files. These adenomas were classified as lesions with low-risk or high-risk of progression, according to the presence of specific DNA copy number changes (Carvalho et al, CancerPrevRes, 2018). Illumina HiSeq 2000;ILLUMINA 103
EGAD00001004074 Genome-wide profiling of DNA methylation levels by RRBS in 150 glioblastoma tumor samples. Patients were selected to represent the general population of glioblastoma patients based on Austrian Brain Tumor Registry. These DNA methylation profiles were created for the validation of the glioblastoma progression study (GBMatch) and consist of 106 profiles from FFPE samples and 44 profiles from fresh-frozen samples. For the 44 fresh-frozen samples also WGS data (43 genomes) and RNA-seq data (37 transcriptomes) have been produced for validation purposes. Illumina HiSeq 3000;ILLUMINA 150
EGAD00001004073 Copy Number Abberation calls generated using TITAN and PhyloWGS, from the CPCGene 200PT Subclonality study 292
EGAD00001004072 SNV calls generated using SomaticSniper and PhyloWGS, from the CPCGene 200PT Subclonality study 293
EGAD00001004071 We integrate genomic (whole-genome sequencing, WGS) and transcriptome (polyA-enriched RNA-Seq) sequencing from 90 NSCLC cases and comprehensively identified the distinct genomic features of Chinese NSCLC patients. 90
EGAD00001004070 RNA sequencing of non-brainstem paediatric high grade glioma from the HERBY phase II randomised trial. RNA from fresh frozen surgical tissue in 20 cases was subjected to Illumina whole transcriptome paired end sequencing. Data is provided as paired-end FASTQ files Illumina HiSeq 2000;ILLUMINA 20
EGAD00001004068 Whole-genome, whole-exome and transcriptome sequencing of pancreatic ductal adenocarcinomas from young adults reveals recurrent NRG1-fusions in KRAS wild-type tumors. HiSeq X Ten;ILLUMINA, Illumina HiSeq 2500;ILLUMINA, Illumina HiSeq 4000;ILLUMINA 36
EGAD00001004067 Custom panel sequencing data from 1714 clear cell renal cell carcinoma samples 1,714
EGAD00001004066 We generated 42 human whole-exome sequencing data sets from fresh-frozen (FF) and FFPE samples. These samples include normal and tumor tissues from two different organs (liver and colon), that we extracted with three different FFPE extraction kits (QIAamp DNA FFPE Tissue kit and GeneRead DNA FFPE kit from Qiagen, Maxwell\textsuperscript{TM} RSC DNA FFPE Kit from Promega). Variant calling analysis shows a very high rate of concordance between matched FF / FFPE pairs and equivalent performance for the three kits we analyzed. We find a significant variation in the difference of total number of variants called between FF and FFPE samples for the three different FFPE DNA extraction kits. Coverage analysis shows that FFPE samples have less good indicators than FF samples, yet the coverage quality remains above accepted thresholds. We detect limited but significant variations in coverage indicator values between the three FFPE extraction kits. Globally, the GeneRead and QIAamp kits have better variant calling and coverage indicators than the Maxwell kit on the samples used in this study, although this kit performs better on some indicators and has advantages in terms of practical usage. Taken together, our results confirm the potential of FFPE samples analysis for clinical genomic studies, but also indicate that the choice of a FFPE DNA extraction kit should be done with careful testing and analysis beforehand in order to maximize the accuracy of the results. Illumina HiSeq 2000;ILLUMINA 42
EGAD00001004062 This dataset includes whole genome sequencing of 198 epileptic individuals. Libraries preparation and whole-genome sequencing: gDNA was cleaned up using ZR-96 DNA Clean & ConcentratorTM-5 Kit (Zymo) prior to being quantified using the Quant-iTTM PicoGreen dsDNA Assay Kit (Life Technologies) and its integrity assessed on agarose gels. Libraries were generated using the TruSeq DNA PCR-Free Library Preparation Kit (Illumina) according to the manufacturer’s recommendations. Libraries were quantified using the Quant-iTTM PicoGreen dsDNA Assay Kit (Life Technologies) and the Kapa Illumina GA with Revised Primers-SYBR Fast Universal kit (Kapa Biosystems). Average size fragment was determined using a LabChip GX (PerkinElmer) instrument. The libraries were denatured in 0.05N NaOH and diluted to 8pM using HT1 buffer. The clustering was done on a Illumina cBot and the flowcell was ran on a HiSeq 2500 for 2x125 cycles (paired-end mode) using v4 chemistry and following the manufacturer's instructions. A phiX library was used as a control and mixed with libraries at 0.01 level. Bioinformatics: The Illumina control software was HCS 2.2.58, the real-time analysis program was RTA v. 1.18.64. Program bcl2fastq v1.8.4 was used to demultiplex samples and generate fastq reads. The filtered reads were aligned to reference Homo_sapiens assembly b37. Each readset was aligned to creates a Binary Alignment Map file (.bam). Illumina HiSeq 2500;ILLUMINA 198
EGAD00001004061 200PT : WG Aligned Sequence (bam)/ Aligned WG sequence data in this dataset are from CPCGene Tumour/Normal Pairs used in the 200PT Study 404
EGAD00001004052 Ultra low coverage sequencing results from the project 'Rapid multiplex small DNA sequencing on the MinION nanopore sequencing platform'. Sequencing data of sample NA12877 and NA12878 generated from 3 nanopore sequencing runs are included in this dataset. MinION;OXFORD_NANOPORE 6
EGAD00001004051 fastq of 345 Japanese gastric cancer Illumina HiSeq 2000;ILLUMINA 345
EGAD00001004046 Analysis of the reference epigenomes and regulatory landscape of CLL as a whole and its major clinico-biological subtypes (with mutated and unmutated IGHV) in the light of the normal B-cell differentiation. We have extensively characterized the reference epigenomes of seven primary chronic lymphocytic leukemia samples (CLLs) with mutated (n=5) and unmutated IGHV (n=2) as well as several mature B-cell subpopulations (naive B cells from blood and tonsil, germinal center B cells, memory B cells and plasma cells from tonsil) using genome-wide maps of six histone marks (H3K4me3, H3K4me1, H3K27ac, H3K36me3, H3K9me3 and H3K27me3), DNA accessibility (ATAC-seq), DNA methylation (whole-genome bisulfite sequencing) and gene expression (RNA-seq). Furthermore, we have mapped the regulatory chromatin landscape of 100 additional CLL cases using chIP-seq of H3K27ac and ATAC-seq and linked these data to additional layers of information (whole-genome and/or whole-exome sequencing (WGS/WES), RNA-seq and DNA methylation microarrays) studied in the context of the International Cancer Genome Consortium (ICGC). Illumina HiSeq 2000;ILLUMINA, NextSeq 500;ILLUMINA 386
EGAD00001004045 Whole Genome Sequencing has been applied in 32 SRCC patients and the raw data have been subjected to standard procedures. Files with genomic variant calling were obtain at the last step. HiSeq X Ten;ILLUMINA 64
EGAD00001004044 Files from whole exome sequencing of eight tumors from eight pancreatic cancer patients along with matched PanIN precursor lesion(s) and a matched normal tissue. Illumina HiSeq 2000;ILLUMINA 28
EGAD00001004043 The dataset consists in 64 fastq files from 23 patients with acute promyelocytic leukemia. Exome sequencing was conducted on several stages (Diagnosis, Remission, Relapse) for each patient. For 5 patients, only Diagnosis and Relapse samples are available. Illumina HiSeq 1000;ILLUMINA 64
EGAD00001004042 As a contribution to the International Cancer Genome Consortium, exome sequencing of 102 Japanese gastric cancer with various histological subtypes have been conducted. This study aims to identify unique and common driver genes and molecular subtypes in Japanese gastric cancer. Please refer ICGC website for detail: http://icgc.org/icgc/cgp/69/420/1012357 Illumina HiSeq 2500;ILLUMINA 102
EGAD00001004041 As a contribution to the International Cancer Genome Consortium, exome sequencing of 142 Japanese gastric cancer with various histological subtypes have been conducted. This study aims to identify unique and common driver genes and molecular subtypes in Japanese gastric cancer. Please refer ICGC website for detail: http://icgc.org/icgc/cgp/69/420/1012357 Illumina HiSeq 2500;ILLUMINA 142
EGAD00001004040 Whole Exome Sequencing of trios (proband + parents) or probands only with Neonatal Diabetes Mellitus (NDM) or Congenital Hyperinsulinism of Infancy (CHI) of unknown genetic origin. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . This dataset contains all the data available for this study on 2018-03-14. Illumina HiSeq 2500;ILLUMINA 57
EGAD00001004039 Albinism is genetically heterogeneous rare genetic condition affecting 1:17000 in the Western world (but more frequent in Africa) whose main feature is a profound visual impairment, characterised by foveal hypoplasia, abnormal chiasmatic connections, nystagmus and photofobia. All these features result in severly altered visual acuity (<0,1), absent depth perception and poor night vision. People with albinism are primarily visually handicapped. In addition, for some types of albinism, the visual phenotype can be presented with partial or total hypopigmentation, hence resulting in a secondary phenotype which can lead to skin cancer if skin is not adequately protected. Recently a new syndrome has been described, FHONDA, with the same visual abnormalities of albinism but without pigment alteration. The traditional classification differentiates Oculoculatenous albinism (OCA), where hypopigmentation involves hair, skin and eyes versus Ocular Albinism (OA), where hypopigmentation only affects the eyes. These are non-sydrimic types of albinism. Some syndromic forms (Hermansky-Pudlak=HPS, Chediak-Higashi=CHS) affect cells beyond pigment cells, present in the lungs, immune system, platelets and intestines, resulting in more severe phenotypes that can be fatal. Mutations in at least 19 genes are assocaited with the corresponding types of albinism. Most hospitals will only diagnose the most frequent cases using traditional Sanger, MLPA approaches. Some will use CGH arrays. We aim to diagnose all cases of albinism through the Albinochip proposal, which combines a Sequenom first step of known mutations combined with subsequent NGS approaches. In some cases we fail to find a second mutation, these are good candidates for further full exome analyses. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . This dataset contains all the data available for this study on 2018-03-14. Illumina HiSeq 2500;ILLUMINA 48
EGAD00001004038 Identification of genes involved in congenital disorders of glycosylation and 3-methylglutaconic aciduria. There are more than 100 genes known for congenital disorders of glycosylation and new disorders are discovered each year. WE included patients with a so far unsolved glycosylation disease. The diagnostic group 3-methyglutaconic aciduria is a heterogenous group of disorders mostly caused by abnormal phospholipid synthesis or in association with mitochondrial dysfunction. We included patients with a so far unsolved disease and 3-methylglutaconic aciduria. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . This dataset contains all the data available for this study on 2018-03-14. Illumina HiSeq 2500;ILLUMINA 31
EGAD00001004037 Aim to characterise cancer gene landscape in CLL, particularly in cases with mutated POT1 gene. Treatment-naive CLL cases will be interrogated by targeted exome sequencing using a cancer gene panel. . This dataset contains all the data available for this study on 2018-03-14. Illumina HiSeq 2500;ILLUMINA 123
EGAD00001004036 Whole exome sequencing of non-brainstem paediatric high grade glioma from the HERBY phase II randomised trial. DNA from 86 cases was subjected to Illumina paired end whole exome sequencing using a customised SureSelect Human All Exon V6 capture set. Germline DNA from whole blood was sequenced for 83 cases. 26 cases were sequenced from both fresh frozen tissue and FFPE material, 10 were sequenced from only fresh frozen material and 50 from only FFPE. Data is provided as bwa aligned BAM files Illumina HiSeq 2000;ILLUMINA 195
EGAD00001004035 exome sequence of 15 female patients suffering from Ovarian Meiotic Defects Illumina HiSeq 2000;ILLUMINA 15
EGAD00001004034 RNA-seq data (bam files) from the hypothalamus of 4 individuals with Prader-Willi syndrome and 4 age-matched control individuals. Detailed information about the study design, case-control matching and RNA-seq data processing is provided in the accompanying publication [Bochukova et al (2018) Cell Reports]. Illumina HiSeq 2000;ILLUMINA 8
EGAD00001004033 Fastq files of whole-genome bisulfite sequence of tumor tissue of HBV-associated hepatocellular carcinoma Illumina HiSeq 2000;ILLUMINA, Illumina Genome Analyzer IIx;ILLUMINA 5
EGAD00001004032 Fastq files of whole-genome bisulfite sequence of non-cancerous tissue of HBV-associated hepatocellular carcinoma Illumina HiSeq 2000;ILLUMINA, Illumina Genome Analyzer IIx;ILLUMINA 3
EGAD00001004031 AngioPredict CNV and Exome data Illumina HiSeq 2500;ILLUMINA 527
EGAD00001004029 Sequencing data for ICGC Oesophageal Adenocarcinoma tissue samples - dcc_earmarked_28 Clinical data for these samples will be available following ICGC DCC release 28. Illumina HiSeq 2000;ILLUMINA 86
EGAD00001004028 Sequencing data for ICGC Oesophageal Adenocarcinoma tissue samples - dcc_earmarked_27 Clinical data for these samples will now be available following ICGC DCC release 28. Illumina HiSeq 2000;ILLUMINA 127
EGAD00001004027 This data is belong to WES-Lung Cancer patients' genome data which is aligned to human reference(human_g1k_v37.fasta). There are 36 paired tumor/normal samples from Samsung Hospital. All samples has passed QC and recalibration steps while aligning to reference. Illumina HiSeq 2000;ILLUMINA 72
EGAD00001004020 Amplicon data of tumor samples generated for validation of WES findings and further sub clonal mapping Ion Torrent PGM;ION_TORRENT 78
EGAD00001004018 The aim of CAGEKID is to carry out comprehensive detection of DNA markers for conventional (clear cell) renal carcinoma. The project includes complete analysis of somatic and constitutional DNA variation, methylation patterns and expression in a large number of constitutional/tumor pairs. CAGEKID is a part of the International Cancer Genome Consortium, ICGC. 708
EGAD00001004014 Whole Exome Sequencing Data from paediatric solid tumors Illumina HiSeq 2500;ILLUMINA 54
EGAD00001004013 Organoids are self-organizing 3D structures grown from stem cells that recapitulate essential aspects of organ structure and function. Here we describe a method to establish long-term culture conditions of human airway epithelial organoids that contain all major cell populations and allow personalized human disease modelling. We collected macroscopically inconspicuous lung tissue from non-small-cell lung cancer (NSCLC) patients undergoing medically indicated surgery and isolated epithelial cells to engineer 3D organoids. We exploit the potential to derive sub-clones from AOs to demonstrate the feasibility of CRISPR gene editing. Finally, we show that AOs readily allow modelling of viral infections such as RSV and for the first time demonstrate the possibility to study neutrophil-epithelium interaction in an organoid model. Taken together, we anticipate that human AOs will find broad applications in the study of adult human airway epithelium in health and disease. HiSeq X Ten;ILLUMINA 4
EGAD00001004012 This data is belong to additional 2015 AML-ETO patients' genome data which is aligned to human reference(human_g1k_v37.fasta). There are 2 paired tumor/normal samples from SNUH. All samples has passed QC and recalibration steps while aligning to reference. Illumina HiSeq 2000;ILLUMINA 4
EGAD00001004011 This data is belong to 2015 AML-ETO patients' genome data which is aligned to human reference(human_g1k_v37.fasta). There are 10 paired tumor/normal samples from SNUH. All samples has passed QC and recalibration steps while aligning to reference. Illumina HiSeq 2000;ILLUMINA 20
EGAD00001004008 This dataset include NPC blood tumor pair sequencing bam file, include 21 pairs, 42 bam files Illumina HiSeq 2000;ILLUMINA 42
EGAD00001004007 Esophageal adenocarcinoma organoid cultures recapitulate human disease heterogeneity and provide a model for clonality studies and precision therapeutics. Illumina HiSeq 2000;ILLUMINA 69
EGAD00001004001 Targeted gene screen of FFPEs, cell lines and primary CRC tumours for testing the new V4 Colorectal gene panel. . This dataset contains all the data available for this study on 2018-03-07. Illumina HiSeq 2500;ILLUMINA 92
EGAD00001004000 Targeted gene screen of cell line tumours for testing the new V4 Colorectal gene panel. . This dataset contains all the data available for this study on 2018-03-07. Illumina HiSeq 2500;ILLUMINA 53
EGAD00001003995 Fifteen pleomorphic invasive lobular carcionoma samples and their matched normal controls were subjected to targeted exome sequencing using the Beijing Genomics Institute TumorCare gene panel. Genomic DNA samples were randomly fragmented and captured libraries of each exome were sequenced on an Illumina Hiseq2000 system. CRAM files are provided for each tumor and normal pair. Illumina HiSeq 2000;ILLUMINA 30
EGAD00001003994 The present series corresponds to 20 whole genome sequencing (10 Tumoral/Non-tumoral pairs). Hepatocellular carcinoma (HCC) accounts for more than 90% of liver cancers, and is a major health problem. It is the 3rd cause of cancer-related mortality. Advances in genomic analyses have formed a comprehensive understanding of different underlying pathobiological layers resulting in hepatocarcinogenesis. Thus, the development of next-generation sequencing technologies has made it possible to generate more comprehensive catalogues of somatic alteration events (single nucleotide substitutions, structural variations, and epigenetic changes) in liver cancer genome than ever before. Illumina HiSeq 2000;ILLUMINA 24
EGAD00001003993 The present series corresponds to 161 RNA-seq samples from tumors with matched WES or WGS. Hepatocellular carcinoma (HCC) accounts for more than 90% of liver cancers, and is a major health problem. It is the 3rd cause of cancer-related mortality. Advances in genomic analyses have formed a comprehensive understanding of different underlying pathobiological layers resulting in hepatocarcinogenesis. Thus, the development of next-generation sequencing technologies has made it possible to generate more comprehensive catalogues of somatic alteration events (single nucleotide substitutions, structural variations, and epigenetic changes) in liver cancer genome than ever before. Illumina HiSeq 2000;ILLUMINA 161
EGAD00001003992 Whole Human Islet paired-ended RNA-seq of 64 human pancreatic donors. Illumina HiSeq 2500;ILLUMINA, Illumina Genome Analyzer IIx;ILLUMINA 64
EGAD00001003991 Complete clinical phenotypic description of all patients; the number listed represents all the samples linked to the 557 patients present in the dataset. Please consult the key file to visualise the sample-patient relationship 736
EGAD00001003989 Longitudinal biopsies from a melanoma patient who initially responded to MEK plus CDK4/6 inhibitor therapy were whole exome sequenced to identify potential resistance mutations. The biopsies included normal tissue, pre-treatment, on-treatment, and several post-resistance timepoints. Illumina HiSeq 2500;ILLUMINA 6
EGAD00001003988 Paired end Whole Exome Sequencing of fine-needle aspirates from 51 Mutliple Myeloma patients. Illumina HiSeq 2000;ILLUMINA 176
EGAD00001003987 This dataset pertains to whole exome sequencing of paired DNA samples of Gingivo-buccal oral cancer patient.DNA was isolated from the tumor and blood tissues of 47 patients (94 samples).We have performed Nextera exome capture and sequenced exome libraries in Illumina HiSeq platform.We have uploaded BWA-ALN aligned BAM files. Illumina HiSeq 2500;ILLUMINA 94
EGAD00001003986 A total of 192 positions per patient were deeply sequenced in each corresponding tumor sample (including 4 experimental controls and SNVs predicted to originate at each node of the sample phylogeny, see Zhang et al. for details). Genomic DNA templates were used as starting material to generate PCR products. PCR was set up using Phusion DNA polymerase according to the manufacturer’s specifications. The standard PCR conditions used were an initial denaturation at 98C for 30 seconds, followed by 35 cycles of 98C for 10 seconds, 60C for 15 seconds and 72C for 8 seconds, and a final extension at 72C for 10 minutes. PCR products were cleaned up using PCRClean DX beads. Amplicons were pooled by template for sequencing sample preparation. Sample preparation involved a second round of amplification using Phusion DNA polymerase with 6 PCR cycles, with primers specified in Zhang et al. DNA quality was assessed using the Caliper LabChip GX HighSensitivity Assay and DNA quantity was measured using a Qubit dsDNA HS assay kit on a Qubit fluorometer. The indexed libraries were pooled together and sequenced on the Illumina NextSeq500 platform with paired-end 150bp reads using v2 chemistry reagents. NextSeq 500;ILLUMINA 180
EGAD00001003985 Each tumor sample was cut into three pieces, yielding two end-pieces for cryovials and a middle portion placed in 10% buffered formalin. End pieces were homogenized manually and with a paddle blender (Stomacher). All paraffin-embedded blocks, including formalin-fixed tumor samples and molecular-fixed fallopian tubes, were sectioned and stained with hematoxylin and eosin prior to expert histopathological review to confirm the presence of high grade serous carcinoma. Homogenized end pieces were then flash frozen, and RNA was extracted using the miRNeasy Mini kit. Nanodrop was used to assess quality (260/280) and quantity. Total RNA samples were also QC checked using the Caliper HT RNA HiSens assay. Samples ranging from 60-255ng RNA were re-arrayed into a 96-well plate. 5'-RACE PCR was carried out as described in "The interface of malignant and immunologic clonal dynamics in high-grade serous ovarian cancer" (Zhang et al.). Briefly, this involved first round and nested PCR with TRB (TCR beta chain) and IGH (immunoglobulin heavy chain) gene-specific primers. The indexed libraries were sequenced on the Illumina HiSeq platform with paired-end 250bp reads using v2 chemistry reagents. Illumina HiSeq 2500;ILLUMINA, NextSeq 500;ILLUMINA 442
EGAD00001003984 Each tumor sample was cut into three pieces, yielding two end-pieces for cryovials and a middle portion placed in 10% buffered formalin. End pieces were homogenized manually and with a paddle blender (Stomacher). All paraffin-embedded blocks, including formalin-fixed tumor samples and molecular-fixed fallopian tubes, were sectioned and stained with hematoxylin and eosin prior to expert histopathological review to confirm the presence of high grade serous carcinoma. Homogenized end pieces were then flash frozen and later used for WGS. For all tumor and matched normal (peripheral blood) samples, DNA was extracted with the Qiagen AllPrep DNA/RNA kit (tumor samples from patients 25,26,28-32) or the Qiagen Blood and Tissue Extraction Kit (tumor samples from patients 1-4,7,9-17, and all blood samples). For all tumor and normal samples, DNA extraction was followed by library construction and sequencing using Illumina HiSeq2500 whole genome shotgun v4 chemistry with paired-end 125bp reads. Illumina HiSeq 2500;ILLUMINA 89
EGAD00001003981 Illumina HiSeq 2500;ILLUMINA 24
EGAD00001003979 This dataset contains ChIP sequencing data from 24 patients. ChIP of 5–10 mg flash-frozen primary ependymoma tumour was performed using 5 mg H3K27ac antibody per ChIP experiment. The enriched DNA has been sequenced on a Illumina HiSeq2000 instrument in paired-end mode. Up to two lanes per sample have been sequenced resulting in 70 Fastq files. Illumina HiSeq 2000;ILLUMINA 24
EGAD00001003978 This data is belong to WGS-Lung Cancer patients' genome data which is aligned to human reference(human_g1k_v37.fasta). There are 30 paired tumor/normal samples from Samsung Hospital. All samples has passed QC and recalibration steps while aligning to reference. Illumina HiSeq 2000;ILLUMINA 60
EGAD00001003977 RNA was extracted from formalin-fixed and paraffin embedded tumors of a large cohort of bladder cancer patients before treatment with anti-PD-L1. RNA was sequenced using a capture based approach (exome capture, RNA access). Illumina HiSeq 2500;ILLUMINA 348
EGAD00001003976 Well-differentiated (WD) and de-differentiated (DD) liposarcoma, subtypes of adipocytic sarcomas, are pathologically and clinically dissimilar, but are poorly distinguishable at the molecular level. These tumors harbor neochromosomes formed from amplifications and rearrangements of chr12q. Nineteen selected patients with matched WD and DD tumors underwent extensive exomic and transcriptomic profiling to distinguish genomic features between the two subtypes. Shared point mutations suggest a common tumor origin and de-differentiated tumors have higher burdens of deletions. Illumina HiSeq 2000;ILLUMINA 51
EGAD00001003975 Raw lane level fastq files from Whole genome sequencing in support of ICGC PRAD-CA Variant calls HiSeq X Ten;ILLUMINA, Illumina HiSeq 2500;ILLUMINA, unspecified;ILLUMINA 128
EGAD00001003974 Raw data files for the German Epigenome Project (DEEP), IHEC/EpiRR submission of 2017. metadata available at: http://deep.dkfz.de/#/experiments Illumina HiSeq 2000;ILLUMINA, Illumina HiSeq 2500;ILLUMINA, NextSeq 500;ILLUMINA 17
EGAD00001003973 This dataset contains whole exome sequencing data from 24 patients. The Agilent SureSelect Human All Exon 50-Mb target enrichment kit was used to capture all human exons for deep sequencing. For each patient a tumour and control sample has been sequenced on a Illumina HiSeq2000 instrument in paired-end mode. Up to three lanes per sample have been sequenced resulting in 118 Fastq files. Illumina HiSeq 2000;ILLUMINA 48
EGAD00001003972 Fastq files for PACA-CA RNA Seq analysis, for DCC release 27 Illumina HiSeq 2500;ILLUMINA 219
EGAD00001003971 ICGC-TCGA DREAM Somatic Mutation Calling - Tumour Heterogeneity Challenge - WGS mapped reads 59
EGAD00001003970 For whole-exome sequencing 1 µg of DNA from fresh-frozen tumors was fragmented by sonication technology (for DNA from fresh-frozen tumors: Bioruptor, diagenode, Liѐge, Belgium; for DNA from FFPE material: Covaris). The fragments were end-repaired and adaptor-ligated, including incorporation of sample index barcodes. After size selection, libraries were subjected to an enrichment process with Sure select XT (Agilent). The final libraries were sequenced with a paired-end 2×75 bp protocol for an average coverage of 100-120x Illumina HiSeq 2000;ILLUMINA 7
EGAD00001003969 RNA-seq analyses were performed on cDNA libraries prepared from PolyA+ RNA using the Illumina TruSeq protocol for mRNA. The final libraries were sequenced with a paired-end 2×75 bp protocol aiming at 8.5 Gb per sample for a 30x mean coverage of the annotated transcriptome. All sequencing reactions were conducted on an Illumina HiSeq instrument (Illumina, San Diego, CA, USA). Illumina HiSeq 2000;ILLUMINA 7
EGAD00001003968 The Janus Serum Bank (JSB) is a population-based cancer research biobank. This dataset contains small RNA sequencing (RNA-seq) data of 520 JSB samples from cancer-free individuals. Sequencing libraries were indexed and 12 samples were sequenced per lane on a HiSeq 2500 (Illumina) to an average depth of 18 million reads per sample. The dataset files are raw FASTQ files from the sequencing machine (50bp, single-end sequencing) Illumina HiSeq 2500;ILLUMINA 520
EGAD00001003966 This dataset conatains RNA sequencing data from 24 patients. Up to two lanes per tumour sample have been seqeunced on a Illumina HiSeq2000 instrument in paired-end mode resulting in 58 Fastq files. Illumina HiSeq 2000;ILLUMINA 24
EGAD00001003963 March 2018 cumulative data release for reference epigenomes generated at Centre for Epigenome Mapping Technologies, Genome Sciences Center, B.C. Cancer Agency as part of the International Human Epigenome Consortium HiSeq X Ten;ILLUMINA, Illumina HiSeq 2000;ILLUMINA, Illumina HiSeq 2500;ILLUMINA, NextSeq 500;ILLUMINA 193
EGAD00001003962 January 2018 data update (bam/fastq) for reference epigenomes generated at Centre for Epigenome Mapping Technologies (Canadian Epigenetics, Environment and Health Research Consortium), Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada as part of the International Human Epigenome Consortium. HiSeq X Ten;ILLUMINA, Illumina HiSeq 2500;ILLUMINA 34
EGAD00001003960 This data is belong to 2014 Lung squamous patients' exome data which is aligned to human reference(human_g1k_v37.fasta). There are 51 paired tumor/normal samples from SNUH. All samples has passed QC and re-calibration steps while aligning to reference. Illumina HiSeq 2000;ILLUMINA 208
EGAD00001003959 Whole genome sequencing data for 25 adenoid cystic carcinoma samples. The samples were used for Illumina TruSeq library construction and were sequenced on an Illumina HiSeq 2000. The PE fastq files are provided. Illumina HiSeq 2000;ILLUMINA 25
EGAD00001003958 Whole exome sequencing data for 18 mucoepidermoid carcinoma samples. The samples were used for Illumina TruSeq library construction and captured using Agilent V4 exome panel. The PE fastq files are provided. Illumina HiSeq 2000;ILLUMINA 18
EGAD00001003957 Raw lane level fastq files from Whole genome sequencing in support of ICGC PRAD-CA Variant calls Illumina HiSeq 2000;ILLUMINA, Illumina HiSeq 2500;ILLUMINA, unspecified;ILLUMINA 23
EGAD00001003956 Illumina platform sequencing data of SureSelect exome libraries prepared from 3 samples from one donor: a normal, primary breast cancer, and cell line derived from metastasis 3
EGAD00001003955 This dataset comprises single-cell RNA sequencing of the human Lin-CD34+38-45RA-90+49f+ phenotype isolated from 2 normal cord donors. Library preparation was performed following a modified CEL-Seq2 protocol. NextSeq 500;ILLUMINA 2
EGAD00001003953 Fastq files for the whole genome sequencing data (Illumina HiSeq 2500; 32.6-fold) for two diffuse gastric cancers revealing the fusion breakpoints. 2102T: CTNND1-ARHGAP26 gene fusion (g.chr11:57,578,103-g.chr5:142,358,707) 354T: ANXA2-MYO9A gene fusion (g.chr15:60,656,550-g.chr15:72,157,966) Illumina HiSeq 2500;ILLUMINA 2
EGAD00001003951 Whole genome sequencing of 4 childhood T-ALL patients, which was further used in single-cell analysis in the paper "Single cell sequencing reveals the origin and the order of mutation acquisition in T-cell acute lymphoblastic leukemia". Illumina HiSeq 2500;ILLUMINA 4
EGAD00001003950 The dataset consists of samples from papillary thyroid cancer patients. A total of 292 DNA samples from blood/normal and cancer tissue are subjected to whole exome sequencing using Illumina. The fastq files generated were aligned with reference genome ‘hg19’, duplicates were marked, realignment around indels and quality recalibration were performed to produce good quality variants. The recalibrated “.bam” files are included with this dataset.​ 290
EGAD00001003948 Merged bam files for PACA-CA Whole Genome Sequencing, for DCC release 27 39
EGAD00001003947 18 human pancreatic islet preparations derived from 17 were processed for ATAC-seq. The data was generated on an Illumina Hiseq 2500 sequencing machine to generate 50bp paired end read data. The resulting fastq.gz and mapped bam files were deposited. Illumina HiSeq 2500;ILLUMINA 18
EGAD00001003946 DNA from 10 human pancreatic islet samples was processed for Whole-genome Bisulphite Sequencing. The resulting libraries were sequenced on an Illumina Hiseq 2000 to generate 100bp paired-end read data. The resulting fastq.gz and mapped bam files were deposited. Illumina HiSeq 2000;ILLUMINA 10
EGAD00001003945 Bam files for PACA-CA RNA Seq analysis, for DCC release 27 219
EGAD00001003944 Data set of 22 tumor/normal pairs of non-small cell lung cancer (NSCLC) patients. All tissue pairs were screened with MeDIP methylation enrichment sequencing and validations were performed with targeted bisulfite re-sequencing. Illumina HiSeq 2500;ILLUMINA 50
EGAD00001003943 The oral and gut microbiomes of melanoma patients were characterized before the initiation of ant-PD1 immunotherapy, and compared to treatment response. Validation studies were performed in germ-free mice using stool from patients who responded/did not respond to ant-PD1 immunotherapy. All baseline oral(n=86) and gut (n=43) microbiome samples were subject to 16S sequencing - V4 region ( merged fastq files have been made available through this portal). Whole genome shotgun sequencing (WGS) was performed on a subset of fecal samples (n=25)- these files are also available( paired end reads). Also available are 16S sequencing results of stool samples from donors (n=2) used in fecal microbiota transplant and murine samples (n=12) from germ-free mice transplanted with stool from responder/non-responder patients. 167
EGAD00001003942 70
EGAD00001003941 Whole-Genome Sequencing of a Healthy Aging Cohort Complete Genomics;COMPLETE_GENOMICS 511
EGAD00001003940 This dataset contains whole genome sequencing data from 24 patients. For each patient a tumour and control sample has been sequenced on a Illumina HiSeq2000 instrument in paired-end mode. Up to three lanes per sample have been sequenced resulting in 112 Fastq files. Illumina HiSeq 2000;ILLUMINA 48
EGAD00001003937 BBMRI - BIOS project - Freeze 2 - Bam files - Imprinting analysis Illumina HiSeq 2000;ILLUMINA 131
EGAD00001003936 Sequencing of V4 hypervariable region of 16S gene from microbiota present in intestinal biopsies of IBD patients Illumina MiSeq;ILLUMINA 107
EGAD00001003935 Sequencing of V4 hypervariable region of 16S gene of microbiota present in feces of IBD patients Illumina MiSeq;ILLUMINA 315
EGAD00001003934 70
EGAD00001003932 This data is belong to 2014 AML patients' exome data which is aligned to human reference(human_g1k_v37.fasta). There are 51 paired tumor/normal samples from SNUH. All samples has passed QC and recalibration steps while aligning to reference. Illumina HiSeq 2000;ILLUMINA 102
EGAD00001003931 Sequencing data from 1,005 cancer patients and 812 healthy controls. All samples prepared using Safe-SeqS technology and sequenced on an Illumina MiSeq and/or HiSeq instrument. Paired FASTQ files for correspond to read 1 and the index read present (R and I respectively). Illumina HiSeq 4000;ILLUMINA 201
EGAD00001003930 Whole Exome Sequencing Illumina HiSeq 2500;ILLUMINA 6
EGAD00001003928 This data is belong to 2016 AML prospective_v1 patients' genome data which is aligned to human reference(human_g1k_v37.fasta). There are 5 paired tumor/normal samples from SNUH. All samples has passed QC and recalibration steps while aligning to reference. HiSeq X Ten;ILLUMINA 10
EGAD00001003927 Merged bam files for PACA-CA Whole Genome Sequencing, for DCC release 27 246
EGAD00001003926 Patient-derived organoids model treatment response of metastatic gastrointestinal cancers (80 targeted exome capture samples and 2 whole-genome sequencing samples) HiSeq X Ten;ILLUMINA, Illumina HiSeq 2500;ILLUMINA 82
EGAD00001003925 This data is belong to 2014 AML-WGS patients' genome data which is aligned to human reference(human_g1k_v37.fasta). There are 10 paired tumor/normal samples from SNUH. All samples has passed QC and recalibration steps while aligning to reference. HiSeq X Ten;ILLUMINA 20
EGAD00001003924 The discovery of the BRAF V600E mutation in almost all cases of hairy-cell leukemia has led to the widespread adoption of the BRAF inhibitor vemurafenib for treatment of chemotherapy-resistant cases. Impressive responses are reported; however, acquired resistance is common. Whilst diverse mechanisms of vemurafenib resistance have been elucidated in melanoma, the basis of resistance in HCL is unclear. Here we apply whole genome and deep targeted sequencing to investigate resistance mechanisms and potential therapeutic strategies in a patient with aquired resistance to vemurafenib. HiSeq X Ten;ILLUMINA 3
EGAD00001003923 The discovery of the BRAF V600E mutation in almost all cases of hairy-cell leukemia has led to the widespread adoption of the BRAF inhibitor vemurafenib for treatment of chemotherapy-resistant cases. Impressive responses are reported; however, acquired resistance is common. Whilst diverse mechanisms of vemurafenib resistance have been elucidated in melanoma, the basis of resistance in HCL is unclear. Here we apply whole genome and deep targeted sequencing to investigate resistance mechanisms and potential therapeutic strategies in a patient with aquired resistance to vemurafenib. Illumina HiSeq 2500;ILLUMINA 15
EGAD00001003920 WGS sequence data from cell lines BT-54/BT-88/BT-92/BT-142 7
EGAD00001003919 We performed whole genome, whole or targeted exome sequencing for 289 individuals from India. This included 152 clinically diagnosed MODY and 137 control samples. Whole genome libraries were constructed using TruSeqNano DNA Library Preparation Kit (Illumina, CA) and sequenced on Illumina HiSeq2500 (Illumina, CA). The whole exome analysis was performed using Agilent SureSelect (Santa Clara, CA) Human All Exome kit v5 (50 Mb). Exome capture libraries were sequenced on HiSeq 2500 (Illumina, CA). Targeted exome sequencing was performed using custom probes corresponding to 1965 genes implicated in pancreatic cell biology and/or diabetes. Illumina HiSeq 2500;ILLUMINA 289
EGAD00001003918 Cancer RNA-seq consisting of FASTQ paired-end reads from ovary samples Illumina HiSeq 2500;ILLUMINA 16
EGAD00001003917 Germline exomes consisting of FASTQ paired-end reads from blood samples Illumina HiSeq 2500;ILLUMINA 19
EGAD00001003916 Cancer exomes consisting of FASTQ paired-end reads from ovary samples Illumina HiSeq 2500;ILLUMINA 19
EGAD00001003915 The human cerebral cortex has undergone rapid expansion and increased complexity during recent evolution. Hominid-specific gene duplications represent a major driving force of evolution, but their impact on human brain evolution remains unclear. Using tailored RNA sequencing (RNAseq), we profiled the spatial and temporal expression of Hominid-specific duplicated (HS) genes in the human fetal cortex, leading to the identification of a repertoire of 36 HS genes displaying robust and dynamic patterns during cortical neurogenesis. Among these we focused on NOTCH2NL, previously uncharacterized HS paralogs of NOTCH2. NOTCH2NL promote the clonal expansion of human cortical progenitors by increasing self-renewal, ultimately leading to higher neuronal output. NOTCH2NL function by activating the Notch pathway, through inhibition of Delta/Notch interactions. Our study uncovers a large repertoire of recently evolved genes linking genomic evolution to human brain development, and reveals how hominin-specific NOTCH paralogs may have contributed to the expansion of the human cortex. Illumina HiSeq 2000;ILLUMINA, Illumina HiSeq 2500;ILLUMINA 9
EGAD00001003912 This data is belong to 2018 AML-ETO patients' genome data which is aligned to human reference(human_g1k_v37.fasta). There are 12 paired tumor/normal samples from SNUH. All samples has passed QC and recalibration steps while aligning to reference. Illumina HiSeq 2000;ILLUMINA 24
EGAD00001003911 We generated human induced pluripotent stem cell (iPSC) lines with a GFP reporter inserted in the endogenous NKX6.1 locus. Characterisation of the reporter lines demonstrated faithful GFP labelling of NKX6.1 expression during pancreas and motor neuron differentiation. We performed three independent in vitro differentiations towards the pancreatic endocrine lineage. We FACS-purified GFP positive and negative cells from stage 7 cultures, and generated Smart-Seq2 RNA-sequencing libraries for the pre-sorted cells, as well as the two GFP-sorted cell populations. Gene expression profiling by RNA-sequencing reveals that the NKX6.1-positive population closely resembles mature human beta cells and the functional evaluation of purified populations shows that the glucose-responsive beta-like cells are enriched within the NKX6.1-positive population. These reporter lines provide a valuable resource to the scientific community for the derivation of functional relevant pancreas and neuronal cell subtypes. Illumina HiSeq 4000;ILLUMINA 15
EGAD00001003909 Raw lane level fastq files from Whole genome sequencing in support of ICGC PRAD-CA Variant calls Illumina HiSeq 2000;ILLUMINA, HiSeq X Ten;ILLUMINA, Illumina HiSeq 2500;ILLUMINA, unspecified;ILLUMINA 610
EGAD00001003908 - Six samples from the DEV cell line: 2 controls, 2 transduced with IL4R WT and 2 transduced with IL4R mutant (I242N) - This DEV cell line is not commercially available and was acquired from a colleague in the Netherlands 6
EGAD00001003907 While the preponderance of morbidity and mortality in medulloblastoma patients are due to metastatic disease, most research focuses on the primary tumor due to a dearth of metastatic tissue samples and model systems. Medulloblastoma metastases are found almost exclusively on the leptomeningeal surface of the brain and spinal cord; dissemination is therefore thought to occur through shedding of primary tumor cells into the cerebrospinal fluid followed by distal re-implantation on the leptomeninges. We present evidence for medulloblastoma circulating tumor cells (CTCs) in therapy naïve patients, and demonstrate in vivo through flank xenografting and parabiosis that medulloblastoma CTCs can spread through the blood to the leptomeningeal space to form leptomeningeal metastases. Medulloblastoma leptomeningeal metastases express high levels of the chemokine CCL2, and expression of CCL2 in medulloblastoma in vivo is sufficient to drive leptomeningeal dissemination. Hematogenous dissemination of medulloblastoma offers a new opportunity to diagnose and treat lethal disseminated medulloblastoma. 79
EGAD00001003906 October 2017 data update (bam/fastq) for reference epigenomes generated at Centre for Epigenome Mapping Technologies (Canadian Epigenetics, Environment and Health Research Consortium), Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada as part of the International Human Epigenome Consortium. Illumina HiSeq 2500;ILLUMINA 28
EGAD00001003905 RNA-Seq files accompanying the paper titled "Somatic Histone H3 Mutations in Diffuse Intrinsic Pontine Gliomas and Non-Brainstem Paediatric Glioblastomas". Illumina HiSeq 2000;ILLUMINA 66
EGAD00001003904 Comprehensive transcriptional characterization of bone marrow endothelial cells by RNA sequencing was performed to determine the molecular properties/signatures of endothelium during bone marrow recovery and niche formation. Regenerative bone marrow endothelium was FACS-isolated from bone marrow aspirates of Acute Myeloid Leukemia patients 17 days after receiving chemotherapy (n=3). Niche-forming endothelial cells were FACS-isolated from fetal bones (gestational age 15-20 weeks) (n=3). Healthy adult bone marrow endothelial cells (n=7) were used as steady-state controls. cDNA was prepared using the SMARTer procedure (SMARTer Ultra Low RNA Kit, Clonetech). The provided file type is FASTQ. Illumina HiSeq 2500;ILLUMINA 13
EGAD00001003903 Targeted sequencing of 284 patients with AV nodel reentry tachycardia (AVNRT). Sixty-seven genes, plausibly involved in AVNRT pathophysiology, were targeted. Using haloplex target enrichment system. Raw paired end fastq files are provided in this dataset. Illumina MiSeq;ILLUMINA 284
EGAD00001003898 This dataset provides whole genome sequencing data of normal/tumors pairs from 4 patients with uterine or ovarian carcinosarcoma using the HiSeq 2000 sequencing system. It includes 10 samples (4 normals, 4 uterine tumors and 2 ovarian tumors). Through separate whole genome sequencing of carcinomatous and sarcomatoid components, we analyse and compare the genomic alterations of these components. Illumina HiSeq 2000;ILLUMINA 10
EGAD00001003895 the dataset contains RNA bam files of Renal Cell Carcinoma patients, which belongs to "An Empirical Approach Leveraging Tumorgrafts to Dissect the Tumor Microenvironment in Renal Cell Carcinoma Identifies Missing Link to Prognostic Inflammatory Factors" 59
EGAD00001003894 The dataset (vcf files) consists of rare germline variants of 68 Finnish acute myeloid leukemia patients. We performed exome sequencing and filtered the germline variants against ExAC total MAF<0.01 in two gene panels. The 35 genes in the panels studied here have previously been associated with hematological malignancies and/or solid tumors. The dataset contains only variants of the two gene panels. 68
EGAD00001003892 Hepatocellular carcinoma specimens, intrahepatic cholangiocarcinoma specimens and liver normal tissues collected from 7 samples, including 44 fastq files from whole exome sequencing. HiSeq X Ten;ILLUMINA 21
EGAD00001003891 Transcriptome sequencing was performed on 214 patients with myelodysplasia in this study. RNA was obtained from bone marrow CD34+ cells (n=100) and/or bone marrow mononuclear cells (n=165). Transcriptome sequencing was performed for both cell fractions in 51 patients. A total of 211 patients were genotyped by targeted deep sequencing. We also studied bone marrow CD34+ cells and bone marrow mononuclear cells obtained from three healthy adults each. Illumina HiSeq 2500;ILLUMINA 266
EGAD00001003890 HiSeq X Ten;ILLUMINA 72
EGAD00001003889 A SMC04_WGBS paired end data for skletal muscle cells HiSeq X Ten;ILLUMINA 1
EGAD00001003888 A SMC03_WGBS paired end data for skletal muscle cells HiSeq X Ten;ILLUMINA 1
EGAD00001003887 Sequencing was performed using OncoPanel v.2 (OPv2), an Agilent SureSelect custom designed bait set consisting of the coding regions of 504 genes, previously linked to human cancer. Sequencing wa sperformed on an Illumina HiSeq 2500. 8 highly differentiated, fusion-negative rhabdomyosarcoma tumor samples were sequenced. BAM files are available for download. Illumina HiSeq 2500;ILLUMINA 8
EGAD00001003886 In the present study, we have examined fungal and bacterial infection in brain tissue from 10 AD patients and 16 control subjects by next-generation sequencing NGS using MiSeq sequencing platform (Illumina). Illumina MiSeq;ILLUMINA 41
EGAD00001003885 The genetic basis of many rare childhood cancers remains unknown. These include a spectrum of infant soft tissue tumors without canonical gene fusions, encompassing congenital mesoblastic nephroma (CMN) of the kidney and infantile fibrosarcoma (IFS). Here, we integrated whole genome and transcriptome sequencing and identified diagnostic markers and novel therapeutic strategies. Illumina HiSeq 2500;ILLUMINA 19
EGAD00001003884 The genetic basis of many rare childhood cancers remains unknown. These include a spectrum of infant soft tissue tumors without canonical gene fusions, encompassing congenital mesoblastic nephroma (CMN) of the kidney and infantile fibrosarcoma (IFS). Here, we integrated whole genome and transcriptome sequencing and identified diagnostic markers and novel therapeutic strategies. HiSeq X Ten;ILLUMINA 37
EGAD00001003882 HiSeq X Five;ILLUMINA 70
EGAD00001003881 A ADMSC04_WGBS paired end data for adipose-derived mesenchymal stem cells HiSeq X Ten;ILLUMINA 1
EGAD00001003880 A ADMSC03_WGBS paired end data for adipose-derived mesenchymal stem cells HiSeq X Ten;ILLUMINA 1
EGAD00001003879 A ADMSC02_WGBS paired end data for adipose-derived mesenchymal stem cells HiSeq X Ten;ILLUMINA 1
EGAD00001003878 A ADMSC01_WGBS paired end data for adipose-derived mesenchymal stem cells HiSeq X Ten;ILLUMINA 1
EGAD00001003877 A SMC09_WGBS paired end data for skletal muscle cells HiSeq X Ten;ILLUMINA 1
EGAD00001003876 A SMC08_WGBS paired end data for skletal muscle cells HiSeq X Ten;ILLUMINA 1
EGAD00001003875 A SMC07_WGBS paired end data for skletal muscle cells HiSeq X Ten;ILLUMINA 1
EGAD00001003874 A SMC06_WGBS paired end data for skletal muscle cells HiSeq X Ten;ILLUMINA 1
EGAD00001003873 A SMC05_WGBS paired end data for skletal muscle cells HiSeq X Ten;ILLUMINA 1
EGAD00001003872 A SMC02_WGBS paired end data for skletal muscle cells HiSeq X Ten;ILLUMINA 1
EGAD00001003871 A SMC01_WGBS paired end data for skletal muscle cells HiSeq X Ten;ILLUMINA 1
EGAD00001003870 A ADMSC04_smRNA-Seq single end data for adipose-derived mesenchymal stem cells Illumina HiSeq 2500;ILLUMINA 1
EGAD00001003869 A ADMSC03_smRNA-Seq single end data for adipose-derived mesenchymal stem cells Illumina HiSeq 2500;ILLUMINA 1
EGAD00001003868 A ADMSC02_smRNA-Seq single end data for adipose-derived mesenchymal stem cells Illumina HiSeq 2500;ILLUMINA 1
EGAD00001003867 A ADMSC01_smRNA-Seq single end data for adipose-derived mesenchymal stem cells Illumina HiSeq 2500;ILLUMINA 1
EGAD00001003866 A SMC09_smRNA-Seq single end data for skletal muscle cells Illumina HiSeq 2500;ILLUMINA 1
EGAD00001003865 A SMC08_smRNA-Seq single end data for skletal muscle cells Illumina HiSeq 2500;ILLUMINA 1
EGAD00001003864 A SMC07_smRNA-Seq single end data for skletal muscle cells Illumina HiSeq 2500;ILLUMINA 1
EGAD00001003863 A SMC06_smRNA-Seq single end data for skletal muscle cells Illumina HiSeq 2500;ILLUMINA 1
EGAD00001003862 A SMC05_smRNA-Seq single end data for skletal muscle cells Illumina HiSeq 2500;ILLUMINA 1
EGAD00001003861 A SMC04_smRNA-Seq single end data for skletal muscle cells Illumina HiSeq 2500;ILLUMINA 1
EGAD00001003860 A SMC03_smRNA-Seq single end data for skletal muscle cells Illumina HiSeq 2500;ILLUMINA 1
EGAD00001003859 A SMC02_smRNA-Seq single end data for skletal muscle cells Illumina HiSeq 2500;ILLUMINA 1
EGAD00001003858 A SMC01_smRNA-Seq single end data for skletal muscle cells Illumina HiSeq 2500;ILLUMINA 1
EGAD00001003857 A ADMSC04_ChIP-Seq(H3K27me3) paired end data for adipose-derived mesenchymal stem cells Illumina HiSeq 2500;ILLUMINA 1
EGAD00001003856 A ADMSC03_ChIP-Seq(H3K27me3) paired end data for adipose-derived mesenchymal stem cells Illumina HiSeq 2500;ILLUMINA 1
EGAD00001003855 A ADMSC02_ChIP-Seq(H3K27me3) paired end data for adipose-derived mesenchymal stem cells Illumina HiSeq 2500;ILLUMINA 1
EGAD00001003854 A ADMSC01_ChIP-Seq(H3K27me3) paired end data for adipose-derived mesenchymal stem cells Illumina HiSeq 2500;ILLUMINA 1
EGAD00001003853 A SMC09_ChIP-Seq(H3K27me3) paired end data for skletal muscle cells Illumina HiSeq 2500;ILLUMINA 1
EGAD00001003852 A SMC08_ChIP-Seq(H3K27me3) paired end data for skletal muscle cells Illumina HiSeq 2500;ILLUMINA 1
EGAD00001003851 A SMC07_ChIP-Seq(H3K27me3) paired end data for skletal muscle cells Illumina HiSeq 2500;ILLUMINA 1
EGAD00001003850 A SMC06_ChIP-Seq(H3K27me3) paired end data for skletal muscle cells Illumina HiSeq 2500;ILLUMINA 1
EGAD00001003849 A SMC05_ChIP-Seq(H3K27me3) paired end data for skletal muscle cells Illumina HiSeq 2500;ILLUMINA 1
EGAD00001003848 A SMC04_ChIP-Seq(H3K27me3) paired end data for skletal muscle cells Illumina HiSeq 2500;ILLUMINA 1
EGAD00001003847 A SMC03_ChIP-Seq(H3K27me3) paired end data for skletal muscle cells Illumina HiSeq 2500;ILLUMINA 1
EGAD00001003846 A SMC02_ChIP-Seq(H3K27me3) paired end data for skletal muscle cells Illumina HiSeq 2500;ILLUMINA 1
EGAD00001003845 A SMC01_ChIP-Seq(H3K27me3) paired end data for skletal muscle cells Illumina HiSeq 2500;ILLUMINA 1
EGAD00001003841 One sample of human genomic DNA. DNA extracted from whole blood. Reads obtained using an exome enrichment kit (Truseq, Illumina) and sequencing of 100bp paired-end reads on a HiSeq 2500 sequencing system (Illumina). Illumina HiSeq 2500;ILLUMINA 1
EGAD00001003837 This dataset, named Stockholm tumor progression cohort, contains exome-sequencing samples of matched primary and metastasis samples from 20 metastatic breast cancer patients. All patients have one or more sequenced normal samples as well. The total number of samples is 125. The dataset has been used, apart from other studies, to explore tumor evolution patterns in metastatic breast cancer at Karolinska Institute Stockholm. Illumina HiSeq 2500 (ILLUMINA) 125
EGAD00001003835 Whole genome sequencing data of 25 prostate tumor and corresponding normal samples, aligned with the CGP BWA-mem workflow. Illumina HiSeq 2000;ILLUMINA 50
EGAD00001003834 This dataset contains whole genome sequencing FASTQ data for 12 cholangiocarcinoma tumor samples, and their matched normal samples. These 12 samples are in addition to 59 samples available in dataset EGAD00001001988, and consist of patients from Thailand, Romania, and Singapore. Paired-end sequencing data was generated by Illumina Hiseq 2000 and 2500, with insert sizes of 170 and 350. Illumina HiSeq 2000;ILLUMINA, Illumina HiSeq 2500;ILLUMINA 24
EGAD00001003832 Patient information SSc patients were recruited at the Department of Rheumatology of the Leiden University Medical Center (Leiden, The Netherlands). All patients met the American Rheumatism Association classification criteria for SSc (Subcommittee for scleroderma criteria 1980), and were classified according to LeRoy and Medsger criteria as either limited or diffuse cutaneous disease (LeRoy EC, Black C, Fleischmajer R, Jablonska S, Krieg T, Medsger TA Jr, Rowell N 1988). Institutional review board approval and written informed consent was obtained before patients entered this study. Two 4 mm skin biopsies were taken from a standardized location on the most proximal part of the lower arm, distal from the elbow. In 10 patients the skin biopsy came from a clinically affected area and in 4 patients the skin was locally unaffected. One sample was used for RNA sequencing and one sample was used for immunohistochemistry. Skin biopsies from healthy individuals were commercially sourced (Tissue Solutions, UK) and collected from donors undergoing skin resection surgery and after informed consent. To match the healthy skin with patients as much as possible, skin biopsies from healthy controls were also taken from a similar position (the under-arm (for 4 controls) and leg (for 2 controls)). Healthy skin donors were selected to match the age and sex of the SSc patient cohort. Biopsies from patients and controls were equally treated and were both stored at -80°C until RNA isolation was performed. RNA from frozen skin biopsies was isolated using RNeasy kit from fibrous tissue (Qiagen, the Netherlands). RNA quantity was determined by using SimplyNano 2000 and quality was assessed on Tapestation (Agilent, the Netherlands). All samples included in the study had a RIN score above 7.0. Transcriptome characterisation and analysis RNA sequencing was performed using polyA selection and a stranded protocol using Ion Torrent next generation sequencing technology (Service XS, The Netherlands). The Ion PI Template OT2 200 Kit v3 and Ion PI Sequencing 200 Kit v3 were used according to the manufacturer’s instructions. 20 samples were run on 11 PI chips. PI chip analyses, base calling and quality checks were performed using the Torrent Server Suite. An average of 42 million 100 bp reads was generated per sample. Following quality control, reads were aligned to the human genome (Homo sapiens GRh38.78) using Bowtie2 and STAR (Dobin et al. 2013; Langmead and Salzberg 2012). Reads were first aligned with STAR. For the unmapped reads from STAR, a second alignment step was performed using bowtie2 (local very sensitive options) Ion Torrent Proton;ION_TORRENT 20
EGAD00001003831 NextSeq 500;ILLUMINA 6
EGAD00001003829 The data set contains paired end fastq files for whole exome sequencing data for Leiomyosarcoma tumor and control samples Illumina HiSeq 2000;ILLUMINA, Illumina HiSeq 2500;ILLUMINA 96
EGAD00001003828 This dataset contains paired fastq files for LMS tumor samples Illumina HiSeq 2000;ILLUMINA, Illumina HiSeq 2500;ILLUMINA 37
EGAD00001003827 The data set contains bam files aligned using bwa-0.7.8 mem -t 8 -R. HiSeq X Ten;ILLUMINA 4
EGAD00001003825 Illumina MiSeq;ILLUMINA, Illumina HiSeq 2000;ILLUMINA 134
EGAD00001003824 Whole genome sequencing data on 10 human cancer cell lines Complete Genomics;COMPLETE_GENOMICS, Illumina Genome Analyzer IIx (ILLUMINA) 14
EGAD00001003823 Somatic mutations were called using whole exome Sequencing (WES) data from colorectal cancer samples (dataset EGAD00001003821) using MuTect2, with matched constitutional WES-data obtained from leukocytes samples as reference. 37
EGAD00001003822 The dataset comprises 8 breast cancer, 11 ovarian cancer, 1 benign tumour, 18 normal tissue, 2 endometrium, and 23 white blood cell samples. Genome wide methylation analysis was performed by Reduced Representation Bisulfite Sequencing (RRBS) on Illumina HiSeq 2500. Data is provided as FASTQ files Illumina HiSeq 2500;ILLUMINA 63
EGAD00001003821 WES was performed using the KAPA-Hyper prep kit from Illumina (Roche, Basel, Switzerland) for library construction, followed by exome capture using Niblegen SeqCap EZ Human Exome Library v3.0 (Roche). Reads were mapped using BWA MEM against the humane reference genome HG19. NextSeq 500;ILLUMINA 42
EGAD00001003820 Whole transcriptome, strand-specific RNA-seq libraries were prepared from total RNA purified using RNeasy mini kit (Qiagen) using Ribo-Zero technology (Epicentre, an Illumina company) for depletion of rRNA followed by library preparation using ScriptSeq ScriptSeq RNA-Seq Library preparation Kit from Illumina. The paired raw sequence reads were processed using TopHat2 and mapped to the humane reference genome HG19. NextSeq 500;ILLUMINA 16
EGAD00001003819 The dataset includes a subset of 762 individuals that were found to be closely related (≤3rd degree), including 263 Chinese and 499 Malays from the Singapore Living Biobank. There samples are whole-exome sequenced on Illumina HiSeq2000 platform (125bp paired end) with the exonic regions being captured using the Nimblegen SeqCap EZ Exome v3 kits.All the files are in the BAM format. Illumina HiSeq 2000;ILLUMINA 762
EGAD00001003818 BAM files of targeted next-generation DNA sequencing data of 13 chordoid gliomas of the third ventricle (2 paired tumor-normal samples and 11 tumor-only samples). Genomic DNA was extracted from formalin-fixed, paraffin-embedded blocks of tumor tissue from 13 patients with chordoid glioma of the third ventricle using the QIAamp DNA FFPE Tissue Kit (Qiagen). Genomic DNA was also extracted from leukocytes in a peripheral blood sample from one of the patients and a non-neoplastic gastric biopsy specimen from one of the patients. Capture-based next-generation DNA sequencing was performed at the University of California, San Francisco Clinical Cancer Genomics Laboratory, using an assay that targets all coding exons of approximately 500 cancer-related genes, select introns of 47 genes, and TERT promoter with a total sequencing footprint of 2.8 Mb (UCSF500 Cancer Panel). Sequencing libraries were prepared from genomic DNA, and target enrichment was performed by hybrid capture using a custom oligonucleotide library (Nimblegen SeqCap EZ Choice). Captured libraries were sequenced as paired-end 100 bp reads on an Illumina HiSeq 2500 instrument. Duplicate sequencing reads were removed computationally to allow for accurate allele frequency determination and copy number calling. Illumina HiSeq 2500;ILLUMINA 15
EGAD00001003816 HALT AML mRNA - RNASeq mapped reads 22
EGAD00001003815 Whole exome sequencing Illumina HiSeq 2000;ILLUMINA 48
EGAD00001003814 The data contain whole deep RNA sequencing of leukocytes from 17 Greenlanders. RNA was purified from peripheral blood with the PAXGene Blood miRNA Kit (Qiagen). The RNA sequencing library was prepared following the instructions of the TruSeq RNA Sample Prep Kit v2 (Illumina). For mRNA isolation and fragmentation 200 ng of total RNA was purified by oligo-dT beads. The qualified libraries were amplified on cBot to generate the cluster on the flowcell (TruSeq PE Cluster Kit V3–cBot–HS, Illumina). The amplified flow cell was sequenced paired-end on the HiSeq 4000 System (TruSeq SBS KIT-HS V3, Illumina). Illumina HiSeq 4000;ILLUMINA 17
EGAD00001003813 The data contain whole exome sequencing of 27 Greenlanders in nine trios. Data were produced by Agilent SureSelect capture followed by paired-end Illumina HiSeq 2000 sequencing to a depth of 90.1X. More details on processing and analysis can be found in Moltke et al, Nature 2014 (PMID 25043022). Illumina HiSeq 2000;ILLUMINA 27
EGAD00001003812 Whole genome sequencing of sampels from isolated populations from Croatia. The samples are sequenced using the Illumina HiSeq X Ten system. This dataset contains all the data available for this study on 2017-11-22. HiSeq X Ten;ILLUMINA 20
EGAD00001003811 Our project will examine the role of PIK3CA mutations and their sensitivity to endocrine therapies and its role, with the addition of complete ovarian suppression. We plan to test our hypotheses using tumour samples collected from patients enrolled in the SOFT/IBCSG24-02 clinical study (Suppression of Ovarian Function Trial - (NCT00066690). SOFT is a phase III trial that randomised 3066 premenopausal women to evaluate if adding ovarian suppression to adjuvant endocrine therapy will improve clinical outcomes. This dataset contains all the data available for this study on 2017-11-22. Illumina HiSeq 2500;ILLUMINA 81
EGAD00001003810 An RNA Seq study of the effects of HDAC inhibitor Quisinostat on six different synovial sarcoma cell lines NextSeq 500;ILLUMINA 12
EGAD00001003808 Illumina HiSeq 2500;ILLUMINA 47
EGAD00001003807 Whole transcriptome RNA sequencing (RNA-seq) of human induced pluripotent stem cell lines from three independent donors at seven islet developmental stages: definitive endoderm (DE), primitive gut tube (GT), posterior foregut (PF), pancreatic endoderm (PE), endocrine progenitors (EP), endocrine-like cells (EN), and beta-like cells (BLC). 24
EGAD00001003806 cDNA depleted RNA (500ng total RNA input) was fragmented to 150-200 nucleotides in first strand buffer for 3 minutes at 94°C. Random hexamer primed first strand was generated in presence of dATP, dGTP, dCTP and cTTP. Second strand was generated using dUTP instead of dTTP to tag the second strand. Subsequent steps to generate the sequencing libraries were performed with the KAPA HTP Library Preparation Kit for Illumina sequencing with minor modifications, i.e., after indexed adapter ligation to the dsDNA fragments, the library was treated with USER enzyme (NEB_M5505L) in order to digest the second strand derived fragments. After amplification of the libraries, samples with unique sample indexes were pooled and sequenced paired-end 2x50bp on a HiSeq2500 system following standard Illumina guidelines. Illumina HiSeq 2500;ILLUMINA 36
EGAD00001003805 A whole genome mutation analysis of cortical kidney tissue, an early passage kidney organoid culture derived from the kidney tissue sample, and a late passage of the same organoid culture. HiSeq X Ten (ILLUMINA) 3
EGAD00001003804 Exome fastq files of 98 hepatocellular carcinoma and matched nomral (BCM, HCC-JP) Illumina HiSeq 2000;ILLUMINA 196
EGAD00001003803 This dataset contains VCF files from a variant calling analysis of 19 neuroblastoma patients. WES or WGS data of the primary tumor were compared to WES cfDNA analysis at the time of diagnosis and at a 2nd timepoint (complete remission, partial remission, disease progression or relapse). For 4 patients, WGS of germline, tumor at diagnosis and tumor at relapse DNA was performed on Illumina HiSeq2500, with 100-bp paired-end reads. For the other patients, WES was performed using either an AgilentSureSelect Human All Exon v5 or a Roche Nimblegen SeqCap EZ Exome V3 kit on Illumina HiSeq2000, with 100-bp paired-end reads. SNVs observed in any of the primary tumors or cfDNA samples studied by WES were targeted using a capture sequencing panel at all intermediate time points. 146
EGAD00001003802 106 FFPE tumor samples from small bowel were sequenced with Illumina HiSeq 4000. Exome capture was performed with NimbleGen SeqCap EZ Exome Library v3 Kit. Reads were aligned with BWA–MEM v.0.7.12 to GRCh37 reference genome. Variant calls were produced with GATK HaplotypeCaller. Variant calls were filtered against all data from gnomAD database using allele frequency threshold 0.0001 in order to remove germline variation. 106
EGAD00001003801 RNAseq Data set Illumina HiSeq 2000;ILLUMINA 40
EGAD00001003800 Whole Exome Sequencing was performed in a dilution series containing known amounts of human and mouse DNA, 3x 100% human 0% mouse, 2x 90/10, 3x 50/50, 2x 25/75 and 3x 0/100. A set of breast cancer clinical samples, matched normal tissue and matched PDTXs (total number = 14) were also analysed. Paired-end 75bp sequences for the dilution series and paired-end 125bp for the clinical samples were obtained on Illumina HiSeq2500; fastq files are provided. A triplicate analysis of the transcriptome using RNA-seq was also performed for the Universal Human RNA Reference and the Universal Mouse RNA Reference samples. Paired-end 150bp fastq files obtained on Illumina HiSeq4000 are provided. Illumina HiSeq 2500;ILLUMINA, Illumina HiSeq 4000;ILLUMINA 12
EGAD00001003799 We performed whole-exome sequencing and whole epigenome sequencing (RRBS) of samples collected from different time points during radiotherapy from thirty-four ESCC patients. We compared the genetic and epigenetic features of the different time biopsy samples to reveal the changes in ESCC received radiotherapy. Illumina HiSeq 2500;ILLUMINA 180
EGAD00001003797 This dataset contains WES data (.bam files) and associated phenotype information from 10 patients included in our microbiome study who went on to anti PD-1 immunotherapy for the treatment of metastatic melanoma at the University of Texas MD Anderson Cancer Center. Both tumor and matching germ line normal were sequenced on each patient using Illumina HiSeq 2500. The average coverage was 283X in tumors and 135X in germline (tumor+germline overall:209, Range: 0-1552). Illumina HiSeq 2500;ILLUMINA 20
EGAD00001003795 This dataset includes Nimblegen SeqCap EZ Exome v3 data for each lesion of three patients with multicentric glioma. For two patients, each lesion was sequenced along with whole blood. For a third patient, 3 pieces from the right lesion and 4 pieces from the left were sequenced along with whole blood. In each case BAM files that have been aligned with BWA mem alignment are available. 15
EGAD00001003794 8
EGAD00001003793 By differential gene expression analysis followed by protein expression and functional studies, we define that the naive T cells having divided the least since thymic emigration express complement receptors (CR1 and CR2) known to bind complement C3b- and C3d-decorated microbial products and, following activation, produce IL-8 (CXCL8), a major chemoattractant for neutrophils in bacterial defense. We also observed an IL-8–producing memory T cell subpopulation coexpressing CR1 and CR2 and with a gene expression signature resembling that of RTEs. JCI Insight. 2017;2(16):e93739. https://doi.org/10.1172/jci.insight.93739 Illumina HiSeq 2500;ILLUMINA 24
EGAD00001003792 The dataset for High Grade Serous Ovarian Carcinomas Originate in the Fallopian Tube includes 46 bam files from next-generation sequencing on the Illumina HiSeq2500. The samples analyzed include multiple lesions from nine patients, five with high grade serous ovarian carcinoma and four who are BRCA-carriers. Illumina HiSeq 2500;ILLUMINA 46
EGAD00001003791 The SAHGP characterises the genomes of 24 individuals (8 Coloured and 16 black southeastern Bantu-speakers) using deep whole genome sequencing (WGS). 24
EGAD00001003790 RNA seq reads constituting of FASTQ paired end reads from 5 FHD/FHDL patients Illumina HiSeq 2000;ILLUMINA 13
EGAD00001003789 Exome reads constituting of FASTQ paired end reads from 5 FHD/FHDL patients Illumina HiSeq 2000;ILLUMINA 16
EGAD00001003788 Whole Exome Sequencing of 9 Colorectal Cancer (CRC) samples performed on Illumina HiSeq4000 consisting of aligned paired reads. RNAseq data sequenced on Illumina NextSeq500 consisting of FASTQ single reads from 3 CRC colon samples. A total of 12 samples from five patients (we matched normal tissue or pbmc and tumors) were sequenced on Illumina NextSeq500. NextSeq 500;ILLUMINA, Illumina HiSeq 4000;ILLUMINA 24
EGAD00001003787 BBMRI - BIOS project - Freeze 2 - Fastq files - GoNL samples Illumina HiSeq 2000;ILLUMINA 420
EGAD00001003786 BBMRI - BIOS project - Freeze 2 - Bam files - GoNL samples Illumina HiSeq 2000;ILLUMINA 420
EGAD00001003785 BBMRI - BIOS project - Freeze 2 - Fastq files - unrelated samples Illumina HiSeq 2000;ILLUMINA 3,559
EGAD00001003784 BBMRI - BIOS project - Freeze 2 - Bam files - unrelated samples Illumina HiSeq 2000;ILLUMINA 3,559
EGAD00001003783 Recent studies using next-generation sequencing strategies have described the landscape of genetic alterations in diffuse large B-cell lymphoma (DLBCL). However, little is known about the clinical relevance of recurrent mutations and copy number alterations and their transcriptional footprints. This study examines the frequency, interaction and clinical impact of recurrent genetic aberrations in DLBCL using high-resolution technologies in a large population-based cohort. 324
EGAD00001003782 When available (25 primary MDS, 12 MDS/MPN, and 6 AML-MRC cases), high quality RNA (stranded-total) was submitted for RNA-seq. RNA was extracted from bulk myeloid cells which was used as the tumor population. Files uploaded are mapped BAM files. Illumina HiSeq 2000;ILLUMINA 43
EGAD00001003781 Paired whole exome sequencing for 32 primary MDS, 14 MDS/MPN, and 8 AML-MRC cases (total = 54). Normal comparator genomic DNA was extracted from lymphocytes purified by flow cytometry. Bulk myeloid cells were used as a source of tumor gDNA. Files uploaded are mapped BAM files. Illumina HiSeq 2000;ILLUMINA 94
EGAD00001003780 RNA-seq data obtained from directed differentiation of a subset of FiPSCs and BiPSCs cell lines towards islet-like cells. RNA was collected at two key developmental stages: definitive endoderm (DE) and pancreatic progenitors (PP). Illumina HiSeq 2500;ILLUMINA 16
EGAD00001003778 Illumina HiSeq 2000;ILLUMINA 1
EGAD00001003776 186 tumor/normal matched samples from whole exome sequecing and 178 samples (168 tumors, 10 normals) from whole transcriptome sequencing Illumina HiSeq 2500 (ILLUMINA) 550
EGAD00001003769 This dataset is a time-series of EGFR-mutant NSCLC clinical specimens from an individual patient profiled using tumor-based whole exome sequencing and the data is in BAM format. DNA was extracted from FFPE for primary tumor and frozen tumor tissue samples and matched non-tumor tissue using the Qiagen Allprep DNA/RNA Mini Kit.  The library preparation protocol was based on the Agilent SureSelect Library Prep and Capture System. DNA was resuspended in a low TE buffer and sheared (Duty Cycle 5%; Intensity 175; Cycles/Burst: 200; Time: 300s, Corvaris S2 Utrasonicator).  Bar-coded exome libraries were prepared using the Agilent Sure Select V5 library kit per manfucaturer’s specifications. The libraries were run on the HiSeq2500. Raw paired end reads (100bp) in FastQ format generated by the Illumina pipeline were aligned to the full hg19 genomic assembly obtained from USCS, gencode 14, using bwa version 0.7.12. Picard tools version 1.117 was used to sort, remove duplicate reads and generate QC statistics. Tumor DNA was sequenced to median depth of 303X (range 114.39-383.41) and the matched germline DNA to average depth of 231.65. Illumina HiSeq 2500;ILLUMINA 8
EGAD00001003765 Whole-exome sequencing of 20 samples of actinic keratosis (10) and cutaneous squamous cell carcinoma (10) was performed to investigate a potential relationship between DNA methylation-based subtypes and genetic mutation patterns. 7 samples were shown to belong to the stem cell-like subclass (4 AK and 2 SCC), 12 - to the keratinocyte-like subtype (6 AK and 6 SCC) and one SCC sample is unclassified (was not included in the methylation analysis). Exome regions were captured using Agilent Low Input Exome-Seq Human v5 kit and sequenced on Illumina Hiseq4000 with paired-end 100-nucleotide reads. Illumina HiSeq 4000;ILLUMINA 20
EGAD00001003764 Four RNA-sequencing datasets from two patients with initial low-grade glioma and copy number alteration at IDH1 upon recurrence. Data is provided as bam files. 4
EGAD00001003763 15 whole exome sequencing datasets from five patients. Data is provided as bam files. Libraries were generated using the SeqCap EZ Exome v3.0 kit and sequenced on an Illumina sequencer 15
EGAD00001003762 Whole Exome sequencing of paediatric High Grade Gliomas Illumina HiSeq 2000;ILLUMINA 100
EGAD00001003761 This dataset contains fastq files with Whole genome sequencing data for the CPC-Gene Project. Data from each sample was generated using multiple whole genome libraries and sequenced across multiple runs Illumina HiSeq 2000;ILLUMINA, Illumina HiSeq 2500;ILLUMINA, unspecified;ILLUMINA 617
EGAD00001003760 There are 88 paired samples from HCC patients including tumors and matched adjacent normal tissues which were sequencing by Illumina HiSeq 2000 platform. Illumina HiSeq 2000;ILLUMINA 176
EGAD00001003759 ATAC-seq data for 5 non-diabetic human pancreatic islet samples Illumina HiSeq 2500;ILLUMINA 5
EGAD00001003758 BBMRI - BIOS project - Freeze 2 - Bam files Illumina HiSeq 2000;ILLUMINA 3,686
EGAD00001003757 BBMRI - BIOS project - Freeze 2 - Fastq files Illumina HiSeq 2000;ILLUMINA 3,686
EGAD00001003755 This dataset provides whole genome sequencing data of normal/tumors pairs from 9 patients with uterine or ovarian carcinosarcoma using the HiSeq 2000 sequencing system. It includes 27 samples (9 normals, 16 uterine tumors and 2 ovarian tumors). Through separate whole genome sequencing of carcinomatous and sarcomatoid components, we analyse and compare the genomic alterations of these components. Illumina HiSeq 2000;ILLUMINA 27
EGAD00001003754 structural variant calls from Delly, vcf format 37
EGAD00001003753 single nucleotide variant calls from somatic sniper, vcf format. input for subclonal reconstruction 20
EGAD00001003752 single nucleotide variant calls from somatic sniper, vcf format 34
EGAD00001003751 Whole genome sequencing data for primary tumors, matching control material from blood and their corresponding organoid. Whole transcriptome data for organoids. HiSeq X Ten;ILLUMINA, NextSeq 500;ILLUMINA 102
EGAD00001003750 This is the first whole exome sequencing analysis of a primary meningeal melanocytic tumour (MMT) alongside the patients germline. Here we report the CRAM files from the tumour and germline. Illumina HiSeq 2500;ILLUMINA 2
EGAD00001003749 Isotype-resolved sequencing of B cell receptor in measles virus infection 1) This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute please see http://www.sanger.ac.uk/datasharing/ This dataset contains all the data available for this study on 2017-09-13. Illumina MiSeq;ILLUMINA 182
EGAD00001003748 Sequencing of B-cell receptor repertoires in healthy individuals and patients with chronic lymphocytic leukemia. 1) This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute please see http://www.sanger.ac.uk/datasharing/ This dataset contains all the data available for this study on 2017-09-13. Illumina MiSeq;ILLUMINA 387
EGAD00001003747 Optimisation of ex vivo Memory B cell Expansion/Differentiation for Interrogation of Rare Peripheral Memory B Cell Subset Responses 1) This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute please see http://www.sanger.ac.uk/datasharing/ This dataset contains all the data available for this study on 2017-09-13. Illumina MiSeq;ILLUMINA 38
EGAD00001003746 Sequencing was performed using OncoPanel v.2 (OPv2), an Agilent SureSelect custom designed bait set consisting of the coding regions of 504 genes, previously linked to human cancer. Sequencing wa sperformed on an Illumina HiSeq 2500. 14 highly differentiated, fusion-negative rhabdomyosarcoma tumor samples, and 8 non-matched normal skeletal muscle samples weer sequenced. BAM files are available for download. Illumina HiSeq 2500;ILLUMINA 22
EGAD00001003745 Exome sequencing fastq files from 6 mutation carriers and 5 non-carriers from 2 families. One µg DNA was used for library preparation using the TruSeq DNA LT Sample Prep Kit v2 according to the manufacturer’s instructions (Illumina). Hybridization was performed using Nimblegen SeqCap EZ Exome v3 (Roche) and Paired-end Sequencing (2x100 bp) on the Illumina HiSeq 2000 with TruSeq v3 chemistry (Illumina). Illumina HiSeq 2000;ILLUMINA 11
EGAD00001003744 Genome and transcriptome sequence data from a pleomorphic xanthoastrocytoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003743 Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003742 Genome and transcriptome sequence data from an adenocarcinoma of the lung patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003741 Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003740 Genome and transcriptome sequence data from a metastatic pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003739 Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003738 Genome and transcriptome sequence data from a Ewing sarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003737 Genome and transcriptome sequence data from a sinus adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003736 Genome and transcriptome sequence data from a metastatic pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003735 Genome and transcriptome sequence data from a metastatic melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003734 Genome and transcriptome sequence data from a spindle cell carcinoma of the left parotid patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003733 Genome and transcriptome sequence data from a metastatic uterine leiomyosarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003732 Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003731 Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003730 Genome and transcriptome sequence data from a metastatic melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003729 Genome and transcriptome sequence data from a peripheral T-cell lymphoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003728 Genome and transcriptome sequence data from a metastatic gastrointestinal stromal tumor patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003727 Genome and transcriptome sequence data from a metastatic pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003726 Genome and transcriptome sequence data from a large-cell neuroendocrine lung cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003725 Genome and transcriptome sequence data from a metastatic adenocarcinoma of the rectosigmoid patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003724 Genome and transcriptome sequence data from a T-cell rich B cell lymphoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003723 Genome and transcriptome sequence data from a squamous cell carcinoma of the anus patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003722 Genome and transcriptome sequence data from a primary unknown patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003721 Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003720 Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003719 Genome and transcriptome sequence data from a metastatic breast carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003718 Genome and transcriptome sequence data from a metastatic pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003717 Genome and transcriptome sequence data from a metastatic non-small cell lung cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003716 Genome and transcriptome sequence data from a melanoma of the right buccal mucosa patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003715 Genome and transcriptome sequence data from a metastatic cholangiocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003714 Genome and transcriptome sequence data from a metastatic breast carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003713 Genome and transcriptome sequence data from a low-grade serous ovarian cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003712 Genome and transcriptome sequence data from a primary of unknown origin patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003711 Genome and transcriptome sequence data from a bilateral breast lobular cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003710 Genome and transcriptome sequence data from a metastatic colorectal carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003709 Genome and transcriptome sequence data from a high-grade serous fallopian tube carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003708 Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003706 This dataset contains fastq files with Whole genome sequencing data for the CPC-Gene Project. Data from each sample was generated using multiple whole genome libraries and sequenced across multiple runs Illumina HiSeq 2000;ILLUMINA, Illumina HiSeq 2500;ILLUMINA, unspecified;ILLUMINA 616
EGAD00001003705 10 single-cell placental RNA libraries were generated using the Chromium Single Cell 3′ Reagent Kit (10X Genomics). All single-cell libraries were sequenced with a customized paired end with dual indexing (98/14/8/10-bp) format according to the recommendation by 10X Genomics. The data were aligned using the Cell Ranger Single-Cell Software Suite (version 1.0). Moreover, plasma RNA from 22 samples were extracted using the RNeasy Mini Kit (Qiagen). cDNA reverse transcription, second-strand synthesis, and RNA-sequencing (RNA-seq) library construction were performed using the Ovation RNA-seq System V2 (NuGEN) kit according to the manufacturer’s protocol. For alignment of the plasma RNA library, adaptor sequences and low-quality bases on the fragment ends (i.e., quality score < 5) were trimmed, and reads were aligned to the human reference genome (hg19) using the TopHat (v2.0.4) software. All aligned reads were deposited in bam file format. Illumina HiSeq 2000;ILLUMINA, NextSeq 500;ILLUMINA 32
EGAD00001003703 The incidence of acute myeloid leukemia (AML) increases with age and mortality exceeds 90% when diagnosed after age 60. Only 10-15% of cases evolve from a pre-existing myeloproliferative or myelodysplastic disorder; the remaining cases arise de novo without a detectable prodrome and are diagnosed upon development of bone marrow failure. Analysis of diagnostic blood samples has demonstrated that de novo AML is preceded by the accumulation of somatic mutations in pre-leukemic hematopoietic stem and progenitor cells (preL-HSPCs) that subsequently undergo clonal expansion. If individuals in this pre-leukemic phase could be identified, methods for determination of risk and monitoring for progression to overt AML could be developed. However recurrent AML mutations also accumulate during aging in healthy individuals who never develop AML, referred to as age related clonal hematopoiesis (ARCH). To distinguish individuals with preL-HSPCs at high risk of developing AML from those with ARCH, we undertook deep targeted sequencing of genes recurrently mutated in AML in blood samples from 133 individuals in the European Prospective Investigation into Cancer and Nutrition (EPIC) study taken on average 6 years before they developed AML (pre-AML group), together with 683 matched healthy individuals (Control group). Pre-AML cases displayed accelerated age-correlated accumulation of somatic mutations.The identity, number and variant allele frequency (VAF) of mutations differed between the two groups, and were incorporated into a computational model of AML risk prediction that accurately distinguished pre-AML cases from controls on average 7 years prior to AML development. Our findings provide proof of concept that early prediction of AML development is feasible in high-risk populations, paving the way for early disease detection, monitoring, and potentially prevention. Illumina HiSeq 2000;ILLUMINA, Illumina HiSeq 2500;ILLUMINA 628
EGAD00001003702 Genome and transcriptome sequence data from a high grade serous carcinoma of the fallopian tube/ovary/peritoneum patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003701 Genome and transcriptome sequence data from a metastatic myoepithelial carcinoma of parotid patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003700 Genome and transcriptome sequence data from a thymic carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003699 Genome and transcriptome sequence data from a metastatic lung cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003698 Genome and transcriptome sequence data from a locally advanced right breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003697 Genome and transcriptome sequence data from a metastatic meningioma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003696 Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003695 Genome and transcriptome sequence data from a metastatic melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003694 Genome and transcriptome sequence data from a pleomorphic sarcomatoid epithelioid carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003693 Genome and transcriptome sequence data from a metastatic rectal adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003692 Genome and transcriptome sequence data from a metastatic gastrointestinal stromal tumour patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003691 Genome sequence data from a metastatic squamous cell carcinoma of the oropharynx patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003690 Transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 1
EGAD00001003689 Genome and transcriptome sequence data from a metastatic epitheloid angiomyelolipoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003688 Genome and transcriptome sequence data from a metastatic pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003687 Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003686 Genome and transcriptome sequence data from a metastatic neuroendocrine tumor arising from small bowel patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003685 Genome and transcriptome sequence data from an osterosarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003684 Genome and transcriptome sequence data from a metastatic breast carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003683 Genome and transcriptome sequence data from a metastatic high grade sarcomatous neoplasm nos patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003682 Genome and transcriptome sequence data from a non-small cell lung cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003681 Genome and transcriptome sequence data from a metastatic breast carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003680 Genome and transcriptome sequence data from a low grade serous ovarian cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003679 Genome and transcriptome sequence data from a metastatic adenoid cystic carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003678 Genome and transcriptome sequence data from a thymoma carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003677 Genome and transcriptome sequence data from a metastatic colorectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003676 Genome and transcriptome sequence data from a metastatic adrenocortical carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003675 Genome and transcriptome sequence data from a metastatic breast carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003674 Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003673 Genome and transcriptome sequence data from a metastatic adenocarcinoma of the ge junction patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003672 Genome and transcriptome sequence data from a metastatic clear cell ovarian cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003671 Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 3
EGAD00001003670 Genome and transcriptome sequence data from a metastatic breast carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003669 Genome and transcriptome sequence data from a metastatic mucinous adenocarcinoma of the rectum patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003668 Genome and transcriptome sequence data from a metastatic rhabdomyosarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003667 Genome and transcriptome sequence data from a metastatic gastrointestinal stromal tumor patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003666 Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003665 Genome and transcriptome sequence data from a metastatic rectal adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003664 Genome and transcriptome sequence data from a metastatic breast carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003663 Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003662 Genome and transcriptome sequence data from a left cavernous sinus invasive skull meningioma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003661 Genome and transcriptome sequence data from an advanced adenocarcinoma of lung patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003660 Genome and transcriptome sequence data from a metastatic colorectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003659 Genome and transcriptome sequence data from an ependymoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003658 Genome and transcriptome sequence data from a primary unknown patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003657 Genome and transcriptome sequence data from an adenocarcinoma of the lung patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003656 Genome and transcriptome sequence data from a metastatic breast carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003655 Genome and transcriptome sequence data from a metastatic colon cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003654 Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003653 Genome and transcriptome sequence data from a non-small cell lung carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003652 Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003651 Genome and transcriptome sequence data from a metastatic colon adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003650 Genome and transcriptome sequence data from a metastatic non-small cell lung cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003649 Genome and transcriptome sequence data from a metastatic colon adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003648 Genome and transcriptome sequence data from a glioblastoma multiforme patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003647 Genome and transcriptome sequence data from an anal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003646 Genome and transcriptome sequence data from a squamous cell carcinoma of ge junction patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003645 Genome and transcriptome sequence data from an anaplastic ependymoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003644 Genome and transcriptome sequence data from a metastatic spindle cell sarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003643 Genome and transcriptome sequence data from a metastatic cecal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003642 Genome and transcriptome sequence data from a metastatic neuroendocrine carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003641 Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003640 Genome and transcriptome sequence data from a metastatic adenoid cystic carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003639 Genome and transcriptome sequence data from a non-small cell lung cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003638 Genome and transcriptome sequence data from a metastatic prostate cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003637 Genome and transcriptome sequence data from a pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003636 Genome and transcriptome sequence data from a metastatic paraganglioma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003635 Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003634 Genome and transcriptome sequence data from a solitary fibrous tumors (sarcoma) patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003633 Genome and transcriptome sequence data from a metastatic rectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003632 Genome and transcriptome sequence data from a chronic lymphocytic leukemia patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003631 Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003630 Genome and transcriptome sequence data from a radiation-induced pleomorphic sarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003629 Genome and transcriptome sequence data from a metastatic gastric cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003628 Genome and transcriptome sequence data from a metastatic adenocarcinoma of appendiceal origin patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003627 Genome and transcriptome sequence data from a salivary duct carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003626 Genome and transcriptome sequence data from a retroperitoneal mucinous cystic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003625 Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003624 Genome and transcriptome sequence data from a metastatic colorectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003623 Genome and transcriptome sequence data from a metastatic rectal adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003622 Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003621 Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003620 Genome and transcriptome sequence data from a metastatic colorectal carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003619 Genome and transcriptome sequence data from a metastatic colorectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003618 Genome and transcriptome sequence data from a mesothelioma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003617 Genome and transcriptome sequence data from a non-small cell lung cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003616 Genome and transcriptome sequence data from an adenocarcimona of the lung patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003615 Genome and transcriptome sequence data from a metastatic pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003614 Genome and transcriptome sequence data from a metastatic non-small cell lung carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003613 Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003612 Genome and transcriptome sequence data from a metastatic adenocarcinoma of the lung patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003611 Genome and transcriptome sequence data from a metastatic cecal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003610 Genome and transcriptome sequence data from a mullerian mixed tumor with carcinosarcoma of the ovaries patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003609 Genome and transcriptome sequence data from a metastatic serous ovarian cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003608 Genome and transcriptome sequence data from a metastatic small cell carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003607 Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003606 Genome and transcriptome sequence data from a metastatic adenocarcinoma of the rectum patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003605 Genome and transcriptome sequence data from a metastatic colonic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003604 Genome and transcriptome sequence data from a metastatic gallbladder cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003603 Genome and transcriptome sequence data from a metastatic melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 2
EGAD00001003602 Dataset consisting of: (1) N=234 genome-wide chromatin accessibility (ATAC-seq) profiles for distinct N=21 healthy old and N=28 healthy young subjects. ATAC-seq biological samples provided for the following tissues: PBMC (N=24), CD14+ monocytes (N=18), CD8+ memory T cells (N=7), CD8+ naive T cells (N=7), CD4+ memory T cells (N=7), CD4+ naive T cells (N=7), and naive B cells (N=7). (2) N=39 genome-wide transcription (RNA-seq) data for distinct N=15 healthy old and N=24 healthy young subjects' PBMCs. Illumina HiSeq 2500;ILLUMINA 273
EGAD00001003601 The dataset for Direct Detection of Early-Stage Cancers using Circulating Tumor DNA includes 602 bam files from next-generation sequencing on the Illumina HiSeq2500 or MiSeq. The samples analyzed include cancer cell lines as well as plasma and tissue specimens from healthy individuals and patients with cancer. Illumina MiSeq;ILLUMINA, Illumina HiSeq 2500;ILLUMINA 550
EGAD00001003600 Exome sequencing data for 1001 DLBCL patients and RNA sequencing data for 775 DLBCL patients Illumina HiSeq 2500;ILLUMINA 1,776
EGAD00001003599 This data is belong to 2017 AML genome data which is aligned to human reference(human_g1k_v37.fasta). There are 10 paired tumor/normal samples from SNUH. All samples has passed QC and recalibration steps while aligning to reference. HiSeq X Ten;ILLUMINA 20
EGAD00001003598 This data is belong to 2017 AML prospective data which is aligned to human reference(human_g1k_v37.fasta). There are 10 paired tumor/normal samples from SNUH. All samples has passed QC and recalibration steps while aligning to reference. HiSeq X Ten;ILLUMINA 20
EGAD00001003597 Promoter capture HiC on KMS11 (multiple myeloma) Illumina HiSeq 2000;ILLUMINA 1
EGAD00001003596 The MITOEXME project aims to improve protocols for molecular diagnosis of patients with OXPHOS disorders with a focus on a next generation sequencing methods and to increase the knowledge of pahtophysiological mechanisms by identification of new targets and cellular studies. In this project we will sequence the exomes fo 120 patients. This dataset contains all the data available for this study on 2017-08-29. Illumina HiSeq 2000;ILLUMINA 125
EGAD00001003595 This dataset consists of TLA data in the parents of 9 healthy families and 11 B-thalasemia risk families during pregnancy, cell-free DNA sequencing data and Fetal DNA sequencing where available. TLA data was collected for the CFTR region in all healthy families and the CYP21A2 region in two of the healthy families. TLA data was collected for the HBB region in the risk families. In each pregnant mother, cell-free DNA was collected, enriched for the region of interest using sureselect pulldown and sequenced. Samples are labled Mother_X, Father_X and CVS_X for the healthy families and HBB_Mother_X, HBB_Father_X and HBB_CVS_X. cfDNA files can be found under the maternal sample, and each consist of three indices used to increase the maximum number of unique molecules per SNP. Both raw and processed cfDNA data is provided, raw data is mapped using BWA MEM, sorted using samtools and restricted to the region of interest for the sake of patient privacy. Processed data is mapped using BWA MEM, sorted using samtools, duplicate filtered using samtools rmdup, overlap-clipped using picardtools and restricted to the region of interest. NextSeq 500;ILLUMINA 55
EGAD00001003594 This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ This dataset contains all the data available for this study on 2017-08-29. Illumina HiSeq 2500;ILLUMINA, Illumina HiSeq 4000;ILLUMINA 391
EGAD00001003593 Complete Genomics;COMPLETE_GENOMICS 24
EGAD00001003592 Merged bam files for PACA-CA Whole Exome Sequencing, for DCC release 25 216
EGAD00001003591 Merged bam files for PACA-CA Whole Genome Sequencing, for DCC release 25 211
EGAD00001003590 Ultra-Fast Patient-Derived Xenografts Identify Functional and Spatial Tumour Heterogeneities that Drive Therapeutic Resistance - WXS unaligned reads Illumina HiSeq 2500;ILLUMINA 27
EGAD00001003589 Ultra-Fast Patient-Derived Xenografts Identify Functional and Spatial Tumour Heterogeneities that Drive Therapeutic Resistance - WXS mapped reads 27
EGAD00001003587 This data is belong to 2015 whole exome sequenced AML data which is aligned to human reference(human_g1k_v37.fasta). There are 40 paired NR samples from Chunnam University. All samples has passed QC and recalibration steps while aligning to reference. HiSeq X Ten;ILLUMINA 80
EGAD00001003586 Whole Genomes Define Concordance in Matched Primary, Xenograft, and Organoid Models of Pancreas Cancer - WGS mapped reads 54
EGAD00001003585 Genomics-Driven Precision Medicine for Advanced Pancreatic Cancer - Early Results from the COMPASS Trial - WGS mapped reads 106
EGAD00001003584 Genomics-Driven Precision Medicine for Advanced Pancreatic Cancer - Early Results from the COMPASS Trial - RNA-Seq mapped reads 50
EGAD00001003583 516 DNA samples were collected from individuals upon enrollment into the European Prospective Investigation into Cancer and Nutrition study between 1993 and 1998 across 17 different centers. 126bp pair-end reads sequencing data from the Illumina platform were converted to fastq format, the 2bp molecular barcode information at each read of the pair was trimmed and was written in the reads name. The Thymine nucleotide required for ligation was removed from the sequences. Burroughs-Wheeler Aligner (BWA-mem) was used for alignment of the processed fastq files to the reference hg19 genome, following indel-re-alignment using GATK. An in-house algorithm was written to collapse read families that share the same molecular barcode sequence 516
EGAD00001003582 Genomics-Driven Precision Medicine for Advanced Pancreatic Cancer - Early Results from the COMPASS Trial - RNA-Seq unmapped reads Illumina HiSeq 2500;ILLUMINA 50
EGAD00001003581 Using low input SMART-seq protocol, the whole transcriptome of human small intestine macrophage subtypes is characterized. NextSeq 500;ILLUMINA 33
EGAD00001003580 WGS sequencing for 310 tumor normal pairs from ICGC ESAD-UK project Tumors 50x Normals 30x HiSeq X bam files These samples are all available in ICGC release 26 Illumina HiSeq 2000 (ILLUMINA) 620
EGAD00001003579 Samples prepared using Safe-SeqS technology. All samples ran on an Illumina MiSeq instrument. Fastq files for read 1 and the index read present (R and I respectively). Illumina MiSeq;ILLUMINA 49
EGAD00001003574 Clonal evolution study of Intrahepatic cholangiocarcinoma: 69 PDPCs and 6 tissues. Illumina HiSeq 4000;ILLUMINA 81
EGAD00001003571 The data consists of 678189 genome-wide polymorphic variants of 3658 individuals from ERF/GRIP region in a variant call format (vcf) file. ERF has been genotyped with different genotyping platform: Illumina 318 k, 350 k, 610 k and Affymetrics 200 k. 3,658
EGAD00001003565 The project is focused on the axonal forms of Charcot-Marie-Tooth (CMT) disease. We have selected 13 families (7 from Spain and 6 from Czech Republic) that have been indepth clinically assessed and previously tested for mutations in known CMT genes without causal variants characterised. In these patients we expect to discover several CMT2 genes. Thus, we requested for exome sequencing of 45 DNAs:27 exomes in families from Spain and 18 exomes in the families from Czech Republic. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ This dataset contains all the data available for this study on 2017-08-16. Illumina HiSeq 2500;ILLUMINA 45
EGAD00001003564 The aim of the project is the definition of the molecular defect in a cohort of Rett-like patients negative for mutations in known disease genes. To this aim, a number of unrelated trios (patients plus parents) will be analysed by exome sequencing. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ This dataset contains all the data available for this study on 2017-08-16. Illumina HiSeq 2500;ILLUMINA 46
EGAD00001003563 Whole exome sequencing of diffuse intrinsic pontine glioma (DIPG) cells isolated from the pons and from a sub-ventricular zone site of spread within the frontal lobe from the same individual (SU- DIPG-XIII) Illumina HiSeq 2000;ILLUMINA 3 sample
EGAD00001003562 This dataset includes bam files from 120 samples. These samples were sequenced using 2x150bp reads on an Illumina HiSeqX sequencer and aligned using the Isaac aligner. All samples were processed with TruSeq DNA PCR-free sample preparation. HiSeq X Ten;ILLUMINA 118
EGAD00001003561 ICGC PCAWG Dataset for RNA-Seq BAM aligned using TopHat2. Project: MALY-DE. 99
EGAD00001003560 ICGC PCAWG Dataset for RNA-Seq BAM aligned using Star. Project: MALY-DE. 99
EGAD00001003559 ICGC PCAWG Dataset for RNA-Seq BAM aligned using TopHat2. Project: RECA-EU. 100
EGAD00001003558 ICGC PCAWG Dataset for RNA-Seq BAM aligned using Star. Project: RECA-EU. 100
EGAD00001003557 This dataset is belong to 2014 whole genome sequenced AML data which is aligned to human reference(human_g1k_v37.fasta). There are 67 paired CR samples from Chunnam University. All samples has passed QC and recalibration steps while aligning to reference. 134
EGAD00001003556 We will perform RNAseq to evaluate the effects of the loss of a list of TSGs on the transcriptome. This dataset contains all the data available for this study on 2017-08-10. Illumina HiSeq 2500;ILLUMINA 25
EGAD00001003555 40 paired normal and tumour whole-exome sequencing samples was used to investigate the genomic landscape of cutaneous squamous cell carcinoma Illumina HiSeq 2500;ILLUMINA 80
EGAD00001003551 The samples include paired tumor and normal tissues from 106 patients . High-coverage WES sequencing or whole genome sequencing of DNA samples were performed on the Illumina HiSeq 2000 system Illumina HiSeq 2000;ILLUMINA 212
EGAD00001003550 Cell line exome sequencing Illumina HiSeq 2500;ILLUMINA 176
EGAD00001003549 ICGC PCAWG Dataset for RNA-Seq BAM aligned using Star. Project: CLLE-ES. 74
EGAD00001003548 ICGC PCAWG Dataset for RNA-Seq BAM aligned using TopHat2. Project: CLLE-ES. 74
EGAD00001003547 ICGC PCAWG Dataset for RNA-Seq BAM aligned using Star. Project: LIRI-JP. 130
EGAD00001003546 ICGC PCAWG Dataset for RNA-Seq BAM aligned using TopHat2. Project: LIRI-JP. 130
EGAD00001003543 HipSci - Usher Syndrome - RNA Sequencing - July 2017 Illumina HiSeq 2500 (ILLUMINA) 27
EGAD00001003542 HipSci - Retinitis Pigmentosa - RNA Sequencing - July 2017 Illumina HiSeq 2500 (ILLUMINA) 2
EGAD00001003541 HipSci - Macular Dystrophy - RNA Sequencing - July 2017 Illumina HiSeq 2500 (ILLUMINA) 3
EGAD00001003540 HipSci - Hypertrophic Cardiomyopathy - RNA Sequencing - July 2017 Illumina HiSeq 2500 (ILLUMINA) 18
EGAD00001003539 HipSci - Bleeding and Platelet Disorders - RNA Sequencing - July 2017 Illumina HiSeq 2500 (ILLUMINA) 7
EGAD00001003538 HipSci - Hereditary Cerebellar Ataxias - RNA Sequencing - July 2017 Illumina HiSeq 2500 (ILLUMINA) 11
EGAD00001003537 HipSci - Hereditary Spastic Paraplegia - RNA Sequencing - July 2017 Illumina HiSeq 2500 (ILLUMINA) 6
EGAD00001003536 HipSci - Primary Immune Deficiency - RNA Sequencing - July 2017 Illumina HiSeq 2500 (ILLUMINA) 8
EGAD00001003535 HipSci - Kabuki Syndrome - RNA Sequencing - July 2017 Illumina HiSeq 2500 (ILLUMINA) 6
EGAD00001003534 HipSci - Congenital Hyperinsulinia - RNA Sequencing - July 2017 Illumina HiSeq 2500 (ILLUMINA) 5
EGAD00001003532 HipSci - Alport Syndrome - RNA Sequencing - July 2017 Illumina HiSeq 2500 (ILLUMINA) 7
EGAD00001003531 HipSci - Bardet-Biedl Syndrome - RNA Sequencing - July 2017 Illumina HiSeq 2500 (ILLUMINA) 59
EGAD00001003530 HipSci - Monogenic Diabetes - RNA Sequencing - July 2017 Illumina HiSeq 2500 (ILLUMINA) 43
EGAD00001003529 HipSci - Healthy Normals - RNA Sequencing - July 2017 Illumina HiSeq 2500 (ILLUMINA) 193
EGAD00001003528 HipSci - Usher Syndrome - Exome Sequencing - July 2017 Illumina HiSeq 2500 (ILLUMINA), Illumina HiSeq 2500;ILLUMINA 27
EGAD00001003527 HipSci - Retinitis Pigmentosa - Exome Sequencing - July 2017 Illumina HiSeq 2500 (ILLUMINA) 2
EGAD00001003526 HipSci - Primary Immune Deficiency - Exome Sequencing - July 2017 Illumina HiSeq 2500 (ILLUMINA) 8
EGAD00001003525 HipSci - Macular Dystrophy - Exome Sequencing - July 2017 Illumina HiSeq 2500 (ILLUMINA) 3
EGAD00001003524 HipSci - Kabuki Syndrome - Exome Sequencing - July 2017 Illumina HiSeq 2500 (ILLUMINA) 6
EGAD00001003523 HipSci - Hypertrophic Cardiomyopathy - Exome Sequencing - July 2017 Illumina HiSeq 2500 (ILLUMINA), Illumina HiSeq 2500;ILLUMINA 18
EGAD00001003522 HipSci - Hereditary Spastic Paraplegia - Exome Sequencing - July 2017 Illumina HiSeq 2500 (ILLUMINA) 6
EGAD00001003521 HipSci - Hereditary Cerebellar Ataxias - Exome Sequencing - July 2017 Illumina HiSeq 2500 (ILLUMINA) 11
EGAD00001003520 HipSci - Congenital Hyperinsulinia - Exome Sequencing - July 2017 Illumina HiSeq 2500 (ILLUMINA) 5
EGAD00001003519 HipSci - Bleeding and Platelet Disorders - Exome Sequencing - July 2017 Illumina HiSeq 2500 (ILLUMINA) 7
EGAD00001003518 HipSci - Battens Disease - Exome Sequencing - July 2017 Illumina HiSeq 2500 (ILLUMINA) 4
EGAD00001003517 HipSci - Alport Syndrome - Exome Sequencing - July 2017 Illumina HiSeq 2500 (ILLUMINA) 7
EGAD00001003516 HipSci - Monogenic Diabetes - Exome Sequencing - July 2017 Illumina HiSeq 2500 (ILLUMINA) 43
EGAD00001003513 This dataset includes bam files from 3,001 samples. These bam files include all read pairs where at least one of the reads aligns within 1kb of the C9orf72 repeat expansion. Additionally, these bam files also contain reads that are aligned to any of 29 pre-determined off target locations where the aligners are known to mis-align reads associated with this repeat expansion. These samples were sequenced using a combination of 2x100bp reads on an Illumina HiSeq2000 and 2x150bp reads on an Illumina HiSeqX sequencer and aligned using the Isaac aligner. HiSeq X Ten;ILLUMINA, Illumina HiSeq 2000;ILLUMINA 3,001
EGAD00001003512 This dataset includes bam files from 58 samples. These bam files include all read pairs where at least one of the reads aligns within 1kb of the HTT repeat expansion. These samples were sequenced using 2x150bp reads on an Illumina HiSeqX sequencer and aligned using bwa. Twelve of the samples used TruSeq Nano library preparation and 46 samples used TruSeq DNA PCR-free sample preparation. HiSeq X Ten;ILLUMINA 58
EGAD00001003511 BAM files with sequencing reads derived from Oxford Nanopore MinION whole genome sequencing of two DNA samples from lymphoblastoid cell lines from two patients with congenital disease. Samples were prepared using 1D and 2D library preps. MinION;OXFORD_NANOPORE 2
EGAD00001003510 BAM files with sequencing reads derived from Illumina whole genome sequencing of two DNA samples from lymphoblastoid cell lines from two patients with congenital disease. Whole genome sequencing was performed using Illumina HiSeq X Ten and samples were prepared using TruSeq library prep. HiSeq X Ten;ILLUMINA 2
EGAD00001003509 Whole Exome Sequencing reads consisting of BAM paired end reads from Follicular Lymphoma samples. 11
EGAD00001003506 A IPS06_X_ENeuron_smRNA-Seq single end data for Early neuron cells(Tuj1) Illumina HiSeq 2500;ILLUMINA 1
EGAD00001003505 A IPS05_X_NPC_smRNA-Seq single end data for Neural progenitor cells(Nestin) Illumina HiSeq 2500;ILLUMINA 1
EGAD00001003504 A IPS04_X_Fibroblast_smRNA-Seq single end data for iPSC(Oct4) Illumina HiSeq 2500;ILLUMINA 1
EGAD00001003503 A IPS03_N_ENeuron_smRNA-Seq single end data for Early neuron cells(Tuj1) Illumina HiSeq 2500;ILLUMINA 1
EGAD00001003502 A IPS02_N_NPC_smRNA-Seq single end data for Neural progenitor cells(Nestin) Illumina HiSeq 2500;ILLUMINA 1
EGAD00001003501 A IPS01_N_Fibroblast_smRNA-Seq single end data for iPSC(Oct4) Illumina HiSeq 2500;ILLUMINA 1
EGAD00001003500 A CKD27_C_Mesan_smRNA-Seq single end data for Mesangial cells(kidney) Illumina HiSeq 2500;ILLUMINA 1
EGAD00001003499 A CKD25_C_Podo_smRNA-Seq single end data for Podocytes(CD90(-) Podocalyxin(+), kidney) Illumina HiSeq 2500;ILLUMINA 1
EGAD00001003498 A CKD24_C_Podo_smRNA-Seq single end data for Podocytes(CD90(-) Podocalyxin(+), kidney) Illumina HiSeq 2500;ILLUMINA 1
EGAD00001003497 A CKD23_C_Mesan_smRNA-Seq single end data for Mesangial cells(kidney) Illumina HiSeq 2500;ILLUMINA 1
EGAD00001003496 A OB57_D_PreA_smRNA-Seq single end data for Preadipocyte(fat) Illumina HiSeq 2500;ILLUMINA 1
EGAD00001003495 A OB56_N_PreA_smRNA-Seq single end data for Preadipocytes(fat) Illumina HiSeq 2500;ILLUMINA 1
EGAD00001003494 A DB31_N_Alpha_smRNA-Seq single end data for alpha cells(PSA-NCAM(-), pancreas) Illumina HiSeq 2500;ILLUMINA 1
EGAD00001003493 A IPS06_X_ENeuron_mRNA-Seq paired end data for Early neuron cells(Tuj1) Illumina HiSeq 2500;ILLUMINA 1
EGAD00001003492 A IPS05_X_NPC_mRNA-Seq paired end data for Neural progenitor cells(Nestin) Illumina HiSeq 2500;ILLUMINA 1
EGAD00001003491 A IPS04_X_Fibroblast_mRNA-Seq paired end data for iPSC(Oct4) Illumina HiSeq 2500;ILLUMINA 1
EGAD00001003490 A IPS03_N_ENeuron_mRNA-Seq paired end data for Early neuron cells(Tuj1) Illumina HiSeq 2500;ILLUMINA 1
EGAD00001003489 A IPS02_N_NPC_mRNA-Seq paired end data for Neural progenitor cells(Nestin) Illumina HiSeq 2500;ILLUMINA 1
EGAD00001003488 A IPS01_N_Fibroblast_mRNA-Seq paired end data for iPSC(Oct4) Illumina HiSeq 2500;ILLUMINA 1
EGAD00001003487 A OB57_D_PreA_mRNA-Seq paired end data for Preadipocyte(fat) Illumina HiSeq 2500;ILLUMINA 1
EGAD00001003486 A OB56_N_PreA_mRNA-Seq paired end data for Preadipocytes(fat) Illumina HiSeq 2500;ILLUMINA 1
EGAD00001003485 A DB31_N_Alpha_mRNA-Seq paired end data for alpha cells(PSA-NCAM(-), pancreas) Illumina HiSeq 2500;ILLUMINA 1
EGAD00001003484 A CKD27_C_Mesan_mRNA-Seq paired end data for Mesangial cells(kidney) Illumina HiSeq 2500;ILLUMINA 1
EGAD00001003483 A CKD25_C_Podo_mRNA-Seq paired end data for Podocytes(CD90(-) Podocalyxin(+), kidney) Illumina HiSeq 2500;ILLUMINA 1
EGAD00001003482 A CKD24_C_Podo_mRNA-Seq paired end data for Podocytes(CD90(-) Podocalyxin(+), kidney) Illumina HiSeq 2500;ILLUMINA 1
EGAD00001003481 A CKD23_C_Mesan_mRNA-Seq paired end data for Mesangial cells(kidney) Illumina HiSeq 2500;ILLUMINA 1
EGAD00001003480 A OB57_D_PreA_WGBS paired end data for Preadipocyte(fat) HiSeq X Ten;ILLUMINA 1
EGAD00001003479 A OB56_N_PreA_WGBS paired end data for Preadipocytes(fat) HiSeq X Ten;ILLUMINA 1
EGAD00001003478 A IPS06_X_ENeuron_WGBS paired end data for Early neuron cells(Tuj1) HiSeq X Ten;ILLUMINA 1
EGAD00001003477 A IPS05_X_NPC_WGBS paired end data for Neural progenitor cells(Nestin) HiSeq X Ten;ILLUMINA 1
EGAD00001003476 A IPS04_X_Fibroblast_WGBS paired end data for iPSC(Oct4) HiSeq X Ten;ILLUMINA 1
EGAD00001003475 A IPS03_N_ENeuron_WGBS paired end data for Early neuron cells(Tuj1) HiSeq X Ten;ILLUMINA 1
EGAD00001003474 A IPS02_N_NPC_WGBS paired end data for Neural progenitor cells(Nestin) HiSeq X Ten;ILLUMINA 1
EGAD00001003473 A IPS01_N_Fibroblast_WGBS paired end data for iPSC(Oct4) HiSeq X Ten;ILLUMINA 1
EGAD00001003472 A DB31_N_Alpha_WGBS paired end data for alpha cells(PSA-NCAM(-), pancreas) HiSeq X Ten;ILLUMINA 1
EGAD00001003471 A CKD27_C_Mesan_WGBS paired end data for Mesangial cells(kidney) HiSeq X Ten;ILLUMINA 1
EGAD00001003470 A CKD25_C_Podo_WGBS paired end data for Podocytes(CD90(-) Podocalyxin(+), kidney) HiSeq X Ten;ILLUMINA 1
EGAD00001003469 A CKD24_C_Podo_WGBS paired end data for Podocytes(CD90(-) Podocalyxin(+), kidney) HiSeq X Ten;ILLUMINA 1
EGAD00001003468 A CKD23_C_Mesan_WGBS paired end data for Mesangial cells(kidney) HiSeq X Ten;ILLUMINA 1
EGAD00001003467 This dataset contains 77 tumor-normal pairs of exome sequencing data of HCC patient from National Taiwan University, Taiwan. Illumina HiSeq 2500;ILLUMINA 154
EGAD00001003466 This dataset contains 21 tumor-normal pairs of exome sequencing data of HCC patient from Chang Gung Memorial Hospital, Taiwan. Illumina HiSeq 2500;ILLUMINA 42
EGAD00001003464 For RNA-Seq total RNA was isolated following LDC67 or JQ1 treatment. 3’RNAseq libraries were prepared with QUANT SEQ FWD 3´mRNA-Seq Kit (Lexogen, Austria), sequenced on an Illumina HiSeq 4000 Illumina HiSeq 2000;ILLUMINA 3
EGAD00001003463 These are the vcf files of exome sequencing of the two probands who were found to harbor mutations in KLB. Sample: EGAN00001564799 is the proband 1; Sample: EGAN00001564800 is the proband 11 in the KLB paper. Exome capture was performed using the SureSelect All Exon capture (Agilent Technologies, Santa Clara, CA USA) and sequenced on the HiSeq2500 (Illumina, San Diego CA USA). 2
EGAD00001003461 H3K27ac ChIP-seq and input genome sequencing was performed in 19 primary prostate tumours classified as intermediate risk. Sequencing of ChIP DNA was performed on an Illumina HiSeq 2000 as either single end 50 bp reads (for 7 samples) or paired end 100 bp reads (for 12 samples). Input DNA from all samples was sequenced using single-end 50 bp reads. The files provided are in fastq format. Illumina HiSeq 2000;ILLUMINA 38
EGAD00001003460 Illumina HiSeq 2000;ILLUMINA, Illumina HiSeq 2500;ILLUMINA 13
EGAD00001003459 Single cell transcriptomics of PBMCs of 47 donors from the Lifelines Deep cohort (general population, Northern part of the Netherlands). Cells of five or six different donors were pooled together in one sample pool, resulting in eight different sample pools. In total, 28.855 cells were captured and their transcriptomes were sequenced to an average depth of 74k. Genotype data was available for each donor, which allowed us to use the Demuxlet method that uses variable SNPs between the pooled individuals to determine which cell belongs to which individual. Since genotype information is lacking of 2 individuals, the transcriptome of only 45 individuals could be retrieved. Illumina HiSeq 4000;ILLUMINA 8
EGAD00001003458 Fastq data of genomics heterogeneity of multiple synchronous lung cancer. Whole-genome sequencing (WGS) were performed in 3 tumour samples, one regional lymph node metastasis sample and peripheral blood sample from the same patient with MSLCs. Illumina HiSeq 2000;ILLUMINA 6
EGAD00001003456 There are 5WGS and 35WES sample pairs from the first affiliated hospital of kunming medical university, which belongs to ICGC projects COCA-CN. Illumina HiSeq 2000;ILLUMINA 80
EGAD00001003455 The MHC vcf call set was generated using a modified AsmVar and BayesTyper pipeline. In contrast to the original pipeline, where variant calling is performed using alignment of collapsed assemblies to a reference genome, the MHC call set was produced using alignment of phased MHC haplotypes. Two iterations of BayesTyper was run, a first iteration for each haplotype seperately and a second iteration performing joint variant calling on all haplotypes. The sample IDs for the fathers and mothers are TrioID-01 and TrioID-02, respectively, and the IDs for the children are TrioID-0x, where x is a number between 3 and 7. 25
EGAD00001003454 Validation of HLA variation of 8 individuals from the GenomeDenmark Phase 2 study. Validation is performed Sanger sequencing of selected amplicons (5-10 amplicons per sample). AB 3730xL Genetic Analyzer;CAPILLARY 8
EGAD00001003453 16S sequencing of stool samples of LifeLines-DEEP, domain V4 Illumina MiSeq;ILLUMINA 1,010
EGAD00001003452 The samples include paired tumor and normal tissues from 205 patients (201 for normal and primary tumor tissues; 4 for normal, primary tumor and liver metastatic tissues). High-coverage WES sequencing or whole genome sequencing of DNA samples were performed on the Illumina HiSeq 2000 system Illumina HiSeq 2000;ILLUMINA 30
EGAD00001003448 strand-specific RNA-seq data from 19 gastric tumors and their adjacent normal tissues, plus 16 gastric cancer cell lines, one normal gastric cell line, and 3 normal stomach RNAs Illumina HiSeq 2500;ILLUMINA 58
EGAD00001003446 This dataset includes deep coverage (>60x) whole exomes of 15 human embryonic stem cell lines. Genomic DNA was purified and fragmented using the Illumina Nextera system for library preparation and sequenced using 150bp paired-end reads. Sequencing reads were aligned to the hg19 reference genome using the BWA MEM alignment program. HiSeq X Ten;ILLUMINA 15
EGAD00001003445 Clear cell renal cancer is characterized by near-universal loss of the short arm of chromosome 3 (3p). This event arises through unknown mechanisms, but critically results in the loss of several tumor suppressor genes. We analyzed whole genomes from 95 biopsies across 33 patients with clear cell renal cancer (ccRCC) recruited into the Renal TRACERx study. We find novel hotspots of point mutations in the 5'-UTR of TERT, targeting a MYC-MAX repressor, that result in telomere lengthening. The most common structural abnormality generates simultaneous 3p loss and 5q gain (36% patients), typically through chromothripsis. Using molecular clocks, we estimate this occurs in childhood or adolescence, generally preceding emergence of the most recent common ancestor by years to decades. Similar genomic changes recent common ancestor by years to decades. Similar genomic changes are seen in inherited kidney cancers. Modeling differences in age-incidence between inherited and sporadic cancers suggests that the number of cells with 3p loss capable of initiating sporadic tumors is no more than a few hundred. Targeting essential genes in deleted regions of chromosome 3p could represent a potential preventative strategy for renal cancer. HiSeq X Ten;ILLUMINA 164
EGAD00001003444 This dataset contains both standard RNA-Seq and small RNA-Seq of TSC related cortical tubers and age matched cortical controls. For the standard RNA-Seq paired-end sequencing was carried out. Each sample was split across multiple lanes. For the files available here the multiple lanes have been merged together, resulting in one forward and one reverse .fastq file for each sample. Small RNA-Seq was carried out on the same samples that underwent standard RNA-Seq. Again paired-end sequencing was carried out. The files here are raw and will need to be undergo quality control and trimming. Illumina HiSeq 2500;ILLUMINA 44
EGAD00001003443 Massively parallel nanowell-based single-cell gene expression profiling Illumina HiSeq 2500;ILLUMINA 14
EGAD00001003441 Total of 584 tumor specimens and/or patient-derived cells across 14 cancer types were subjected for whole-exome/targeted-exome and/or whole-transcriptome sequencing. Illumina HiSeq 2500;ILLUMINA 584
EGAD00001003440 One file of patient 16 with WGS done on Illumina HiSeq X-Ten. For research purpose and authorised user only. HiSeq X Ten;ILLUMINA 1
EGAD00001003439 Three files of patients 10, 11 and 13 with WGS done on Illumina HiSeq X Ten. For research purpose and authorised user only. HiSeq X Ten;ILLUMINA 3
EGAD00001003438 Three files of patients 20, 23 and 25 with WGS done on Illumina HiSeq 2000. For research purpose and authorised user only. Illumina HiSeq 2000;ILLUMINA 3
EGAD00001003437 Fourteen files of patients 1, 2, 4, 6, 7, 8, 9, 12, 14, 16, 17, 18, 19 and 27 with WGS done on Illumina MiSeq with low coverage. For research purpose and authorised user only. Illumina MiSeq;ILLUMINA 14
EGAD00001003436 Seven files of patients 3, 21, 29, 30, 31, 32 and 33 with WGS done on Illumina MiSeq with high coverage. For research purpose and authorised user only. Illumina MiSeq;ILLUMINA 7
EGAD00001003435 Whole Genome Sequencing for the paper titled "Orthotopic Patient-Derived Xenografts of Pediatric Solid Tumors" Illumina HiSeq 2000;ILLUMINA 150
EGAD00001003434 Whole Exome Sequencing for the paper titled "Orthotopic Patient-Derived Xenografts of Pediatric Solid Tumors" Illumina HiSeq 2000;ILLUMINA 149
EGAD00001003433 RNA-Seq data for the paper titled "Orthotopic Patient-Derived Xenografts of Pediatric Solid Tumors" Illumina HiSeq 2000;ILLUMINA 95
EGAD00001003432 ChIP-Seq data for the paper titled "Orthotopic Patient-Derived Xenografts of Pediatric Solid Tumors" Illumina HiSeq 2000;ILLUMINA 20
EGAD00001003431 High-coverage WGS sequencing of DNA samples from 45pairs GCs was performed on the Illumina HiSeq X Ten System. Illumina HiSeq 2000 (ILLUMINA) 88
EGAD00001003430 RNA analysis of six patients 34, 35, 36, 37, 38 and 39 with WGS done on Illumina HiSeq2500. For research purpose and authorised user only. Illumina HiSeq 2500;ILLUMINA 6
EGAD00001003429 RNA analysis of two patients 11 and 15 with WGS done on Illumina HiSeq2000. For research purpose and authorised user only. Illumina HiSeq 2000;ILLUMINA 2
EGAD00001003428 RNAseq data from the study: "Widespread DNA hypomethylation and differential gene expression in Turner syndrome". Illumina HiSeq 2000;ILLUMINA, NextSeq 500;ILLUMINA 37
EGAD00001003427 Genome-wide profiling of DNA methylation levels by RRBS in 349 samples, derived from 112 glioblastoma (IDH wildtype) patients, 13 IDH muated brain tumor patients, and 5 normal brain controls. For each patient samples from at least two and up to six tumor resections are available. For 6 patients multiple regions of each tumor were sampled. Illumina HiSeq 2000;ILLUMINA, Illumina HiSeq 3000;ILLUMINA, Illumina HiSeq 4000;ILLUMINA 349
EGAD00001003425 A EGFR mutant NSCLC cell line which is sensitive to AZD9291 inhibition was mutagenised with the chemical mutagen ENU and then drug selected using a AZD9291. Single cell derived colonies were then manually picked and expanded in drug. Resistance was confirmed in a 14 day assay and DNA was collected. These then underwent targeted amplicon-based sequencing to confirm candidate resistance effectors hypothesised from currently available literature. This dataset contains all the data available for this study on 2017-07-05. Illumina MiSeq;ILLUMINA 177
EGAD00001003422 WXS from barcoded cells that are FACS sorted from GBM-719 xenografts, and the germline reference from patient GBM-719. The 4 xenografts are named according to passage (secondary or tertiary) and treatment (vehicle control or temozolomide). Illumina HiSeq 2500;ILLUMINA 5
EGAD00001003421 Sequence data of 28 Samples (19 chronic lymphocytic leukemia, 9 control) Including RNA-Seq and ChIP-Seq of following histone modifications: H3, H3K4me1, H3K4me3, H3K9ac, H3K9me3, H3K27ac, H3K27me3, H3K36me3 Project see: http://www.cancerepisys.org/ 28
EGAD00001003419 Illumina HiSeq 2000;ILLUMINA 50
EGAD00001003416 ICGC PCAWG Dataset for RNA-Seq BAM aligned using TopHat2. Project: OV-AU. 93
EGAD00001003415 ICGC PCAWG Dataset for RNA-Seq BAM aligned using Star. Project: OV-AU. 93
EGAD00001003414 June 2017 data update (bam/fastq) for reference epigenomes generated at Centre for Epigenome Mapping Technologies (Canadian Epigenetics, Environment and Health Research Consortium), Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada as part of the International Human Epigenome Consortium. Illumina HiSeq 2500;ILLUMINA, NextSeq 500;ILLUMINA 40
EGAD00001003412 Illumina HiSeq 2000;ILLUMINA 152
EGAD00001003411 ICGC PCAWG Dataset for RNA-Seq BAM aligned using TopHat2. Project: PACA-AU. 81
EGAD00001003410 ICGC PCAWG Dataset for RNA-Seq BAM aligned using Star. Project: PACA-AU. 81
EGAD00001003409 Amyotrophic lateral sclerosis (ALS) and frontotemporal dementia (FTD) are part of a clinical, pathological and genetic continuum. The purpose of the present study was to assess the mutation burden that is present in ALS and/or FTD known disease-causing genes in 54 patients (16 with available postmortem neuropathological diagnosis) with concurrent ALS and FTD (ALS/FTD) not-carrying the C9orf72 hexanucleotide repeat expansion, the most important genetic cause in both diseases. Illumina HiSeq 2500;ILLUMINA 54
EGAD00001003407 Whole-genome sequencing and phasing of admixed Aboriginal Australian genomes and Papua New Guinean genomes using 10x Genomics Chromium technology. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute please see http://www.sanger.ac.uk/datasharing/ This dataset contains all the data available for this study on 2017-06-27. HiSeq X Ten;ILLUMINA 4
EGAD00001003406 DDD DATAFREEZE 2016-10-03: 7831 trios - exome sequence CRAM files 24,697
EGAD00001003405 High-coverage WGS sequencing of DNA samples from 23pairs GCs was performed on the Illumina HiSeq X Ten System. Illumina HiSeq 2000;ILLUMINA 46
EGAD00001003404 Background: Epigenetic heterogeneity within a tumour can play an important role in tumour evolution and the emergence of resistance to treatment. It is increasingly recognised that the study of DNA methylation (DNAm) patterns along the genome - so-called `epialleles' - offers greater insight into epigenetic dynamics than conventional analyses which examine DNAm marks individually. Results: We have developed a Bayesian model to infer which epialleles are present in multiple regions of the same tumour. We apply our method to reduced representation bisulfite sequencing (RRBS) data from multiple regions of one lung cancer tumour and a matched normal sample. The model borrows information from all tumour regions to leverage greater statistical power. The total number of epialleles, the epiallele DNAm patterns, and a noise hyperparameter are all automatically inferred from the data. Uncertainty as to which epiallele an observed sequencing read originated from is explicitly incorporated by marginalising over the appropriate posterior densities. The degree to which tumour samples are contaminated with normal tissue can be estimated and corrected for. By tracing the distribution of epialleles throughout the tumour we can infer the phylogenetic history of the tumour, identify epialleles that differ between normal and cancer tissue, and define a measure of global epigenetic disorder. Conclusions: Detection and comparison of epialleles within multiple tumour regions enables phylogenetic analyses, identification of differentially expressed epialleles, and provides a measure of epigenetic heterogeneity. Illumina HiSeq 2500;ILLUMINA 8
EGAD00001003400 We present targeted NGS panel data from 170 samples that were processed using the TruSightTM Cancer (TSC) panel (Illumina, San Diego, CA, USA), which targets 94 genes and 284 SNPs associated with a predisposition towards cancer. The samples are enriched for CNVs in the genes of interest. All CNVs have previously been assessed with MLPA and can therefore be considered as confirmed. Illumina MiSeq;ILLUMINA 170
EGAD00001003399 RNAseq dataset of 34 samples (6 normals, 7 stroma-enriched, 21 malignant cells-enriched) from patients with resected pancreatic ductal carcinoma. Illumina HiSeq 4000;ILLUMINA 34
EGAD00001003395 This dataset consists of the exome sequencing data for 30 tumour and germline DNA pairs derived from relapsed/refractory DLBCL. 60
EGAD00001003394 This dataset contains bam files for ChIP-seq experiments for 6 neuroblastoma PDXs (Patient Derived Xenograft). It includes the bam files for the H3K27ac mark as well as the bam files of the corresponding input DNA for each sample. Illumina HiSeq 2500 (ILLUMINA) 6
EGAD00001003393 This dataset contains bam files for RNA-seq experiments for 6 neuroblastoma PDXs (Patient Derived Xenograft) and 3 pairs of neuroblastoma tumors at diagnosis and at relapse. NextSeq 500 (ILLUMINA), Illumina HiSeq 2500 (ILLUMINA), Illumina HiSeq 4000 (ILLUMINA) 12
EGAD00001003392 High-coverage WGS sequencing of DNA samples from 51pairs GCs was performed on the Illumina HiSeq X Ten System. Illumina HiSeq 2000;ILLUMINA 102
EGAD00001003391 DCM-controls (113 human non-DCM samples) human heart biopsies from 113 non-diseased controls were subjected to RNA sequencing in order to assess transcriptome variation. We used Illumina HiSeq2000 technology. Each sample-dataset contains the output from tophat-1.4.1 (one *.bam file with the aligned reads and two *.fq files one with the not aligned forward read and one with the revers unaligned reads). We reveal extensive differences of gene expression and splicing between dilated cardiomyopathy patients and controls. Illumina HiSeq 2000;ILLUMINA 113
EGAD00001003390 DCM-cases (149 human DCM samples) human heart biopsies from 149 patients with dilated cardiomyopathy (DCM) were subjected to RNA sequencing in order to assess transcriptome variation. We used Illumina HiSeq2000 technology. Each sample-dataset contains the output from tophat-1.4.1 (one *.bam file with the aligned reads and two *.fq files one with the not aligned forward read and one with the revers unaligned reads). We reveal extensive differences of gene expression and splicing between dilated cardiomyopathy patients and controls. Illumina HiSeq 2000;ILLUMINA 149
EGAD00001003388 Aligned, merged and deduplicated BAM files from HiSeq whole genome sequencing of 366 samples: matched tumour-normal pairs from 183 melanoma patients. 366
EGAD00001003387 MinION;OXFORD_NANOPORE 19
EGAD00001003386 Whole-exome sequencing on AB 5500xl Genetic Analyzer of colorectal cancer primary tumor sample (OT2_cohort) AB 5500 Genetic Analyzer;ABI_SOLID 1
EGAD00001003385 Whole-exome sequencing on AB 5500xl Genetic Analyzer of Blood EDTA (OT2_cohort) AB 5500 Genetic Analyzer;ABI_SOLID 1
EGAD00001003384 Whole-exome sequencing on Illumina HiSeq2000/2500 of Patient-derived xenograft derived from colorectal cancer primary tumor sample (OT2_cohort) Illumina HiSeq 2000;ILLUMINA 14
EGAD00001003383 Whole-exome sequencing on Illumina HiSeq2000/2500 of colorectal cancer primary tumor sample (OT2_cohort) Illumina HiSeq 2000;ILLUMINA 7
EGAD00001003382 MinION;OXFORD_NANOPORE 26
EGAD00001003381 Whole-exome sequencing on Illumina HiSeq2000/2500 of Patient-derived xenograft derived from colorectal cancer metastasis sample (OT2_cohort) Illumina HiSeq 2000;ILLUMINA 10
EGAD00001003380 Whole-exome sequencing on Illumina HiSeq2000/2500 of PDO culture derived from colorectal cancer metastasis sample (OT2_cohort) Illumina HiSeq 2000;ILLUMINA 1
EGAD00001003379 Whole-exome sequencing on Illumina HiSeq2000/2500 of colorectal cancer metastasis sample (OT2_cohort) Illumina HiSeq 2000;ILLUMINA 4
EGAD00001003378 Whole-exome sequencing on Illumina HiSeq2000/2500 of normal colon control tissue (OT2_cohort) Illumina HiSeq 2000;ILLUMINA 1
EGAD00001003377 Whole-exome sequencing on Illumina HiSeq2000/2500 of Blood EDTA (OT2_cohort) Illumina HiSeq 2000;ILLUMINA 10
EGAD00001003376 Whole-genome sequencing on Illumina HiSeq2000/2500 of Patient-derived xenograft derived from colorectal cancer primary tumor sample (OT2_cohort) Illumina HiSeq 2000;ILLUMINA 12
EGAD00001003375 Whole-genome sequencing on Illumina HiSeq2000/2500 of colorectal cancer primary tumor sample (OT2_cohort) Illumina HiSeq 2000;ILLUMINA 7
EGAD00001003374 Whole-genome sequencing on Illumina HiSeq2000/2500 of Patient-derived xenograft derived from colorectal cancer metastasis sample (OT2_cohort) Illumina HiSeq 2000;ILLUMINA 8
EGAD00001003373 Whole-genome sequencing on Illumina HiSeq2000/2500 of PDO culture derived from colorectal cancer metastasis sample (OT2_cohort) Illumina HiSeq 2000;ILLUMINA 1
EGAD00001003372 Whole-genome sequencing on Illumina HiSeq2000/2500 of colorectal cancer metastasis sample (OT2_cohort) Illumina HiSeq 2000;ILLUMINA 4
EGAD00001003371 Whole-genome sequencing on Illumina HiSeq2000/2500 of normal colon control tissue (OT2_cohort) Illumina HiSeq 2000;ILLUMINA 1
EGAD00001003370 Whole-genome sequencing on Illumina HiSeq2000/2500 of Blood EDTA (OT2_cohort) Illumina HiSeq 2000;ILLUMINA 10
EGAD00001003369 RNAseq on Illumina HiSeq2000/2500 of Patient-derived xenograft derived from colorectal cancer primary tumor sample (OT2_cohort) Illumina HiSeq 2000;ILLUMINA 13
EGAD00001003368 RNAseq on Illumina HiSeq2000/2500 of PDO culture derived from colorectal cancer primary tumor sample (OT2_cohort) Illumina HiSeq 2000;ILLUMINA 1
EGAD00001003367 RNAseq on Illumina HiSeq2000/2500 of colorectal cancer primary tumor sample (OT2_cohort) Illumina HiSeq 2000;ILLUMINA 7
EGAD00001003366 RNAseq on Illumina HiSeq2000/2500 of Patient-derived xenograft derived from colorectal cancer metastasis sample (OT2_cohort) Illumina HiSeq 2000;ILLUMINA 10
EGAD00001003365 RNAseq on Illumina HiSeq2000/2500 of PDO culture derived from colorectal cancer metastasis sample (OT2_cohort) Illumina HiSeq 2000;ILLUMINA 1
EGAD00001003364 RNAseq on Illumina HiSeq2000/2500 of colorectal cancer metastasis sample (OT2_cohort) Illumina HiSeq 2000;ILLUMINA 4
EGAD00001003363 Whole-exome sequencing on Illumina HiSeq2000/2500 of Patient-derived xenograft derived from colorectal cancer primary tumor sample (EPO2_cohort) Illumina HiSeq 2000;ILLUMINA 114
EGAD00001003362 RNAseq on Illumina HiSeq2000/2500 of Patient-derived xenograft derived from colorectal cancer primary tumor sample (EPO2_cohort) Illumina HiSeq 2000;ILLUMINA 49
EGAD00001003361 VCF files containing mitochondrial variant calls using MToolbox 432
EGAD00001003360 Bam files containing mitochondrial alignments, extracted from CPCGene Whole Genome Alignments 432
EGAD00001003359 In this study, we present the results of a custom “pan-cardiomyopathy panel” in a molecular screening of 38 unrelated patients, 16 affected by DCM, 14 by HCM, and 8 by ARVC. The panel was designed using the Design Studio Tool (Illumina, San Diego, CA,USA). Coding regions and intron–exon boundaries of 115 genes, known to be associated with 7 DCM, HCM, and ARVC as well as channelopathies, were selected for targeted gene enrichment. For genes with multiple transcripts, all exons included in transcripts expressed in cardiac muscle were considered in the gene panel design. Total DNA was extracted from peripheral blood samples using the Wizard Genomic DNA Purification Kit (Promega, Mannheim, Germany) according to the manufacturer’s instructions, quantified, and qualitatively checked using NanoDrop 2000c (Thermo Fisher Scientific, Waltham, MA, USA). Custom targeted gene enrichment and DNA library preparation were performed using the Nextera Capture Custom Enrichment kit (Illumina) according to the manufacturer’s instructions. Targeted regions were sequenced using the Illumina MiSeq platform, generating approximately two millions of 150-bp paired-end reads for each sample (Q30 ≥90%). Illumina MiSeq;ILLUMINA 38
EGAD00001003358 Whole-exome sequencing of papillary thyroid carcinoma from Saudi Arabia 189
EGAD00001003357 Aligned, merged and deduplicated BAM files from HiSeq whole exome sequencing of 106 samples: matched tumour-normal pairs from 53 melanoma patients. 106
EGAD00001003355 From 17 patients undergoing knee joint replacement surgery for osteoarthritis, we collected 4 samples each: intact cartilage, degraded cartilage, synovium, and meniscus. We also collected blood for DNA analysis. Multiplexed libraries were sequenced on Illumina HiSeq 2000 (75bp paired-end read length) and a cram file was produced for each sample. This dataset contains all the data available for this study on 2017-06-09. Illumina HiSeq 2500;ILLUMINA 72
EGAD00001003354 From 9 patients undergoing hip joint replacement surgery for osteoarthritis, we collected 3 cartilage samples each: a low-grade sample (no obvious evidence of damage or fibrillation); a high-grade sample (damaged and fibrillated cartilage); an osteophytic sample (overlaid bony protrusions mainly around the margins of the articular surface). Multiplexed libraries were sequenced on Illumina HiSeq 2000 (75bp paired-end read length) and a cram file was produced for each sample. This dataset contains all the data available for this study on 2017-06-09. Illumina HiSeq 2500;ILLUMINA 27
EGAD00001003353 BAM outputs from STAR (https://github.com/alexdobin/STAR) analysis of RNASeq sequencing on HiSeq platform of 56 tumour samples from 46 melanoma cases. Gene model = Ensembl version 70 56
EGAD00001003351 In order to comprehensively investigate the genetic relationship between PTC tumors and benign nodules, we totally collected 127 fresh-frozen biopsies samples from 28 patients with concurrent thyroid benign nodule and PTC (n=20) or simple benign nodule (n=8). We carried out whole-exome sequencing on all the 127 biopsies samples and RNA-sequencing in total of 40 samples. Illumina HiSeq 2500;ILLUMINA 127
EGAD00001003350 DDD DATAFREEZE 2016-10-03: 7831 trios - phenotypic and family descriptions 24,674
EGAD00001003349 ChIP-seq data (H3K4Me1, H3K4Me3, H3K27Ac histone modifications) in experimental triplicates on multiple myeloma cell line KMS11 and plasma cell leukaemia cell lines L363 and JJN3. ChIP reactions were performed on a Diagenode SX-8G IP-Star Compact using Diagenode automated Ideal Kit. ChIP libraries were generated using HTP Illumina library preparation kit, and sequenced using Illumina HiSeq 2000 with 100 bp single-ended reads. ChIP-seq files are in BED format. Illumina HiSeq 2000;ILLUMINA 9
EGAD00001003348 The differentiation of distinct multifocal hepatocellular carcinoma (HCC): multicentric disease vs. intrahepatic metastases, in which the management and prognosis varies substantively, remains problematic. We aim to stratify multifocal HCC and identify novel diagnostic and prognostic biomarkers by performing whole genome and transcriptome sequencing, as part of a multi-omics strategy. Illumina HiSeq 2000;ILLUMINA 8
EGAD00001003347 This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ This dataset contains all the data available for this study on 2017-05-24. Illumina HiSeq 2000;ILLUMINA 76
EGAD00001003345 exome sequence data for 57 HIV elite long term non-progressors and rapid progressors. Complete dataset of improved BAMs mapped to hs37d5 and including phenotype information. 57
EGAD00001003344 Transcriptome profiling of 25 prostate tumor samples by RNA-Seq Illumina HiSeq 2000;ILLUMINA 25
EGAD00001003342 Identification of fusion transcripts by RNA-sequencing and Whole genome sequencing of a breast cancer patient sample (METABRIC ID MB-0152) Illumina HiSeq 2000;ILLUMINA 3
EGAD00001003341 Sequence data from fungal infection isolated from neural tissue in ALS patients. Illumina MiSeq;ILLUMINA 34
EGAD00001003340 DDD DATAFREEZE 2016-10-03: 7831 trios - VCF files 24,664
EGAD00001003339 Whole exome library making will be performed on genomic DNA derived from radiotherapy induced sarcoma samples and matched normal DNA from the same patients. Next Generation sequencing will be performed on the resulting libraries and mapped to build 37 of the human reference genome to facilitate the identification of mutations This dataset contains all the data available for this study on 2017-05-17. Illumina HiSeq 2000;ILLUMINA 7
EGAD00001003337 T cells isolated from peripheral blood, tumors and adjacent normal tissues from six hepatocellular carcinoma patients. SmartSeq2 and Tang2009 protocol were used to amplify RNA from single T cells. High depth enables simultaneously expression profiling and TCR assembling. Illumina HiSeq 2500 (ILLUMINA), Illumina HiSeq 4000 (ILLUMINA) 5,063
EGAD00001003336 BAM outputs from RSEM (https://deweylab.github.io/RSEM/) analysis of RNASeq sequencing on HiSeq platform of tumour samples from 29 pancreatic neuroendocrine cases. 29
EGAD00001003335 A resource for assessment of exon CNV calling methods in targeted NGS data, we here present the ICR96 exon CNV validation series. The dataset includes high-quality sequencing data from a targeted NGS assay (the TruSight Cancer Panel) together with Multiplex Ligation-dependent Probe Amplification (MLPA) results for 96 independent samples. 66 samples contain at least one validated exon CNV and 30 samples have validated negative results for exon CNVs in 26 genes. The dataset includes 46 exon CNVs in BRCA1, BRCA2, TP53, MLH1, MSH2, MSH6, PMS2, EPCAM and PTEN, giving excellent representation of the cancer predisposition genes most frequently tested in clinical practice. Moreover, the validated exon CNVs include 25 single exon CNVs the most difficult exon CNV to detect. Illumina HiSeq 2500;ILLUMINA 96
EGAD00001003334 Targeted exome sequencing of patient derived xenografts from primary colorectal tumours and liver metastases. This dataset contains all the data available for this study on 2017-05-11. Illumina HiSeq 2000;ILLUMINA 573
EGAD00001003332 PCR and MiSeq validation for early embryonic substitution candidates from 400 Breast cancer patients This dataset contains all the data available for this study on 2017-05-11. Illumina MiSeq;ILLUMINA 4
EGAD00001003331 Whole-exome sequencing of a cohort of families (probands and affected/unaffected relatives) suffering from one of two rare thyroid disorders: congenital hypothyroidism (CH) and resistance to thyroid hormone (RTH). This dataset contains all the data available for this study on 2017-05-11. Illumina HiSeq 2000;ILLUMINA, Illumina HiSeq 2500;ILLUMINA 78
EGAD00001003330 The samples will be sequenced for a targeted panel of cancer relevant genes (n ~ 370) and analysed for somatic mutations. This dataset contains all the data available for this study on 2017-05-11. Illumina HiSeq 2000;ILLUMINA 416
EGAD00001003329 The offspring of first cousin marriages have ~6% of their genome autozygous, i.e. homozygous identical by descent, or even more if there was further consanguinity in their ancestry. In the UK there are large populations with very high first cousin marriage rates of 20-50%. Sequencing the exomes of a sample of these individuals has the potential both to support genetic health programmes in these populations, and to provide genetic research information about rare loss of function mutations. This pilot study based on existing cohort samples from the Born In Bradford study will identify homozygous individuals for almost all variants down to an allele frequency around 1%, plus individuals carrying hundreds of new homozygous rare loss-of-function variants, and will support development of community relations and ethics for a wider study currently being designed. The data deposited in the EGA consist of low coverage whole exome sequencing on these samples. This dataset contains all the data available for this study on 2017-05-11. Illumina HiSeq 2000;ILLUMINA, Illumina HiSeq 2500;ILLUMINA 3,188
EGAD00001003328 Clinical and genetic information of an individual with RVOT-VT and a KCNK2 (TREK1) gene mutation obtained after whole exome sequencing. 1
EGAD00001003326 Azoospermia, characterized by the absence of spermatozoa in the ejaculate is a common cause of male infertility with a poorly characterized etiology. Exome sequencing analysis of two azoospermic brothers allowed the identification of a homozygous splice mutation in SPINK2, encoding a serine protease inhibitor believed to target acrosin, the main sperm acrosomal protease. In accord with these findings we observed that homozygous Spink2 KO male mice had azoospermia. Moreover, despite normal fertility, heterozygous male mice had a high rate of morphologically abnormal spermatozoa and a reduced sperm motility. Further analysis demonstrated that in the absence of Spink2, protease-induced stress initiates Golgi fragmentation and prevents acrosome biogenesis leading to spermatid differentiation arrest. We also observed a deleterious effect of acrosin overexpression in HEK cells, effect that was alleviated by SPINK2 coexpression confirming its role as acrosin inhibitor. These results demonstrate that SPINK2 is necessary to neutralize proteases during their cellular transit towards the acrosome and that its deficiency induces a pathological continuum ranging from oligoasthenoteratozoospermia in heterozygotes to azoospermia in homozygotes. Illumina HiSeq 2000;ILLUMINA 2
EGAD00001003325 Exome from EGAS00001002441 Illumina HiSeq 2500;ILLUMINA 2
EGAD00001003323 Runs that contain data for the sensitivity and specificity experiments for BiSeqS. Illumina MiSeq;ILLUMINA 2
EGAD00001003321 Systematic next generation sequencing efforts are beginning to define the genomic landscape across a range of primary tumours, but we know very little of the mutational evolution that contributes to disease progression. We therefore propose to obtain a comprehensive description of genomic, transcriptomic and epigenomic changes in a cohort of matched primary and metastatic colorectal cancers, and additionally to explore the extent to which those mutations identified as recurrent in the metastatic setting are able to subvert normal biological processes using both genetically engineered mouse models and established cancer cell lines. This study will enable us to define to what extent primary tumour profiling can capture the biological processes operative in matched metastases as well as the significance of intratumoural heterogeneity. This dataset contains all the data available for this study on 2017-05-04. Illumina HiSeq 2000;ILLUMINA 523
EGAD00001003320 Transcriptome sequencing of tumour tissue, adjacent normal tissue and derived organoids/tumoroids from colorectal cancer This dataset contains all the data available for this study on 2017-05-04. Illumina HiSeq 2000;ILLUMINA, Illumina HiSeq 2500;ILLUMINA 106
EGAD00001003318 RNA-sequencing alignment for SYSCOL colorectal adenoma-carcinoma samples 314
EGAD00001003317 There are 22 pairs of LAML cases in this project which belongs to LAML-CN.The library is constructed by the Illumina protocol. Illumina HiSeq 2000;ILLUMINA 63
EGAD00001003316 RNAseq of LC2AD with AD80 or DMSO Plenker et al., Mechanistic insight into RET kinase inhibitors targeting the DFG-out conformation in RET-rearranged cancer Illumina HiSeq 2000;ILLUMINA 1
EGAD00001003315 This dataset includes the high-throughput sequencing data from a study entitled "Clonal History and Genetic Predictors of Transformation into Small Cell Carcinomas from Lung Adenocarcinomas". Whole-genome sequencing libraries were generated by PCR-free methods, and sequencing run was made in HiSeq X or HiSeq 2500 machines. PCR duplicates-marked, indel-realigned, and base-recalibrarted BAM files are provided in our dataset. HiSeq X Ten;ILLUMINA, Illumina HiSeq 2500;ILLUMINA 16
EGAD00001003311 Dataset contains one sample derived from gDNA of human fibroblasts. Files are in FASTQ format and were generated using the Agilent SureSelect Human All Exon 50Mb Kit and followed by Next Generation Sequencing on a HighSeq2000 instrument (Illumina). Illumina HiSeq 2000;ILLUMINA 1
EGAD00001003310 There are 66 pairs of LAML cases(complete genomics) in this project which belongs to LAML-CN..The library is constructed by the Completes Genomics protocol. Complete Genomics;COMPLETE_GENOMICS 66
EGAD00001003309 The study will investigate serial samples from the same patient taken at the time of MGUS or SMM diagnosis, and later at the time of evolution towards MM. Samples will be sequenced by whole genome along with a matched normal to obtain the highest possible amount of information toinvestigate genomic changes at disease evolution. This dataset contains all the data available for this study on 2017-04-27. HiSeq X Ten;ILLUMINA 139
EGAD00001003308 This is an in vitro genome-wide CRISPR/cas9 screen in human glioblastoma stem cells, screening for genes essential for survival of these cells. These cells express cas9 and have been transfected with a guide RNA library causing gene knockouts. We will analyse the sequencing data for depletion of guide RNAs. This dataset contains all the data available for this study on 2017-04-27. Illumina HiSeq 2000;ILLUMINA 10
EGAD00001003307 In this project we will use exome sequencing to identify somatic mutations in lesions from a patient with a germline mutation in the protection of telomeres 1 gene (POT1). This dataset contains all the data available for this study on 2017-04-27. Illumina MiSeq;ILLUMINA, Illumina HiSeq 2000;ILLUMINA 40
EGAD00001003306 Exome sequencing data of 15 French Caucasian and 10 African-Caribbean men with prostate Cancer. Illumina HiSeq 2000;ILLUMINA 50
EGAD00001003305 Diffuse Intrinsic Pontine Glioma (DIPG) is a fatal brain cancer that arises in the brainstem of children with no effective treatment. To understand what drives DIPGs we integrated whole-genome-sequencing with methylation, expression and copy-number profiling. AB SOLiD System (ABI_SOLID), Illumina HiSeq 2500 (ILLUMINA) 23
EGAD00001003304 We collected tumor samples and adjacent nomal mucosae from 46 patients with colorectal cancer in surgical operation from 2014 to 2016 in the First Affiliated Hospital of Chongqing Medical University (Chongqing, China) and the Research Institute of Surgery, Third Military Medical University (Chongqing, China). the qualified captured library of each sample was then loaded on Illumina HiSeq 2000 (Illumina, San Diego, CA) platforms and subjected to high-throughput sequencing. Complete Genomics;COMPLETE_GENOMICS 38
EGAD00001003303 The evolution of four breast cancers was analyzed using longitudinal samples collected over 2-15 years. Whole-genome sequencing and single-cell RNA-Seq were used to analyze evolution. We have deposited VCF files for SNV, indel, and structural variant calls from WGS data, and a text file showing transcripts per million (TPM) expression for the single-cell RNA-Seq data. 16
EGAD00001003302 Illumina HiSeq 3000;ILLUMINA 21
EGAD00001003301 Whole exome sequencing of 10 metastatic biopsies from four TRACERx100 patients (see EGA dataset EGAS00001002247), collected either after relapse or death. The data from these samples are initially published with Abbosh, C. et al. Phylogenetic ctDNA analysis depicts early stage lung cancer evolution. Nature, http://dx.doi.org/10.1038/nature22364 (2017). Abstract: Earlier detection of relapse following primary surgery for non-small cell lung cancer and the characterization of emerging subclones seeding metastatic sites might offer new therapeutic approaches to limit tumor recurrence. The potential to non-invasively track tumor evolutionary dynamics in ctDNA of early-stage lung cancer is not established. Here we conduct a patient-specific approach to ctDNA profiling in the first 100 lung TRACERx (TRAcking Cancer Evolution through therapy (Rx)) study participants, including one patient co-recruited to the PEACE (Posthumous Evaluation of Advanced Cancer Environment) post-mortem study. We identify independent predictors of ctDNA release in early-stage non-small cell lung cancer and perform tumor volume limit of detection analyses. Through blinded profiling of post-operative plasma, we observe evidence of adjuvant chemotherapy resistance and identify patients destined to experience recurrence of their lung cancer. Finally, we show that phylogenetic ctDNA profiling tracks the subclonal nature of lung cancer relapse and metastases, providing a new approach for ctDNA driven therapeutic studies. 10
EGAD00001003300 Paired-end reads were aligned to human reference genome build hg19 by BWA. Single nucleotide variants and small insertions/deletions were called by GATK resulting in a single VCF file including all 176 samples. 175
EGAD00001003298 BAM outputs from RSEM (https://deweylab.github.io/RSEM/) analysis of RNASeq sequencing on HiSeq platform of tumour samples from 95 pancreatic adenocarcinoma cases. 96
EGAD00001003297 9
EGAD00001003296 Integrated callset of low coverage Ethiopian and Egyptian genomes from the Pagani et al. 2015 AJHG paper (doi: http://dx.doi.org/10.1016/j.ajhg.2015.04.019) 220
EGAD00001003295 Integrated callset of high coverage Egyptian genomes from the Pagani et al. 2015 AJHG paper (doi: http://dx.doi.org/10.1016/j.ajhg.2015.04.019) 3
EGAD00001003294 Integrated callset of high coverage Ethiopian genomes from the Pagani et al. 2015 AJHG paper (doi: http://dx.doi.org/10.1016/j.ajhg.2015.04.019) 5
EGAD00001003293 RNA-Seq and WXS from 6 glioblastoma patients Illumina HiSeq 2500;ILLUMINA 11
EGAD00001003292 Illumina HiSeq 2000;ILLUMINA 520
EGAD00001003291 This dataset represents RNA-sequencing data from 278 primary colon cancers obtained from fresh-frozen tumor sections. RNA-sequencing was performed using TruSeq library preparation and samples were sequenced on Illumina NextSeq and HiSeq. The data are available as Illumina NextSeq and HiSeq fastq files (_R1.fastq and _R2.fastq for each tumor sample, 556 files in total). NextSeq 500 (ILLUMINA), Illumina HiSeq 2500 (ILLUMINA) 278
EGAD00001003290 Whole genome sequencing for 12 late onset prostate cancer tumor/control pairs (ICGC) 24
EGAD00001003286 Whole genome sequencing data for MMML (7 tumors and 8 controls) 15
EGAD00001003285 RNA sequencing data for MMML (3 tumor samples and 1 gcbcell) 5
EGAD00001003284 Whole exome sequencing of enteropathy-associated T cell lymphoma (EATL) tumors and paired normals, as well as RNA-sequencing of EATL tumors: including (1) 69 exome capture, paired-end Illumina Hiseq sequencing, BAM files from EATL tumor samples, (2) 36 exome capture, paired-end Illumina Hiseq sequencing, BAM files from EATL paired normal samples, and (3) 32 RNAseq, paired-end Illumina Hiseq sequencing, BAM files from EATL tumor samples. Illumina HiSeq 2500;ILLUMINA 137
EGAD00001003283 Whole genome sequencing data for MMML (healthy cell_line) 24
EGAD00001003282 Analysis scripts and output 37
EGAD00001003281 Genomic alterations driving tumorigenesis result from the interaction of environmental exposures and endogeneous cellular processes. With a diversity of risk factors including viral infection, carcinogenic exposures and metabolic diseases, liver cancer is an ideal model to study these interactions. Whole genome sequencing of liver tumors identified 10 mutational signatures showing distinct relationships with environmental exposures, replication and transcription. Transcription-coupled damage was specifically associated with the liver-specific signature 16 and alcohol intake. Flood of indels were identified in very highly expressed hepato-specific genes, likely resulting from replication-transcription collisions. Reconstruction of sub-clonal architecture revealed mutational signature evolution during tumor development exemplified by the vanishing of aflatoxin-B1 signature in African migrants. These findings shed new light on the natural history of liver cancers. Illumina HiSeq 2000;ILLUMINA 52
EGAD00001003280 NextSeq 550;ILLUMINA 16
EGAD00001003279 RNA sequencing data for 170 medulloblastoma tumor samples Illumina HiSeq 2000;ILLUMINA 171
EGAD00001003278 Whole Exome and Target Sequencing Data in 75 Samples from 5 Hepatocellular Carcinoma Patients. The sequencing was performed by Illumina HiSeq 4000. Background and aims: Intratumoral heterogeneity (ITH) challenges identifying mutations with target therapy potential whereas circulating cell-free DNAs (cfDNAs) could reflect nearly the entire mutation spectrum in given tumors. We investigated how to minimize the limit of ITH for profiling hepatocellular carcinoma (HCC).Methods: Thirty-two multi-regional HCC samples from five patients were subjected to whole exome sequencing (WES) and targeted deep sequencing (TDS). ITH extent was measured by the average percentage of non-ubiquitous mutations (present in parts of tumor regions). Matched cfDNAs were also analyzed by WES and TDS. Profiling efficiencies of single tumor specimen and cfDNA were compared and the one better depicted mutational landscape was selected to screen therapeutic targets.Results: We found variable extents of ITH in HCCs and observed branched and parallel evolution patterns. ITH level decreased at higher sequencing depth of TDS than that measured by WES (28.1% vs 34.9%, P < 0.01) but it remained unchanged upon additional samples analyzed. TDS of single tumor specimen detected an average of 70% the total mutations in HCC. Although more mutations were detected in cfDNA under TDS than WES, an average of 47.2% total HCC mutations uncovered by cfDNA suggested tissue outperform cfDNA and the latter may serve as alternative in profiling HCC genome. Consequently, TDS of single tumor tissue in 66 patients and cfDNAs in four unresectable HCCs identified 38.6% (26/66 and 1/4) patients bearing therapeutic targets.Conclusions: TDS of single tumor specimen could largely circumvent ITH to uncover mutations indicative of target therapy in HCC. Illumina HiSeq 4000;ILLUMINA 124
EGAD00001003276 Whole genome sequencing data for MMML (24 tumor/control pairs), fastq-files Illumina HiSeq 2000;ILLUMINA, Illumina HiSeq 2500;ILLUMINA 49
EGAD00001003275 Targeted resequencing of samples was done with TruSeq custom amplicon low input kit (TSCA-LI, Illumina). The oligo capture probes were designed to include a prefix of 8 random nucleotides at the 5 end of each probe. The assay is designed such that each targeted locus is annealed with two probes, resulting in amplicons tagged with unique molecular identifiers (UMI) (22) of 16 bases. Raw FASTQ sequencing files were processed as following: (a) The first 8 bases were trimmed from each read and recorded with the corresponding base quality scores (BQ) in the attribute field. (b) Reads were aligned with BWA. (c) First round of PCR duplicate cleaning was performed with picard tools markDuplicates using the parameters BARCODE_TAG=BC TAGGING_POLICY=All REMOVE_DUPLICATES=true (d) Since in the previous step only duplicate reads with identical UMIs were removed, a second pass of filtering was done. Reads with identical mapping were considered unique only if their corresponding UMIs were different in at least 3 positions (i.e., UMI edit distance > 2). (e) Paired-end read pairs overlapping genomic positions were clipped to avoid overestimation of the sequencing coverage using bamUtils clipOverlap. NextSeq 550;ILLUMINA 74
EGAD00001003274 Whole genome sequencing data for MMML (tumor/control pairs and one cell_line) 315
EGAD00001003273 Low-coverage whole genome sequencing for the establishment of genomewide copy number alterations in pleura effusions and respective primary tumors Illumina MiSeq;ILLUMINA 20
EGAD00001003272 March 2017 data update (bam/fastq) for reference epigenomes generated at Centre for Epigenome Mapping Technologies (Canadian Epigenetics, Environment and Health Research Consortium), Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada as part of the International Human Epigenome Consortium. Illumina HiSeq 2500;ILLUMINA 8
EGAD00001003271 WGS of 23 patients diagnosed with NKTL. The tumor samples were sequenced with Illumina HiSeq 2500 platform and the resulting FASTq files have been uploaded. Illumina HiSeq 2000 (ILLUMINA) 102
EGAD00001003270 ICGC DCC Release 24, PACA-CA Whole Genome sequence merged alignments 95
EGAD00001003269 High-coverage WGS sequencing of DNA samples from 90pairs GCs was performed on the Illumina HiSeq X Ten System. Illumina HiSeq 2000;ILLUMINA 1,332
EGAD00001003268 HGSC cases in the OvCaRe and CRCHUM Tumour Banks were selected according to the following criteria: (i) were administered platinum taxane based therapy; (ii) relapsed within 12 months (365 days) or had at least longer than 4.5 years (1642.5 days) follow-up data; (iii) had at least 50% tumour content by H&E staining and expert pathology review. All cases were re-reviewed by expert pathologists to confirm the diagnosis of HGSC. Germline BRCA1 and BRCA2 was determined for all patients through hereditary cancer screening programs. The design of cases selection as a discovery cohort was engineered to amplify biological differences by selecting cases from the extremes of the outcome distribution. All HGSC tumours are primary tumour samples. Library construction and sequencing Frozen specimens with >50% tumour cellularity (based on initial slide review) were used for cryosectioning and subsequent nucleic acid extraction. Patient tumour and normal blood samples derived from primary, untreated fresh frozen tumour specimens harvested at diagnosis during standard of care debulking surgery. Germline DNA was provided from peripheral blood buffy coat on all specimens except 13 from Tokyo, where non-cancer frozen tissue was used as a germline source. DNA extraction from both matched normal (blood) and tumour samples (frozen tissue) were performed using the QIAamp Blood and Tissue DNA kit (Qiagen) and quantified using a Qbit fluorometer and reagents (high-sensitivity assay). Three lanes of Illumina HiSeq 2500 v4 chemistry for normal samples and five lanes for tumour samples were obtained. The PCR-free protocol was adopted to eliminate the PCR-induced bias and improve coverage across the genome. Illumina HiSeq 2000;ILLUMINA 118
EGAD00001003267 For GCT cohorts, OvCaRe cases were reviewed, including frozen material, by at least two expert gynecopathologists prior to inclusion in the sequencing cohort who provided the confirmation on final selected cohort. Frozen H&E from Tokyo were also used for evaluation along with representative H&E photos and review done at the Jikei School of Medicine. All GCT tumours are primary tumour samples. Library construction and sequencing Frozen specimens with >50% tumour cellularity (based on initial slide review) were used for cryosectioning and subsequent nucleic acid extraction. Patient tumour and normal blood samples derived from primary, untreated fresh frozen tumour specimens harvested at diagnosis during standard of care debulking surgery. Germline DNA was provided from peripheral blood buffy coat on all specimens except 13 from Tokyo, where non-cancer frozen tissue was used as a germline source. DNA extraction from both matched normal (blood) and tumour samples (frozen tissue) were performed using the QIAamp Blood and Tissue DNA kit (Qiagen) and quantified using a Qbit fluorometer and reagents (high-sensitivity assay). Three lanes of Illumina HiSeq 2500 v4 chemistry for normal samples and five lanes for tumour samples were obtained. The PCR-free protocol was adopted to eliminate the PCR-induced bias and improve coverage across the genome. Illumina HiSeq 2000;ILLUMINA 20
EGAD00001003266 For ENOC cohorts, OvCaRe cases were reviewed, including frozen material, by at least two expert gynecopathologists prior to inclusion in the sequencing cohort who provided the confirmation on final selected cohort. Frozen H&E from Tokyo were also used for evaluation along with representative H&E photos and review done at the Jikei School of Medicine. For ENOC, DAH985 and DG1288 are recurrent and both were treated with chemotherapy after their first surgery. DAH123 is a untreated sample, metastasis from an primary endometrial tumour. All HGSC, GCT, CCOC and the rest ENOC tumours are primary tumour samples. Library construction and sequencing Frozen specimens with >50% tumour cellularity (based on initial slide review) were used for cryosectioning and subsequent nucleic acid extraction. Patient tumour and normal blood samples derived from primary, untreated fresh frozen tumour specimens harvested at diagnosis during standard of care debulking surgery. Germline DNA was provided from peripheral blood buffy coat on all specimens except 13 from Tokyo, where non-cancer frozen tissue was used as a germline source. DNA extraction from both matched normal (blood) and tumour samples (frozen tissue) were performed using the QIAamp Blood and Tissue DNA kit (Qiagen) and quantified using a Qbit fluorometer and reagents (high-sensitivity assay). Three lanes of Illumina HiSeq 2500 v4 chemistry for normal samples and five lanes for tumour samples were obtained. The PCR-free protocol was adopted to eliminate the PCR-induced bias and improve coverage across the genome. Illumina HiSeq 2000;ILLUMINA 58
EGAD00001003265 For CCOC cohorts, OvCaRe cases were reviewed, including frozen material, by at least two expert gynecopathologists prior to inclusion in the sequencing cohort who provided the confirmation on final selected cohort. Frozen H&E from Tokyo were also used for evaluation along with representative H&E photos and review done at the Jikei School of Medicine. All CCOC tumours are primary tumour samples. Library construction and sequencing Frozen specimens with >50% tumour cellularity (based on initial slide review) were used for cryosectioning and subsequent nucleic acid extraction. Patient tumour and normal blood samples derived from primary, untreated fresh frozen tumour specimens harvested at diagnosis during standard of care debulking surgery. Germline DNA was provided from peripheral blood buffy coat on all specimens except 13 from Tokyo, where non-cancer frozen tissue was used as a germline source. DNA extraction from both matched normal (blood) and tumour samples (frozen tissue) were performed using the QIAamp Blood and Tissue DNA kit (Qiagen) and quantified using a Qbit fluorometer and reagents (high-sensitivity assay). Three lanes of Illumina HiSeq 2500 v4 chemistry for normal samples and five lanes for tumour samples were obtained. The PCR-free protocol was adopted to eliminate the PCR-induced bias and improve coverage across the genome. Illumina HiSeq 2000;ILLUMINA, Illumina Genome Analyzer II;ILLUMINA 70
EGAD00001003264 ICGC DCC Release 24, PACA-CA Exome sequence 190
EGAD00001003263 ICGC DCC Release 24, PACA-CA Deep KRAS sequencing 82
EGAD00001003262 High-coverage WES sequencing of DNA samples from 50 PTCs was performed on the Illumina HiSeq 2500 or 4000 System Illumina HiSeq 2000;ILLUMINA 100
EGAD00001003261 These are seven sequencing files form whole exome and whole genome of five tissue samples collected from one pancreatic cancer patient HiSeq X Ten;ILLUMINA, Illumina HiSeq 2500;ILLUMINA 5
EGAD00001003260 The cell lines in this study are a combination of internally sequenced (cosmic) and externally sequenced cell lines known to be “double-wild-type” (lacking BRAF and NRAS somatic mutations). These sequences were realigned in this data set for consistency. 22
EGAD00001003259 Regions of common inter-individual DNA methylation differences in human monocytes – potential function and genetic basis WGBS Data of Samples: 43_Hm03_BlMo_Ct, 43_Hm02_BlMo_Ct, 43_Hm05_BlMo_Ct, 43_Hm01_BlMo_Ct For details about sequencing or sample metadata check http://deep.dkfz.de/ Illumina HiSeq 2000;ILLUMINA 4
EGAD00001003256 Whole genome sequencing for 131 early onset prostate tumor/control pairs (ICGC) 262
EGAD00001003255 Transcriptome of anaplastic meingiomas Illumina HiSeq 2500;ILLUMINA 34
EGAD00001003254 R&D project to develop low input library construction methods. Illumina HiSeq 2500;ILLUMINA 12
EGAD00001003253 Targeted gene screen of cell line tumour samples for testing the new V2 Colorectal gene panel. Illumina HiSeq 2000;ILLUMINA 57
EGAD00001003252 Sequencing of drug resistant organoids Illumina HiSeq 2000;ILLUMINA 36
EGAD00001003250 1cm biospies of from patients undergoing bladder cystectomy will be collected. The underlying muscle and stroma will be removed and the remaining epithelia dissected into small sequential areas which will be sent for ultra-deep exome sequencing using a panel of known cancer and viral genes. Sequence analysis using similar methods to Martincorena I et al (Science 2015, 348:880) will provide an idea of the somatic mutational landscape in these patient samples. Individual patient muscle samples will also be sequenced as a reference. Illumina HiSeq 2000;ILLUMINA 55
EGAD00001003248 A BRAF V600E colorectal organoid which is sensitive to MAP kinase inhibition was mutagenised with the chemical mutagen ENU and then drug selected using a combination of Trametinib, Dabrafenib and Cetuximab. Single cell derived organoids were then manually picked and expanded in drug. Resistance was confirmed in a 14 day assay and DNA was collected. These then underwent targeted amplicon-based sequencing to confirm candidate resistance effectors from a screen in 2 2D BRAF V600E colorectal cell lines. Pools of resistant clones were also sequenced. Illumina MiSeq;ILLUMINA 36
EGAD00001003247 Liberal variant calls generated with VarScan 37
EGAD00001003246 Whole exome sequencing of hepatosplenic T cell lymphoma (HSTL) tumors, paired normals, and cell lines, including (1) 68 exome capture, paired-end Illumina Hiseq sequencing, BAM files from HSTL tumor samples, (2) 20 exome capture, paired-end Illumina Hiseq sequencing, BAM files from HSTL paired normal samples, and (3) 2 exome capture, paired-end Illumina Hiseq sequencing, BAM files from HSTL cell lines. Illumina HiSeq 2500;ILLUMINA 90
EGAD00001003245 We aim to sequence the small RNAs of 22 human melanoma cell lines in biological triplicate in order to define the microRNAs expression profile of each cell line. The data will be correlated to the mutation status and the sensitivity to a panel of drugs in order to identify genes whose deregulation is associated to drug resistance This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ Illumina HiSeq 2500;ILLUMINA 66
EGAD00001003244 We aim to sequence the mRNA transcriptome of 22 human melanoma cell lines in biological triplicate in order to define the gene expression profile of each cell line. The data will be correlated to the mutation status and the sensitivity to a panel of drugs in order to identify genes whose deregulation is associated to drug resistance This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ Illumina HiSeq 2000;ILLUMINA 66
EGAD00001003242 This study comprises of three different datasets. 1) 57 samples from the 1243 canapps cell line study,2) 91 FFPE normal samples and 3) 87 samples from the SCORT WS2 dataset. The aim is to sequence these 235 samples in order to test the new V2 Colorectal bait design. Illumina HiSeq 2000;ILLUMINA 92
EGAD00001003241 Toxoplasmosis is a zoonotic disease caused by a ubiquitous protozoan parasite called Toxoplasma gondii, which can infect all mammal and bird species throughout the world. seroprevalence varies widely between countries. Studies have estimated that between 7-34% of people in the UK have been infected with T. gondii. The vast majority of these people will not have noticed any symptoms, however about 10% of people develop a mild to moderate self limiting flu-like illness. Following the acute active stage of the infection the parasite persists in the body in the form of cysts, particularly in heart and skeletal muscle and nervous system tissues, for many years, and usually for life. In immunocompetent persons these cysts do not pose a health risk. We will use RNA-seq to quantify the transcriptional response of macrophages to T gondii infection. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ Illumina HiSeq 2000;ILLUMINA 18
EGAD00001003240 Study of cell lineage and embryogenesis using biopsy samples from sites across the whole body (post mortem). Sample donors are recruited sensitively through the Phoenix study and consent to samples being taken after their death for both the Phoenix study and this WTSI study. HiSeq X Ten;ILLUMINA 33
EGAD00001003239 This study involves mutagenizing C32, a melanoma cell line, with ENU to identify those mutations which engender resistance to a targeted treatment. Illumina HiSeq 2000;ILLUMINA 80
EGAD00001003237 Primary mucosal melanomas (MMs) arise from melanocytes located in mucosal membranes lining the respiratory, gastrointestinal and urogenital tracts. MMs frequently present late and have a poor prognosis; the 5-year survival rate is only 14%. MM makes up only ~1.4% of all melanomas and it is this rarity that makes knowledge of the genetic changes that contribute to its pathogenesis limited to a small number of exome/genome studies and other targeted studies. Thus to investigate the somatic alterations and mutation spectra in MM genomes, we have extracted genomic DNA from formalin-fixed, paraffin-embedded (FFPE) human MMs, and subjected them to whole exome sequencing. Given the propensity of MM to metastasize, we will also be sequencing metastatic MM lesions; primary and metastatic lesions from the same individual represent an excellent opportunity to identify potential drivers of metastasis in MM. Finally we will sequence 'normal' DNA from the same individual, where possible, to exclude germline variations. Illumina HiSeq 2000;ILLUMINA 141
EGAD00001003236 Raw whole genome sequence data(fastq) for the GATCI project HiSeq X Ten;ILLUMINA 10
EGAD00001003235 Raw exome sequence data(fastq) for the GATCI project unspecified;ILLUMINA 172
EGAD00001003234 Aligned whole genome sequence from AML relapse project 33
EGAD00001003231 Poly A transcriptome sequence of mutifocal hepatocelular carcinoma Illumina HiSeq 2000;ILLUMINA 7
EGAD00001003230 Small RNA expression profiles of the blood plasma-derived exosomes from B-cell chronic lymphocytic leukemia patients Illumina HiSeq 2000;ILLUMINA 3
EGAD00001003227 ICGC PCAWG Dataset for WGS BAM aligned using BWA MEM. Project: OV-AU. 146
EGAD00001003225 ICGC prostate UK study batches 4-6 prostatectomy analysis. Whole genome sequenced normal (blood) and malignant tissue pair of 111 patients. Illumina HiSeq 2000;ILLUMINA 221
EGAD00001003224 we collected tumor samples and adjacent nomal mucosae from 17 patients with colorectal cancer in surgical operation from 2014 to 2016 in the First Affiliated Hospital of Chongqing Medical University (Chongqing, China) and the Research Institute of Surgery, Third Military Medical University (Chongqing, China). the qualified captured library of each sample was then loaded on Illumina HiSeq 2000 (Illumina, San Diego, CA) platforms and subjected to high-throughput sequencing. Illumina HiSeq 2000;ILLUMINA 34
EGAD00001003223 we collected tumor samples and adjacent nomal mucosae from 5 patients with colorectal cancer in surgical operation from 2014 to 2016 in the First Affiliated Hospital of Chongqing Medical University (Chongqing, China) and the Research Institute of Surgery, Third Military Medical University (Chongqing, China). the qualified captured library of each sample was then loaded on Illumina HiSeq 2000 (Illumina, San Diego, CA) platforms and subjected to high-throughput sequencing. Illumina HiSeq 2000;ILLUMINA 10
EGAD00001003222 Aligned, merged and deduplicated BAM files from HiSeqXTen sequencing of six samples: matched tumour-normal pairs from three melanoma patients. 6
EGAD00001003221 Aligned, merged and deduplicated BAM files from BGISeq-500 sequencing of six samples: matched tumour-normal pairs from three melanoma patients. 6
EGAD00001003220 Whole genome, whole exome, and custom panel sequencing of high-grade meningioma cohort 188
EGAD00001003218 There are 80 Brain cancer cases (160 samples)in this study and belong to GBM-CN project. Illumina HiSeq 2000;ILLUMINA 80
EGAD00001003217 Targeted resequencing at high depth (21 genes, 9 chromosomal regions): at least 4 FFPE samples per case and matched germline DNA: * 100 cases with detailed outcome data, including 15 cases with tumour relapse (515 samples) * 40 cases with matched pre-chemotherapy biopsies (240 samples) * 50 nephrogenic rests matched to above cases (50 samples) We expect a proportion (possibly 10%) of cases to be mutationally silent on the above studies, and propose to subsequently carry out integrated whole-genome, methylome and transcriptome studies on matched frozen tissue from these cases Illumina HiSeq 2500;ILLUMINA 35
EGAD00001003216 Whole genome sequencing of tumour normal pairs of human undifferentiated sarcomas. HiSeq X Ten;ILLUMINA 98
EGAD00001003215 This data set contains whole exome sequences of individuals with self-stated parental relatedness from the East London Genes & Health cohort. Rare frequency functional variants in these healthy individuals will be studied with respect to the genetic health of the participants and loss-of-function analysis of human genes. Illumina HiSeq 2000;ILLUMINA, Illumina HiSeq 2500;ILLUMINA 1,702
EGAD00001003213 The olfactory gene repertoire is largely species-specific, shaped by the nature and necessity of chemosensory information for survival in each species' niche. We are intrigued by this interspecific variation and started to investigate the olfactory transcriptome in primates for evidence of selection at the level of receptor gene choice. Having collected this data from two primates, we now wish to extend the analysis to humans. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ Illumina HiSeq 2500;ILLUMINA