Replication Data for: "ramr: an R package for detection of rare aberrantly methylated regions" (doi:10.18710/ED8HSD)

View:

Part 1: Document Description
Part 2: Study Description
Part 3: Data Files Description
Part 4: Variable Description
Part 5: Other Study-Related Materials
Entire Codebook

(external link) (external link)

Document Description

Citation

Title:

Replication Data for: "ramr: an R package for detection of rare aberrantly methylated regions"

Identification Number:

doi:10.18710/ED8HSD

Distributor:

DataverseNO

Date of Distribution:

2020-11-30

Version:

2

Bibliographic Citation:

Nikolaienko, Oleksii, 2020, "Replication Data for: "ramr: an R package for detection of rare aberrantly methylated regions"", https://doi.org/10.18710/ED8HSD, DataverseNO, V2, UNF:6:mHk2VYyjhtEzz5muuhFrqw== [fileUNF]

Study Description

Citation

Title:

Replication Data for: "ramr: an R package for detection of rare aberrantly methylated regions"

Identification Number:

doi:10.18710/ED8HSD

Authoring Entity:

Nikolaienko, Oleksii (University of Bergen)

Producer:

University of Bergen

Software used in Production:

R

Software used in Production:

comb-p

Software used in Production:

ramr

Distributor:

DataverseNO

Distributor:

University of Bergen

Access Authority:

Nikolaienko, Oleksii

Depositor:

Nikolaienko, Oleksii

Date of Deposit:

2020-11-24

Holdings Information:

https://doi.org/10.18710/ED8HSD

Study Scope

Keywords:

Computer and Information Science, Medicine, Health and Life Sciences, ramr, DNA methylation, Computational Biology, Epigenomics

Abstract:

<p>This data set contains all the necessary data sets (biologically-relevant simulated data sets, preprocessed public data sets) used to evaluate performance and obtain results using <i>ramr</i> (<a href="https://github.com/BBCG/ramr">https://github.com/BBCG/ramr</a>, <a href="http://www.bioconductor.org/packages/ramr/">http://www.bioconductor.org/packages/ramr/</a>) - a new method for identification of aberrantly methylated regions (AMRs). All the necessary R scripts that were used for preparation, testing and analysis of data sets are also provided. For additional information please check <i>ramr</i> package README.md file, vignettes or reference citation.</p> <p></p><p><b>Please use TREE VIEW to browse files efficiently</b></p>

<p></p><p><b>Abstract</b></p> <p>With recent advances in the field of epigenetics, the focus is widening from large and frequent disease- or phenotype-related methylation signatures to rare alterations transmitted mitotically or transgenerationally (constitutional epimutations). Merging evidence indicate that such constitutional alterations, albeit occurring at a low mosaic level, may confer risk of disease later in life. Given their inherently low incidence rate and mosaic nature, there is a need for bioinformatic tools specifically designed to analyse such events.</p> <p>We have developed a method (<i>ramr</i>) to identify aberrantly methylated DNA regions (AMRs). <i>ramr</i> can be applied to methylation data obtained by array or next-generation sequencing techniques to discover AMRs being associated with elevated risk of cancer as well as other diseases. We assessed accuracy and performance metrics of <i>ramr</i> and confirmed its applicability for analysis of large public data sets. Using <i>ramr</i> we identified aberrantly methylated regions that are known or may potentially be associated with development of colorectal cancer and provided functional annotation of AMRs that arise at early developmental stages.</p>

Kind of Data:

program source code

Kind of Data:

machine-readable data

Kind of Data:

machine-readable text

Methodology and Processing

Sources Statement

Data Access

Other Study Description Materials

Related Materials

<a href="http://www.bioconductor.org/packages/devel/bioc/src/contrib/ramr_1.1.2.tar.gz"><i>ramr</i> package code, version 1.1.2</a>

Related Studies

<a href="https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE105018">GSE105018</a>: Whole blood DNA methylation profiles in participants of the Environmental Risk (E-Risk) Longitudinal Twin Study at age 18.

<a href="https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE51032">GSE51032</a>: This Series contains data from 845 participants (188 men and 657 women) in the EPIC-Italy cohort that was produced at the Human Genetics Foundation (HuGeF) in Turin, Italy.

<a href="https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE98149">GSE98149</a>: Reprogramming of H3K9me3-dependent heterochromatin during mammalian early embryo development [ChIP-seq].

<a href="https://portal.gdc.cancer.gov/projects/TCGA-COAD">TCGA-COAD</a>: The Cancer Genome Atlas Colon Adenocarcinoma data.

Related Publications

Citation

Title:

<p>Oleksii Nikolaienko, Per Eystein Lønning, Stian Knappskog, <i>ramr</i>: an R/Bioconductor package for detection of rare aberrantly methylated regions, Bioinformatics, 2021;, btab586</p>

Identification Number:

10.1093/bioinformatics/btab586

Bibliographic Citation:

<p>Oleksii Nikolaienko, Per Eystein Lønning, Stian Knappskog, <i>ramr</i>: an R/Bioconductor package for detection of rare aberrantly methylated regions, Bioinformatics, 2021;, btab586</p>

Citation

Title:

<p><i>ramr</i>: an R package for detection of rare aberrantly methylated regions. Oleksii Nikolaienko, Per Eystein Lønning, Stian Knappskog. bioRxiv 2020.12.01.403501</p>

Identification Number:

10.1101/2020.12.01.403501

Bibliographic Citation:

<p><i>ramr</i>: an R package for detection of rare aberrantly methylated regions. Oleksii Nikolaienko, Per Eystein Lønning, Stian Knappskog. bioRxiv 2020.12.01.403501</p>

Other Reference Note(s)

https://github.com/BBCG/ramr

http://www.bioconductor.org/packages/ramr/

File Description--f86831

File: results-5-1000.results.tab

  • Number of cases: 280

  • No. of variables per record: 27

  • Type of File: text/tab-separated-values

Notes:

UNF:6:Jb8LIc18+DBR9OX2xWFcXQ==

File Description--f101253

File: REVISION-SIMULATED-results-5-1000.results.tab

  • Number of cases: 450

  • No. of variables per record: 34

  • Type of File: text/tab-separated-values

Notes:

UNF:6:v552E9gD4TbKA1rHJX8rRg==

Variable Description

List of Variables:

Variables

lib

f86831 Location:

Variable Format: character

Notes: UNF:6:uiRFFYIIUdsrpY9Z/GRhDw==

delta

f86831 Location:

Summary Statistics: Min. 0.025; Max. 0.5; Mean 0.185; StDev 0.1760987015516356; Valid 280.0

Variable Format: numeric

Notes: UNF:6:jCsTJ1OreqxZ2Qybt2KA4g==

cutoff

f86831 Location:

Summary Statistics: Max. 0.01; StDev 0.0032765212830313346; Mean 0.0013888888749999996; Valid 280.0; Min. 1.0E-9

Variable Format: numeric

Notes: UNF:6:1ua5V14fH0GueMJmoUaIhQ==

user.self

f86831 Location:

Summary Statistics: Mean 2.9658321428571526; Max. 6.52399999999989; Min. 0.0529999999998836; StDev 2.574685161607194; Valid 280.0

Variable Format: numeric

Notes: UNF:6:ep0CzdoZNbud7y4SkYZg9Q==

sys.self

f86831 Location:

Summary Statistics: Mean 1.4170964285714374; StDev 0.8539971049039488; Valid 280.0; Min. 0.369000000000028; Max. 3.40300000000002;

Variable Format: numeric

Notes: UNF:6:cORP8De9TiRxeakO9FOv2g==

elapsed

f86831 Location:

Summary Statistics: StDev 1047.432738657964; Min. 11.1909999999916; Mean 530.1005142857147; Valid 280.0; Max. 3106.84100000001

Variable Format: numeric

Notes: UNF:6:RrHCvfFa5ULxyPF+2LrSQg==

user.child

f86831 Location:

Summary Statistics: Valid 280.0; Min. 17.2370000000228; StDev 6075.886427847371; Max. 18043.958; Mean 2984.9033607142856

Variable Format: numeric

Notes: UNF:6:rzzpo0sv564vtTnwcq6uZg==

sys.child

f86831 Location:

Summary Statistics: Mean 65.79111428571433; Valid 280.0; Max. 153.468000000001; StDev 57.696316697888335; Min. 5.42699999999968;

Variable Format: numeric

Notes: UNF:6:67DFlYOT03guezew4CTf0Q==

uGTP

f86831 Location:

Summary Statistics: Max. 1000.0; Min. 1000.0; Mean 1000.0; Valid 280.0; StDev 0.0

Variable Format: numeric

Notes: UNF:6:FNFTaAbN43a+2K/BnFPrkA==

uTP

f86831 Location:

Summary Statistics: StDev 430.14905963498774; Mean 480.88928571428585; Min. 0.0; Max. 1000.0; Valid 280.0

Variable Format: numeric

Notes: UNF:6:7iRuPjQzSr9v3Bfuw2nnRA==

nGTP

f86831 Location:

Summary Statistics: Mean 3000.0; StDev 0.0; Max. 3000.0; Min. 3000.0; Valid 280.0

Variable Format: numeric

Notes: UNF:6:VqFpuPR7lPeFEi7f5E4CsQ==

nTP

f86831 Location:

Summary Statistics: StDev 1283.7860811085825; Min. 0.0; Max. 3000.0; Mean 1089.3785714285725; Valid 280.0

Variable Format: numeric

Notes: UNF:6:fdTDlKjYx796M3yiuV1d0g==

tFP

f86831 Location:

Summary Statistics: Mean 5431.710714285702; Max. 278361.0; StDev 34430.581335007344; Valid 280.0; Min. 0.0;

Variable Format: numeric

Notes: UNF:6:78xyDi6zTd8uRjUMlt3ifw==

tGTN

f86831 Location:

Summary Statistics: Max. 2803900.0; Min. 2803900.0; Valid 280.0; StDev 0.0; Mean 2803900.0

Variable Format: numeric

Notes: UNF:6:KtZ5S20Tm/Xgy2PS7Xav/w==

uFN

f86831 Location:

Summary Statistics: StDev 430.14905963498774; Min. 0.0; Max. 1000.0; Valid 280.0; Mean 519.1107142857142;

Variable Format: numeric

Notes: UNF:6:5u2a7aAqSqm6ByE8nZvLqw==

nFN

f86831 Location:

Summary Statistics: Valid 280.0; Min. 0.0; StDev 1283.7860811085825; Mean 1910.6214285714275; Max. 3000.0

Variable Format: numeric

Notes: UNF:6:AkMM9EK+3FWcVkIgVEeTEA==

tTN

f86831 Location:

Summary Statistics: StDev 34430.581335007344; Min. 2525539.0; Mean 2798468.2892857143; Max. 2803900.0; Valid 280.0

Variable Format: numeric

Notes: UNF:6:7dsHgI7Ihd7h8iT2tjrUqQ==

uPrecision

f86831 Location:

Summary Statistics: StDev 0.29176290693033896; Mean 0.8762258065965057; Valid 206.0; Max. 1.0; Min. 0.00230462645696836;

Variable Format: numeric

Notes: UNF:6:4d343l2fbtqHjYYE+PUf7Q==

uRecall

f86831 Location:

Summary Statistics: Mean 0.48088928571428574; Min. 0.0; Valid 280.0; StDev 0.43014905963498784; Max. 1.0;

Variable Format: numeric

Notes: UNF:6:A0ipQD/XwoESLXNOtjeQYg==

uMCC

f86831 Location:

Summary Statistics: Min. 0.0316171446756817; StDev 0.3440529055838176; Max. 1.0; Mean 0.6675345244490059; Valid 206.0

Variable Format: numeric

Notes: UNF:6:sQXPtSTXQ2ui2bDvNBZh8g==

uF1

f86831 Location:

Summary Statistics: Min. 0.001998001998002; Valid 206.0; StDev 0.38176610531446126; Mean 0.6291200747058197; Max. 1.0

Variable Format: numeric

Notes: UNF:6:xZvQberW2BzuCR2Mu4nYcQ==

uAuPR

f86831 Location:

Summary Statistics: Min. 0.306852819440055; StDev 0.20698472463768908; Valid 216.0; Mean 0.858489891044851; Max. 1.0

Variable Format: numeric

Notes: UNF:6:7nxwuPhDw80Exua09VZfeA==

nPrecision

f86831 Location:

Summary Statistics: Valid 193.0; StDev 0.2677379902580225; Max. 1.0; Mean 0.8931108352547239; Min. 0.00680065794403194;

Variable Format: numeric

Notes: UNF:6:36jzLiS17SYr83r9338s7g==

nRecall

f86831 Location:

Summary Statistics: Min. 0.0; Mean 0.3631261904761904; Max. 1.0; StDev 0.4279286937028607; Valid 280.0;

Variable Format: numeric

Notes: UNF:6:6ct80NKBBFKQR3wH21XwEQ==

nMCC

f86831 Location:

Summary Statistics: Max. 0.999666254565577; StDev 0.3887618186733151; Min. 0.0182476625090128; Mean 0.5454749370737823; Valid 193.0

Variable Format: numeric

Notes: UNF:6:bWPHnk6GDc7PzD7rIEa1sw==

nF1

f86831 Location:

Summary Statistics: Max. 0.999666555518506; Mean 0.5037172322214419; Min. 6.66444518493835E-4; Valid 193.0; StDev 0.41780202819100243

Variable Format: numeric

Notes: UNF:6:/At7hbq/QTBfGod+prBm1g==

nAuPR

f86831 Location:

Summary Statistics: Mean 0.9080552621673479; StDev 0.16892816465037097; Min. 0.307513762754091; Valid 232.0; Max. 1.0

Variable Format: numeric

Notes: UNF:6:VqtyknMbaKxychU+nnvSbA==

lib

f101253 Location:

Variable Format: character

Notes: UNF:6:hjY5jxt1uv7sys6V0Q3TGA==

delta

f101253 Location:

Summary Statistics: Mean 0.18500000000000003; Min. 0.025; Max. 0.5; Valid 450.0; StDev 0.1759795999515545

Variable Format: numeric

Notes: UNF:6:TjfPMCD0J/2fvI1pqXuYTg==

cutoff

f101253 Location:

Summary Statistics: Valid 450.0; StDev 0.014941601166722848; Max. 0.05; Min. 1.0E-10; Mean 0.006111111109999996;

Variable Format: numeric

Notes: UNF:6:0b4oBfezYJIwQhbDmvWhHA==

user.self

f101253 Location:

Summary Statistics: Valid 450.0; Max. 2.02100000000002; Mean 0.3533800000000004; Min. 0.0; StDev 0.4742481237657386;

Variable Format: numeric

Notes: UNF:6:uAwbpTkoGXTEaOd0uJVoBQ==

sys.self

f101253 Location:

Summary Statistics: Max. 1.32400000000001; Valid 450.0; Mean 0.4825511111111114; Min. 0.077; StDev 0.45849287171936626

Variable Format: numeric

Notes: UNF:6:dc/LkaCX/w34gVvnD29tYg==

elapsed

f101253 Location:

Summary Statistics: Max. 4670.268; Mean 999.7191622222223; Valid 450.0; StDev 1555.294549330697; Min. 13.2889999999898

Variable Format: numeric

Notes: UNF:6:aHAmde5+VtvUhvKPXiWGjg==

user.child

f101253 Location:

Summary Statistics: Valid 450.0; Max. 15862.336; Mean 3622.7533133333322; Min. 40.1560000000172; StDev 5670.450848415524;

Variable Format: numeric

Notes: UNF:6:1JUQJtG65dT7N7LVt3hTkw==

sys.child

f101253 Location:

Summary Statistics: Min. 4.31700000000274; StDev 254.64214734337807; Valid 450.0; Max. 1004.346; Mean 146.92694000000023;

Variable Format: numeric

Notes: UNF:6:8N+EwDHnp5lXe8PLBbTcww==

uGTP

f101253 Location:

Summary Statistics: Mean 1000.0; Min. 1000.0; Valid 450.0; StDev 0.0; Max. 1000.0;

Variable Format: numeric

Notes: UNF:6:YcdsP1naXPjw5NXxnxZrGw==

uTP

f101253 Location:

Summary Statistics: Min. 0.0; StDev 423.20643390851114; Valid 450.0; Mean 520.1066666666668; Max. 1000.0

Variable Format: numeric

Notes: UNF:6:BJpDBl8WWM0lKsOgXKa39g==

nGTP

f101253 Location:

Summary Statistics: Min. 3000.0; Max. 3000.0; Valid 450.0; Mean 3000.0; StDev 0.0

Variable Format: numeric

Notes: UNF:6:EMHmILdv87Iwx/XDBPZDZA==

nTP

f101253 Location:

Summary Statistics: Valid 450.0; Max. 3000.0; StDev 1297.3795023738842; Min. 0.0; Mean 1270.962222222221;

Variable Format: numeric

Notes: UNF:6:oxMUa2wBbvGOdo28WQiKuQ==

tFP

f101253 Location:

Summary Statistics: Valid 450.0; Min. 0.0; Max. 1278490.0; StDev 131775.8901566385; Mean 20049.884444444448

Variable Format: numeric

Notes: UNF:6:QgAz2FzlCdlEOJ/thysNDQ==

utpCor

f101253 Location:

Summary Statistics: Valid 450.0; Mean 0.6304923738179067; Min. 0.0; Max. 0.942761762432433; StDev 0.4061453599480377

Variable Format: numeric

Notes: UNF:6:QZmfCOddEsq305JXy389ig==

ntpCor

f101253 Location:

Summary Statistics: Min. 0.0; Valid 450.0; Mean 0.5537685089531444; Max. 0.951390286187402; StDev 0.427732046412866

Variable Format: numeric

Notes: UNF:6:bIe+ut7rndNfZcBR2POxQw==

ufnCor

f101253 Location:

Summary Statistics: Valid 450.0; Mean 0.6147878885422986; Max. 0.915056039708211; StDev 0.3671852906366386; Min. 0.0;

Variable Format: numeric

Notes: UNF:6:P0EtDA0Lzu86SP+TQe+LDg==

nfnCor

f101253 Location:

Summary Statistics: StDev 0.32844821944307856; Min. 0.0; Max. 0.913474025766923; Valid 450.0; Mean 0.7177199472249159

Variable Format: numeric

Notes: UNF:6:YPxPCRzM6UFsIx+XMycVgQ==

tGTN

f101253 Location:

Summary Statistics: Min. 2803900.0; Valid 450.0; Mean 2803900.0; Max. 2803900.0; StDev 0.0

Variable Format: numeric

Notes: UNF:6:9aBDtg6u896vZML0rgOh2g==

uFN

f101253 Location:

Summary Statistics: Valid 450.0; Min. 0.0; StDev 423.20643390851114; Max. 1000.0; Mean 479.89333333333326

Variable Format: numeric

Notes: UNF:6:qwbRbxp6Wmk6/tO2FoB7mQ==

nFN

f101253 Location:

Summary Statistics: Min. 0.0; Valid 450.0; Max. 3000.0; Mean 1729.037777777779; StDev 1297.3795023738842

Variable Format: numeric

Notes: UNF:6:gfMjRuf0w30hkhryAN1qCQ==

tTN

f101253 Location:

Summary Statistics: Min. 1525410.0; Max. 2803900.0; StDev 131775.8901566385; Mean 2783850.1155555556; Valid 450.0

Variable Format: numeric

Notes: UNF:6:tbgYzEY5TJmbosE5cL25Bw==

tFPR

f101253 Location:

Summary Statistics: Min. 0.0; StDev 0.04699735730826295; Mean 0.0071507130940634284; Valid 450.0; Max. 0.455968472484753

Variable Format: numeric

Notes: UNF:6:Cy9BsaP3nUiPEsY37h4qtA==

uPrecision

f101253 Location:

Summary Statistics: Valid 362.0; Max. 1.0; Mean 0.6889471799370782; StDev 0.39216653115202366; Min. 0.0

Variable Format: numeric

Notes: UNF:6:7mUx1Uq/5FVzrCqOF1n2og==

uRecall

f101253 Location:

Summary Statistics: StDev 0.4232064339085109; Min. 0.0; Max. 1.0; Valid 450.0; Mean 0.5201066666666666

Variable Format: numeric

Notes: UNF:6:f/oEmaz7ONjMnRiW+3Bm0A==

uMCC

f101253 Location:

Summary Statistics: Min. -1.17728534002087E-4; StDev 0.3544690257948995; Valid 362.0; Mean 0.5759139060399365; Max. 0.999500196453802

Variable Format: numeric

Notes: UNF:6:UiXwDhCMQXKNQGtceTSfTw==

uF1

f101253 Location:

Summary Statistics: Mean 0.541073440772531; Valid 361.0; StDev 0.3779287554440327; Min. 0.00103452735031682; Max. 0.999500249875063

Variable Format: numeric

Notes: UNF:6:DsOZTKDoF9QuwMQRY8/Sqw==

uAuROC

f101253 Location:

Summary Statistics: Max. 0.999999999821677; StDev 0.3720074487923377; Valid 450.0; Min. 0.0; Mean 0.6890130296372909;

Variable Format: numeric

Notes: UNF:6:cK5tkjJUhlSyV6h4lNgtkA==

uAuPR

f101253 Location:

Summary Statistics: Valid 420.0; Min. 2.37988993009073E-6; Mean 0.5956017539091044; Max. 0.9999995004995; StDev 0.3719997502870138;

Variable Format: numeric

Notes: UNF:6:n2Q3rgJvLwzeit56YnDKZw==

nPrecision

f101253 Location:

Summary Statistics: Min. 0.00178015131286159; StDev 0.3693234078379187; Max. 1.0; Mean 0.7258924894350665; Valid 332.0;

Variable Format: numeric

Notes: UNF:6:trhX2yWDGjN4ih7xtp2djA==

nRecall

f101253 Location:

Summary Statistics: Min. 0.0; Max. 1.0; StDev 0.4324598341246284; Mean 0.4236540740740738; Valid 450.0;

Variable Format: numeric

Notes: UNF:6:/98z1CQoV+XQoCftrmWvDw==

nMCC

f101253 Location:

Summary Statistics: Mean 0.5350309033528038; Min. 0.00106817696753959; StDev 0.37234719708420705; Max. 0.999833196695076; Valid 332.0

Variable Format: numeric

Notes: UNF:6:sMFgzEiYfxg7DQC+xkuiuw==

nF1

f101253 Location:

Summary Statistics: Mean 0.5008784837923426; Max. 0.999833361106482; StDev 0.3943974095935564; Valid 332.0; Min. 6.66444518493835E-4

Variable Format: numeric

Notes: UNF:6:85iaXJq8352jEbStEeEKAw==

nAuROC

f101253 Location:

Summary Statistics: Valid 450.0; Mean 0.6650502678200866; StDev 0.38509225645045886; Min. 0.0; Max. 0.999999999881118;

Variable Format: numeric

Notes: UNF:6:utfNdZ5dI8qRSnd0WCgIsg==

nAuPR

f101253 Location:

Summary Statistics: Max. 0.999999888925914; Min. 7.13118407368891E-6; Valid 390.0; Mean 0.6562494350127613; StDev 0.3498694711260333;

Variable Format: numeric

Notes: UNF:6:Kti2J4bPCHv5kEDe+yXR5w==

Other Study-Related Materials

Label:

00-README.txt

Text:

The README file

Notes:

text/plain

Other Study-Related Materials

Label:

GSE105018.data.Rdata

Text:

An R data file containing the following objects: 1. geo.ranges: a GRanges object containing beta values for 430802 CpGs across 1658 samples from GSE105018 Illumina HumanMethylation 450 data set (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE105018). 2. sample.info: a list containing sample metadata for all samples in the GSE105018 data set. 3. sample.ids: a character vector of sample IDs.

Notes:

application/gzip

Other Study-Related Materials

Label:

GSE51032.data.Rdata

Text:

An R data file containing the following objects: 1. geo.ranges: a GRanges object containing beta values for 485512 CpGs across 845 samples from GSE51032 Illumina HumanMethylation 450 data set (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE51032). This data set was used as a template to create simulated test data sets. 2. sample.info: a list containing sample metadata for all samples in the GSE51032 data set. 3. sample.ids: a character vector of sample IDs.

Notes:

application/gzip

Other Study-Related Materials

Label:

GRCh37.p13.2019.12.12.Rdata

Text:

An R data file containing GRanges object with all the gene coordinates from GRCh37 annotation.

Notes:

application/gzip

Other Study-Related Materials

Label:

GRCh38.p13.2020.08.10.Rdata

Text:

An R data file containing GRanges object with all the gene coordinates from GRCh38 annotation.

Notes:

application/gzip

Other Study-Related Materials

Label:

to.remove.RData

Text:

An R data file containing character vector with all polymorphic and non-specific Illumina HumanMethylation 450 probe IDs.

Notes:

application/gzip

Other Study-Related Materials

Label:

hg19ToMm9.over.chain

Text:

human hg19 to mouse mm9 liftover chain. Required for R/PoC.results.GSE105018.R.

Notes:

application/octet-stream

Other Study-Related Materials

Label:

mouse.embryonic.marks.tar.gz

Text:

LOLA database containing information on mouse embryonic chromatin marks. Required for R/PoC.results.GSE105018.R. Ungzip before using.

Notes:

application/gzip

Other Study-Related Materials

Label:

results-5-1000.Rdata

Text:

An R data file with all the performance metrics obtained by R/PoC.results.SIMULATED.R.

Notes:

application/gzip

Other Study-Related Materials

Label:

results-plots-5-1000.pdf

Text:

plots for performance metrics obtained by R/PoC.results.SIMULATED.R.

Notes:

application/pdf

Other Study-Related Materials

Label:

SIMULATED-5-1000-0.025.data.Rdata

Text:

An R data file containing the following objects: 1. simulated.ranges: a GRanges object containing simulated beta values for 485512 CpGs across 100 samples with aberrant methylation events introduced in some of the regions. 2. modified.regions.unique: a GRanges object containing genomic ranges and sample ID for 1000 true positive unique aberrantly methylated regions. 3. modified.regions.nonunique: a GRanges object containing genomic ranges and sample ID for 3000 true positive non-unique aberrantly methylated regions. 4. delta: delta value by which CpG beta values corresponding to particular region/sample pair in "simulated.ranges" were increased or decreased. 5. merge.window: "merge.window" parameter value to use during testing. 6. min.cpgs: "min.cpgs" parameter value to use during testing. 7. sample.ids: a character vector of sample IDs.

Notes:

application/gzip

Other Study-Related Materials

Label:

SIMULATED-5-1000-0.050.data.Rdata

Text:

An R data file which is similar to "data∕SIMULATED/SIMULATED-5-1000-0.025.data.Rdata", but was generated using "delta=0.050"

Notes:

application/gzip

Other Study-Related Materials

Label:

SIMULATED-5-1000-0.100.data.Rdata

Text:

An R data file which is similar to "data∕SIMULATED/SIMULATED-5-1000-0.025.data.Rdata", but was generated using "delta=0.100"

Notes:

application/gzip

Other Study-Related Materials

Label:

SIMULATED-5-1000-0.250.data.Rdata

Text:

An R data file which is similar to "data∕SIMULATED/SIMULATED-5-1000-0.025.data.Rdata", but was generated using "delta=0.250"

Notes:

application/gzip

Other Study-Related Materials

Label:

SIMULATED-5-1000-0.500.data.Rdata

Text:

An R data file which is similar to "data∕SIMULATED/SIMULATED-5-1000-0.025.data.Rdata", but was generated using "delta=0.500"

Notes:

application/gzip

Other Study-Related Materials

Label:

writer-plots-5-1000.pdf

Text:

Methylation profiles of eight random genomic regions from the GSE51032 template data set (“GSE51032”) together with the methylation profiles of the same regions in each of five simulated test data sets obtained by changing beta values within those regions for a subset of samples (highlighted) by a particular delta (“delta=0.025”, “0.050”, “0.100”, “0.250” or “0.500”).

Notes:

application/pdf

Other Study-Related Materials

Label:

files.2019-10-10.json

Text:

JSON file containing information on all Illumina Human Methylation 450 assay results in TCGA.

Notes:

application/json

Other Study-Related Materials

Label:

TCGA-COAD.tcga.data.Rdata

Text:

An R data file containing the following objects: 1. tcga.data: an object containing beta values for 487192 CpGs across 192 samples from TCGA-COAD Illumina HumanMethylation 450 data set (https://portal.gdc.cancer.gov/). This data set was used to find epimutations undergoing positive selection during carcinogenesys.

Notes:

application/gzip

Other Study-Related Materials

Label:

tcga.coad.AMRs.Rdata

Text:

An R data file containing the following objects: 1. tcga.coad.ranges: GRanges object with TCGA-COAD subset of methylation beta values for adjacent and tumour tissue pairs. 2. tcga.coad.ramr.hg19: GRanges object with aberrantly methylated regions found in adjacent TCGA-COAD tissue samples.

Notes:

application/gzip

Other Study-Related Materials

Label:

PoC.results.GSE105018.R

Text:

an R script to reproduce all findings from GSE105018 data set.

Notes:

type/x-r-syntax

Other Study-Related Materials

Label:

PoC.results.GSE51032.R

Text:

an R script to reproduce all findings from GSE51032 data set.

Notes:

type/x-r-syntax

Other Study-Related Materials

Label:

PoC.results.SIMULATED.R

Text:

an R script that evaluates method performance for ramr and other methods across all test scenarios. NB that some methods (e.g. comb-p) are slow, thus modern hardware and parallelism is advised. As comb-p uses at least four threads by default, it was forced to run in a single threaded mode by limiting "processes" parameter of multiprocessing.pool.Pool call to "1" in comb-p sorce code (cpv/_common.py).

Notes:

type/x-r-syntax

Other Study-Related Materials

Label:

PoC.results.TCGA-COAD.R

Text:

an R script that finds all aberantly methylated regions in a subset of adjacent tissue samples from TCGA-COAD.

Notes:

type/x-r-syntax

Other Study-Related Materials

Label:

PoC.writer.SIMULATED.R

Text:

an R script that generates simulated test data sets using GSE51032 as a template.

Notes:

type/x-r-syntax

Other Study-Related Materials

Label:

REVISION-SIMULATED-5-1000-0.025.data.Rdata

Text:

An R data file generated by revision/R/REVISION-performance-metrics.R and containing the following objects: 1. simulated.ranges: a GRanges object containing simulated beta values for 485512 CpGs across 100 samples with aberrant methylation events introduced in some of the regions. 2. modified.regions.unique: a GRanges object containing genomic ranges and sample ID for 1000 true positive unique aberrantly methylated regions. 3. modified.regions.nonunique: a GRanges object containing genomic ranges and sample ID for 3000 true positive non-unique aberrantly methylated regions. 4. modified.regions.noise: a GRanges object containing genomic ranges and sample ID for 1000 single-base aberrations. 5. delta: delta value by which CpG beta values corresponding to particular region/sample pair in "simulated.ranges" were increased or decreased. 6. merge.window: "merge.window" parameter value to use during testing. 7. min.cpgs: "min.cpgs" parameter value to use during testing. 8. sample.ids: a character vector of sample IDs.

Notes:

application/gzip

Other Study-Related Materials

Label:

REVISION-SIMULATED-5-1000-0.050.data.Rdata

Text:

An R data file which is similar to "revision∕data/REVISION-SIMULATED-5-1000-0.025.data.Rdata", but was generated using "delta=0.050". File was generated by revision/R/REVISION-performance-metrics.R.

Notes:

application/gzip

Other Study-Related Materials

Label:

REVISION-SIMULATED-5-1000-0.100.data.Rdata

Text:

An R data file which is similar to "revision∕data/REVISION-SIMULATED-5-1000-0.025.data.Rdata", but was generated using "delta=0.100". File was generated by revision/R/REVISION-performance-metrics.R.

Notes:

application/gzip

Other Study-Related Materials

Label:

REVISION-SIMULATED-5-1000-0.250.data.Rdata

Text:

An R data file which is similar to "revision∕data/REVISION-SIMULATED-5-1000-0.025.data.Rdata", but was generated using "delta=0.250". File was generated by revision/R/REVISION-performance-metrics.R.

Notes:

application/gzip

Other Study-Related Materials

Label:

REVISION-SIMULATED-5-1000-0.500.data.Rdata

Text:

An R data file which is similar to "revision∕data/REVISION-SIMULATED-5-1000-0.025.data.Rdata", but was generated using "delta=0.500". File was generated by revision/R/REVISION-performance-metrics.R.

Notes:

application/gzip

Other Study-Related Materials

Label:

REVISION-SIMULATED-plots-5-1000.pdf

Notes:

application/pdf

Other Study-Related Materials

Label:

REVISION-SIMULATED-results-5-1000.Rdata

Text:

An R data file with all the performance metrics obtained by revision/R/REVISION-performance-metrics.R.

Notes:

application/gzip

Other Study-Related Materials

Label:

REVISION-SIMULATED-suppl-plots-5-1000.pdf

Text:

plots for performance metrics generated by revision/R/REVISION-performance-metrics.R.

Notes:

application/pdf

Other Study-Related Materials

Label:

REVISION-SIMULATED-suppl-table-5-1000.pdf

Text:

top results generated by revision/R/REVISION-performance-metrics.R.

Notes:

application/pdf

Other Study-Related Materials

Label:

REVISION-SIMULATED-suppl-time-5-1000.pdf

Text:

time plot generated by revision/R/REVISION-performance-metrics.R.

Notes:

application/pdf

Other Study-Related Materials

Label:

REVISION-algorithm-diagram.R

Text:

an R script that builds ramr flowchart

Notes:

type/x-r-syntax

Other Study-Related Materials

Label:

REVISION-dataset-similarity.R

Text:

an R script that compares template and simulated data sets

Notes:

type/x-r-syntax

Other Study-Related Materials

Label:

REVISION-LOLA-bootstrap.R

Text:

an R script that checks LOLA analysis specificity

Notes:

type/x-r-syntax

Other Study-Related Materials

Label:

REVISION-performance-metrics.R

Text:

an R script that generates simulated test data sets using GSE51032 as a template and evaluates method performance for ramr and other methods across all test scenarios. NB that some methods (e.g. comb-p) are slow, thus modern hardware and parallelism is advised. As comb-p uses at least four threads by default, it was forced to run in a single threaded mode by limiting "processes" parameter of multiprocessing.pool.Pool call to "1" in comb-p sorce code (cpv/_common.py).

Notes:

type/x-r-syntax

Other Study-Related Materials

Label:

REVISION-ramr-dmrcate.R

Text:

an R script that compares ramr and DMRcate across various random seeds

Notes:

type/x-r-syntax