Takes output from HOMER with heatmaps of regions surrounding peaks/tss and reformats it to be amenable to further analysis and visualization. Automatically deduces the number of bins and samples based on column names produced by HOMER analysis.

ReadHeatmaps(heatmaps.raw, sample.names = NULL, select.cols = NA,
  test = FALSE, raw = FALSE)

Arguments

heatmaps.raw

output from HOMER of heatmap analysis over tag directories

sample.names

names of samples analyzed (tag directories); make sure this is in the same order as supplied in the data file!

select.cols

the colClasses arg for read.table; use this to select specific columns (samples) from the data, since a large file will crash R. Provide a vector with classes, with "NULL" (in quotes) for all columns that you want to exclude. Requires you to know composition (columns) of data...see parameter "test"

test

reads in first 5 rows for inspection of data size/composition

raw

returns raw, unprocessed form of data (basically read.table output)

Value

List of data.frames comprising positions (rows) vs. genes/regions

Details

Highly recommended to register parallel backend with doParallel for parallelization of computations!

Warning: heatmaps take up a lot of disk space, and consequently will use about 8x the file size in RAM..making "huge" analysis unfeasible. Data is from the following command from HOMER:

annotatePeaks.pl tss mm10 -ann $GTF -size 5000 -hist 25 -ghist -d $TAGDIRS

Examples

ReadHeatmaps(hm.dat.txt, c("A", "B", "C"))