R/create_hic_table.R
create.hic.table.Rd
Create hic.table object from a sparse upper triangular Hi-C matrix
create.hic.table( sparse.mat1, sparse.mat2, chr = NA, scale = TRUE, include.zeros = FALSE, subset.dist = NA, subset.index = NA, exclude.regions = NA, exclude.overlap = 0.2 )
sparse.mat1 | Required, sparse upper triangular Hi-C matrix, 7 column BEDPE format of the upper triangle of the matrix, OR InteractionSet object with the genomic ranges of the interacting regions for the upper triangle of the Hi-C matrix and a single metadata column containing the interaction frequencies for each interacting pair for the first dataset you wish to jointly normalize. |
---|---|
sparse.mat2 | Required, sparse upper triangular Hi-C matrix, 7 column BEDPE format of the upper triangle of the matrix, OR InteractionSet object with the genomic ranges of the interacting regions for the upper triangle of the Hi-C matrix and a single metadata column containing the interaction frequencies for each interacting pair for the second dataset you wish to jointly normalize. |
chr | The chromosome name for the matrices being entered i.e 'chr1' or 'chrX'. Only needed if using sparse upper triangular matrix format. If using BEDPE format leave set to NA. |
scale | Logical, should scaling be applied to the matrices to adjust for total read counts. If TRUE the IFs of the second sparse matrix will be adjusted as follows: IF2_scaled = IF2 / (sum(IF2)/sum(IF1)). |
include.zeros | Logical, If set to TRUE the function will include pairwise interactions where one of the interaction frequencies is 0. |
subset.dist | Should the matrix be subset to only include interactions up
to a user specified matrix unit distance? i.e. to only include
the cells of the matrix which are at a unit distance less than or equal to
100 set |
subset.index | Should the matrix be subset by a user specified distance?
Input as a vector of 4 numbers (i.start, i.end, j.start, j.end).
i.e. to only include a subset of the matrix with row numbers 20 <= i <= 40
and column numbers 30 <= j <= 50 set as |
exclude.regions | A data.frame or genomic ranges object in the form of chr start end. Regions contained in the object will be removed from the hic.table object. Could be useful for excluding regions with a known CNV, blacklist regions, or some other a priori known difference. |
exclude.overlap | The proportion of overlap required to exclude a region. Defaults to 0.2, indicating 20% or more overlap will be enough for exclusion. To exclude any amount of overlap set to 0. If set to 1, only a 100% overlap with an excluded regions will result in exclusion. |
A hic.table object.
This function is used to transform two sparse upper triangular Hi-C matrices
into an object usable in the hic_loess
function.
Sparse upper triangular Hi-C matrix format is typical of the Hi-C data available
from the Aiden Lab http://www.aidenlab.org/. If you have a full
Hi-C contact matrix, first transform it to sparse upper triangular format using
the full2sparse
function. Sparse matrices should have 3 columns
in the following order: Start location of region 1, Start location of region 2,
Interaction Frequency. Matrices in 7 column BEDPE format should
have 7 columns in the following order: Chromosome name of the first region,
Start location of first region, End location of first region,
Chromosome name of the second region, Start location of the second region,
End location of the second region, Interaction Frequency. Please enter either
two sparse matrices or two matrices in 7 column BEDPE format or two
InteractionSet objects; do not mix and match.
# Create hic.table object using included Hi-C data in sparse upper # triangular matrix format data('HMEC.chr22') data('NHEK.chr22') hic.table <- create.hic.table(HMEC.chr22, NHEK.chr22, chr = 'chr22') # View result hic.table#> chr1 start1 end1 chr2 start2 end2 IF1 IF2 D #> 1: chr22 16000000 16500000 chr22 16000000 16500000 5 5.203323 0 #> 2: chr22 16000000 16500000 chr22 16500000 17000000 2 8.672206 1 #> 3: chr22 16500000 17000000 chr22 16500000 17000000 297 480.440193 0 #> 4: chr22 16000000 16500000 chr22 17000000 17500000 5 14.742750 2 #> 5: chr22 16500000 17000000 chr22 17000000 17500000 92 277.510581 1 #> --- #> 2479: chr22 49000000 49500000 chr22 51000000 51500000 21 54.634896 4 #> 2480: chr22 49500000 50000000 chr22 51000000 51500000 35 71.112086 3 #> 2481: chr22 50000000 50500000 chr22 51000000 51500000 394 339.083241 2 #> 2482: chr22 50500000 51000000 chr22 51000000 51500000 4066 2741.284207 1 #> 2483: chr22 51000000 51500000 chr22 51000000 51500000 9916 7132.021930 0 #> M #> 1: 0.05750528 #> 2: 2.11639897 #> 3: 0.69389392 #> 4: 1.56000562 #> 5: 1.59283701 #> --- #> 2479: 1.37943338 #> 2480: 1.02273986 #> 2481: -0.21655615 #> 2482: -0.56875831 #> 2483: -0.47544713