Perform filtering on a Hi-C experiment
hic_filter(hicexp, zero.p = 0.8, A.min = 5, remove.regions = hg19_cyto)
hicexp | A hicexp object. |
---|---|
zero.p | The proportion of zeros in a row to filter by. If the proportion of zeros in a row is <= zero.p the row will be filtered out, i.e. zero.p = 1 means nothing is filtered based on zeros and zero.p = 0 will filter rows that have any zeros. |
A.min | The minimum average expression value (row mean) for an interaction pair. If the interaction pair has an average expression value less than A.min the row will be filtered out. |
remove.regions | A GenomicRanges object indicating specific regions to be filtered out. By default this is the hg19 centromeric, gvar, and stalk regions. Also included in the package is hg38_cyto. If your data is not hg19 you will need to substitute this file. To choose not to filter any regions set regions = NULL. |
A hicexp object.
This function is used to filter out the interactions that have low average IFs or large numbers of 0 IF values. If you have already performed filtering when making your hicexp object do not use this again. As these interactions are not very interesting and are commonly false positives during difference detection it is better to remove them from the dataset. Additionally, filtering will help speed up the run time of multiHiCcompare. Filtering can be performed before or after normalization, however the best computational speed gain will occur when filtering is done before normalization.