Time-varying TAD boundary analysis

TimeCompare(
  cont_mats,
  resolution,
  z_thresh = 2,
  window_size = 15,
  gap_thresh = 0.2,
  groupings = NULL
)

Arguments

cont_mats

List of contact matrices in either sparse 3 column, n x n or n x (n+3) form where the first three columns are coordinates in BED format. See "Input_Data" vignette for more information. If an n x n matrix is used, the column names must correspond to the start point of the corresponding bin. Required.

resolution

Resolution of the data. Used to assign TAD boundaries to genomic regions. If not provided, resolution will be estimated from column names of the first matrix. Default is "auto".

z_thresh

Threshold for boundary score. Higher values result in a more stringent detection of differential TADs. Default is 3.

window_size

Size of sliding window for TAD detection, measured in bins. Results should be consistent. Default is 15.

gap_thresh

Required % of non-zero entries before a region will be considered non-informative and excluded. Default is .2

groupings

Variable for defining groups of replicates at a given time point. Each group will be combined using consensus boundary scores. It should be a vector of equal length to cont_mats where each entry is a label corresponding to the group membership of the corresponding matrix. Default is NULL, implying one matrix per time point.

Value

A list containing consensus TAD boundaries and overall scores

  • TAD_Bounds - Data frame containing all regions with a TAD boundary at one or more time point. Coordinate corresponds to genomic region, sample columns correspond to individual boundary scores for each sample, Consensus_Score is the consensus boundary score across all samples. Category is the differential boundary type.

  • All_Bounds - Data frame containing consensus scores for all regions

  • Count_Plot - Plot containing the prevelance of each boundary type

Details

Given a list of sparse 3 column, n x n, or n x (n+3) contact matrices representing different time points, TimeCompare identifies all TAD boundaries. Each TAD boundary is classified into six categories (Common, Dynamic, Early/Late Appearing and Early/Late Disappearing), based on how it changes over time.

Examples

# Read in data data("time_mats") # Find time varying TAD boundaries diff_list <- TimeCompare(time_mats, resolution = 50000)
#> Converting to n x n matrix
#> Matrix dimensions: 704x704
#> Matrix dimensions: 704x704
#> Matrix dimensions: 704x704
#> Matrix dimensions: 704x704