Transcription Factor Binding Prediction from ATAC-seq and scATAC-seq with Deep Neural Networks
benchmark function can be used to calculate the area under the precision recall curve (AUPRC) for a bigwig file compared to a gold standard in bigwig format. The user must provide the predictions in bigwig format and specify the resolution of the evaluation (e.g., 200bp).
maxatac benchmark --prediction GM12878_CTCF_chr1.bw --gold_standard GM12878_CTCF_ENCODE_IDR.bw --chromosomes chr1 --bin_size 200
The input bigwig file of transcription factor binding predictions. This file can also be any bigwig signal track that you want to compare against a gold standard.
The input gold standard bigwig file. This file needs to be a binary signal track that has 1 corresponding to TFBS (e.g., from ChIP-seq) and 0 in positions with no TFBS.
The output filename prefix to use. Default:
The chromosomes to benchmark the predictions for. Default:
chr1 is the held out test chromosome.
The size of the bin to use for aggregating the single base-pair predictions. Default:
200 is the size used by the ENCODE-DREAM in vivo TFBS Prediction Challenge
The method to use for aggregating the single base-pair predictions into larger bins. Options include
max score found in the window.
See the pyBigWig documentation for more details.
This flag will set the precision of the predictions signal track. Provide an integer that represents the number of floats before rounding. Currently, the predictions go from
0 - .0000000001. Default:
9 is the limit of precision from TensorFlow.
The output directory to write the results to. Default:
The path to the blacklist bigwig signal track of regions that should be excluded. Default:
hg38_maxatac_blacklist.bed which contains regions that are specific to ATAC-seq.
This argument is used to set the logging level. Currently, the only working logging level is