Transcription Factor Binding Prediction from ATAC-seq and scATAC-seq with Deep Neural Networks

View the Project on GitHub MiraldiLab/maxATAC


The benchmark function can be used to calculate the area under the precision recall curve (AUPRC) for a bigwig file compared to a gold standard in bigwig format. The user must provide the predictions in bigwig format and specify the resolution of the evaluation (e.g., 200bp).


maxatac benchmark --prediction --gold_standard --chromosomes chr1 --bin_size 200

Required Arguments


The input bigwig file of transcription factor binding predictions. This file can also be any bigwig signal track that you want to compare against a gold standard.


The input gold standard bigwig file. This file needs to be a binary signal track that has 1 corresponding to TFBS (e.g., from ChIP-seq) and 0 in positions with no TFBS.


The output filename prefix to use. Default: maxatac_benchmark

Optional Arguments


The chromosomes to benchmark the predictions for. Default: chr1 is the held out test chromosome.


The size of the bin to use for aggregating the single base-pair predictions. Default: 200 is the size used by the ENCODE-DREAM in vivo TFBS Prediction Challenge


The method to use for aggregating the single base-pair predictions into larger bins. Options include max, min, and mean. Default: max score found in the window.

See the pyBigWig documentation for more details.


This flag will set the precision of the predictions signal track. Provide an integer that represents the number of floats before rounding. Currently, the predictions go from 0 - .0000000001. Default: 9 is the limit of precision from TensorFlow.


The output directory to write the results to. Default: ./prediction_results


The path to the blacklist bigwig signal track of regions that should be excluded. Default: hg38_maxatac_blacklist.bed which contains regions that are specific to ATAC-seq.


This argument is used to set the logging level. Currently, the only working logging level is ERROR.