Benchmark

The benchmark function can be used to calculate the area under the precision recall curve (AUPRC) for a bigwig file compared to a gold standard in bigwig format. The user must provide the predictions in bigwig format and specify the resolution of the evaluation (e.g., 200bp).

Example

maxatac benchmark --prediction GM12878_CTCF_chr1.bw --gold_standard GM12878_CTCF_ENCODE_IDR.bw --chromosomes chr1 --bin_size 200

Required Arguments

`--prediction`

The input bigwig file of transcription factor binding predictions. This file can also be any bigwig signal track that you want to compare against a gold standard.

`--gold_standard`

The input gold standard bigwig file. This file needs to be a binary signal track that has 1 corresponding to TFBS (e.g., from ChIP-seq) and 0 in positions with no TFBS.

`--prefix`

The output filename prefix to use. Default: maxatac_benchmark

Optional Arguments

`--chromosomes`

The chromosomes to benchmark the predictions for. Default: chr1 is the held out test chromosome.

`--bin_size`

The size of the bin to use for aggregating the single base-pair predictions. Default: 200 is the size used by the ENCODE-DREAM in vivo TFBS Prediction Challenge

`--agg`

The method to use for aggregating the single base-pair predictions into larger bins. Options include max, min, and mean. Default: max score found in the window.

See the pyBigWig documentation for more details.

`--round_predictions`

This flag will set the precision of the predictions signal track. Provide an integer that represents the number of floats before rounding. Currently, the predictions go from 0 - .0000000001. Default: 9 is the limit of precision from TensorFlow.

`--output_directory`

The output directory to write the results to. Default: ./prediction_results

`--blacklist_bw`

The path to the blacklist bigwig signal track of regions that should be excluded. Default: hg38_maxatac_blacklist.bed which contains regions that are specific to ATAC-seq.

`--loglevel`

This argument is used to set the logging level. Currently, the only working logging level is ERROR.