Transcription Factor Binding Prediction from ATAC-seq and scATAC-seq with Deep Neural Networks
predict function will use a maxATAC model to predict TF binding in a new condition. The user must provide a model and a bigwig file that corresponds to an ATAC-seq signal track.
maxatac predict --models CTCF.h5 --signal GM12878.bigwig
maxatac predict --tf CTCF --signal GM12878.bigwig
The user must provide either the TF name that they want to make predictions for or the h5 model file they desire. If the user provides a TF name, the best model will be used and the correct threshold file will be provided for peak calling.
-s, --signal, -i
The ATAC-seq signal bigwig track that will be used to make predictions of TF binding.
-n, --name, --prefix
Output filename prefix to use. Default
This argument specifies the path to the 2bit DNA sequence for the genome of interest. maxATAC models are trained with hg38 so you will need the correct
The cutoff type (i.e.
log2FC). (F1 = F1-score, and log2FC = Log2( Precision : Random Precision)). Default: F1.
The cutoff value for the cutoff type provided. Note precision, recall, and F1-scores range 0-1, while better-than-random log2FC scores range from 0 to infinity. Example: .7
The cutoff file provided in /data/models that corresponds to the average validation performance metrics for the TF model.
Output directory path. Default:
The path to a bigwig file that has regions to exclude. Default: maxATAC-defined blacklist.
--bed, --peaks, --regions, , --roi, -roi
The path to a bed file that contains the genomic regions to focus TF predictions on. These peaks will be used to refine the prediction windows.
The number of regions to predict on per batch. Default
10000. Decrease this value if you are having memory issues.
The step size to use for building the prediction intervals. Overlapping prediction bins will be averaged together. Default:
INPUT_LENGTH/4, where INPUT_LENGTH is the maxATAC model input size of 1,024 bp.
-cs, --chrom_sizes, -chrom_sizes, --chromosome_sizes
The path to the chromosome sizes file. This is used to generate the bigwig signal tracks.
-c, -chroms, --chromosomes
The chromosomes to make predictions on. Our models do not currently considered chromosomes X or Y. This means that most of the files will not contain this information. You should not predict in chrX or chrY unless you know your bigwig contains these chromosomes. Default: Autosomal chromosomes 1-22.
This argument is used to set the logging level. Currently, the only working logging level is
The windows to use for prediction. These windows must be 1,024 bp wide and have a consistent step size.
This will skip calling peaks at the end of predictions.