TY - JOUR
T1 - R-CRISPR
T2 - A deep learning network to predict off-target activities with mismatch, insertion and deletion in CRISPR-Cas9 system
AU - Niu, Rui
AU - Peng, Jiajie
AU - Zhang, Zhipeng
AU - Shang, Xuequn
N1 - Publisher Copyright:
© 2021 by the authors. Licensee MDPI, Basel, Switzerland.
PY - 2021/12
Y1 - 2021/12
N2 - The Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR)—associated protein 9 (Cas9) system is a groundbreaking gene-editing tool, which has been widely adopted in biomedical research. However, the guide RNAs in CRISPR-Cas9 system may induce unwanted off-target activities and further affect the practical application of the technique. Most existing in silico prediction methods that focused on off-target activities possess limited predictive precision and remain to be improved. Hence, it is necessary to propose a new in silico prediction method to address this problem. In this work, a deep learning framework named R-CRISPR is presented, which devises an encoding scheme to encode gRNA-target sequences into binary matrices, a convolutional neural network as feature extractor, and a recurrent neural network to predict off-target activities with mismatch, insertion, or deletion. It is demonstrated that R-CRISPR surpasses six mainstream prediction methods with a significant improvement on mismatch-only datasets verified by GUIDE-seq. Compared with the state-of-art prediction methods, R-CRISPR also achieves competitive performance on datasets with mismatch, insertion, and deletion. Furthermore, experiments show that data concatenate could influence the quality of training data, and investigate the optimal combination of datasets.
AB - The Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR)—associated protein 9 (Cas9) system is a groundbreaking gene-editing tool, which has been widely adopted in biomedical research. However, the guide RNAs in CRISPR-Cas9 system may induce unwanted off-target activities and further affect the practical application of the technique. Most existing in silico prediction methods that focused on off-target activities possess limited predictive precision and remain to be improved. Hence, it is necessary to propose a new in silico prediction method to address this problem. In this work, a deep learning framework named R-CRISPR is presented, which devises an encoding scheme to encode gRNA-target sequences into binary matrices, a convolutional neural network as feature extractor, and a recurrent neural network to predict off-target activities with mismatch, insertion, or deletion. It is demonstrated that R-CRISPR surpasses six mainstream prediction methods with a significant improvement on mismatch-only datasets verified by GUIDE-seq. Compared with the state-of-art prediction methods, R-CRISPR also achieves competitive performance on datasets with mismatch, insertion, and deletion. Furthermore, experiments show that data concatenate could influence the quality of training data, and investigate the optimal combination of datasets.
KW - CRISPR/Cas9
KW - Deep learning
KW - Off-target prediction
UR - http://www.scopus.com/inward/record.url?scp=85120049930&partnerID=8YFLogxK
U2 - 10.3390/genes12121878
DO - 10.3390/genes12121878
M3 - 文章
C2 - 34946828
AN - SCOPUS:85120049930
SN - 2073-4425
VL - 12
JO - Genes
JF - Genes
IS - 12
M1 - 1878
ER -