Beijing Genome Institute develops long non-coding RNA database

Recently, LncBook, a database of human long non-coding RNA (lncRNA) developed by the Institute of Genome Research, Chinese Academy of Sciences, Beijing, has been officially launched. The results of this research are based on

In recent years, research on lncRNAs has been an international hotspot. Studies have shown that lncRNAs play an important role in a variety of biological processes and are closely related to the occurrence of diseases, but the annotation information and quality of lncRNAs still lag far behind that of protein-coding genes. The LncBook database not only provides a rich, high-quality human lncRNA dataset, but also performs large-scale multi-omics data analysis and systematic functional and disease annotation, providing a wealth of usable information and data for functional experimental studies and bioinformatics analysis.

Based on rigorous review criteria, LncBook integrated existing lncRNA data and identified new lncRNAs, yielding a total of 270,044 lncRNA transcripts. On this basis, LncBook performs large-scale deep data analysis at these multi-omics levels of lncRNA expression, methylation, variation, and miRNA-lncRNA interactions. At the expression level, mapping of lncRNA expression in 32 or 53 normal human tissues and identification of 49,115 highly tissue-specific (tissue-specific) and 819 housekeeping (housekeeping) lncRNAs; at the methylation level, construction of methylation profiles of the promoter and body regions of lncRNAs in normal versus cancer states in nine cancers; at the variant level, annotation of 92,725,757 SNP minimal allele frequencies of lncRNA regions based on the dbSNP database SNP loci (based on thousand genomic data), ClinVar and COSMIC disease association information; and prediction of 128,392,451 lncRNA-miRNA interaction entries. The above results are presented in the LncBook database in the form of graphs or tables, and the relevant information is available for free download. Based on the above data, LncBook also predicted 97,998 potential disease-associated lncRNAs. In addition, on the basis of LncRNAWiki LncBook annotated 1, 867 literature reports on lncRNAs with systematic functional and disease information.

LncBook, an important lncRNA repository, provides the richest amount of data on human lncRNA data to date. As a complement to the LncRNAWiki database, LncBook has user-friendly query, browsing and visualization features. Users can retrieve lncRNA information by ID/symbol, function, disease name, etc., browse multi-omics information of specified lncRNAs, and download all relevant annotation information and analysis results via ftp. In addition, LncBook provides tools that can be used for lncRNA sequence comparison, classification, coding ability prediction, and other studies to facilitate online analysis.

The research was conducted in collaboration with Vladimir Bajic, a professor at King Abdullah University of Science & Technology (KAUST) in Saudi Arabia. The research has been funded by the Strategic Pioneer Science and Technology Program of CAS, the International Partnership Program of CAS, and the "13th Five-Year Plan" Informatization Program of CAS.

