Publication

2023-08-06
Finding Needles in the Haystack: Strategies for Uncovering Noncoding Regulatory Variants
Despite accumulating evidence implicating noncoding variants in human diseases, unraveling their functionality remains a significant challenge. Systematic annotations of the regulatory landscape and the growth of sequence variant data sets have fueled the development of tools and methods to identify causal noncoding variants and evaluate their regulatory effects. Here, we review the latest advances in the field and discuss potential future research avenues to gain a more in-depth understanding of noncoding regulatory variants. Continue reading...
2022-08-26
Functional genomic assays to annotate enhancer-promoter interactions genome-wide
Enhancers are pivotal for regulating gene transcription that occurs at promoters. Identification of the interacting enhancer–promoter pairs and understanding the mechanisms behind how they interact and how enhancers modulate transcription can provide fundamental insight into gene regulatory networks. Recently, advances in high-throughput methods in three major areas—chromosome conformation capture assay, such as Hi-C to study basic chromatin architecture, ectopic reporter experiments such as self-transcribing active regulatory region sequencing (STARR-seq) to quantify promoter and enhancer activity, and endogenous perturbations such as clustered regularly interspaced short palindromic repeat interference (CRISPRi) to identify enhancer–promoter compatibility—have further our knowledge about transcription. In this review, we will discuss the major method developments and key findings from these assays. Continue reading...
2022-08-16
Survey of the binding preferences of RNA-binding proteins to RNA editing events
Background: Adenosine-to-inosine (A-to-I) editing is an important RNA posttranscriptional process related to a multitude of cellular and molecular activities. However, systematic characterizations of whether and how the events of RNA editing are associated with the binding preferences of RNA sequences to RNA-binding proteins (RBPs) are still lacking. Results: With the RNA-seq and RBP eCLIP-seq datasets from the ENCODE project, we quantitatively survey the binding preferences of 150 RBPs to RNA editing events, followed by experimental validations. Such analyses of the RBP-associated RNA editing at nucleotide resolution and genome-wide scale shed light on the involvement of RBPs specifically in RNA editing-related processes, such as RNA splicing, RNA secondary structures, RNA decay, and other posttranscriptional processes. Conclusions: These results highlight the relevance of RNA editing in the functions of many RBPs and therefore serve as a resource for further characterization of the functional associations between various RNA editing events and RBPs. Continue reading...
A comparison of experimental assays and analytical methods for genome-wide identification of active enhancers
Mounting evidence supports the idea that transcriptional patterns serve as more specific identifiers of active enhancers than histone marks; however, the optimal strategy to identify active enhancers both experimentally and computationally has not been determined. Here, we compared 13 genome-wide RNA sequencing (RNA-seq) assays in K562 cells and show that nuclear run-on followed by cap-selection assay (GRO/PRO-cap) has advantages in enhancer RNA detection and active enhancer identification. We also introduce a tool, peak identifier for nascent transcript starts (PINTS), to identify active promoters and enhancers genome wide and pinpoint the precise location of 5′ transcription start sites. Finally, we compiled a comprehensive enhancer candidate compendium based on the detected enhancer RNA (eRNA) transcription start sites (TSSs) available in 120 cell and tissue types, which can be accessed at https://pints.yulab.org. With knowledge of the best available assays and pipelines, this large-scale annotation of candidate enhancers will pave the way for selection and characterization of their functions in a timeand labor-efficient manner. Continue reading...
2019-12-05
Inhibition of DCLK1 down-regulates PD-L1 expression through Hippo pathway in human pancreatic cancer
Immunotherapy is one of the most promising strategies for cancer, compared with traditional treatments. As one of the key emerging immunotherapies, anti-PD-1/PD-L1 treatment has brought survival benefits to many advanced cancer patients. However, in pancreatic cancer, immunotherapy-based approaches have not achieved a favorable clinical effect because of mismatch repair deficiencies. Therefore, the majority of pancreatic tumors are regarded as immune-quiescent tumors and non-responsive to single-checkpoint blockade therapies. Many preclinical and clinical studies suggest that it is still important to clarify the regulatory mechanism of the PD-1/ PD-L1 pathway in pancreatic cancer. As a marker of cancer stem cells, DCLK1 has been found to play an important role in the occurrence and development of a plethora of human cancers. Recent researches have revealed that DCLK1 is closely related to EMT process of tumor cells, meanwhile, it could also be used as a biomarker in gastrointestinal tumors to predict the prognoses of patients. However, the role that DCLK1 plays in the immune regulation of tumor microenvironments remains unknown. Therefore, we sought to understand if DCLK1 could positively regulate the expression of PD-L1 in pancreatic cancer cells. Furthermore, we examined if DCLK1 highly correlated with the Hippo pathway through TCGA database analysis. We found that DCLK1 helped regulate the level of PD-L1 expression by affecting the corresponding expression level of yes-associated protein in the Hippo pathway. Collectively, our study identifies DCLK1 as an important regulator of PD-L1 expression in pancreatic tumor and highlights a central role of DCLK1 in the regulation of tumor immunity. Continue reading...
2019-08-06
Extensive disruption of protein interactions by genetic variants across the allele frequency spectrum in human populations
Each human genome carries tens of thousands of coding variants. The extent to which this variation is functional and the mechanisms by which they exert their influence remains largely unexplored. To address this gap, we leverage the ExAC database of 60,706 human exomes to investigate experimentally the impact of 2009 missense single nucleotide variants (SNVs) across 2185 protein-protein interactions, generating interaction profiles for 4797 SNV-interaction pairs, of which 421 SNVs segregate at > 1% allele frequency in human populations. We find that interaction-disruptive SNVs are prevalent at both rare and common allele frequencies. Furthermore, these results suggest that 10.5% of missense variants carried per individual are disruptive, a higher proportion than previously reported; this indicates that each individual’s genetic makeup may be significantly more complex than expected. Finally, we demonstrate that candidate disease-associated mutations can be identified through shared interaction perturbations between variants of interest and known disease mutations. Continue reading...
2018-08-14
bioSyntax: syntax highlighting for computational biology
Background: Computational biology requires the reading and comprehension of biological data files. Plain-text formats such as SAM, VCF, GTF, PDB and FASTA, often contain critical information which is obfuscated by the data structure complexity. Results: bioSyntax is a freely available suite of biological syntax highlighting packages for vim, gedit, Sublime, VSCode, and less. bioSyntax improves the legibility of low-level biological data in the bioinformatics workspace. Conclusion: bioSyntax supports computational scientists in parsing and comprehending their data efficiently and thus can accelerate research output. Continue reading...
Large-scale prediction of ADAR-mediated effective human A-to-I RNA editing
Adenosine-to-inosine (A-to-I) editing by adenosine deaminase acting on the RNA (ADAR) proteins is one of the most frequent modifications during post- and co-transcription. To facilitate the assignment of biological functions to specific editing sites, we designed an automatic online platform to annotate A-to-I RNA editing sites in pre-mRNA splicing signals, microRNAs (miRNAs) and miRNA target untranslated regions ($3^\prime$ UTRs) from human (Homo sapiens) high-throughput sequencing data and predict their effects based on large-scale bioinformatic analysis. After analysing plenty of previously reported RNA editing events and human normal tissues RNA high-seq data, >60000 potentially effective RNA editing events on functional genes were found. The RNA Editing Plus platform is available for free at https://www.rnaeditplus.org/, and we believe our platform governing multiple optimized methods will improve further studies of A-to-I-induced editing post-transcriptional regulation. Continue reading...
BioQueue: a novel pipeline framework to accelerate bioinformatics analysis
Motivation: With the rapid development of Next-Generation Sequencing, a large amount of data is now available for bioinformatics research. Meanwhile, the presence of many pipeline frameworks makes it possible to analyse these data. However, these tools concentrate mainly on their syntax and design paradigms, and dispatch jobs based on users’ experience about the resources needed by the execution of a certain step in a protocol. As a result, it is difficult for these tools to maximize the potential of computing resources, and avoid errors caused by overload, such as memory overflow. Results: Here, we have developed BioQueue, a web-based framework that contains a checkpoint before each step to automatically estimate the system resources (CPU, memory and disk) needed by the step and then dispatch jobs accordingly. BioQueue possesses a shell command-like syntax instead of implementing a new script language, which means most biologists without computer programming background can access the efficient queue system with ease. Continue reading...
Circulating microRNAs: Promising Biomarkers Involved in Several Cancers and Other Diseases
Recently, many studies indicated that microRNAs (miRNAs) stably existed in various body fluids, including serum, plasma, saliva, and urine. Such miRNAs that exist in mammalian body fluids are known as circulating miRNAs, and they can transmit signals between cells and regulate intracellular gene expression. Currently, we barely understand the characteristics, sources, secretion, uptake, and functions of newly generated miRNAs. Particularly, it has been shown that certain types of circulating miRNAs can provide effective clinical data, suggesting their roles as novel biomarkers for the early detection of diseases such as cancers, cardiovascular diseases, and diabetes. Therefore, miRNAs have attracted much attention in academia for their promising applications in fundamental research and clinical diagnosis. This review summarizes some of the functional studies that have been conducted as well as the promising applications of circulating miRNAs, and we hope it will benefit other researchers in this field. Continue reading...