Losing part of the secondary structure may destabilize neighboring parts of the same secondary structure. Detailed analysis is given of four biomedically relevant proteins Beta-site Amyloid Precursor Protein Cleaving enzyme BACE , Interleukin-4, Frataxin and Hereditary hemochromatosis protein and their associated splice variant models.
The visualization of these possible structures provides new insights about their functionality and the possible etiology of associated diseases. However, similar to the three mentioned tools, our application builds SpliceGraphs. In addition to the splice sites support, we used features such as the length of exons and prioritized multiple exons over a continuous exon including all mentioned multiple exons with the identical start and end coordinates to improve the SpliceGraph structure, get a better definition of differences between transcripts variants, and recognize all possible exons.
Use of this software is as simple as Vials tool which works with the gene names, but we have provided the possibility to enter a set of transcripts in a using process, and we believe it as an advantage for our software. Also, our tool represents a clear view of the alternative splicing events of the query transcript regarding the SpliceGraph and determines the exonic and genomic regions of the events.
We presented the possibility of the investigation of AS patterns in both single and multiple forms: single form for specific transcript investigations and multiple form for cases of having a set of transcripts. Also, an image that represents the query transcript as well as the SpliceGraph constructed from known transcripts of the corresponded gene, gives a clear view of the alternative splicing region and illustrates how the AS events are happened. In addition, in the cases that the Unique transcript reads count of transcripts are input along with transcript IDs, the application provides the possibility to perform a Chi-square Goodness of Fit statistical test to determine significance of alteration rates between Experimental Group and Control Group.
The possibility of result exporting in text and Microsoft excel format is considered for results. Methods of application are shown in the practical guide.
Data for testing is supplied in the supplemental files S6—9. We developed a practical SpliceGraph-based application for detecting alternative splicing events from transcripts in all model organisms.
We eliminated the complicated steps for downloading reference data and using strict command lines arguments in our software to ease extracting AS events straight from transcripts rather than RNA-seq data. Using this software, researchers are able to investigate AS events as the significant factor of alteration in proteins functions through the updated SpliceGraph in each use.
The SpliceDetector software is compatible with Windows and needs. Barbosa-Morais, N. The evolutionary landscape of alternative splicing in vertebrate species. Chen, F. Identification and evolutionary analysis of novel exons and alternative splicing events using cross-species EST-to-genome comparisons in human, mouse and rat. Mortazavi, A. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Trapnell, C. Douglas, A. RNA splicing: disease and therapy.
Tazi, J. Alternative splicing and disease. Garcia-Blanco, M. Alternative splicing: therapeutic target and tool. Progress in molecular and subcellular biology 44 , 47—64 Havens, M. Targeting RNA splicing for disease therapy. Wiley interdisciplinary reviews. Zhang, M. McClintock, D. The mutant form of lamin A that causes Hutchinson-Gilford progeria is a biomarker of cellular aging in human skin. Keren, H. Alternative splicing and evolution: diversification, exon definition and function.
Nature reviews. Panahi, B. Genome-wide analysis of alternative splicing events in Hordeum vulgare: Highlighting retention of intron-based splicing and its possible function through network analysis.
Conesa, A. A survey of best practices for RNA-seq data analysis. Ryan, M. SpliceSeq: a resource for analysis and visualization of RNA-Seq data on alternative splicing and its functional impacts. Differential analysis of gene regulation at transcript resolution with RNA-seq. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation.
Grabherr, M. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Bollina, D. ASGS: an alternative splicing graph web service. Anders, S. Detecting differential usage of exons from RNA-seq data. Erratum to: A survey of best practices for RNA-seq data analysis. Florea, L. Thousands of exon skipping events differentiate among splicing patterns in sixteen human tissues.
Hu, Y. DiffSplice: the genome-wide detection of differential splicing events with RNA-seq. Kato, T. Katz, Y. Analysis and design of RNA sequencing experiments for identifying isoform regulation. Singh, D. Stephan-Otto Attolini, C.
Designing alternative splicing RNA-seq studies. Beyond generic guidelines. Wang, W. Identifying differentially spliced genes from two groups of RNA-seq samples. Wu, J. SpliceTrap: a method to quantify alternative splicing under single cellular conditions. Hubbard, T. The Ensembl genome database project. Nucleic acids research 30 , 38—41 Yates, A. Ensembl Harrington, E.
Sircah: a tool for the detection and visualization of alternative transcripts. Rogers, M. Heber, S. Splicing graphs and EST assembly problem.
Bioinformatics 18 Suppl 1 , S— By incorporating these gained and ghost domains direct into their PPI context, we can consider MGGs as potential interaction changes caused by domain exclusion or inclusion. Genes without multiple splice variants were removed from further analysis.
We downloaded the 11, domain-domain interactions DDI from 3did [ 34 ], where inclusion requires a known protein-protein structure to support the interaction. After removal of interactions without DDI support, interactions remained. Of these, only had gained or ghost domains. There were 50 genes with an isoform significant for survival for these interactions.
We note that this tool may be used independently of NEEP. Mutation profiles for each patient were constructed as described as the 6 substitution types along with the neighboring bases for a total of 96 mutation types using only mutations found by all four variant callers. We used MutationalPatterns [ 40 ] in R to find the linear contributions of each of the 30 signatures to a patient profile. To check for possible confounding factors, we considered three smoking variables reported by TCGA and checked whether they were different between the low and high RAD51C expression groups: cigarettes per day was tested using the Welch two sample t-test; years smoked was tested using the Wilcoxon rank sum test; and the binary smoker variable was tested using the two-sample test for equal proportions.
In addition, we checked if survival confounded the relationship between RAD51C and signature 3. Because the causality between mutations and survival must be directional, we conducted Cox-PH survival analysis of RAD51C with and without the contribution of Signature 3 as a confounding variable. The exact binomial test was used to determine if the proportion of splice variants significant for Signature 3 was greater than expected by chance.
Counts of genes belonging to each enrichment cluster for the gene, isoform splice variant , and MGG granularities are displayed as bar lengths. Missing bars does not indicate no membership, just insignificant enrichment of any term in the cluster. The file does not contain the exact paths, only the members of each component of the MGG.
The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. National Center for Biotechnology Information , U. PLoS Comput Biol. Published online Oct Surinder K. Rachel Karchin, Editor. Author information Article notes Copyright and License information Disclaimer. Received May 3; Accepted Oct 8. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
This article has been cited by other articles in PMC. S1 File: Enrichment analysis methods. S3 File: List of multi-granularity graphs. Abstract Splice variants have been shown to play an important role in tumor initiation and progression and can serve as novel cancer biomarkers.
Author summary In spite of many recent breakthroughs, there is still a pressing need for better ways to diagnose and treat cancer in ways that are specific to the unique biology of the disease.
Introduction Large-scale cancer sequencing initiatives have opened up a window into the genome of individual cancers, offering unprecedented opportunities for studying the functional consequences of molecular alterations in human cancers [ 1 ]. Open in a separate window. Fig 1. Workflow for generating multi-granularity graphs MGGs.
Results NEEP identifies splice variants significantly associated with patient survival One of the contributions of this work is the development of a statistically robust and computationally efficient method to identify optimal expression thresholds that yield minimum p -values when performing survival analysis over a large number of transcripts. Fig 2. NEEP yields uniformly distributed p -values.
Case study of multi-granular graphs linked to DNA repair After identifying splice variants significantly associated with survival, we followed the procedure described in Materials and Methods and summarized in Fig 1 to generate multi-granular graphs MGGs.
Fig 3. Multi-granularity graphs. RAD51C expression is linked to a characteristic mutational signature and lower patient survival The biological models discussed above suggest a role played by these splice variants in DNA repair. Fig 4. Mutation signature 3 association with RAD51C Fig 5. Robustness of single threshold methods and NEEP. Discussion The goal of this work was to identify splice variants significantly associated with patient survival and provide possible mechanisms underpinning the associations.
Minimum p -value approach Choosing the threshold which maximizes a test statistic avoids the choice of an arbitrary threshold, while increasing reliability. Empirical estimation of p -values To overcome the limitations of choosing a single threshold arbitrary threshold and minimum p -value methods non-uniform p -values under the null hypothesis , we developed a new approach resulting in null empirically estimated p -values NEEP.
Robustness We measured robustness of the NEEP method by examining sensitivity of significant splice variants to changes in the set of patients. Multi-granular graphs We utilized data across multiple granularities to construct plausible graphs for the association between splice variant expression and lung cancer survival.
Supporting information S1 Fig Enrichment term cluster membership across granularities. TIF Click here for additional data file. S1 File Enrichment analysis methods. PDF Click here for additional data file. S3 File List of multi-granularity graphs. TSV Click here for additional data file. Data Availability All relevant data are within the manuscript and its Supporting Information files. References 1. Cancer genome landscapes. Cancer statistics, CA: a cancer journal for clinicians.
Molecular targeted therapy: Treating cancer with specificity. European journal of pharmacology. Splicing programs and cancer. Journal of Nucleic Acids. Oltean S, Bates DO. Hallmarks of alternative splicing in cancer. Therapeutic targeting of splicing in cancer. Nature Medicine. Differential analyses for RNA-seq: transcript-level estimates improve gene-level inferences.
Published by Elsevier Inc. All rights reserved. Publication types Research Support, Non-U.
0コメント