Submitted Successfully!
To reward your contribution, here is a gift for you: A free trial for our video production service.
Thank you for your contribution! You can also upload a video entry or images related to this topic.
Version Summary Created by Modification Content Size Created at Operation
1 -- 1410 2023-12-09 07:41:10 |
2 format correct Meta information modification 1410 2023-12-11 02:45:37 | |
3 RNAlysis is changed to RNAnalysis Meta information modification 1410 2023-12-12 07:50:33 |

Video Upload Options

Do you have a full video?

Confirm

Are you sure to Delete?
Cite
If you have any further questions, please contact Encyclopedia Editorial Office.
Vishwakarma, R. RNAnalysis for Codeless RNAnalysing Sequencing of Data. Encyclopedia. Available online: https://encyclopedia.pub/entry/52543 (accessed on 01 September 2024).
Vishwakarma R. RNAnalysis for Codeless RNAnalysing Sequencing of Data. Encyclopedia. Available at: https://encyclopedia.pub/entry/52543. Accessed September 01, 2024.
Vishwakarma, Riya. "RNAnalysis for Codeless RNAnalysing Sequencing of Data" Encyclopedia, https://encyclopedia.pub/entry/52543 (accessed September 01, 2024).
Vishwakarma, R. (2023, December 09). RNAnalysis for Codeless RNAnalysing Sequencing of Data. In Encyclopedia. https://encyclopedia.pub/entry/52543
Vishwakarma, Riya. "RNAnalysis for Codeless RNAnalysing Sequencing of Data." Encyclopedia. Web. 09 December, 2023.
RNAnalysis for Codeless RNAnalysing Sequencing of Data
Edit

In next-generation sequencing experiments, significant obstacles include conducting exploratory data analysis, deciphering trends, pinpointing potential targets/candidates, and presenting results in a clear and intuitive manner. These challenges become more pronounced for researchers lacking expertise in coding, as most analysis tools demand programming skills. Even for adept computational biologists, there is a need for an efficient and reproducible system to produce standardized results.

RNA sequencing Gene-set enrichment analysis Python Bioinformatics

1. Introduction

Qualitative data analysis, understanding patterns, finding possible targets/candidates, and intuitively displaying the results are among the crucial difficult aspects of next-generation sequencing experiments. These challenges are even more difficult for researchers who are not accustomed to writing computer code because the bulk of the analysis tools accessible demand programming knowledge. A customizable Python-based analysis programme for RNA sequencing data is called RNAnalysis [1]. With the help of this tool, users may create unique analytic pipelines that address their unique research concerns, starting with raw FASTQ files and progressing through exploratory data analysis, data visualisation, cluster analysis, and gene-set enrichment analysis. Researchers may conduct data analysis using RNAnalysis' user-friendly graphical user interface without writing any code. Assessing RNA data from many research targeting C. elegans worms has shown the utility of RNAnalysis [2]. The application may be used to process data from any organism. As a research tool for biologists, RNA sequencing tends to gain in popularity. Researchers can compare the levels of gene expression in various biological specimens or experimental settings, group genes based on their patterns of expression, and characterise expression changes in genes involved in particular biological functions and pathways using a wide range of RNA-sequencing analysis techniques.

A unique analysis script must be written for any exceptional research topics because the majority of analysis tools can only handle a portion of these tasks. This script may be daunting to distribute or mimic. Furthermore, many of the current tools can only be used by researchers with experience in computer programming since they need users to be knowledgeable about interpreting and composing code.

RNAnalysis proposes a resolution for these concerns by (1) using a modular strategy that enables users to either study their data gradually or build repeatable analysis pipelines out of specialized roles; and (2) offering a graphical user interface (GUI) that is user-friendly and adaptable, enabling users to examine their data interactively and respond to a wide range of biological questions—regardless of how general or specialized—without writing a single line of code. (3) providing in-depth documentation and step-by-step guided analyses to aid novice users in quickly picking up effective data analysis techniques.

2. Framework of RNAnalysis

RNAnalysis was intended to accomplish three main objectives:

(1) pre-processing and data exploration;

(2) identifying relevant gene sets using filtering, clustering, and set operations;

(3) observing gene set intersections and applying enrichment analysis to those sets.

Using a graphical user interface, users may execute fundamental adapter trimming, RNA sequencing quantification, and differential expression analysis using RNAnalysis in conjunction with tools like CutAdapt, kallisto, and DESeq2. In other words, users may start their RNAnalysis analysis using sequencing data at any point in the process. Users can also import data tables created anywhere else into RNAnalysis. Users of RNAnalysis may study and evaluate many data tables simultaneously while flipping between them thanks to the program's tabbed interface. RNAnalysis may examine user-defined gene sets of interest, differential expression tables, and gene expression matrices (raw or normalised). Additionally, RNAnalysis supports annotations on user-defined gene characteristics. Since RNAnalysis uses tabular data, it may be used with any kind of data table.

3. Data Verification and Prepping

Initially, by consolidating and depicting the distribution and trends of the data, RNAnalysis enables users to authenticate their findings. For instance, users may examine overall patterns in the data, assess the prevalence of gene expression between samples using scatter plots and pair plots, and look for any potential batch effects using cluster gramme plots and PCA projections.

Additionally, RNAnalysis gives users the option to pre-process their data by normalising it using one of the many methods (including Median of Ratios, Relative Log Ratio, Trimmed Mean of M-values, and more), filtering out genes with low levels of expression, and removing rows from their tables that have missing data.

4. Sorting and Grouping Data

Depending on the nature of their data and the biological questions they are trying to answer, users can further filter their data tables after pre-processing according to a wide range of factors. To best meet the demands of the user, these filtering functions might be used in a specified sequence or combination. Among many other things, these features include segmentation by statistical significance or the direction and magnitude of fold change, filtering genomic features by their type, and performing set operations between various data tables and gene sets (for example, intersections, differences, majority vote intersections, etc.) between tables.

The ability to quickly derive gene lists from set operations performed on the user's tables and gene sets and utilize these lists in downstream applications is one of the strong aspects of RNAnalysis.This can be accomplished by manually selecting subsets of interest using an interactive graphical platform, or by using a pre-defined set operation (such as intersection or difference).

Finally, RNAnalysis enables users to group genes depending on how frequently they are expressed. RNAnalysis offers a wide range of clustering techniques, including ensemble-based clustering, density-based clustering, and distance-based clustering (K-Means, K-Medoids, Hierarchical clustering) (a modified version of the CLICOM algorithm). RNAnalysis further offers customers a huge selection of distance measures for clustering research. Implementing distance metrics established specifically for biological applications, such as time-course gene expression data, and distance metrics that were scientifically shown to be the most appropriate for transcriptomics research are examples of this.

5. Implementing Customizable Pipelines and Modularity

At any time throughout the analysis, filtered data tables may be stored or loaded. The names of the output files will automatically reflect the actions carried out on the data as well as their sequence. Additionally, RNAlysis displays the history of instructions performed to each table in the order they were applied, and any operation applied to the data may be undone with a single click.

Any of the tasks that RNAlysis provides may be "bundled" by users into different Pipelines, as was previously indicated. These Pipelines can then be applied in the same manner and with the same settings to any number of related data tables. When examining a large number of datasets, aids users in saving time and avoiding errors and inconsistencies. Additionally, pipelines may be shared and exported with other researchers, who can utilise them on any machine that has RNAlysis installed. By facilitating reporting and sharing of analytic pipelines, this feature improves the repeatability and impartiality of bioinformatic results [3].

6. Enrichment Analysis

Users can do enrichment analysis for specific gene sets using the Enrichment window after using the aforementioned techniques to narrow down data tables to gene sets of interest. A group of techniques known as "gene set enrichment analysis" can be used to find gene classes, biological processes, or pathways that are over- or underrepresented in a gene set of interest.

For enrichment analysis, RNAlysis offers a variety of strategies and statistical techniques, such as the traditional gene-set enrichment analysis, permutation tests background-free enrichment analysis, and enrichment for ordinal or continuous variables [4]. All significant model organism enrichment analysis annotations may be automatically retrieved by RNAlysis from well-known resources like KEGG pathways and Gene Ontology categories [5].

RNAlysis, however, also takes annotations for user-defined characteristics and groupings, in contrast to the majority of other analytic pipelines. This enables users to modify their analyses to meet their requirements and biological inquiries.

7. Epilogue

All the features and functions that RNAlysis offers may be imported and utilised in common Python scripts, allowing users with coding skills to further automate and tailor their bioinformatic investigations. RNAlysis can be used wholly inside a graphical interface. RNAlysis comes with a tonne of documentation to help both new and seasoned users. Along with video demos, use samples, and suggested procedures, a User Guide provides a birds-eye perspective of the modules and capabilities of RNAlysis.

Users can search for particular items to gain a more in-depth analysis of the theoretical foundation, use cases, and alternative parameters of the functions and capabilities offered in RNAlysis in the remaining documentation. The project has a public, open-source GitHub repository that is accessible. Additionally, the package has a large number of test cases that are automatically run each time the source code is changed to guarantee that data analysis using RNAlysis is consistent and trustworthy.

References

  1. Guy Teichman, Dror Cohen, Or Ganon, Netta Dunsky, Shachar Shani, Hila Gingold, View ORCID ProfileOded Rechavi. RNAlysis: analyze your RNA sequencing data without writing a single line of code. BioRXiV BMC biology. 2022, 10, 14.
  2. Francis R. G. Amrit 1 and Arjumand Ghazi. Transcriptomic Analysis of C. elegans RNA Sequencing Data Through the Tuxedo Suite on the Galaxy Project. Journal of Visualized Experiments. 2017, 122, 55.
  3. Carlos Prieto , David Barrios. RaNA-Seq: Interactive RNA-Seq analysis from FASTQ files to functional analysis. Bioinformatics. 2019, 19, 93.
  4. Guy Teichman, Dror Cohen, Or Ganon, Netta Dunsky, Shachar Shani, Hila Gingold & Oded Rechavi. RNAlysis: analyze your RNA sequencing data without writing a single line of code. BMC Biology. 2023, 21, 34.
  5. Minoru Kanehisaa and Susumu Goto. KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Research. 2000, 10, 93.
More
Information
Contributor MDPI registered users' name will be linked to their SciProfiles pages. To register with us, please refer to https://encyclopedia.pub/register :
View Times: 591
Revisions: 3 times (View History)
Update Date: 12 Dec 2023
1000/1000
ScholarVision Creations