IDOM/DRC FGC & PSOM NGSC Cores: Analysis - Tools

Introduction

We routinely run the pipeline RUM-MultipleComparisons to assess RNA-Seq data. Although the tool includes the work 'RUM' in the title, it can work with gene expression values from a variety of RNA-Seq tools.

We are still expanding what analyses RUM-MultipleComparisons performs but at the moment, it includes these basic steps.

Assemble a table of the raw data
Filter to consider just transcripts
Performs quantile normalization of the values
Does a series of k-means clustering of the data and displays results as heatmaps
Generates MvA plots of averages for all conditions
Generates MvA plots of replicates within a condition
Tabulates fold-changes between average values for all conditions

What Files Should I Look At?

First, take a look at the plot, Replicates and Kmeans-heatmap.pdf files so that you can see if the samples have good intra-condition consistency. In addition, the heatmap file will help you see if the changes between conditions are consistent across samples, and roughly how many sets of expression patterns there are in the set.

Once you can see that the data is ok, turn to the Averages.tab file or the appropriate Kmeans-*-clusters.tab file to see gene IDs. All of the tab files can be opened from within Excel which can be used to further filter the genes. Gene lists can also be created for use with functional analysis.

How Do we Usually Run It?

We usually focus on well-characterized RefSeqs, i.e., those with IDs like NM_* or NR_*.

What Does the Output Look Like?

Plots

AllPairs-mva.png - a comparison of all samples in the data set.
Kmeans-heatmap.pdf - series of heatmaps using different numbers of clusters. Yellow/white is high expression, red is low.
Pairs.pdf - MvA plots of all condition comparisons
Replicates-mva.pdf - MvA plots of replicates within a condition

Tables of Data

AllTranscriptReadCounts-sql.tab - initial raw data
AllTranscriptReadCounts.tab - data filtered to just transcripts
Averages.tab - averages over conditions with fold-changes for all comparison
Details-Lg2-Qn.tab - quantile normalized values for individual samples
Kmeans-04-clusters.tab - details of genes in each cluster.
Kmeans-05-clusters.tab
Kmeans-06-clusters.tab
...
Kmeans-28-clusters.tab
Kmeans-29-clusters.tab
Kmeans-30-clusters.tab

IDOM/DRC FGC & PSOM NGSC Cores

Saturday, September 1, 2012

Analysis - Tools - RUM-MultipleComparisons

Introduction

What Files Should I Look At?

How Do we Usually Run It?

What Does the Output Look Like?

Plots

Tables of Data

No comments:

Post a Comment

About Me

Blog Archive