IDOM/DRC FGC & PSOM NGSC Cores: September 2012

The Functional Genomics Core (FGC) and the Next-Generation Sequencing Core (NGSC) provide similar services, but with some important differences.
Here is a summary to help you decide which core is right for your project.

FGC

high-throughput sequencing for IDOM/DRC members
downstream data analysis for IDOM/DRC members as capacity allows
Agilent microarrays for IDOM, UPenn, and academic clients
limited RNA-Seq library prep for IDOM/DRC members

NGSC

high-throughput sequencing for UPenn and academic clients
standardized basic preliminary data analysis for UPenn and academic clients
limited RNA-Seq library prep for UPenn, and academic clients

For the NGSC and FGC, prices are higher for external clients.

To good news is that you talk to the same people no matter which core you use.

Introduction

We routinely run the pipeline RUM-MultipleComparisons to assess RNA-Seq data. Although the tool includes the work 'RUM' in the title, it can work with gene expression values from a variety of RNA-Seq tools.

We are still expanding what analyses RUM-MultipleComparisons performs but at the moment, it includes these basic steps.

Assemble a table of the raw data
Filter to consider just transcripts
Performs quantile normalization of the values
Does a series of k-means clustering of the data and displays results as heatmaps
Generates MvA plots of averages for all conditions
Generates MvA plots of replicates within a condition
Tabulates fold-changes between average values for all conditions

What Files Should I Look At?

First, take a look at the plot, Replicates and Kmeans-heatmap.pdf files so that you can see if the samples have good intra-condition consistency. In addition, the heatmap file will help you see if the changes between conditions are consistent across samples, and roughly how many sets of expression patterns there are in the set.

Once you can see that the data is ok, turn to the Averages.tab file or the appropriate Kmeans-*-clusters.tab file to see gene IDs. All of the tab files can be opened from within Excel which can be used to further filter the genes. Gene lists can also be created for use with functional analysis.

How Do we Usually Run It?

We usually focus on well-characterized RefSeqs, i.e., those with IDs like NM_* or NR_*.

What Does the Output Look Like?

Plots

AllPairs-mva.png - a comparison of all samples in the data set.
Kmeans-heatmap.pdf - series of heatmaps using different numbers of clusters. Yellow/white is high expression, red is low.
Pairs.pdf - MvA plots of all condition comparisons
Replicates-mva.pdf - MvA plots of replicates within a condition

Tables of Data

AllTranscriptReadCounts-sql.tab - initial raw data
AllTranscriptReadCounts.tab - data filtered to just transcripts
Averages.tab - averages over conditions with fold-changes for all comparison
Details-Lg2-Qn.tab - quantile normalized values for individual samples
Kmeans-04-clusters.tab - details of genes in each cluster.
Kmeans-05-clusters.tab
Kmeans-06-clusters.tab
...
Kmeans-28-clusters.tab
Kmeans-29-clusters.tab
Kmeans-30-clusters.tab

IDOM/DRC FGC & PSOM NGSC Cores

Friday, September 7, 2012

FGC or NGSC - which core to use?

FGC

NGSC

Saturday, September 1, 2012

Analysis - Tools - RUM-MultipleComparisons