Monday, December 17, 2012

Analysis - Tools - Comparison

We run this tool to do basic differential analysis.  It is best used for RNA-Seq data, but can be used for other data as well.

Files

By default the files are created in Analysis/DiffExp.  In this directory, you may find multiple analyses which use different data and/or parameters.  Looking inside one of these directories, you will see 3 to 4 files called Compare.*, the most useful of which is Compare.tab.xls.

Compare.tab.xls

This file contains the comparison data.  The contents are somewhat flexible, but will follow this outline.

Each row is a transcript.  The first few columns contain the gene, transcript, and 'Best' (an indicator which guides you to the best transcript for each genes.)

The next set of columns are various comparisons.  Which comparisons are done depend on the experiment.  For each comparison there are 6 columns.
  1. MVA:M:Test:Control - log2 test/control fold change 
  2. MVA:A:Test:Control - log2 average expression
  3. EDGE:A:Test:Control - log2 average expression
  4. EDGE:M:Test:Control - log2 test/control fold change
  5. EDGE:pv:Test:Control  - 0-1 p-value
  6. EDGE:FDR:Test:Control - 0-1 FDR from p-value using Benjamini-Hochberg correction
The first word in each column title indicates the tool that is used to produce the data in the column.
  1. MVA is a simple MvA comparison with no statistical significance.
  2. EDGE is the EdgeR package which performs differential gene expression on RNA-Seq data.
The data that is passed to the analysis programs has been quantile normalized.
M values are the log2(Test/Control), so M=1 indicates 2-fold increase in expression.
A values  are log2 of the average expression between two conditions.  MvA and EdgeR use different units, MVA is usually Reads, whereas EdgeR values have been normalized to counts per million.

The next set of columns of the file are quantile normalized log2 versions of the 'raw' data for the individual samples.

The last set of columns are the 'raw' data which is usually reads.

Looking Deeper
Within each Comparison folder is another called 'Heatmap'.  See http://fgc-ngsc-cores.blogspot.com/2012/09/analysis-tools-rum-multiplecomparisons.html for details about the files in this folder.

No comments:

Post a Comment