--- title: "R Notebook" output: html_notebook --- DESeq2 tutorial website: (https://www.bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#how-to-get-help-for-deseq2) Harvard's workshop for *DESeq2* analysis: (https://hbctraining.github.io/DGE_workshop/lessons/04_DGE_DESeq2_analysis.html) We first need to load all the libraries we need to use in the script ```{r} library(DESeq2) library(airway) ``` For this example we will use the dataset that comes from the *airway* package: ```{r} data("airway") ``` # You may see the following type of instructions: # Prepare the data # Convert the data to a DESeqDataSet object ```{r} dds <- DESeqDataSetFromMatrix(countData = assay(airway), colData = colData(airway), design = ~ cell + dex) ``` Hmmmm... What is `assay(airway)` or `colData(airway)` or what does `~cell+dex` mean? Let's try to figure out. In R, anything that has round parenthesis is a function, so we can explore each function alone and see what objects it produce. Be careful though... Bioinformatics data might be very large! Explore the object before you try to print it! ```{r} x <- assay(airway) ``` First see what class of an object it is: ```{r} class(x) ``` Great! This is a matrix, so we should be able to find its dimensions: ```{r} dim(x) ``` Since there are many rows but only 8 columns we can take a look at the beginning of this matrix. `head()` function works for data frames, matrices and many other R objects. ```{r} head(x) ``` Great! By looking at the first few rows, we now know that this matrix has row names and row columns. Let's now move to the next function - `colData(airway)`. Let's see what it does: ```{r} x <- colData(airway) ``` Again, let's see the "class" of this object: ```{r} class(x) ``` The output says "DFrame". It is not exactly a *data frame* object we saw before, but maybe something similar? Let's try `head()` function: ```{r} head(x) ``` Ah! It does look like a data frame! How many rows and how many columns does it have? ```{r} dim(x) ``` The last argument that was passed to the `` function was `design = ~ cell + dex`. The expression that follows the tilde `~` sign is a *formula*. Usually it is a list of column names from a data frame that we want to include in our analysis as independent variables. So now, let's execute that function: ```{r} dds <- DESeqDataSetFromMatrix(countData = assay(airway), colData = colData(airway), design = ~ cell + dex) ``` The output is stored in a variable `dds`. What is in it? What can be done with it. Again, try using functions we used before: ```{r} class(dds) ``` Well, it is some "DESeq" Data Set. What can I do with it. Usually you will be given directions, but you also can look for help: ```{r} help(DESeqDataSet) ``` Let's use other funtions to explore what is hidden in this object: ```{r} head(dds) ``` Function `str()` is very helpful, but for very complex large objects, the output could be a little too big.... ```{r} str(dds) ``` Now we are ready to perform differential expression analysis ```{r} dds_a <- DESeq(dds) ``` And again, it might be a good idea to understand what did we get as a result. So we use the same functions - `class`, `head`, `str` to "see" what is hidden in this object ```{r} class(dds_a) ``` ```{r} head(dds_a) ``` And the tutorial (or lecture directions) tells us to use `results()` function to "view" the results. ```{r} # Extract results res <- results(dds_a) ``` ```{r} # View the results head(res) ``` ```{r} # Plot results plotMA(res, main="DESeq2", ylim=c(-2,2)) ``` ```{r} # Save results to a CSV file write.csv(as.data.frame(res), file="DESeq2_results.csv") ```