---
title: "R Notebook"
output: html_notebook
---
DESeq2 tutorial website:
(https://www.bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#how-to-get-help-for-deseq2)
Harvard's workshop for *DESeq2* analysis: (https://hbctraining.github.io/DGE_workshop/lessons/04_DGE_DESeq2_analysis.html)
We first need to load all the libraries we need to use in the script
```{r}
library(DESeq2)
library(airway)
```
For this example we will use the dataset that comes from the *airway* package:
```{r}
data("airway")
```
# You may see the following type of instructions:
# Prepare the data
# Convert the data to a DESeqDataSet object
```{r}
dds <- DESeqDataSetFromMatrix(countData = assay(airway),
colData = colData(airway),
design = ~ cell + dex)
```
Hmmmm... What is `assay(airway)` or `colData(airway)` or what does `~cell+dex` mean?
Let's try to figure out.
In R, anything that has round parenthesis is a function, so we can explore each
function alone and see what objects it produce.
Be careful though... Bioinformatics data might be very large!
Explore the object before you try to print it!
```{r}
x <- assay(airway)
```
First see what class of an object it is:
```{r}
class(x)
```
Great! This is a matrix, so we should be able to find its dimensions:
```{r}
dim(x)
```
Since there are many rows but only 8 columns we can take a look at the beginning of this matrix.
`head()` function works for data frames, matrices and many other R objects.
```{r}
head(x)
```
Great! By looking at the first few rows, we now know that this matrix has row names and row columns.
Let's now move to the next function - `colData(airway)`. Let's see what it does:
```{r}
x <- colData(airway)
```
Again, let's see the "class" of this object:
```{r}
class(x)
```
The output says "DFrame".
It is not exactly a *data frame* object we saw before, but maybe something similar?
Let's try `head()` function:
```{r}
head(x)
```
Ah! It does look like a data frame! How many rows and how many columns does it have?
```{r}
dim(x)
```
The last argument that was passed to the `` function was `design = ~ cell + dex`.
The expression that follows the tilde `~` sign is a *formula*. Usually it is a list of column names from a data frame that we want to include in our analysis as independent variables.
So now, let's execute that function:
```{r}
dds <- DESeqDataSetFromMatrix(countData = assay(airway),
colData = colData(airway),
design = ~ cell + dex)
```
The output is stored in a variable `dds`. What is in it? What can be done with it. Again, try using functions we used before:
```{r}
class(dds)
```
Well, it is some "DESeq" Data Set. What can I do with it. Usually you will be given directions, but you also can look for help:
```{r}
help(DESeqDataSet)
```
Let's use other funtions to explore what is hidden in this object:
```{r}
head(dds)
```
Function `str()` is very helpful, but for very complex large objects, the output could be a little too big....
```{r}
str(dds)
```
Now we are ready to perform differential expression analysis
```{r}
dds_a <- DESeq(dds)
```
And again, it might be a good idea to understand what did we get as a result. So we use the same functions -
`class`, `head`, `str` to "see" what is hidden in this object
```{r}
class(dds_a)
```
```{r}
head(dds_a)
```
And the tutorial (or lecture directions) tells us to use `results()` function to "view" the results.
```{r}
# Extract results
res <- results(dds_a)
```
```{r}
# View the results
head(res)
```
```{r}
# Plot results
plotMA(res, main="DESeq2", ylim=c(-2,2))
```
```{r}
# Save results to a CSV file
write.csv(as.data.frame(res), file="DESeq2_results.csv")
```