04 Data-Analysis

.title[
# 04 Data-Analysis
]
.subtitle[
## Elements of R e R Markdown
]
.author[
### Claudio Zandonella
]

---

### Environment

- `ls()`, list all objects in the environment 
- `rm(<name_object>)`, remove object from the environment 
- `rm(list = ls())`, remove all objects from the environment

> Use RStudio Projects!

### Libraries

- `install.packages("<package-name>")`, install package
- `library("<package-name")`, load package
- `<package_name>::<function>()`, use function without loading package

]

### Working Directory

- `getwd()`, get absolute filepath
- `setwd()`, set the working directory in the session

### Paths

- Favour **Relative path** instead of **Absolute path**.
- Use `"/"` (forward slash) to indicate paths
- `"./"` current working directory
- `"../"` parent directory

### Help!....I need somebody, Help!

- `?<name_function>()`, get function documentation
- `??<name>`, search for matches in documentation

]

---

<br>

<table class="table table-striped table-hover" style="margin-left: auto; margin-right: auto;">
 <thead>
  <tr>
   <th style="text-align:left;"> Format </th>
   <th style="text-align:left;"> Read </th>
   <th style="text-align:left;"> Write </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;"> <bold><code class="remark-inline-code">.rda()</code></bold> </td>
   <td style="text-align:left;width: 12cm; "> <code class="remark-inline-code">load("&lt;data&gt;.rda")</code> </td>
   <td style="text-align:left;width: 13cm; "> <code class="remark-inline-code">save(&lt;object&gt;, file = "&lt;data&gt;.rda")</code> </td>
  </tr>
  <tr>
   <td style="text-align:left;"> <bold><code class="remark-inline-code">.rds()</code></bold> </td>
   <td style="text-align:left;width: 12cm; "> <code class="remark-inline-code">readRDS("&lt;data&gt;.rds")</code> </td>
   <td style="text-align:left;width: 13cm; "> <code class="remark-inline-code">saveRDS(&lt;object&gt;, file = "&lt;data&gt;.rds")</code> </td>
  </tr>
  <tr>
   <td style="text-align:left;"> <bold><code class="remark-inline-code">.csv()</code></bold> </td>
   <td style="text-align:left;width: 12cm; "> <code class="remark-inline-code">read.csv("&lt;data&gt;.csv")</code> for <code class="remark-inline-code">,</code> as separator and <code class="remark-inline-code">.</code> as decimal separator </td>
   <td style="text-align:left;width: 13cm; "> <code class="remark-inline-code">write.csv(&lt;object&gt;, file = "&lt;data&gt;.csv")</code> </td>
  </tr>
  <tr>
   <td style="text-align:left;">  </td>
   <td style="text-align:left;width: 12cm; "> <code class="remark-inline-code">read.csv2()</code> for <code class="remark-inline-code">;</code> as separator and <code class="remark-inline-code">,</code> as decimal separator </td>
   <td style="text-align:left;width: 13cm; "> <code class="remark-inline-code">write.csv2(&lt;object&gt;, file = "&lt;data&gt;.csv")</code> </td>
  </tr>
  <tr>
   <td style="text-align:left;"> <bold><code class="remark-inline-code">.txt()</code></bold> </td>
   <td style="text-align:left;width: 12cm; "> <code class="remark-inline-code">read.table("&lt;data&gt;.txt", sep = )</code> </td>
   <td style="text-align:left;width: 13cm; "> <code class="remark-inline-code">write.table(&lt;object&gt;, file = "&lt;data&gt;.txt")</code> </td>
  </tr>
  <tr>
   <td style="text-align:left;"> <bold><code class="remark-inline-code">.xlsx()</code></bold> </td>
   <td style="text-align:left;width: 12cm; "> <code class="remark-inline-code">readxl::read_xlsx("&lt;data&gt;.xlsx")</code> </td>
   <td style="text-align:left;width: 13cm; "> <code class="remark-inline-code">really??</code> </td>
  </tr>
</tbody>
</table>

---
class: size-small

# Descriptive Analysis

<br>

.pull-left-50[
- `summary()`, summary information
- `mean()`, arithmetic mean 
- `median()`, median value
- `quantile()`, sample quantiles
- `max()`, maximum value
- `min()`, minimum value
- `range()`, range of values
- `sd()`, standard deviation 
- `var()`, variance
- `cor()`, correlation
- `cov()`, covariance
- `table()`, frequency table
]

### Utility Functions
- `paste()`, concatenate string
- `paste0()`, concatenate string
- `head()`, return first rows
- `cumsum()`, cumulative sum
- `cumprod()`, cumulative product
- `round()`, rounding numbers
- `scale()`, scaling and centering values
- `unique()`, get unique elements
- `duplicated()`, determine duplicated elements
- `complete.cases()`, find complete cases
]

---
class: size-small
# Plots

.pull-left-50[
- `plot(x, y, ...)`, scatter plot
- `plot(x, y, type = "l",...)`, line plot
- `barplot(height, ...)`, barplot
- `hist(x, ...)`, histogram
- `boxplot(formula, ...)` boxplot
- `plot(density(x, ...))`, kernel density
- `pairs()`, matrices of scatter plots

### Plot Options

- `col`, color of lines and points 
- `main`, main plot title
- `xlab` / `ylab`, label for x/y axis
- `xlim` / `ylim`, limits for x/y axis
- `legend()`, add legend
- `par(mfrow = c(<n-rows>, <n-cols>))`, <br>define plots grid

]

.pull-right-50[
<img src="04-R-data-analysis_files/figure-html/example-plot-1.png" style="display: block; margin: auto;" />

]

---
class: size-small
# Statistics

.pull-left-50[
### Tests
- `cor.test()`, test correlation
- `t.test()`, Student's t-test
- `wilcox.test()`, Wilcoxon rank test
- `fisher.test()`, Fisher's exact test
- `chisq.test()`, Pearson's chi-squared test 
- `shapiro.test()`, normality Test

### Models

- `lm()`, fit linear models
- `glm()`, fit generalized linear models
- `anova()`, compute analysis of variance
]

`<outcome> ~ <predictors>`

- `1` add intercept term
- `a + b` additive effect
- `a:b` interaction effect
- `a*b` interaction and lower terms
- `(a + b + ...)^n` limit to interaction of grade `n`
- `- a` remove term

### Contrasts

- `contr.treatment()`, dummy coding 
- `contr.sum()`, sum-to-zero contrasts
- `contr.poly()`, orthogonal polynomial contrasts
- `contr.helmert()`, Helmert contrasts
]

---
class: size-small
# Toop Package

- `lme4`, fit linear and generalized linear mixed-effects models
- `brms`, fit Bayesian generalized (non-)linear multivariate multilevel models using 'Stan' for full Bayesian inference.
- `metafor`, a comprehensive collection of functions for conducting meta-analyses in R.
- `lavaan`, fit a variety of latent variable models, including confirmatory factor analysis, structural equation modeling and latent growth curve models
]

- `car`, variety of tests and diagnostics for regresion analysis
- `performance`, utilities for computing measures to assess model quality.
- `emmeans`, obtain estimated marginal means (EMMs) and cntrasts for many linear, generalized linear, and mixed models.

]
---
class: end, middle, center

# Thanks!