R is a scripting language^{1} with the main purpose of conducting tasks related to statistics, mainly by academics for the academics. Though, these days things got slightly out of hand and R became one of the most popular languages especially in the field of “data science”. The biggest advantage of R is the huge package (R equivalent of “There is an app for that”) and developer support.

Other main points to know about R are as follows.

- R is mainly based on vector operations.
^{2} - R inherently does not support parallel processing. (There are solutions but still frustrating.)
- R does all the computations in the memory. A bit problematic for bigger data (>10M rows) applications, but there are solutions as well.
- Even though R packages have magnificent reporting tools (e.g. This document is prepared using R Markdown.), it is not suitable for all purpose use such as Python (especially web development).
- Package management is mainly done by CRAN repository. Though, these days there are many packages in other sources as well For instance, putting R packages on GitHub before CRAN for testing is quite popular.
- There is a Microsoft version of R with additional abilities (It supports Mac and Linux, too).

A list of resources with links and explanations will be given at the end of this document.

- Download R from this link https://cran.r-project.org/ (latest release will be 3.4.2 as of Sep 29, 2017) and install. Make sure it is working.
- Recommended to choose an editor (see R 101 below for alternatives).

This part lays out the very basics of R. Content is mainly about data types (numeric, character and logical), object types (vectors, matrices, lists and ) and basic operations. Before starting check the following tips that can be useful.

- Commenting the code is done with
`#`

for each line. There is no block comment like`/* */`

used in C, but there are block commment keyboard shortcuts for most editors. - If you need information about any object just put
`?`

before the object. For example, try`?mean`

to get information about the function`mean`

. - You will need a good code editor or an IDE (Integrated Development Environment). Most popular choice of an IDE for R is RStudio (Remember to install R first). Recommended alternatives are Atom and Sublime Text with proper add-ons.

Values can be defined on variables with the assignment operator `<-`

or `=`

.^{3} For example let’s assign a numeric value to the variable `x`

.^{4} You don’t need to define a variable, assigning a value is enough.

```
x <- 522
x
```

`## [1] 522`

Your can also assign character strings,

```
x <- "BDA503"
x
```

`## [1] "BDA503"`

and logical. (There is also a factor type, but it is skipped for now.)

```
x <- FALSE
x
```

`## [1] FALSE`

In this part, object types such as `vector`

, `matrix`

, `data.frame`

and `list`

are explained. Although this is not a complete list (e.g. `array`

is another object type) and object is a more general concept, these object types are mostly sufficient at beginner and intermediate levels.

Most basic data structure is a vector. You can create a simple vector with `c()`

(**c**ombine).

```
x <- c(5,2,2)
x
```

`## [1] 5 2 2`

You can change any value in a vector by defining its index. Index starts with 1.

```
x[2] <- 7
x
```

`## [1] 5 7 2`

You can omit a value by putting a negative index.

```
x[-2] <- 0
x
```

`## [1] 0 7 0`

R handles out of bounds index values and returns `NA`

.

```
x[5] <- 10
x
```

`## [1] 0 7 0 NA 10`

You can define multiple index values and define rules to choose the index.

```
x2 <- 10:19 #This is a special representation that generates a vector from a (10) to b (19).
x2[c(1,3,7)] #Return 1st, 3rd and 7th values.
```

`## [1] 10 12 16`

`x2[(1:3)] #Return 1st to 3rd values.`

`## [1] 10 11 12`

`x2[x2>15] #Return the index values where x2 > 15`

`## [1] 16 17 18 19`

`x2>15`

`## [1] FALSE FALSE FALSE FALSE FALSE FALSE TRUE TRUE TRUE TRUE`

You can give names instead of index values.

```
x3<-c(1,2,3)
names(x3)<-c("a1","b2","c3")
x3
```

```
## a1 b2 c3
## 1 2 3
```

`x3["b2"]`

```
## b2
## 2
```

If you try to combine different data types, R will transform them to characters or numeric.

`c(5,FALSE)`

`## [1] 5 0`

`c(5,FALSE,"BDA503")`

`## [1] "5" "FALSE" "BDA503"`

Mathematical operations can be easily done with vectors.

```
vec1 <- 1:5 # This is a special representation of consecutive numbers.
vec1
```

`## [1] 1 2 3 4 5`

```
vec2 <- vec1 * 2
vec2
```

`## [1] 2 4 6 8 10`

`vec1 + vec2`

`## [1] 3 6 9 12 15`

Vectors need not to be of equal size (though recommended).

```
vec1 <- 1:6
vec2 <- 3:5
vec1 + vec2
```

`## [1] 4 6 8 7 9 11`

Matrix is more like a stylized vector in a rectangular (matrix) format with some special functions.

```
mat1<-matrix(1:9, ncol=3, nrow=3)
mat1
```

```
## [,1] [,2] [,3]
## [1,] 1 4 7
## [2,] 2 5 8
## [3,] 3 6 9
```

You can manipulate a value of a matrix by giving its index value.

```
mat1[2,2] <- -10
mat1
```

```
## [,1] [,2] [,3]
## [1,] 1 4 7
## [2,] 2 -10 8
## [3,] 3 6 9
```

Here are some basic matrix operations.

```
mat2 <- matrix(c(0,4,1,2,0,0,0,0,1),ncol=3)
mat2
```

```
## [,1] [,2] [,3]
## [1,] 0 2 0
## [2,] 4 0 0
## [3,] 1 0 1
```

`t(mat2) # Transpose of a matrix`

```
## [,1] [,2] [,3]
## [1,] 0 4 1
## [2,] 2 0 0
## [3,] 0 0 1
```

`solve(mat2) # Inverse of a matrix`

```
## [,1] [,2] [,3]
## [1,] 0.0 0.25 0
## [2,] 0.5 0.00 0
## [3,] 0.0 -0.25 1
```

`det(mat2) # Determinant value of a matrix`

`## [1] -8`

`dim(mat2) # Dimensions of a matrix`

`## [1] 3 3`

`nrow(mat2) # Number of rows of a matrix`

`## [1] 3`

`ncol(mat2) # Number of columns of a matrix`

`## [1] 3`

`diag(mat2) # Diagonal values of a matrix`

`## [1] 0 0 1`

`eigen(mat2) # Eigenvalues and eigenvectors of a matrix`

```
## eigen() decomposition
## $values
## [1] 2.828427 -2.828427 1.000000
##
## $vectors
## [,1] [,2] [,3]
## [1,] 0.5505553 0.5708950 0
## [2,] 0.7786028 -0.8073674 0
## [3,] 0.3011087 -0.1491200 1
```

`mat1 %*% mat2 # Matrix multiplication`

```
## [,1] [,2] [,3]
## [1,] 23 2 7
## [2,] -32 4 8
## [3,] 33 6 9
```

You can also do vector operations with matrices.

`mat1 + mat2`

```
## [,1] [,2] [,3]
## [1,] 1 6 7
## [2,] 6 -10 8
## [3,] 4 6 10
```

`mat1 - mat2`

```
## [,1] [,2] [,3]
## [1,] 1 2 7
## [2,] -2 -10 8
## [3,] 2 6 8
```

`mat1 / mat2`

```
## [,1] [,2] [,3]
## [1,] Inf 2 Inf
## [2,] 0.5 -Inf Inf
## [3,] 3.0 Inf 9
```

`mat1 * mat2`

```
## [,1] [,2] [,3]
## [1,] 0 8 0
## [2,] 8 0 0
## [3,] 3 0 9
```

You can do operations with matrices and vectors together. Then matrix is treated like a vector with the index column order (i.e. starts from top to bottom, then goes to next column).

```
mat3 <- matrix(1:9,ncol=3)
mat3
```

```
## [,1] [,2] [,3]
## [1,] 1 4 7
## [2,] 2 5 8
## [3,] 3 6 9
```

```
vec <- c(0,1,0)
mat3 + vec
```

```
## [,1] [,2] [,3]
## [1,] 1 4 7
## [2,] 3 6 9
## [3,] 3 6 9
```

`mat3 * vec`

```
## [,1] [,2] [,3]
## [1,] 0 0 0
## [2,] 2 5 8
## [3,] 0 0 0
```

You can name rows and columns of a matrix.

```
rownames(mat3) <- c("a","b","c")
colnames(mat3) <- c("y1","y2","y3")
mat3
```

```
## y1 y2 y3
## a 1 4 7
## b 2 5 8
## c 3 6 9
```

Data frame is the most useful object type. Unlike matrix and vector you can define different data types for different columns.

```
df1 <- data.frame(some_numbers=1:3,some_names=c("Blood","Sweat","Tears"),some_logical=c(TRUE,FALSE,TRUE))
df1
```

```
## some_numbers some_names some_logical
## 1 1 Blood TRUE
## 2 2 Sweat FALSE
## 3 3 Tears TRUE
```

You can see the details of an object (in this case the data frame) using `str()`

function.

`str(df1)`

```
## 'data.frame': 3 obs. of 3 variables:
## $ some_numbers: int 1 2 3
## $ some_names : Factor w/ 3 levels "Blood","Sweat",..: 1 2 3
## $ some_logical: logi TRUE FALSE TRUE
```

You easily can do operations on a single column using `$`

.

`df1$some_numbers`

`## [1] 1 2 3`

`df1$some_names`

```
## [1] Blood Sweat Tears
## Levels: Blood Sweat Tears
```

`df1$some_logical`

`## [1] TRUE FALSE TRUE`

```
df1$some_numbers <- df1$some_numbers^2
df1
```

```
## some_numbers some_names some_logical
## 1 1 Blood TRUE
## 2 4 Sweat FALSE
## 3 9 Tears TRUE
```

There are many example data sets in base R and packages in `data.frame`

format. For instance, `EuStockMarkets`

contains the closing prices of DAX (Germany), SMI (Switzerland), CAC (French), FTSE (UK) stock market indices.

`head(EuStockMarkets) #head() function shows the first rows of a data frame.`

```
## DAX SMI CAC FTSE
## [1,] 1628.75 1678.1 1772.8 2443.6
## [2,] 1613.63 1688.5 1750.5 2460.2
## [3,] 1606.51 1678.6 1718.0 2448.2
## [4,] 1621.04 1684.1 1708.1 2470.4
## [5,] 1618.16 1686.6 1723.1 2484.7
## [6,] 1610.61 1671.6 1714.3 2466.8
```

Lists can hold many objects (including lists).

```
list1 <- list(df1,mat3,vec2)
list1
```

```
## [[1]]
## some_numbers some_names some_logical
## 1 1 Blood TRUE
## 2 4 Sweat FALSE
## 3 9 Tears TRUE
##
## [[2]]
## y1 y2 y3
## a 1 4 7
## b 2 5 8
## c 3 6 9
##
## [[3]]
## [1] 3 4 5
```

`list1[[1]]`

```
## some_numbers some_names some_logical
## 1 1 Blood TRUE
## 2 4 Sweat FALSE
## 3 9 Tears TRUE
```

You can name the objects and call them with the names if you like.

```
list1 <- list(some_df=df1,some_mat=mat3,vec2)
list1
```

```
## $some_df
## some_numbers some_names some_logical
## 1 1 Blood TRUE
## 2 4 Sweat FALSE
## 3 9 Tears TRUE
##
## $some_mat
## y1 y2 y3
## a 1 4 7
## b 2 5 8
## c 3 6 9
##
## [[3]]
## [1] 3 4 5
```

`list1$some_df`

```
## some_numbers some_names some_logical
## 1 1 Blood TRUE
## 2 4 Sweat FALSE
## 3 9 Tears TRUE
```

Lists are frequently used in functions as parameter set holders and for other purposes.

Remember, you can always look for help for a function using `?function_name`

or `help(function_name)`

. This is not an exhaustive list, there are many other fantastic functions in base R.

`rep(x=5,times=10) #Repeat a value or a vector`

`## [1] 5 5 5 5 5 5 5 5 5 5`

`seq(from=5,to=10,length.out=11) #Create a sequence with the given number of equidistant elements`

`## [1] 5.0 5.5 6.0 6.5 7.0 7.5 8.0 8.5 9.0 9.5 10.0`

`seq(from=5,to=10,by=0.25) #Create a sequence with the given increment value`

```
## [1] 5.00 5.25 5.50 5.75 6.00 6.25 6.50 6.75 7.00 7.25 7.50
## [12] 7.75 8.00 8.25 8.50 8.75 9.00 9.25 9.50 9.75 10.00
```

```
vec1 <- sample(x=1:10,size=10,replace=FALSE) #Pick 10 numbers randomly without replacement (Note: Your results might differ from this document due to randomness.)
vec1
```

`## [1] 1 9 7 5 8 4 2 10 3 6`

`print(vec1/2) #Print the outputs of an object. Useful for later.`

`## [1] 0.5 4.5 3.5 2.5 4.0 2.0 1.0 5.0 1.5 3.0`

`rev(vec1) #Reverse of a vector`

`## [1] 6 3 10 2 4 8 5 7 9 1`

`length(vec1) #Number of elements of a vector`

`## [1] 10`

`vec1 %% 2 #Mod 2 of the elements in the vector`

`## [1] 1 1 1 1 0 0 0 0 1 0`

`min(vec1) #Minimum value of the vector`

`## [1] 1`

`max(vec1) #Maximum value of the vector`

`## [1] 10`

`factorial(vec1) #Factorial value of all elements of a vector (You can use a single value as well)`

```
## [1] 1 362880 5040 120 40320 24 2 3628800
## [9] 6 720
```

`sum(vec1) #Sum of all the values in the vector`

`## [1] 55`

`cumsum(vec1) #Cumulative sum of all the values in the vector`

`## [1] 1 10 17 22 30 34 36 46 49 55`

`prod(vec1) #Product (multiplication) of all the values in the vector`

`## [1] 3628800`

`cumprod(vec1) #Cumulative product of all the values in the vector`

```
## [1] 1 9 63 315 2520 10080 20160 201600
## [9] 604800 3628800
```

`log(vec1) #Natural logarithm of the values in the vector`

```
## [1] 0.0000000 2.1972246 1.9459101 1.6094379 2.0794415 1.3862944 0.6931472
## [8] 2.3025851 1.0986123 1.7917595
```

`log(vec1,base=2) #Logarithm of base 2.`

```
## [1] 0.000000 3.169925 2.807355 2.321928 3.000000 2.000000 1.000000
## [8] 3.321928 1.584963 2.584963
```

`exp(vec1) #Exponential values of a vector (e=2.71...)`

```
## [1] 2.718282 8103.083928 1096.633158 148.413159 2980.957987
## [6] 54.598150 7.389056 22026.465795 20.085537 403.428793
```

`vec1^2 #Power of 2`

`## [1] 1 81 49 25 64 16 4 100 9 36`

`sqrt(vec1) #Square root`

```
## [1] 1.000000 3.000000 2.645751 2.236068 2.828427 2.000000 1.414214
## [8] 3.162278 1.732051 2.449490
```

```
vecx <- c(1,3,5,7) #Define another vector
vecy <- c(8,6,4,2) #Define another vector
pmax(vecx,vecy) #Maximum of each corresponding element of two (or more) vectors
```

`## [1] 8 6 5 7`

`pmin(vecx,vecy) #Minimum of each corresponding element of two (or more) vectors`

`## [1] 1 3 4 2`

`max(vecx,vecy) #Difference between max and pmax`

`## [1] 8`

```
vec1 <- c(-1,0.5,-1.2,4/3)
vec1
```

`## [1] -1.000000 0.500000 -1.200000 1.333333`

`abs(vec1) #Absolute value`

`## [1] 1.000000 0.500000 1.200000 1.333333`

`round(vec1,digits = 1) #Round a value to a number of digits`

`## [1] -1.0 0.5 -1.2 1.3`

`floor(vec1) #Round down value of vector`

`## [1] -1 0 -2 1`

`ceiling(vec1) #Round up value of vector`

`## [1] -1 1 -1 2`

`round(0.5) #Interesting case about rounding. Compare with below.`

`## [1] 0`

`round(1.5) #Interesting case about rounding. Compare with above.`

`## [1] 2`

```
vec_table<-sample(letters[1:5],20,replace=TRUE) #Another vector for frequency tables. letters is a predefined object in R.
vec_table
```

```
## [1] "d" "a" "e" "e" "e" "e" "d" "a" "b" "b" "a" "d" "b" "b" "c" "e" "e"
## [18] "b" "b" "c"
```

`table(vec_table) #Easily do a frequency table.`

```
## vec_table
## a b c d e
## 3 6 2 3 6
```

```
vec2 <- sample(x=11:20,size=10,replace=FALSE)
vec2
```

`## [1] 18 17 14 11 16 12 20 13 19 15`

`sort(vec2) #Sort the values in the vector`

`## [1] 11 12 13 14 15 16 17 18 19 20`

`rank(vec2) #Rank of the values in the vector`

`## [1] 8 7 4 1 6 2 10 3 9 5`

`order(vec2) #Returns the index values (ascending) of the sorted vector.`

`## [1] 4 6 8 3 10 5 2 1 9 7`

`order(vec2,decreasing=TRUE) #Returns the index values (descending) of the sorted vector.`

`## [1] 7 9 1 2 5 10 3 8 6 4`

These operators return TRUE or FALSE values. They are especially useful to

```
vec1 <- 1:10
vec1
```

`## [1] 1 2 3 4 5 6 7 8 9 10`

`vec1 > 5 #Logical (TRUE/FALSE) result of elements greater than 5.`

`## [1] FALSE FALSE FALSE FALSE FALSE TRUE TRUE TRUE TRUE TRUE`

`vec1[vec1 > 5]`

`## [1] 6 7 8 9 10`

`vec1 >= 5 #Logical result of elements greater than or equal to 5.`

`## [1] FALSE FALSE FALSE FALSE TRUE TRUE TRUE TRUE TRUE TRUE`

`vec1[vec1 >= 5]`

`## [1] 5 6 7 8 9 10`

`vec1 < 5 #Logical result of elements less than 5.`

`## [1] TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE FALSE`

`vec1 <= 5 #Logical result of elements less than or equal to 5.`

`## [1] TRUE TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE`

`vec1 > 5 & vec1 < 9 #and (&) operator`

`## [1] FALSE FALSE FALSE FALSE FALSE TRUE TRUE TRUE FALSE FALSE`

`vec1[vec1 > 5 & vec1 < 9]`

`## [1] 6 7 8`

`vec1 < 5 | vec1 > 9 #or (|) operator`

`## [1] TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE TRUE`

`vec1[vec1 < 5 | vec1 > 9]`

`## [1] 1 2 3 4 10`

You can also do element by element comparisons of two vectors.

```
eu_df<- data.frame(EuStockMarkets[1:20,]) #Take the first 20 rows of the stock market index data
eu_df_returns <- data.frame(DAX=100*(round(eu_df$DAX[-1]/eu_df$DAX[-20],4)-1),
CAC=100*(round(eu_df$CAC[-1]/eu_df$CAC[-20],4)-1)) #Calculate the index percentage returns
eu_df_returns$DAX_or_CAC <- eu_df_returns$DAX >= eu_df_returns$CAC #If the return of DAX is larger than or equal to CAC return TRUE
eu_df_returns
```

```
## DAX CAC DAX_or_CAC
## 1 -0.93 -1.26 TRUE
## 2 -0.44 -1.86 TRUE
## 3 0.90 -0.58 TRUE
## 4 -0.18 0.88 FALSE
## 5 -0.47 -0.51 TRUE
## 6 1.25 1.18 TRUE
## 7 0.58 1.32 FALSE
## 8 -0.29 -0.19 FALSE
## 9 0.64 0.02 TRUE
## 10 0.12 0.31 FALSE
## 11 -0.58 -0.24 FALSE
## 12 -0.51 0.15 FALSE
## 13 -0.52 -0.03 FALSE
## 14 0.20 0.34 FALSE
## 15 0.18 -0.04 TRUE
## 16 0.27 0.35 FALSE
## 17 -0.66 0.52 FALSE
## 18 -0.48 0.11 FALSE
## 19 -0.52 -0.70 TRUE
```

Some functions are predefined to facilitate statistics calculations.

```
vec1 <- sample(1:20,50,replace=TRUE) #Sample 50 numbers from values between 1 to 20
vec1
```

```
## [1] 9 1 20 20 20 13 1 2 1 18 8 16 9 12 12 4 4 16 5 1 9 7 14
## [24] 2 9 11 1 8 7 1 17 10 17 18 17 13 17 5 5 12 15 20 7 4 8 10
## [47] 12 7 17 17
```

`mean(vec1) #Mean`

`## [1] 10.18`

`median(vec1) #Median`

`## [1] 9.5`

`var(vec1) #Variance`

`## [1] 37.08939`

`sd(vec1) #Standard deviation`

`## [1] 6.090106`

`quantile(vec1) #Quantile values`

```
## 0% 25% 50% 75% 100%
## 1.0 5.0 9.5 16.0 20.0
```

`quantile(vec1,0.65) #Quantile value of a specific percentage`

```
## 65%
## 12.85
```

`summary(vec1) #An aggregate summary`

```
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.00 5.00 9.50 10.18 16.00 20.00
```

There are also random number generators and functions related with densities and cdf’s of different distributions. Here are the functions for normal distribution.

`rnorm(5,mean=0,sd=1) #Generate 5 normally distributed random numbers with mean 0 and sd 1`

`## [1] 0.02469966 0.52781272 0.15529557 0.33042475 -0.74232844`

`dnorm(x=0,mean=0,sd=1) #Density value of a point in a normal distribution with mean 0 and sd 1`

`## [1] 0.3989423`

`pnorm(q=1.96,mean=0,sd=1) #Cumulative distribution value of a point in a normal distribution with mean 0 and sd 1`

`## [1] 0.9750021`

`qnorm(p=0.975,mean=0,sd=1) #Quantile value of a point in a normal distribution with mean 0 and sd 1`

`## [1] 1.959964`

Other distributions include `dpois`

(poisson), `dbinom`

(binomial), `dgeom`

(geometric), `dunif`

(uniform), `dgamma`

(gamma), `dexp`

(exponential), `dchisq`

(chi-squared), `dt`

(t distribution), `df`

(F distribution), `dcauchy`

(cauchy),`dnbinom`

(negative binomial), `dhyper`

(hypergeometric), `dlnorm`

(lognormal), `dbeta`

(beta), `dlogis`

(logistic) and `dweibull`

(weibull) with the same format (e.g. `rpois`

generates random poisson numbers).

**Tip:** For reproducibility use `set.seed`

. It will set the randomness seed to a value and random number generation will be the same for (almost) everyone.

```
set.seed(522)
rnorm(10)
```

```
## [1] 0.52028245 0.75354770 -0.80932517 -0.42112173 0.08458416
## [6] 1.80153605 1.25071091 -0.31097287 1.16377544 -0.67728655
```

Let’s run it a second time by resetting the seed. The output will be the same.

```
set.seed(522)
rnorm(10)
```

```
## [1] 0.52028245 0.75354770 -0.80932517 -0.42112173 0.08458416
## [6] 1.80153605 1.25071091 -0.31097287 1.16377544 -0.67728655
```

See, the same output happens when randomness seed is restarted at the same value.

You can convert numeric to character, logical to numeric using functions starting with `as.`

and check the type of the object with `is.`

or `typeof()`

.

```
vec1<-c(1,2,3,4)
is.numeric(vec1) #Is the vector numeric?
```

`## [1] TRUE`

`as.character(vec1) #Make the vector character?`

`## [1] "1" "2" "3" "4"`

`typeof(vec1) #What is the type?`

`## [1] "double"`

```
vec2<-c("a","b","c","d")
typeof(vec2)
```

`## [1] "character"`

`as.numeric(vec2) # oops`

`## Warning: NAs introduced by coercion`

`## [1] NA NA NA NA`

```
vec3<-c(TRUE,FALSE,TRUE,FALSE)
is.logical(vec3)
```

`## [1] TRUE`

`as.numeric(vec3)`

`## [1] 1 0 1 0`

`as.character(vec3)`

`## [1] "TRUE" "FALSE" "TRUE" "FALSE"`

`vec3*1 #Convert to numeric with multiplication`

`## [1] 1 0 1 0`

```
df1<-data.frame(a=c(1,2,3),b=c(4,5,6),c=c(7,8,9))
as.matrix(df1) #Convert to matrix
```

```
## a b c
## [1,] 1 4 7
## [2,] 2 5 8
## [3,] 3 6 9
```

```
mat1 <- matrix(1:9,ncol=3)
as.data.frame(mat1)
```

```
## V1 V2 V3
## 1 1 4 7
## 2 2 5 8
## 3 3 6 9
```

```
strvec1<-c("BDA503","BDA507","IE422")
grep("BDA",strvec1) #Index values of character strings including BDA
```

`## [1] 1 2`

`grepl("BDA",strvec1) #TRUE FALSE statements of character strings including BDA`

`## [1] TRUE TRUE FALSE`

`gsub("BDA","IE",strvec1) #Replacing strings`

`## [1] "IE503" "IE507" "IE422"`

`nchar(strvec1) #Return number of characters in string`

`## [1] 6 6 5`

`substr(strvec1,start=1,stop=2) #Trim the string from start to stop`

`## [1] "BD" "BD" "IE"`

`paste("BDA","503",sep="-") #Concatenate two strings with a separator.`

`## [1] "BDA-503"`

`paste0("BDA","503") #Concatenate two strings without a separator, equivalent of paste(.,sep="").`

`## [1] "BDA503"`

`paste(strvec1,collapse="+") #Concatenate elements of a vector with a collapse character.`

`## [1] "BDA503+BDA507+IE422"`

Conditionals are straightforward. If a statement returns `TRUE`

, then the code chunk defined by the brackets are executed.

```
course_name <- "BDA503" #Define the course name.
if(course_name=="BDA503"){ #If the course name is BDA503.
print("Correct course.")
}
```

`## [1] "Correct course."`

It is possible to execute some other code chunk if the statement is `FALSE`

with `else`

and add other conditionals using `else if`

.

```
course_name <- "BDA507" #Define the course name.
if(course_name=="BDA503"){ #If the course name is BDA503.
print("Correct course.")
}else if(grepl("BDA",course_name)){ #If the course name include BDA but it is not BDA503.
print("Wrong course but close.")
}else{ #If none of the above
print("Wrong course.")
}
```

`## [1] "Wrong course but close."`

`if`

conditional statements accept only one value. If you want to check for all elements in a vector use `ifelse()`

.

```
course_name<-c("BDA503","BDA511","IE422")
ifelse(course_name=="BDA503","Correct Course","Wrong Course")
```

`## [1] "Correct Course" "Wrong Course" "Wrong Course"`

Although you are warned that R works slowly with loops (especially loops within loops), it is usually inevitable to use the loops.

For loops consist of a loop variable and a scope.

```
val<-2
for(i in 1:3){ #Define the loop variable and scope
print(val^i)
}
```

```
## [1] 2
## [1] 4
## [1] 8
```

Scope does not need to be numbers. For returns whatever in the scope in index order

```
for(i in c("BDA503","BDA511","IE422")){
print(i)
}
```

```
## [1] "BDA503"
## [1] "BDA511"
## [1] "IE422"
```

While is a less frequently used loop type. It repeats the code while a condition is met. It first checks the condition. When it is not satisfied, it skips the code chunk.

```
x <- 0
while(x < 3){
x <- x+1
print(paste0("x is ",x," x is not at the desired level. Desired level is above 3."))
}
```

```
## [1] "x is 1 x is not at the desired level. Desired level is above 3."
## [1] "x is 2 x is not at the desired level. Desired level is above 3."
## [1] "x is 3 x is not at the desired level. Desired level is above 3."
```

R lets you to define functions easily, with a flexible format. Here are some examples.

```
fun1<-function(par1="This is a default value"){
print(par1)
}
```

If there is a default value defined on the function you do not need to enter any value if you are comfortable with.

`fun1()`

`## [1] "This is a default value"`

You can change the parameters when you call the function.

`fun1(par1="Congratulations, you changed the parameter.")`

`## [1] "Congratulations, you changed the parameter."`

If you are careful about the order of your entered parameters, you do not need to write the parameter name.

`fun1("Wow you do it like a pro without parameter names!")`

`## [1] "Wow you do it like a pro without parameter names!"`

Here is another simple example. Let’s calculate the future value of an initial investment compounded interest.

```
calc_future_value<-function(present_value,interest_rate,years){
return(present_value*(1+interest_rate)^years)
}
calc_future_value(100,0.05,5)
```

`## [1] 127.6282`

Put a technical analysis.

Reading from and writing to data files will be unavoidable at some point. While it is useful to know the fundamental functions, I/O operations usually require experience. In other words, you will face many challenges to read a table from an excel file or writing outputs to txt files. Though, it gets easier

Frequently use the help of these functions to understand their inner workings. For `xlsx`

files and other data types (e.g. JSON, SQL) there are packages.

```
setwd("~/some_path") #Set working directory path.
getwd() #Get the working directory path.
scan(file="some_data_file.txt") #Read data from file.
read.table(file="some_data_file.csv") #Read xls or csv files but not xlsx files. You will need a package for that.
source("path_to_some_r_file/some_r_file.r")
write("writing_something",file="some_document_file.txt")
write.table() #Writing to csv or xls. Similar logic to to read.table with opposite function.
file.choose() #Manually choosing a file from computer. You can use it like read.table(file.choose())
dir(path="some_path") #Files in the path directory.
```

Important: Defining paths in R can be different in Windows and Mac. See this link for more detail.

```
dir("C:/Desktop/") #Windows style 1
dir("C:\\Desktop\\") #Windows style 2
dir("~/Documents/") #Mac and Linux style. Might work for Windows too.
```

Tip: Sometimes, R reads columns containing characters as `factor`

data type. It is not covered in this tutorial and it is tough to handle and convert. Therefore using the following code will prevent R to read character strings as factors.

`options(stringsAsFactors=FALSE)`

If your character vector is read as a factor, use `as.character()`

function. If your numeric vector is read as a factor, use `as.numeric(as.character())`

function. Examples are given below.

```
factvec<-factor(c("a","b","c","a")) #Factor data vector
factvec
as.character(factvec) #Convert to character
factvec2<-factor(c(10,20,30,40,10)) #Factor data vector with numbers only
factvec2
as.numeric(factvec2) #If you want to convert directly to numeric, output will not be desirable.
as.numeric(as.character(factvec2))
```

RData is a special data file type used by R. It is quite useful and efficient to store (better than csv). One disadvantage is it is not as common as csv, so reading RData outside R is a challenge.

```
load(path="some_RData")
save(some_data_frame,file="some_file.RData")
```

Packages are the most important asset class of R. These last years have seen a rapid expansion of R packages for almost any topic of interest that need computation. There are two steps to use a package; to install and to load.

```
install.packages("package_name") #Install command
library(package_name) #Load the package require() also works. No quotes!
```

**Remember:** You need to install a package only once. It is downloaded and ready to use whenever you load the package with `library()`

. Packages are updated from time to time. To update your installed packages, use `update.packages()`

command.

Below displays an example of a package use from the start. You will see how it is done in base R and how it can be enhanced with the packages.

Plotting in R can be a bit problematic and hard. Let’s plot the returns of stock indexes of the previous `EuStockMarkets`

data.

```
#Let's redo what we did previously.
eu_df<- data.frame(EuStockMarkets[1:20,]) #Take the first 20 rows of the stock market index data
eu_df_returns <- data.frame(DAX=100*(round(eu_df$DAX[-1]/eu_df$DAX[-20],4)-1),
CAC=100*(round(eu_df$CAC[-1]/eu_df$CAC[-20],4)-1)) #Calculate the index percentage returns
eu_df_returns
```

```
## DAX CAC
## 1 -0.93 -1.26
## 2 -0.44 -1.86
## 3 0.90 -0.58
## 4 -0.18 0.88
## 5 -0.47 -0.51
## 6 1.25 1.18
## 7 0.58 1.32
## 8 -0.29 -0.19
## 9 0.64 0.02
## 10 0.12 0.31
## 11 -0.58 -0.24
## 12 -0.51 0.15
## 13 -0.52 -0.03
## 14 0.20 0.34
## 15 0.18 -0.04
## 16 0.27 0.35
## 17 -0.66 0.52
## 18 -0.48 0.11
## 19 -0.52 -0.70
```

Base R plotting is as following.

```
plot(x=1:nrow(eu_df_returns),
y=eu_df_returns$DAX,
type="l",col="red",
ylim=c(min(unlist(eu_df_returns)),max(unlist(eu_df_returns))),
ylab="Returns (%)",
xlab="Time Index")
lines(eu_df_returns$CAC)
```

You can probably do better with `ggplot2`

package. It has more beautiful aesthetics, more readable code and better options. Even with the default values your plots will look better. Here is a simple implementation of the previous example.

```
if(!("ggplot2" %in% rownames(installed.packages()))){
install.packages("ggplot2") #Install the package (you can skip it if it is already installed)
}
library(ggplot2)
ggplot(data=eu_df_returns,aes(x=1:nrow(eu_df_returns))) +
geom_line(aes(y=DAX,color="DAX")) +
geom_line(aes(y=CAC,color="CAC")) +
labs(x="Time Index",y="Returns (%)")
```