In week-2 we studied with tidyverse package and some functions in this package. This homework is about it. After we downloaded data from ODD official website, we changed the name to ODD_Retail_Sales_201701.xlsx. We will make some example from raw data to final analysis below.
Our raw excel file is in our repository. We can automatically download that file and put it in a temporary file. Then we can read that excel document into R and remove the temp file.
# Download file from repository to the temp file
# Remove the temp file
download.file("",destfile=tmp,mode = 'wb')
It’s ok but needs some work.
In order to make the data standardized and workable we need to define column names and remove NA values for this example. Please use the same column names in your examples also.
#Firstly we should specify the library to use.
# Use the same column names in your data.
colnames(raw_data) <- c("brand_name","auto_dom","auto_imp","auto_total","comm_dom","comm_imp","comm_total","total_dom","total_imp","total_total")
# Now we replace NA values with 0 and label the time period with year and month, so when we merge the data we won't be confused.
car_data_jan_17 <- raw_data %>% mutate_if(is.numeric,funs(ifelse(,0,.))) %>% mutate(year=2017,month=1)
One of the best methods is to save your data to an RDS or RData file. The difference is RDS can hold only one object but RData can hold many. Since we have only one data frame here we will go with RDS.
# You can read that file by readRDS and assigning to an object
I just have a look the percentage of commercial vehicle sales over total sales. And I ordered the dataframe according to total commercial sales brand by brand.
# A new column is added named as perc_comm. That is the percentage of commercial sales to total sales.
#How to select columns.
## # A tibble: 17 x 6
## year month brand_name comm_total total_total perc_comm
## <dbl> <dbl> <chr> <dbl> <dbl> <dbl>
## 1 2017 1 FORD 2978 4511 66.0
## 2 2017 1 FIAT 2245 3866 58.1
## 3 2017 1 VOLKSWAGEN 1255 4314 29.1
## 4 2017 1 RENAULT 519 4874 10.6
## 5 2017 1 CITROEN 441 956 46.1
## 6 2017 1 TOYOTA 356 2283 15.6
## 7 2017 1 MERCEDES-BENZ 343 842 40.7
## 8 2017 1 DACIA 235 1706 13.8
## 9 2017 1 KIA 193 619 31.2
## 10 2017 1 MITSUBISHI 190 218 87.2
## 11 2017 1 ISUZU 173 173 100
## 12 2017 1 PEUGEOT 168 805 20.9
## 13 2017 1 NISSAN 167 1321 12.6
## 14 2017 1 IVECO 146 146 100
## 15 2017 1 HYUNDAI 134 2357 5.69
## 16 2017 1 KARSAN 82 82 100
## 17 2017 1 SSANGYONG 9 19 47.4