library(plyr) library(scales) library(tidyverse) library(ggplot2) library(ggcorrplot) library(ggthemes) library(formattable) library(htmlwidgets) library(ggalt) library(party) library(rpart) library(rpart.plot) library(pROC)
December 19, 2017
library(plyr) library(scales) library(tidyverse) library(ggplot2) library(ggcorrplot) library(ggthemes) library(formattable) library(htmlwidgets) library(ggalt) library(party) library(rpart) library(rpart.plot) library(pROC)
Human Resources Analytics Data from kaggle with 14999x10 rows&columns
## Observations: 14,999 ## Variables: 10 ## $ satisfaction_level <dbl> 0.38, 0.80, 0.11, 0.72, 0.37, 0.41, 0.10... ## $ last_evaluation <dbl> 0.53, 0.86, 0.88, 0.87, 0.52, 0.50, 0.77... ## $ number_project <int> 2, 5, 7, 5, 2, 2, 6, 5, 5, 2, 2, 6, 4, 2... ## $ average_montly_hours <int> 157, 262, 272, 223, 159, 153, 247, 259, ... ## $ time_spend_company <int> 3, 6, 4, 5, 3, 3, 4, 5, 5, 3, 3, 4, 5, 3... ## $ Work_accident <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0... ## $ left <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1... ## $ promotion_last_5years <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0... ## $ departments <fctr> sales, sales, sales, sales, sales, sale... ## $ salary <fctr> low, medium, medium, low, low, low, low...
## satisfaction_level last_evaluation number_project average_montly_hours ## Min. :0.0900 Min. :0.3600 Min. :2.000 Min. : 96.0 ## 1st Qu.:0.4400 1st Qu.:0.5600 1st Qu.:3.000 1st Qu.:156.0 ## Median :0.6400 Median :0.7200 Median :4.000 Median :200.0 ## Mean :0.6128 Mean :0.7161 Mean :3.803 Mean :201.1 ## 3rd Qu.:0.8200 3rd Qu.:0.8700 3rd Qu.:5.000 3rd Qu.:245.0 ## Max. :1.0000 Max. :1.0000 Max. :7.000 Max. :310.0 ## time_spend_company Work_accident left ## Min. : 2.000 Min. :0.0000 Min. :0.0000 ## 1st Qu.: 3.000 1st Qu.:0.0000 1st Qu.:0.0000 ## Median : 3.000 Median :0.0000 Median :0.0000 ## Mean : 3.498 Mean :0.1446 Mean :0.2381 ## 3rd Qu.: 4.000 3rd Qu.:0.0000 3rd Qu.:0.0000 ## Max. :10.000 Max. :1.0000 Max. :1.0000 ## promotion_last_5years ## Min. :0.00000 ## 1st Qu.:0.00000 ## Median :0.00000 ## Mean :0.02127 ## 3rd Qu.:0.00000 ## Max. :1.00000
## satisfaction_level last_evaluation number_project average_montly_hours ## 1 0.6100000 0.8891911 4.936889 243.6409 ## 2 0.4159149 0.5306140 2.178723 144.8322 ## 3 0.2511361 0.8628964 5.780275 285.0799 ## time_spend_company Work_accident promotion_last_5years ## 1 4.778667 0.05333333 0.0008888889 ## 2 3.071733 0.04741641 0.0091185410 ## 3 4.262172 0.03870162 0.0037453184
Secondary factor that determines the satisfied employees quit decision is time spent in the company. If someone is working in the same company between 4.5 - 6.5 years, they break their comfort zone and flee because they work for long hours (> 216 h monthly basis) even if their bosses are very happy with their work (evaluation score > 0.80). We belive that this category corresponds to patient, hardworking, motivated employees of Cluster 1 (i.e. Group 3).