EGM Data

First we import the egm_example_data to R with readr package as a tibble. And check with glimpse function for its structure.

egm_df <- read_excel("egm_example_data.xlsx")
glimpse(egm_df)
## Observations: 1,368
## Variables: 15
## $ date                                <chr> "06.01.2017", "06.01.2017"...
## $ pension_fund_company                <chr> "Aegon Emeklilik ve Hayat"...
## $ n_of_participants                   <dbl> 42826, 118867, 726684, 111...
## $ fund_size_participants              <dbl> 139193651, 1840701596, 738...
## $ gov_contribution                    <dbl> 14409322, 231156574, 81535...
## $ contribution                        <dbl> 109589132, 1466908275, 579...
## $ n_of_pensioners                     <chr> "269", "2300", "5640", "11...
## $ n_of_ind_contracts                  <dbl> 34003, 113222, 627972, 938...
## $ n_of_group_ind_contracts            <chr> "14127", "18813", "148377"...
## $ n_of_employer_group_certificates    <chr> "884", "4488", "98772", "2...
## $ n_total                             <dbl> 49014, 136523, 875121, 128...
## $ size_of_ind_contracts               <dbl> 87071132, 1165167454, 4001...
## $ size_of_group_ind_contracts         <dbl> 17725697, 244504362, 13248...
## $ size_of_employer_group_certificates <dbl> 1808842, 42707063, 3775014...
## $ size_total                          <dbl> 106605671, 1452378880, 570...

Here some date and numeric columns are imported as character. So we need to tidy the data. We convert columns 7,9 and 10 to integer.

cols = c(7, 9, 10)    
egm_df[,cols] = apply(egm_df[,cols], 2, function(x) as.integer(as.character(x)))

Then convert the date column to the date type with lubridate package

# converting the date column to the date type
egm_df <- egm_df %>% mutate(date=dmy(date))

# creating a year and amonth column
egm_df <- egm_df %>% mutate(month=month(date))
egm_df <- egm_df %>% mutate(year=year(date))

Analysis

First we check the number of records for each company

##number of records for companies
egm_df %>% group_by(pension_fund_company) %>% count()
## # A tibble: 19 x 2
## # Groups:   pension_fund_company [19]
##    pension_fund_company                n
##    <chr>                           <int>
##  1 Aegon Emeklilik ve Hayat           76
##  2 Allianz Hayat ve Emeklilik         76
##  3 Allianz Yasam ve Emeklilik         76
##  4 Anadolu Hayat Emeklilik            76
##  5 Asya Emeklilik ve Hayat             7
##  6 Avivasa Emeklilik ve Hayat         76
##  7 Axa Hayat ve Emeklilik             76
##  8 Bereket Emeklilik ve Hayat         69
##  9 BNP Paribas Cardif Emeklilik       76
## 10 Cigna Finans Emeklilik ve Hayat    76
## 11 Fiba Emeklilik ve Hayat            76
## 12 Garanti Emeklilik ve Hayat         76
## 13 Groupama Emeklilik                 76
## 14 Halk Hayat ve Emeklilik            76
## 15 Katilim Emeklilik ve Hayat         76
## 16 Metlife Emeklilik ve Hayat         76
## 17 NN Hayat ve Emeklilik              76
## 18 Vakif Emeklilik ve Hayat           76
## 19 Ziraat Hayat ve Emeklilik          76

Then we will remove the Asya Emeklilik ve Hayat and Bereket Emeklilik ve Hayat due to the missing months. Then we group for those companies that have an average number of pensioners over 5000 for the year 2017 and 2018

egm_df %>% filter(pension_fund_company != c("Asya Emeklilik ve Hayat", "Bereket Emeklilik ve Hayat")) %>% group_by(pension_fund_company, year) %>% summarise(number_of_pensioners = mean(n_of_pensioners)) %>% filter(number_of_pensioners > 5000)
## # A tibble: 9 x 3
## # Groups:   pension_fund_company [5]
##   pension_fund_company        year number_of_pensioners
##   <chr>                      <dbl>                <dbl>
## 1 Allianz Yasam ve Emeklilik  2017                6861.
## 2 Allianz Yasam ve Emeklilik  2018                8349.
## 3 Anadolu Hayat Emeklilik     2017               13922.
## 4 Anadolu Hayat Emeklilik     2018               16974.
## 5 Avivasa Emeklilik ve Hayat  2017               11003.
## 6 Avivasa Emeklilik ve Hayat  2018               13260.
## 7 Garanti Emeklilik ve Hayat  2017                7306.
## 8 Garanti Emeklilik ve Hayat  2018                9223.
## 9 Vakif Emeklilik ve Hayat    2018                5764.

Then we plot the change of the number of pensioners over time for the resulting companies