ERROR Team Members

  1. Ĺžule KĂĽtĂĽkde
  2. Özge Genç
  3. Hakan YiÄźit
  4. Haki Bozkurt

Introduction

In this project, The International Dataset, GDP Per Capita, Education Expenditures of GDP% and Health Expenditures of GDP% are examined.

The United States Census Bureau’s International Dataset provides estimates of country populations since 1950 and projections through 2050. Specifically, the data set includes midyear population figures broken down by age and gender assignment at birth. Additionally, they provide time-series data for attributes including fertility rates, birth rates, death rates, and migration rates. The U.S. Census Bureau provides estimates and projections for countries and areas that are recognized by the U.S. Department of State that have a population of at least 5,000. Kaggle Reference

In the international dataset, the mid-year population and mortality life expectancy data-sets are selected to be used for analysis.We processed the analysis as follows:

Data Analysis

1. Population Density on the World Map

First we call the required packages, purr, leaflet, maps, tidyverse, dplyr, sp, maptools, tidyr, plyr, stringr

#required packages
library(purrr)
library(leaflet)
library(maps)
library(tidyverse)
library(dplyr)
library(sp)
library(maptools)
library(tidyr)
library(plyr)
library(stringr)

Then the midyear_population and country_names data files are imported and the world maps for 1950, 2017 and 2050 are created.

#loading data files
pop<-read.csv(file="midyear_population.csv", header=TRUE)
area<-read.csv(file="country_names_area.csv", header=TRUE)

#joining two tables and calculating population density
pop_area<-left_join(pop,area,by="country_name")%>%
  mutate(pop_dens=midyear_population/country_area)

#checking if there is any na 
any(is.na(pop_area))
## [1] FALSE
#while synchronizing correcting some important na values
pop_area$country_name<-str_replace_all(pop_area$country_name,"Czechia","Czech Republic")
pop_area$country_name<-str_replace_all(pop_area$country_name,"Burma","Myanmar")
pop_area$country_name<-str_replace_all(pop_area$country_name,"Korea North","North Korea")
pop_area$country_name<-str_replace_all(pop_area$country_name,"United States","USA")
pop_area$country_name<-str_replace_all(pop_area$country_name,"Congo (Kinshasa)","Democratic Republic of the Congo")
pop_area$country_name<-str_replace_all(pop_area$country_name,"Congo (Brazzaville)","Republic of Congo")
pop_area$country_name<-str_replace_all(pop_area$country_name,"Cote d'Ivoire","Ivory Coast")
pop_area$country_name<-str_replace_all(pop_area$country_name,"United Kingdom","UK")
pop_area$country_name<-str_replace_all(pop_area$country_name,"Gaza Strip","Palestine")
pop_area$country_name<-str_replace_all(pop_area$country_name,"Korea South","South Korea")

#filtering data
data1950<-pop_area%>%
  filter(year=="1950")%>%
  group_by(country_name,pop_dens)

data2017<-pop_area%>%
  filter(year=="2017")%>%
  group_by(country_name,pop_dens)

data2050<-pop_area%>%
  filter(year=="2050")%>%
  group_by(country_name,pop_dens)

#creating the map and preparing in order to synchronize with the data and map
m=map("world", fill = TRUE, plot = FALSE)
m_nms <- sapply( strsplit( m$names, ':' ), function(x) x[1] )
m_poli <- map2SpatialPolygons(m, IDs=m_nms, proj4string=CRS("+proj=longlat +datum=WGS84"))
m_df <- data.frame(ID = names(m_poli))
rownames(m_df) <- names(m_poli)
world_spdf <- SpatialPolygonsDataFrame(m_poli, m_df)

#synchronizing map and the data
joined1950<- merge(world_spdf, data1950, by.x = 'ID', by.y = 'country_name')
joined2017<- merge(world_spdf, data2017, by.x = 'ID', by.y = 'country_name')
joined2050<- merge(world_spdf, data2050, by.x = 'ID', by.y = 'country_name')

#unimportant na values are assuming as they are zeros
missing <- is.na(joined1950$pop_dens)
joined1950$pop_dens[missing]<-0

missing <- is.na(joined2017$pop_dens)
joined2017$pop_dens[missing]<-0

missing <- is.na(joined2050$pop_dens)
joined2050$pop_dens[missing]<-0

#creating the color palettes
bins <- c(0, 10, 20, 30,40,50, 100, 500,  Inf)
pal1<- colorBin("YlOrRd", domain = joined1950$pop_dens, bins = bins)
pal2<- colorBin("YlOrRd", domain = joined2017$pop_dens, bins = bins)
pal3<- colorBin("YlOrRd", domain = joined2050$pop_dens, bins = bins)


#alternative for continious coloring
#qpal <- colorQuantile("Reds", data2017$pop_dens, n = 9)


#creating labels
labels1<- sprintf(
  "<strong>%s</strong><br/>%g people / km<sup>2</sup>",
  joined1950$ID, round(joined1950$pop_dens)
) %>% lapply(htmltools::HTML)


labels2<- sprintf(
  "<strong>%s</strong><br/>%g people / km<sup>2</sup>",
  joined2017$ID, round(joined2017$pop_dens)
) %>% lapply(htmltools::HTML)


labels3<- sprintf(
  "<strong>%s</strong><br/>%g people / km<sup>2</sup>",
  joined2050$ID, round(joined2050$pop_dens)
) %>% lapply(htmltools::HTML)



map1<-leaflet(data = joined1950 )%>% addTiles() %>%
  setView(lat = 39, lng = 35, zoom = 1) %>%
  addPolygons(color = "#444444", 
              weight = 1, 
              smoothFactor = 0.5,
              opacity = 1.0, 
              fillOpacity = 0.5,
              fillColor = ~pal1(joined1950$pop_dens),
              highlightOptions = highlightOptions(color = "white", weight = 2,bringToFront = TRUE),
              label=labels1,
              labelOptions=labelOptions(
                style = list("font-weight" = "normal", padding = "3px 8px"),
                textsize = "15px",
                direction = "auto"))%>%
  addLegend("bottomright", pal=pal1, values = ~joined1950$pop_dens,
            title = "<strong> people / km<sup>2</sup>",
            labFormat = labelFormat(digits=5),
            opacity = 1)

map2<-leaflet(data = joined2017) %>% addTiles() %>%
  setView(lat = 39, lng = 35, zoom = 1) %>%
  addPolygons(color = "#444444", 
              weight = 1, 
              smoothFactor = 0.5,
              opacity = 1.0, 
              fillOpacity = 0.5,
              fillColor = ~pal2(joined2017$pop_dens),
              highlightOptions = highlightOptions(color = "white", weight = 2,bringToFront = TRUE),
              label=labels2,
              labelOptions=labelOptions(
                style = list("font-weight" = "normal", padding = "3px 8px"),
                textsize = "15px",
                direction = "auto"))%>%
  addLegend("bottomright", pal=pal2, values = ~joined2017$pop_dens,
            title = "<strong> people / km<sup>2</sup>",
            labFormat = labelFormat(digits=5),
            opacity = 1)

map3<-leaflet(data = joined2050) %>% addTiles() %>%
  setView(lat = 39, lng = 35, zoom = 1) %>%
  addPolygons(color = "#444444", 
              weight = 1, 
              smoothFactor = 0.5,
              opacity = 1.0, 
              fillOpacity = 0.5,
              fillColor = ~pal3(joined2050$pop_dens),
              highlightOptions = highlightOptions(color = "white", weight = 2,bringToFront = TRUE),
              label=labels3,
              labelOptions=labelOptions(
                style = list("font-weight" = "normal", padding = "3px 8px"),
                textsize = "15px",
                direction = "auto"))%>%
  addLegend("bottomright", pal=pal3, values = ~joined2050$pop_dens,
            title = "<strong> people / km<sup>2</sup>",
            labFormat = labelFormat(digits=5),
            opacity = 1)
  • Population Density World Map, 1950 (people/km^2)
map1
  • Population Density World Map, 2017 (people/km^2)
map2
  • Population Density World Map, 2050 (people/km^2)
map3

2. Analysis of Life Expectancy and Infant Mortality

Mortality data-set has 15 variables, we will use the following variables: infant_mortality, infant_mortality_male, infant_mortality_female, life_expectancy, life_expectancy_male, life_expectancy_female, country_name and year.

infant_mortality: Both sexes infant mortality rate (infant deaths per 1,000 population) infant_mortality_male: Male infant mortality rate (infant deaths per 1,000 population) infant_mortality_female: Female infant mortality rate (infant deaths per 1,000 population) life_expectancy: Both sexes life expectancy at birth (years) life_expectancy_male: Male life expectancy at birth (years) life_expectancy_female: Female life expectancy at birth (years)

  • Infant Mortality
mle<-read.csv(file="mortality_life_expectancy.csv",header=TRUE)

glimpse(mle)
## Observations: 15,106
## Variables: 15
## $ ĂŻ..country_code              <fctr> SI, SI, SI, SI, SI, SI, SI, SI, ...
## $ country_name                 <fctr> Slovenia, Slovenia, Slovenia, Sl...
## $ year                         <int> 2036, 2022, 2023, 2024, 2025, 202...
## $ infant_mortality             <dbl> 3.39, 3.76, 3.73, 3.70, 3.67, 3.6...
## $ infant_mortality_male        <dbl> 3.76, 4.22, 4.18, 4.14, 4.10, 4.0...
## $ infant_mortality_female      <dbl> 3.00, 3.27, 3.25, 3.22, 3.20, 3.1...
## $ life_expectancy              <dbl> 80.90, 79.11, 79.26, 79.40, 79.55...
## $ life_expectancy_male         <dbl> 77.51, 75.58, 75.73, 75.89, 76.04...
## $ life_expectancy_female       <dbl> 84.52, 82.89, 83.02, 83.15, 83.29...
## $ mortality_rate_under5        <dbl> 3.93, 4.43, 4.39, 4.35, 4.31, 4.2...
## $ mortality_rate_under5_male   <dbl> 4.39, 5.02, 4.97, 4.91, 4.86, 4.8...
## $ mortality_rate_under5_female <dbl> 3.44, 3.81, 3.78, 3.74, 3.71, 3.6...
## $ mortality_rate_1to4          <dbl> 0.54, 0.68, 0.67, 0.65, 0.64, 0.6...
## $ mortality_rate_1to4_male     <dbl> 0.63, 0.80, 0.79, 0.77, 0.76, 0.7...
## $ mortality_rate_1to4_female   <dbl> 0.44, 0.54, 0.53, 0.52, 0.51, 0.5...
  • Average Infant Mortality by Years (Male, Female, Both)
detach("package:plyr",unload = TRUE)
## Warning: 'plyr' namespace cannot be unloaded:
##   namespace 'plyr' is imported by 'ggplot2', 'scales', 'broom', 'reshape2' so cannot be unloaded
library(dygraphs)
library(tidyverse)

meanim<-mle%>%
  group_by(year)%>%
  summarise(le=sum(infant_mortality),lem=sum(infant_mortality_male),lef=sum(infant_mortality_female),n=n())%>%
  mutate(mean_infant_mort=le/n,mean_male_infant_mort=lem/n,mean_female_infant_mort=lef/n)%>%
  arrange(year)%>%
  select(year,mean_infant_mort,mean_male_infant_mort,mean_female_infant_mort)

meanim$year<-as.numeric(meanim$year)

infantm<-dygraph(meanim, main = "Average Infant Mortality by Years", ylab = "Infant Mortality (over 1000 infant)") %>%
  dyRangeSelector()
infantm
  • Mortality infants male/female
qplot(x = infant_mortality_male, y = infant_mortality_female, data = mle)

  • Infant Mortality rate infants - 1970/2000/2017/2050
target <- c("1970", "2000", "2017", "2050")

years_chosen <- filter(mle, year %in% target)

qplot(x = infant_mortality, y = year, data = years_chosen)

  • Mortality rate infants - Turkey

Infant mortality in Turkey through time

mortality_turkey <- filter(mle, country_name == "Turkey")

ggplot(aes(x = infant_mortality, y = year), data = mortality_turkey)+
  geom_point()

  • Average Life Expectancy by Years (Male/Female/Both)
library(dplyr)
meanle<-mle%>%
  group_by(year)%>%
  summarise(le=sum(life_expectancy),lem=sum(life_expectancy_male),lef=sum(life_expectancy_female),n=n())%>%
  mutate(mean_life_ex=le/n,mean_male_life_ex=lem/n,mean_female_life_ex=lef/n)%>%
  arrange(year)%>%
  select(year,mean_life_ex,mean_male_life_ex,mean_female_life_ex)

meanle$year<-as.numeric(meanle$year)

lifem <- dygraph(meanle, main = "Average Life Expectancy by Years", ylab = "Life Expectancy (year)") %>%
  dyRangeSelector()
lifem
  • Life Expectancy from 1950 to 2050

Life expectancy of the countries in 100 years of range, 77.5-82.5 is the range with the most count of countries.

#Life expectancy
g<- ggplot(mle, aes(x=life_expectancy))
g + geom_histogram(binwidth=5, fill='darkblue', color='black')+
  labs(x= 'Life Expectancy', y='Count', title='Life Expectancy')

  • Life Expectancy in the Countries(below/above the average of infant mortality)
#GGplot - boxplot

mle$infant_category <- ifelse(mle$infant_mortality > mean(mle$infant_mortality),"more than av", "less than av")


  
ggplot(data=mle, aes(x=infant_category, y=life_expectancy, fill=infant_category)) + geom_boxplot() +
     stat_summary(fun.y=mean, colour="darkgreen", geom="point", 
                           shape=15, size=2,show.legend = FALSE)

  • Life Expectancy Male/Female
#Life expectancy male/female
ggplot(aes(x = life_expectancy_male, y = life_expectancy_female), data = mle)+
  geom_point()

  • Life Expectancy in years: 1970, 2000, 2017, 2050
#Life expectancy 1970/2000/2017/2050
target2 <- c("1970", "2000", "2017", "2050")

years_chosen2 <- filter(mle, year %in% target2)

qplot(x = life_expectancy, y = year, data = years_chosen2, color="life expectancy")

Analysis on Turkey

  • Summary of Life Expectancy in Turkey
#Life expectancy Turkey
summary(mle %>% 
          
  filter(country_name == "Turkey"))
##  ĂŻ..country_code         country_name      year      infant_mortality
##  TU     :71      Turkey        :71    Min.   :1980   Min.   : 6.63   
##  AA     : 0      Afghanistan   : 0    1st Qu.:1998   1st Qu.:10.47   
##  AC     : 0      Albania       : 0    Median :2015   Median :18.87   
##  AE     : 0      Algeria       : 0    Mean   :2015   Mean   :27.00   
##  AF     : 0      American Samoa: 0    3rd Qu.:2032   3rd Qu.:41.30   
##  AG     : 0      Andorra       : 0    Max.   :2050   Max.   :72.58   
##  (Other): 0      (Other)       : 0                                   
##  infant_mortality_male infant_mortality_female life_expectancy
##  Min.   : 7.18         Min.   : 6.050          Min.   :62.61  
##  1st Qu.:11.31         1st Qu.: 9.585          1st Qu.:68.73  
##  Median :20.13         Median :17.550          Median :74.57  
##  Mean   :28.48         Mean   :25.455          Mean   :73.34  
##  3rd Qu.:42.60         3rd Qu.:39.950          3rd Qu.:78.09  
##  Max.   :77.81         Max.   :67.090          Max.   :80.60  
##                                                               
##  life_expectancy_male life_expectancy_female mortality_rate_under5
##  Min.   :60.86        Min.   :64.45          Min.   : 7.47        
##  1st Qu.:67.02        1st Qu.:70.52          1st Qu.:11.85        
##  Median :72.26        Median :77.00          Median :21.51        
##  Mean   :71.14        Mean   :75.64          Mean   :31.74        
##  3rd Qu.:75.54        3rd Qu.:80.78          3rd Qu.:48.09        
##  Max.   :77.89        Max.   :83.44          Max.   :90.23        
##                                                                   
##  mortality_rate_under5_male mortality_rate_under5_female
##  Min.   : 8.05              Min.   : 6.85               
##  1st Qu.:12.70              1st Qu.:10.96               
##  Median :22.61              Median :20.36               
##  Mean   :32.99              Mean   :30.42               
##  3rd Qu.:48.81              3rd Qu.:47.34               
##  Max.   :95.08              Max.   :85.14               
##                                                         
##  mortality_rate_1to4 mortality_rate_1to4_male mortality_rate_1to4_female
##  Min.   : 0.840      Min.   : 0.880           Min.   : 0.800            
##  1st Qu.: 1.395      1st Qu.: 1.400           1st Qu.: 1.390            
##  Median : 2.690      Median : 2.530           Median : 2.860            
##  Mean   : 4.963      Mean   : 4.745           Mean   : 5.191            
##  3rd Qu.: 7.075      3rd Qu.: 6.480           3rd Qu.: 7.695            
##  Max.   :19.030      Max.   :18.730           Max.   :19.350            
##                                                                         
##  infant_category   
##  Length:71         
##  Class :character  
##  Mode  :character  
##                    
##                    
##                    
## 
  • Infant Mortality Rates in Turkey (Male/Female/Both)
library(dygraphs)
library(dplyr)

mtur<-mortality_turkey%>%
  select(year,infant_mortality_female,infant_mortality_male)%>%
  arrange(year)

mtur$year<-as.numeric(mtur$year)


dygraph(mtur, main = "Infant Mortality Rates in Turkey", ylab = "Infant Mortality") %>%
  dyRangeSelector()
  • Life Expectancy in Turkey (Male/Female/Both)
mtur<-mortality_turkey%>%
  select(year,life_expectancy_male,life_expectancy_female)%>%
  arrange(year)

mtur$year<-as.numeric(mtur$year)


dygraph(mtur, main = "Life Expectancy in Turkey", ylab = "Life Expectancy") %>%
  dyRangeSelector()

Life Expectancy in the World in 2050

#while synchronizing correcting some important values

mle$country_name<-str_replace_all(mle$country_name,"Czechia","Czech Republic")
mle$country_name<-str_replace_all(mle$country_name,"Burma","Myanmar")
mle$country_name<-str_replace_all(mle$country_name,"Korea North","North Korea")
mle$country_name<-str_replace_all(mle$country_name,"United States","USA")
mle$country_name<-str_replace_all(mle$country_name,"Congo (Kinshasa)","Democratic Republic of the Congo")
mle$country_name<-str_replace_all(mle$country_name,"Congo (Brazzaville)","Republic of Congo")
mle$country_name<-str_replace_all(mle$country_name,"Cote d'Ivoire","Ivory Coast")
mle$country_name<-str_replace_all(mle$country_name,"United Kingdom","UK")
mle$country_name<-str_replace_all(mle$country_name,"Gaza Strip","Palestine")
mle$country_name<-str_replace_all(mle$country_name,"Korea South","South Korea")




datamor2050<-mle%>%
  filter(year=="2050")%>%
  group_by(country_name,life_expectancy)


#creating the map and preparing in order to synchronize with the data and map
mm=map("world", fill = TRUE, plot = FALSE)
mm_nms <- sapply( strsplit( mm$names, ':' ), function(x) x[1] )
mm_poli <- map2SpatialPolygons(mm, IDs=mm_nms, proj4string=CRS("+proj=longlat +datum=WGS84"))
mm_df <- data.frame(ID = names(mm_poli))
rownames(mm_df) <- names(mm_poli)
worldm_spdf <- SpatialPolygonsDataFrame(mm_poli, mm_df)

#synchronizing map and the data

joinedmor2050<- merge(worldm_spdf, datamor2050, by.x = 'ID', by.y = 'country_name')

#not important na values are assuming as they are zeros

missingmor <- is.na(joinedmor2050$life_expectancy)
joinedmor2050$life_expectancy[missingmor]<-0

#creating the color palettes
bins <- c(40, 50, 60, 70, 80, 90, 100, Inf)

pal3<- colorBin("YlOrRd", domain = joinedmor2050$life_expectancy, bins = bins)


#alternative for continious coloring


#creating labels


labels3<- sprintf(
  "<strong>%s</strong><br/>%g expected years</sup>",
  joinedmor2050$ID, round(joinedmor2050$life_expectancy)
) %>% lapply(htmltools::HTML)



leaflet(data = joinedmor2050) %>% addTiles() %>%
  setView(lat = 39, lng = 35, zoom = 1) %>%
  addPolygons(color = "#444444", 
              weight = 1, 
              smoothFactor = 0.5,
              opacity = 1.0, 
              fillOpacity = 0.5,
              fillColor = ~pal3(joinedmor2050$life_expectancy),
              highlightOptions = highlightOptions(color = "white", weight = 2,bringToFront = TRUE),
              label=labels3,
              labelOptions=labelOptions(
                style = list("font-weight" = "normal", padding = "3px 8px"),
                textsize = "15px",
                direction = "auto"))%>%
  addLegend("bottomright", pal=pal3, values = ~joinedmor2050$life_expectancy,
            title = "<strong> expected years</sup>",
            labFormat = labelFormat(digits=5),
            opacity = 1)
## Warning in pal3(joinedmor2050$life_expectancy): Some values were outside
## the color scale and will be treated as NA

3.Life Expectancy Vs GDP, Health & Education Expenditures

#load required 
library(plotly)
## 
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
## 
##     last_plot
## The following object is masked from 'package:stats':
## 
##     filter
## The following object is masked from 'package:graphics':
## 
##     layout
library(tidyr)
library(plyr)
## -------------------------------------------------------------------------
## You have loaded plyr after dplyr - this is likely to cause problems.
## If you need functions from both plyr and dplyr, please load plyr first, then dplyr:
## library(plyr); library(dplyr)
## -------------------------------------------------------------------------
## 
## Attaching package: 'plyr'
## The following objects are masked from 'package:plotly':
## 
##     arrange, mutate, rename, summarise
## The following objects are masked from 'package:dplyr':
## 
##     arrange, count, desc, failwith, id, mutate, rename, summarise,
##     summarize
## The following object is masked from 'package:maps':
## 
##     ozone
## The following object is masked from 'package:purrr':
## 
##     compact
library(tidyverse)
library(readxl)

#load main data file and continent data
mle<-read.csv(file="mortality_life_expectancy.csv", header=TRUE)%>%
  filter(year==2013)
cont<-read_excel("continent.xlsx")
names(cont)<-c("country_name","continent")

#load gdp data and reorginize to merge 
gdp<-read.csv(file="gdpp.csv",skip=4, header = TRUE,as.is = TRUE)
gdp1<-gather(gdp,"year","gdp",-Country.Name,-Country.Code,-Indicator.Name,-Indicator.Code)
gdp2<-separate(gdp1,5,sep='X',into = c("x","year"))
gdp2$gdp<-round(as.numeric(gdp2$gdp))
names(gdp2)<-c("country_name","country_code","indicator_name","indicator_code","x","year","gdp")
gdp3<-gdp2%>%filter(year=="2013")%>%
  select(country_name,gdp)

#load education data and reorginize to merge 
edu<-read.csv(file="education.csv",skip=4, header = TRUE,as.is = TRUE)
edu1<-gather(edu,"year","edu_ex",-Country.Name,-Country.Code,-Indicator.Name,-Indicator.Code)
edu2<-separate(edu1,5,sep='X',into = c("x","year"))
edu2$edu_ex<-round(as.numeric(edu2$edu_ex),digits = 1)
names(edu2)<-c("country_name","country_code","indicator_name","indicator_code","x","year","edu_ex")
edu3<-edu2%>%filter(year=="2013")%>%
  select(country_name,edu_ex)

#load health data and reorginize to merge 
health<-read.csv(file="health.csv",skip=4, header = TRUE,as.is = TRUE)
health1<-gather(health,"year","h_ex",-Country.Name,-Country.Code,-Indicator.Name,-Indicator.Code)
health2<-separate(health1,5,sep='X',into = c("x","year"))
health2$h_ex<-round(as.numeric(health2$h_ex),digits = 1)
names(health2)<-c("country_name","country_code","indicator_name","indicator_code","x","year","h_ex")
health3<-health2%>%filter(year=="2013")%>%
  select(country_name,h_ex)

#joining all data
alldata<-join(mle,gdp3,by="country_name")
alldata<-join(alldata,edu3,by="country_name")
alldata<-join(alldata,health3,by="country_name")
alldata<-join(alldata,cont,by="country_name")

#omiting missing values
alldata1<-na.omit(alldata)


#creating a color vector for continents
colors <- c('#4AC6B7', '#1972A4', '#965F8A', '#FF7070', '#C61951')

#Life Expectancy v. Per Capita GDP, 2013 Plot
p1 <- plot_ly(alldata1, x = alldata1$life_expectancy, y = alldata1$gdp, color = alldata1$continent, size = alldata1$infant_mortality, colors = colors,
             type = 'scatter', mode = 'markers',marker = list(symbol = 'circle', sizemode = 'diameter',line = list(width = 2, color = '#FFFFFF')),
             text = ~paste('Country:', alldata1$country_name, '<br>Life Expectancy:', round(alldata1$life_expectancy), '<br>GDP:', alldata1$gdp)) %>%
  layout(title = 'Life Expectancy v. Per Capita GDP, 2013',
         xaxis = list(title =  'Life Expectancy (years)',
                      gridcolor = 'rgb(255, 255, 255)',
                      type = 'log',
                      zerolinewidth = 1,
                      ticklen = 5,
                      gridwidth = 2),
         yaxis = list(title ='GDP per capita',
                      gridcolor = 'rgb(255, 255, 255)',
                      zerolinewidth = 1,
                      ticklen = 5,
                      gridwith = 2),
         paper_bgcolor = 'rgb(243, 243, 243)',
         plot_bgcolor = 'rgb(243, 243, 243)')

#Life Expectancy v. Health Expenditures of %GDP, 2013 Plot
p2 <- plot_ly(alldata1, x = alldata1$life_expectancy, y = alldata1$h_ex, color = alldata1$continent, size = alldata1$infant_mortality, colors = colors,
              type = 'scatter', mode = 'markers',marker = list(symbol = 'circle', sizemode = 'diameter',line = list(width = 2, color = '#FFFFFF')),
              text = ~paste('Country:', alldata1$country_name, '<br>Life Expectancy:', round(alldata1$life_expectancy), '<br>Health Expenditure of %GDP:', alldata1$h_ex)) %>%
  layout(title = 'Life Expectancy v. Health Expenditures of %GDP, 2013',
         xaxis = list(title =  'Life Expectancy (years)',
                      gridcolor = 'rgb(255, 255, 255)',
                      type = 'log',
                      zerolinewidth = 1,
                      ticklen = 5,
                      gridwidth = 2),
         yaxis = list(title ='Health Expenditures of %GDP',
                      gridcolor = 'rgb(255, 255, 255)',
                      zerolinewidth = 1,
                      ticklen = 5,
                      gridwith = 2),
         paper_bgcolor = 'rgb(243, 243, 243)',
         plot_bgcolor = 'rgb(243, 243, 243)')

#Life Expectancy v. Education Expenditures of %GDP, 2013 Plot
p3 <- plot_ly(alldata1, x = alldata1$life_expectancy, y = alldata1$edu_ex, color = alldata1$continent, size = alldata1$infant_mortality, colors = colors,
              type = 'scatter', mode = 'markers',marker = list(symbol = 'circle', sizemode = 'diameter',line = list(width = 2, color = '#FFFFFF')),
              text = ~paste('Country:', alldata1$country_name, '<br>Life Expectancy:', round(alldata1$life_expectancy), '<br>Education Expenditure of %GDP:', alldata1$h_ex)) %>%
  layout(title = 'Life Expectancy v. Education Expenditures of %GDP, 2013',
         xaxis = list(title =  'Life Expectancy (years)',
                      gridcolor = 'rgb(255, 255, 255)',
                      type = 'log',
                      zerolinewidth = 1,
                      ticklen = 5,
                      gridwidth = 2),
         yaxis = list(title ='Education Expenditures of %GDP',
                      gridcolor = 'rgb(255, 255, 255)',
                      zerolinewidth = 1,
                      ticklen = 5,
                      gridwith = 2),
         paper_bgcolor = 'rgb(243, 243, 243)',
         plot_bgcolor = 'rgb(243, 243, 243)')
  • Life Expectancies according to GDP of the countries (grouped by continents, buble sizes=infant mortality)
p1
  • Life Expectancies according to Health Expenditures of the countries (grouped in continents, buble sizes= infant mortality)
p2
  • Life Expectancies according to Education Expenditures of the countries (grouped by continents, buble sizes=infant mortality)
p3

4.Correlation Matrix (Life Expectancy, Infant Mortality, GDP, Health Expenditures, Education Expenditures)

library(GGally)
## 
## Attaching package: 'GGally'
## The following object is masked from 'package:dplyr':
## 
##     nasa
#selecting alldatas' columns for correlation
alldata2<-alldata1%>%
  select(life_expectancy,infant_mortality,gdp,h_ex,edu_ex)

cor(alldata2)
##                  life_expectancy infant_mortality        gdp       h_ex
## life_expectancy        1.0000000       -0.9232561  0.6667292  0.3860973
## infant_mortality      -0.9232561        1.0000000 -0.5716238 -0.3300420
## gdp                    0.6667292       -0.5716238  1.0000000  0.4922450
## h_ex                   0.3860973       -0.3300420  0.4922450  1.0000000
## edu_ex                 0.2545437       -0.3408828  0.3371695  0.4223333
##                      edu_ex
## life_expectancy   0.2545437
## infant_mortality -0.3408828
## gdp               0.3371695
## h_ex              0.4223333
## edu_ex            1.0000000
ggpairs(alldata2,title = "Correlation Matrix")+theme_bw()

ggcorr(alldata2 , method = c("everything", "pearson"),name="Correlation",label=TRUE)

5.CONCLUSION

  • Life Expectancy and Infant Mortality has correlation -0.9, we see when the infant mortality rate decreases, the life expectacy increases.

  • Life Expectancy and GDP has 0.6 correlation when there is 0.4 correlation with Health Expenditures and 0.3 correlation with Education Expenditures. So we can conclude that GDP has more effect on Life Expectancy rather than Health and Education Expenditures.

  • Infant Mortality has -0.6 correlation with GDP, -0.3 correlation with Health Expenditures and Education Expenditures, so we can conclude that GDP has more effect on decreasing the infant mortality.

  • Population density world maps show the countries’ population density (people/km^2) increase through time (1950, 2017, 2050).

  • Average infant mortality decreases through time, there is a drastic decrease between 1980-1990. Male infant mortality is higher than female infant mortality.

  • Average life expectancy increasesin years mostly between 1980-1990. Female life expectancy is higher than male life expectancy.

  • In Turkey infant mortality rate decreases from 72.68 to 6.63 through years(until the projection of 2050).Life expectancy of female is higher than male life expectancy and increases over time.