In this project, The International Dataset, GDP Per Capita, Education Expenditures of GDP% and Health Expenditures of GDP% are examined.
The United States Census Bureau’s International Dataset provides estimates of country populations since 1950 and projections through 2050. Specifically, the data set includes midyear population figures broken down by age and gender assignment at birth. Additionally, they provide time-series data for attributes including fertility rates, birth rates, death rates, and migration rates. The U.S. Census Bureau provides estimates and projections for countries and areas that are recognized by the U.S. Department of State that have a population of at least 5,000. Kaggle Reference
In the international dataset, the mid-year population and mortality life expectancy data-sets are selected to be used for analysis.We processed the analysis as follows:
Population Density: Leaflet and Maps libraries are called for showing the population densities on worldmap in 1950, 2017 and 2050.
Analysis of Infant Mortalities: Tidyverse and Dygrapgh libraries are called for infant mortality Male/Female Analysis, infant mortality rates in 1970, 2000, 2017 and 2050 and infant mortality rates in Turkey.
Analysis of Life Expectancies: Tidyverse and Dygraph libraries are called for life expectancies male/female; life expectancies during 100 years, life expectancies in the countries below/above the average infant mortality, life expectancies in 1970, 2000, 2017 and 2050, life expectancies in Turkey over time and finally life expectancies in the world map.
Analysis of Life Expectancy according to GDP, Health & Education Expenditures: Here Plotly package is used to check the realtionship between GDP, health expenditure and education expenditure of the countries according to their continents.
Correlation Matrix: GGally library is used to create a correlation matrix between Life Expectancy, Infant Mortality, GDP, Health Expenditures and Education Expenditures.
First we call the required packages, purr, leaflet, maps, tidyverse, dplyr, sp, maptools, tidyr, plyr, stringr
#required packages
library(purrr)
library(leaflet)
library(maps)
library(tidyverse)
library(dplyr)
library(sp)
library(maptools)
library(tidyr)
library(plyr)
library(stringr)
Then the midyear_population and country_names data files are imported and the world maps for 1950, 2017 and 2050 are created.
#loading data files
pop<-read.csv(file="midyear_population.csv", header=TRUE)
area<-read.csv(file="country_names_area.csv", header=TRUE)
#joining two tables and calculating population density
pop_area<-left_join(pop,area,by="country_name")%>%
mutate(pop_dens=midyear_population/country_area)
#checking if there is any na
any(is.na(pop_area))
## [1] FALSE
#while synchronizing correcting some important na values
pop_area$country_name<-str_replace_all(pop_area$country_name,"Czechia","Czech Republic")
pop_area$country_name<-str_replace_all(pop_area$country_name,"Burma","Myanmar")
pop_area$country_name<-str_replace_all(pop_area$country_name,"Korea North","North Korea")
pop_area$country_name<-str_replace_all(pop_area$country_name,"United States","USA")
pop_area$country_name<-str_replace_all(pop_area$country_name,"Congo (Kinshasa)","Democratic Republic of the Congo")
pop_area$country_name<-str_replace_all(pop_area$country_name,"Congo (Brazzaville)","Republic of Congo")
pop_area$country_name<-str_replace_all(pop_area$country_name,"Cote d'Ivoire","Ivory Coast")
pop_area$country_name<-str_replace_all(pop_area$country_name,"United Kingdom","UK")
pop_area$country_name<-str_replace_all(pop_area$country_name,"Gaza Strip","Palestine")
pop_area$country_name<-str_replace_all(pop_area$country_name,"Korea South","South Korea")
#filtering data
data1950<-pop_area%>%
filter(year=="1950")%>%
group_by(country_name,pop_dens)
data2017<-pop_area%>%
filter(year=="2017")%>%
group_by(country_name,pop_dens)
data2050<-pop_area%>%
filter(year=="2050")%>%
group_by(country_name,pop_dens)
#creating the map and preparing in order to synchronize with the data and map
m=map("world", fill = TRUE, plot = FALSE)
m_nms <- sapply( strsplit( m$names, ':' ), function(x) x[1] )
m_poli <- map2SpatialPolygons(m, IDs=m_nms, proj4string=CRS("+proj=longlat +datum=WGS84"))
m_df <- data.frame(ID = names(m_poli))
rownames(m_df) <- names(m_poli)
world_spdf <- SpatialPolygonsDataFrame(m_poli, m_df)
#synchronizing map and the data
joined1950<- merge(world_spdf, data1950, by.x = 'ID', by.y = 'country_name')
joined2017<- merge(world_spdf, data2017, by.x = 'ID', by.y = 'country_name')
joined2050<- merge(world_spdf, data2050, by.x = 'ID', by.y = 'country_name')
#unimportant na values are assuming as they are zeros
missing <- is.na(joined1950$pop_dens)
joined1950$pop_dens[missing]<-0
missing <- is.na(joined2017$pop_dens)
joined2017$pop_dens[missing]<-0
missing <- is.na(joined2050$pop_dens)
joined2050$pop_dens[missing]<-0
#creating the color palettes
bins <- c(0, 10, 20, 30,40,50, 100, 500, Inf)
pal1<- colorBin("YlOrRd", domain = joined1950$pop_dens, bins = bins)
pal2<- colorBin("YlOrRd", domain = joined2017$pop_dens, bins = bins)
pal3<- colorBin("YlOrRd", domain = joined2050$pop_dens, bins = bins)
#alternative for continious coloring
#qpal <- colorQuantile("Reds", data2017$pop_dens, n = 9)
#creating labels
labels1<- sprintf(
"<strong>%s</strong><br/>%g people / km<sup>2</sup>",
joined1950$ID, round(joined1950$pop_dens)
) %>% lapply(htmltools::HTML)
labels2<- sprintf(
"<strong>%s</strong><br/>%g people / km<sup>2</sup>",
joined2017$ID, round(joined2017$pop_dens)
) %>% lapply(htmltools::HTML)
labels3<- sprintf(
"<strong>%s</strong><br/>%g people / km<sup>2</sup>",
joined2050$ID, round(joined2050$pop_dens)
) %>% lapply(htmltools::HTML)
map1<-leaflet(data = joined1950 )%>% addTiles() %>%
setView(lat = 39, lng = 35, zoom = 1) %>%
addPolygons(color = "#444444",
weight = 1,
smoothFactor = 0.5,
opacity = 1.0,
fillOpacity = 0.5,
fillColor = ~pal1(joined1950$pop_dens),
highlightOptions = highlightOptions(color = "white", weight = 2,bringToFront = TRUE),
label=labels1,
labelOptions=labelOptions(
style = list("font-weight" = "normal", padding = "3px 8px"),
textsize = "15px",
direction = "auto"))%>%
addLegend("bottomright", pal=pal1, values = ~joined1950$pop_dens,
title = "<strong> people / km<sup>2</sup>",
labFormat = labelFormat(digits=5),
opacity = 1)
map2<-leaflet(data = joined2017) %>% addTiles() %>%
setView(lat = 39, lng = 35, zoom = 1) %>%
addPolygons(color = "#444444",
weight = 1,
smoothFactor = 0.5,
opacity = 1.0,
fillOpacity = 0.5,
fillColor = ~pal2(joined2017$pop_dens),
highlightOptions = highlightOptions(color = "white", weight = 2,bringToFront = TRUE),
label=labels2,
labelOptions=labelOptions(
style = list("font-weight" = "normal", padding = "3px 8px"),
textsize = "15px",
direction = "auto"))%>%
addLegend("bottomright", pal=pal2, values = ~joined2017$pop_dens,
title = "<strong> people / km<sup>2</sup>",
labFormat = labelFormat(digits=5),
opacity = 1)
map3<-leaflet(data = joined2050) %>% addTiles() %>%
setView(lat = 39, lng = 35, zoom = 1) %>%
addPolygons(color = "#444444",
weight = 1,
smoothFactor = 0.5,
opacity = 1.0,
fillOpacity = 0.5,
fillColor = ~pal3(joined2050$pop_dens),
highlightOptions = highlightOptions(color = "white", weight = 2,bringToFront = TRUE),
label=labels3,
labelOptions=labelOptions(
style = list("font-weight" = "normal", padding = "3px 8px"),
textsize = "15px",
direction = "auto"))%>%
addLegend("bottomright", pal=pal3, values = ~joined2050$pop_dens,
title = "<strong> people / km<sup>2</sup>",
labFormat = labelFormat(digits=5),
opacity = 1)
map1
map2
map3
Mortality data-set has 15 variables, we will use the following variables: infant_mortality, infant_mortality_male, infant_mortality_female, life_expectancy, life_expectancy_male, life_expectancy_female, country_name and year.
infant_mortality: Both sexes infant mortality rate (infant deaths per 1,000 population) infant_mortality_male: Male infant mortality rate (infant deaths per 1,000 population) infant_mortality_female: Female infant mortality rate (infant deaths per 1,000 population) life_expectancy: Both sexes life expectancy at birth (years) life_expectancy_male: Male life expectancy at birth (years) life_expectancy_female: Female life expectancy at birth (years)
mle<-read.csv(file="mortality_life_expectancy.csv",header=TRUE)
glimpse(mle)
## Observations: 15,106
## Variables: 15
## $ ĂŻ..country_code <fctr> SI, SI, SI, SI, SI, SI, SI, SI, ...
## $ country_name <fctr> Slovenia, Slovenia, Slovenia, Sl...
## $ year <int> 2036, 2022, 2023, 2024, 2025, 202...
## $ infant_mortality <dbl> 3.39, 3.76, 3.73, 3.70, 3.67, 3.6...
## $ infant_mortality_male <dbl> 3.76, 4.22, 4.18, 4.14, 4.10, 4.0...
## $ infant_mortality_female <dbl> 3.00, 3.27, 3.25, 3.22, 3.20, 3.1...
## $ life_expectancy <dbl> 80.90, 79.11, 79.26, 79.40, 79.55...
## $ life_expectancy_male <dbl> 77.51, 75.58, 75.73, 75.89, 76.04...
## $ life_expectancy_female <dbl> 84.52, 82.89, 83.02, 83.15, 83.29...
## $ mortality_rate_under5 <dbl> 3.93, 4.43, 4.39, 4.35, 4.31, 4.2...
## $ mortality_rate_under5_male <dbl> 4.39, 5.02, 4.97, 4.91, 4.86, 4.8...
## $ mortality_rate_under5_female <dbl> 3.44, 3.81, 3.78, 3.74, 3.71, 3.6...
## $ mortality_rate_1to4 <dbl> 0.54, 0.68, 0.67, 0.65, 0.64, 0.6...
## $ mortality_rate_1to4_male <dbl> 0.63, 0.80, 0.79, 0.77, 0.76, 0.7...
## $ mortality_rate_1to4_female <dbl> 0.44, 0.54, 0.53, 0.52, 0.51, 0.5...
detach("package:plyr",unload = TRUE)
## Warning: 'plyr' namespace cannot be unloaded:
## namespace 'plyr' is imported by 'ggplot2', 'scales', 'broom', 'reshape2' so cannot be unloaded
library(dygraphs)
library(tidyverse)
meanim<-mle%>%
group_by(year)%>%
summarise(le=sum(infant_mortality),lem=sum(infant_mortality_male),lef=sum(infant_mortality_female),n=n())%>%
mutate(mean_infant_mort=le/n,mean_male_infant_mort=lem/n,mean_female_infant_mort=lef/n)%>%
arrange(year)%>%
select(year,mean_infant_mort,mean_male_infant_mort,mean_female_infant_mort)
meanim$year<-as.numeric(meanim$year)
infantm<-dygraph(meanim, main = "Average Infant Mortality by Years", ylab = "Infant Mortality (over 1000 infant)") %>%
dyRangeSelector()
infantm
qplot(x = infant_mortality_male, y = infant_mortality_female, data = mle)
target <- c("1970", "2000", "2017", "2050")
years_chosen <- filter(mle, year %in% target)
qplot(x = infant_mortality, y = year, data = years_chosen)
Infant mortality in Turkey through time
mortality_turkey <- filter(mle, country_name == "Turkey")
ggplot(aes(x = infant_mortality, y = year), data = mortality_turkey)+
geom_point()
library(dplyr)
meanle<-mle%>%
group_by(year)%>%
summarise(le=sum(life_expectancy),lem=sum(life_expectancy_male),lef=sum(life_expectancy_female),n=n())%>%
mutate(mean_life_ex=le/n,mean_male_life_ex=lem/n,mean_female_life_ex=lef/n)%>%
arrange(year)%>%
select(year,mean_life_ex,mean_male_life_ex,mean_female_life_ex)
meanle$year<-as.numeric(meanle$year)
lifem <- dygraph(meanle, main = "Average Life Expectancy by Years", ylab = "Life Expectancy (year)") %>%
dyRangeSelector()
lifem
Life expectancy of the countries in 100 years of range, 77.5-82.5 is the range with the most count of countries.
#Life expectancy
g<- ggplot(mle, aes(x=life_expectancy))
g + geom_histogram(binwidth=5, fill='darkblue', color='black')+
labs(x= 'Life Expectancy', y='Count', title='Life Expectancy')
#GGplot - boxplot
mle$infant_category <- ifelse(mle$infant_mortality > mean(mle$infant_mortality),"more than av", "less than av")
ggplot(data=mle, aes(x=infant_category, y=life_expectancy, fill=infant_category)) + geom_boxplot() +
stat_summary(fun.y=mean, colour="darkgreen", geom="point",
shape=15, size=2,show.legend = FALSE)
#Life expectancy male/female
ggplot(aes(x = life_expectancy_male, y = life_expectancy_female), data = mle)+
geom_point()
#Life expectancy 1970/2000/2017/2050
target2 <- c("1970", "2000", "2017", "2050")
years_chosen2 <- filter(mle, year %in% target2)
qplot(x = life_expectancy, y = year, data = years_chosen2, color="life expectancy")
#Life expectancy Turkey
summary(mle %>%
filter(country_name == "Turkey"))
## ĂŻ..country_code country_name year infant_mortality
## TU :71 Turkey :71 Min. :1980 Min. : 6.63
## AA : 0 Afghanistan : 0 1st Qu.:1998 1st Qu.:10.47
## AC : 0 Albania : 0 Median :2015 Median :18.87
## AE : 0 Algeria : 0 Mean :2015 Mean :27.00
## AF : 0 American Samoa: 0 3rd Qu.:2032 3rd Qu.:41.30
## AG : 0 Andorra : 0 Max. :2050 Max. :72.58
## (Other): 0 (Other) : 0
## infant_mortality_male infant_mortality_female life_expectancy
## Min. : 7.18 Min. : 6.050 Min. :62.61
## 1st Qu.:11.31 1st Qu.: 9.585 1st Qu.:68.73
## Median :20.13 Median :17.550 Median :74.57
## Mean :28.48 Mean :25.455 Mean :73.34
## 3rd Qu.:42.60 3rd Qu.:39.950 3rd Qu.:78.09
## Max. :77.81 Max. :67.090 Max. :80.60
##
## life_expectancy_male life_expectancy_female mortality_rate_under5
## Min. :60.86 Min. :64.45 Min. : 7.47
## 1st Qu.:67.02 1st Qu.:70.52 1st Qu.:11.85
## Median :72.26 Median :77.00 Median :21.51
## Mean :71.14 Mean :75.64 Mean :31.74
## 3rd Qu.:75.54 3rd Qu.:80.78 3rd Qu.:48.09
## Max. :77.89 Max. :83.44 Max. :90.23
##
## mortality_rate_under5_male mortality_rate_under5_female
## Min. : 8.05 Min. : 6.85
## 1st Qu.:12.70 1st Qu.:10.96
## Median :22.61 Median :20.36
## Mean :32.99 Mean :30.42
## 3rd Qu.:48.81 3rd Qu.:47.34
## Max. :95.08 Max. :85.14
##
## mortality_rate_1to4 mortality_rate_1to4_male mortality_rate_1to4_female
## Min. : 0.840 Min. : 0.880 Min. : 0.800
## 1st Qu.: 1.395 1st Qu.: 1.400 1st Qu.: 1.390
## Median : 2.690 Median : 2.530 Median : 2.860
## Mean : 4.963 Mean : 4.745 Mean : 5.191
## 3rd Qu.: 7.075 3rd Qu.: 6.480 3rd Qu.: 7.695
## Max. :19.030 Max. :18.730 Max. :19.350
##
## infant_category
## Length:71
## Class :character
## Mode :character
##
##
##
##
library(dygraphs)
library(dplyr)
mtur<-mortality_turkey%>%
select(year,infant_mortality_female,infant_mortality_male)%>%
arrange(year)
mtur$year<-as.numeric(mtur$year)
dygraph(mtur, main = "Infant Mortality Rates in Turkey", ylab = "Infant Mortality") %>%
dyRangeSelector()
mtur<-mortality_turkey%>%
select(year,life_expectancy_male,life_expectancy_female)%>%
arrange(year)
mtur$year<-as.numeric(mtur$year)
dygraph(mtur, main = "Life Expectancy in Turkey", ylab = "Life Expectancy") %>%
dyRangeSelector()
#while synchronizing correcting some important values
mle$country_name<-str_replace_all(mle$country_name,"Czechia","Czech Republic")
mle$country_name<-str_replace_all(mle$country_name,"Burma","Myanmar")
mle$country_name<-str_replace_all(mle$country_name,"Korea North","North Korea")
mle$country_name<-str_replace_all(mle$country_name,"United States","USA")
mle$country_name<-str_replace_all(mle$country_name,"Congo (Kinshasa)","Democratic Republic of the Congo")
mle$country_name<-str_replace_all(mle$country_name,"Congo (Brazzaville)","Republic of Congo")
mle$country_name<-str_replace_all(mle$country_name,"Cote d'Ivoire","Ivory Coast")
mle$country_name<-str_replace_all(mle$country_name,"United Kingdom","UK")
mle$country_name<-str_replace_all(mle$country_name,"Gaza Strip","Palestine")
mle$country_name<-str_replace_all(mle$country_name,"Korea South","South Korea")
datamor2050<-mle%>%
filter(year=="2050")%>%
group_by(country_name,life_expectancy)
#creating the map and preparing in order to synchronize with the data and map
mm=map("world", fill = TRUE, plot = FALSE)
mm_nms <- sapply( strsplit( mm$names, ':' ), function(x) x[1] )
mm_poli <- map2SpatialPolygons(mm, IDs=mm_nms, proj4string=CRS("+proj=longlat +datum=WGS84"))
mm_df <- data.frame(ID = names(mm_poli))
rownames(mm_df) <- names(mm_poli)
worldm_spdf <- SpatialPolygonsDataFrame(mm_poli, mm_df)
#synchronizing map and the data
joinedmor2050<- merge(worldm_spdf, datamor2050, by.x = 'ID', by.y = 'country_name')
#not important na values are assuming as they are zeros
missingmor <- is.na(joinedmor2050$life_expectancy)
joinedmor2050$life_expectancy[missingmor]<-0
#creating the color palettes
bins <- c(40, 50, 60, 70, 80, 90, 100, Inf)
pal3<- colorBin("YlOrRd", domain = joinedmor2050$life_expectancy, bins = bins)
#alternative for continious coloring
#creating labels
labels3<- sprintf(
"<strong>%s</strong><br/>%g expected years</sup>",
joinedmor2050$ID, round(joinedmor2050$life_expectancy)
) %>% lapply(htmltools::HTML)
leaflet(data = joinedmor2050) %>% addTiles() %>%
setView(lat = 39, lng = 35, zoom = 1) %>%
addPolygons(color = "#444444",
weight = 1,
smoothFactor = 0.5,
opacity = 1.0,
fillOpacity = 0.5,
fillColor = ~pal3(joinedmor2050$life_expectancy),
highlightOptions = highlightOptions(color = "white", weight = 2,bringToFront = TRUE),
label=labels3,
labelOptions=labelOptions(
style = list("font-weight" = "normal", padding = "3px 8px"),
textsize = "15px",
direction = "auto"))%>%
addLegend("bottomright", pal=pal3, values = ~joinedmor2050$life_expectancy,
title = "<strong> expected years</sup>",
labFormat = labelFormat(digits=5),
opacity = 1)
## Warning in pal3(joinedmor2050$life_expectancy): Some values were outside
## the color scale and will be treated as NA
#load required
library(plotly)
##
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
##
## last_plot
## The following object is masked from 'package:stats':
##
## filter
## The following object is masked from 'package:graphics':
##
## layout
library(tidyr)
library(plyr)
## -------------------------------------------------------------------------
## You have loaded plyr after dplyr - this is likely to cause problems.
## If you need functions from both plyr and dplyr, please load plyr first, then dplyr:
## library(plyr); library(dplyr)
## -------------------------------------------------------------------------
##
## Attaching package: 'plyr'
## The following objects are masked from 'package:plotly':
##
## arrange, mutate, rename, summarise
## The following objects are masked from 'package:dplyr':
##
## arrange, count, desc, failwith, id, mutate, rename, summarise,
## summarize
## The following object is masked from 'package:maps':
##
## ozone
## The following object is masked from 'package:purrr':
##
## compact
library(tidyverse)
library(readxl)
#load main data file and continent data
mle<-read.csv(file="mortality_life_expectancy.csv", header=TRUE)%>%
filter(year==2013)
cont<-read_excel("continent.xlsx")
names(cont)<-c("country_name","continent")
#load gdp data and reorginize to merge
gdp<-read.csv(file="gdpp.csv",skip=4, header = TRUE,as.is = TRUE)
gdp1<-gather(gdp,"year","gdp",-Country.Name,-Country.Code,-Indicator.Name,-Indicator.Code)
gdp2<-separate(gdp1,5,sep='X',into = c("x","year"))
gdp2$gdp<-round(as.numeric(gdp2$gdp))
names(gdp2)<-c("country_name","country_code","indicator_name","indicator_code","x","year","gdp")
gdp3<-gdp2%>%filter(year=="2013")%>%
select(country_name,gdp)
#load education data and reorginize to merge
edu<-read.csv(file="education.csv",skip=4, header = TRUE,as.is = TRUE)
edu1<-gather(edu,"year","edu_ex",-Country.Name,-Country.Code,-Indicator.Name,-Indicator.Code)
edu2<-separate(edu1,5,sep='X',into = c("x","year"))
edu2$edu_ex<-round(as.numeric(edu2$edu_ex),digits = 1)
names(edu2)<-c("country_name","country_code","indicator_name","indicator_code","x","year","edu_ex")
edu3<-edu2%>%filter(year=="2013")%>%
select(country_name,edu_ex)
#load health data and reorginize to merge
health<-read.csv(file="health.csv",skip=4, header = TRUE,as.is = TRUE)
health1<-gather(health,"year","h_ex",-Country.Name,-Country.Code,-Indicator.Name,-Indicator.Code)
health2<-separate(health1,5,sep='X',into = c("x","year"))
health2$h_ex<-round(as.numeric(health2$h_ex),digits = 1)
names(health2)<-c("country_name","country_code","indicator_name","indicator_code","x","year","h_ex")
health3<-health2%>%filter(year=="2013")%>%
select(country_name,h_ex)
#joining all data
alldata<-join(mle,gdp3,by="country_name")
alldata<-join(alldata,edu3,by="country_name")
alldata<-join(alldata,health3,by="country_name")
alldata<-join(alldata,cont,by="country_name")
#omiting missing values
alldata1<-na.omit(alldata)
#creating a color vector for continents
colors <- c('#4AC6B7', '#1972A4', '#965F8A', '#FF7070', '#C61951')
#Life Expectancy v. Per Capita GDP, 2013 Plot
p1 <- plot_ly(alldata1, x = alldata1$life_expectancy, y = alldata1$gdp, color = alldata1$continent, size = alldata1$infant_mortality, colors = colors,
type = 'scatter', mode = 'markers',marker = list(symbol = 'circle', sizemode = 'diameter',line = list(width = 2, color = '#FFFFFF')),
text = ~paste('Country:', alldata1$country_name, '<br>Life Expectancy:', round(alldata1$life_expectancy), '<br>GDP:', alldata1$gdp)) %>%
layout(title = 'Life Expectancy v. Per Capita GDP, 2013',
xaxis = list(title = 'Life Expectancy (years)',
gridcolor = 'rgb(255, 255, 255)',
type = 'log',
zerolinewidth = 1,
ticklen = 5,
gridwidth = 2),
yaxis = list(title ='GDP per capita',
gridcolor = 'rgb(255, 255, 255)',
zerolinewidth = 1,
ticklen = 5,
gridwith = 2),
paper_bgcolor = 'rgb(243, 243, 243)',
plot_bgcolor = 'rgb(243, 243, 243)')
#Life Expectancy v. Health Expenditures of %GDP, 2013 Plot
p2 <- plot_ly(alldata1, x = alldata1$life_expectancy, y = alldata1$h_ex, color = alldata1$continent, size = alldata1$infant_mortality, colors = colors,
type = 'scatter', mode = 'markers',marker = list(symbol = 'circle', sizemode = 'diameter',line = list(width = 2, color = '#FFFFFF')),
text = ~paste('Country:', alldata1$country_name, '<br>Life Expectancy:', round(alldata1$life_expectancy), '<br>Health Expenditure of %GDP:', alldata1$h_ex)) %>%
layout(title = 'Life Expectancy v. Health Expenditures of %GDP, 2013',
xaxis = list(title = 'Life Expectancy (years)',
gridcolor = 'rgb(255, 255, 255)',
type = 'log',
zerolinewidth = 1,
ticklen = 5,
gridwidth = 2),
yaxis = list(title ='Health Expenditures of %GDP',
gridcolor = 'rgb(255, 255, 255)',
zerolinewidth = 1,
ticklen = 5,
gridwith = 2),
paper_bgcolor = 'rgb(243, 243, 243)',
plot_bgcolor = 'rgb(243, 243, 243)')
#Life Expectancy v. Education Expenditures of %GDP, 2013 Plot
p3 <- plot_ly(alldata1, x = alldata1$life_expectancy, y = alldata1$edu_ex, color = alldata1$continent, size = alldata1$infant_mortality, colors = colors,
type = 'scatter', mode = 'markers',marker = list(symbol = 'circle', sizemode = 'diameter',line = list(width = 2, color = '#FFFFFF')),
text = ~paste('Country:', alldata1$country_name, '<br>Life Expectancy:', round(alldata1$life_expectancy), '<br>Education Expenditure of %GDP:', alldata1$h_ex)) %>%
layout(title = 'Life Expectancy v. Education Expenditures of %GDP, 2013',
xaxis = list(title = 'Life Expectancy (years)',
gridcolor = 'rgb(255, 255, 255)',
type = 'log',
zerolinewidth = 1,
ticklen = 5,
gridwidth = 2),
yaxis = list(title ='Education Expenditures of %GDP',
gridcolor = 'rgb(255, 255, 255)',
zerolinewidth = 1,
ticklen = 5,
gridwith = 2),
paper_bgcolor = 'rgb(243, 243, 243)',
plot_bgcolor = 'rgb(243, 243, 243)')
p1
p2
p3
library(GGally)
##
## Attaching package: 'GGally'
## The following object is masked from 'package:dplyr':
##
## nasa
#selecting alldatas' columns for correlation
alldata2<-alldata1%>%
select(life_expectancy,infant_mortality,gdp,h_ex,edu_ex)
cor(alldata2)
## life_expectancy infant_mortality gdp h_ex
## life_expectancy 1.0000000 -0.9232561 0.6667292 0.3860973
## infant_mortality -0.9232561 1.0000000 -0.5716238 -0.3300420
## gdp 0.6667292 -0.5716238 1.0000000 0.4922450
## h_ex 0.3860973 -0.3300420 0.4922450 1.0000000
## edu_ex 0.2545437 -0.3408828 0.3371695 0.4223333
## edu_ex
## life_expectancy 0.2545437
## infant_mortality -0.3408828
## gdp 0.3371695
## h_ex 0.4223333
## edu_ex 1.0000000
ggpairs(alldata2,title = "Correlation Matrix")+theme_bw()
ggcorr(alldata2 , method = c("everything", "pearson"),name="Correlation",label=TRUE)
Life Expectancy and Infant Mortality has correlation -0.9, we see when the infant mortality rate decreases, the life expectacy increases.
Life Expectancy and GDP has 0.6 correlation when there is 0.4 correlation with Health Expenditures and 0.3 correlation with Education Expenditures. So we can conclude that GDP has more effect on Life Expectancy rather than Health and Education Expenditures.
Infant Mortality has -0.6 correlation with GDP, -0.3 correlation with Health Expenditures and Education Expenditures, so we can conclude that GDP has more effect on decreasing the infant mortality.
Population density world maps show the countries’ population density (people/km^2) increase through time (1950, 2017, 2050).
Average infant mortality decreases through time, there is a drastic decrease between 1980-1990. Male infant mortality is higher than female infant mortality.
Average life expectancy increasesin years mostly between 1980-1990. Female life expectancy is higher than male life expectancy.
In Turkey infant mortality rate decreases from 72.68 to 6.63 through years(until the projection of 2050).Life expectancy of female is higher than male life expectancy and increases over time.