2- Source Files For The Analysis

We are going to use TUIK data to map domestic migration in Turkey. Original excel file can be found in this link.

2.1- Description Of The Data Set

Data set we obtained from the TUIK website contains 4 years of population movement within Turkey. Original excel file consists of 329 rows and 86 columns of data. Each row represents the destination cities which people migrated to and on each column we can obtain the distribution of migrated population by the place of birth. There are 81 distinct Turkish cities as a destination and 83 distinct columns as a place of birth with the addition for people who were born abroad or unknown locations.

3- Objectives

  • Primary goal of this analysis is to find & address the specific patterns that shape the migration dynamics in Turkey.
  • First we will analyze the data how the migration figures evolved in four years and then visualize the common patterns observed along this period.
  • Finally we will try to link additional demographic statistics from TUIK and show that what would be the population density across cities if people could migrate to place where they were born instead of where they migrated actually.

4- Abstract

Human migration is the movement of people from one province to another with the intentions of settling down, permanently or temporarily in a new location. The movement might be often over long distances and from one country to another, but internal migration is also possible; indeed, this is the dominant form globally. If migration takes place within the borders of a country, then it is called internal migration; if it crosses a country’s border, it is called external or international migration. Migration can occur owing to social, economic, political, cultural, and ethnic reasons. Internal migration began with modernization in agriculture and industrialization activities after the Second World War in Turkey. Migration first occured from village to city, then from small- and medium-sized cities to large cities. In the 1990s, a new form of migration emerged: from cities to villages. Internal migration has caused social changes in both city and village settlements. Along with these changes, a number of problems emerged, especially in those of cities, and these problems exist till date.This study analyzed internal migration addresses from a city to another city in Turkey between 2014 and 2017.

5- Data Cleaning & Pre-processing

We will go through the steps for reading, reshaping, organizing and finally saving a RDS file in order to have a clean and multi purpose data set.

5.1- Reading The Data

We will start by reading raw data from the excel file and get rid of some unnecessary lines containing footnotes.

First we check the structure and head of data in order to get an insight about our dataset.

5.2- Reshaping The Data

Since the file we have is structured in a horizontal way it is better to transform it to a vertical format by using melt function in reshape2 library.

5.3- Organizing & Preprocessing Data

There are some additional information that we will not use for our analysis in the current data frame. Thus, we will remove them and also do some renaming and formatting.

Year column is currently in integer format which can be seen above. However, in order to do more analysis on ggplot2 or other packages it is better to transform it to date format with lubridate.

5.4- Translation from TR to EN

As the last part of data cleaning steps, we will translate characters from TR to EN by using below function

5.5- Saving as RDS format

RDS file allows us to work with the same data later on without replicating above steps.

6- Exploratory Data Analysis (as of 2017)

We first list top 6 cities which preferred as a “Province of residence” and top 6 cities that migrated population have as a “place of birth”. So we can compare the relation between each other.

Top 6 Cities Preffered for Migration

Top 6 Cities That Migrated People Have as a “Place of Birth”

As you can notice from the two tables above, ranking of top 6 cities differ except Istanbul, Ankara and Izmir. We can make deduction that Kocaeli, Antalya and Bursa have higher possibility to be considered as a destination place to migrate. People who were born in Adana, Sanliurfa and Diyarbakir prefer to migrate to other cities rather than living in their hometowns. The reasons may be economic such as job opportunities, density of industrial activity, compulsory state service also it could be social such as terrorism or environmental such as climate.

We inferred that three biggest cities of Turkey; Istanbul, Ankara and Izmir have the same major role as migrated cities and people who migrated from. But what would be the real impact of the migration on population? If migration to these cities is higher than migrated from, then we should expect an increase in the population or vice versa.

7- Findings

7.1- Three largest cities population growth

Below graph represents net annual impact of migration on three biggest Turkish cities. As shown, during the period between 2014 and 2017 net migration impact resulted in increasing population in these cities.

2015 was the peak of the population growth induced by migration for all three cities and 2016 was the lowest.

7.2- Net Migration Impact as of 2017 by All Cities

As shown below, 26 cities out of 81 have a position of net receiver. On the other hand remaining 55 cities’ population declined due to migration. In this perspective we can assert that population concentrated on 32% of the Turkish cities in 2017 due to migration. Also it is visible in the graph that cities with net receiver status tend to concentrate on western and more industrialized part of the country. Without any other information we can assume majority of the migration in 2017 can be explained by economical motives.

8- Geographical Distrubituon of Migration to Three Biggest Cities

We are going to visualize migration to three biggest cities on Turkey map.

In order to do that we need to go to this web page to download Turkey political map. It is only for non commercial use. First we need to select Turkey and then select R (SpatialPolygonsDataFrame) with level 1 information which only includes cities (Level 2 would also include towns).

8.1- Geographical Distribution of Migration to Istanbul and Ankara as of 2017

Below two maps illustrates source cities of migration to Istanbul and Ankara as of 2017.

#Istanbul 2017
ist17 <- clean_data %>%
  filter(Year == "2017-01-01", Destination == "Istanbul")
mig_ist <- data_frame(id = rownames(TRmap@data), Birth_Place = TRmap@data$NAME_1) %>%
  left_join(ist17, by = "Birth_Place")
mig_ist_map <- left_join(TRcity, mig_ist, by = "id")
#Ankara 2017
ank17 <- clean_data %>%
  filter(Year == "2017-01-01", Destination == "Ankara")
mig_ank <- data_frame(id = rownames(TRmap@data), Birth_Place = TRmap@data$NAME_1) %>%
  left_join(ank17, by = "Birth_Place")
mig_ank_map <- left_join(TRcity, mig_ank, by = "id")

grid.arrange(ggplot(mig_ist_map) +
  geom_polygon( aes(x = long, y = lat, group = group, fill = People), color = "grey") +
  coord_map() + theme_void() +
  labs(title = "Migration to Istanbul in 2017", subtitle = paste0("Total Number of People Migrated to Istanbul: ", sum(ist17$People))) +
  scale_fill_distiller(name = "Number of People", palette = "Spectral", limits = c(0,20000), na.value = "black") + theme(plot.title = element_text(hjust = 0.5), plot.subtitle = element_text(hjust = 0.5)), ggplot(mig_ank_map) +
  geom_polygon( aes(x = long, y = lat, group = group, fill = People), color = "grey") +
  coord_map() + theme_void() +
  labs(title = "Migration to Ankara in 2017", subtitle = paste0("Total Number of People Migrated to Ankara: ", sum(ank17$People)), caption = "Source: TUIK") +
  scale_fill_distiller(name = "Number of People", palette = "Spectral", limits = c(0,20000), na.value = "black") + theme(plot.title = element_text(hjust = 0.5), plot.subtitle = element_text(hjust = 0.5)), nrow=2)

As seen above migration to Istanbul is more diversified compared to migration to Ankara. Ankara mostly choosen by people in close proximity on the other hand Istanbul attracks people all over Turkey but mostly rural areas of Black Sea region and Eastern Anatolia. These result are confirming the results about economical motivation of the migration.

8.2- Geographical Distribution of Cumulative Migration to Istanbul and Ankara Between 2014 - 2017

Geographical distribution for the 4 years period between 2014- 2017 for Istanbul and Ankara shown below. Results are pretty similar to distribution realized in 2017 only impacts which we can conclude motives are consistent for the last 4 years.

#Istanbul cum
istcum <- clean_data %>%
  filter(Destination == "Istanbul")
mig_ist_cum <- data_frame(id = rownames(TRmap@data), Birth_Place = TRmap@data$NAME_1) %>%
  left_join(istcum, by = "Birth_Place")
mig_ist_cum_map <- left_join(TRcity, mig_ist_cum, by = "id")
#Ankara cum
ankcum <- clean_data %>%
  filter(Destination == "Ankara")
mig_ank_cum <- data_frame(id = rownames(TRmap@data), Birth_Place = TRmap@data$NAME_1) %>%
  left_join(ankcum, by = "Birth_Place")
mig_ank_cum_map <- left_join(TRcity, mig_ank_cum, by = "id")

grid.arrange(ggplot(mig_ist_cum_map) +
  geom_polygon( aes(x = long, y = lat, group = group, fill = People), color = "grey") +
  coord_map() + theme_void() +
  labs(title = "Migration to Istanbul between 2014 - 2017", subtitle = paste0("Total Number of People Migrated to Istanbul: ", sum(istcum$People))) +
  scale_fill_distiller(name = "Number of People", palette = "Spectral", limits = c(0,20000), na.value = "black") + theme(plot.title = element_text(hjust = 0.5), plot.subtitle = element_text(hjust = 0.5)), ggplot(mig_ank_cum_map) +
  geom_polygon( aes(x = long, y = lat, group = group, fill = People), color = "grey") +
  coord_map() + theme_void() +
  labs(title = "Migration to Ankara between 2014 - 2017", subtitle = paste0("Total Number of People Migrated to Ankara: ", sum(ankcum$People)), caption = "Source: TUIK") +
  scale_fill_distiller(name = "Number of People", palette = "Spectral", limits = c(0,20000), na.value = "black") + theme(plot.title = element_text(hjust = 0.5), plot.subtitle = element_text(hjust = 0.5)), nrow=2)

8.3- Regional Impact of Migration

Total inflows and outflows of migration is hard to analyze just in one look for 81 cities of Turkey, essentially we would need to check combination of 81 cities in groups of 2. Instead of this approach, we can analyze on geographical regions of Turkey which is a combination of 7 regions in groups of 2.

# File is created manually
tmp3<-tempfile(fileext=".xlsx")
download.file("https://github.com/MEF-BDA503/gpj18-r_boys/blob/master/source_files/region_tr.xlsx?raw=true",mode = "wb",destfile=tmp3)
reg<-readxl::read_excel(tmp3)
# first region for birth place
breg <- reg %>%
  select(Birth_Place, Region) %>%
  inner_join(., clean_data, by = "Birth_Place")
# second region for destination
dreg <- inner_join(reg, breg, by = "Destination") %>%
  select(Year, Birth_Place.x, Region.x, Destination, Region.y, People)
colnames(dreg) <- c("Year", "Birth_Place", "Birth_Region", "Destination", "Destination_Region", "People")
# migration by regions as of 2017
regions <- dreg %>%
  select(Year, Birth_Region, Destination_Region, People) %>%
  group_by(Birth_Region, Destination_Region) %>%
  summarise(R_People = sum(People))
#graph
nodes <- regions %>%
  distinct(Birth_Region) %>%
  rename(label = Birth_Region) %>%
  rowid_to_column("id")
per_region <- regions %>%  
  group_by(Birth_Region, Destination_Region) %>%
  summarise(R_People = round(sum(R_People)/1000,0)) %>%
  arrange(desc(R_People)) %>%
  ungroup()
edges <- per_region %>% 
  left_join(nodes, by = c("Birth_Region" = "label")) %>% 
  rename(from = id)
edges <- edges %>% 
  left_join(nodes, by = c("Destination_Region" = "label")) %>% 
  rename(to = id)
edges <- select(edges, from, to, R_People)
regions_igraph <- graph_from_data_frame(d = edges, vertices = nodes, directed = TRUE)
regions_igraph_tidy <- as_tbl_graph(regions_igraph)

ggraph(regions_igraph, layout = "linear") + 
  geom_edge_arc(aes(width = R_People), alpha = 0.8) + 
  scale_edge_width(range = c(0.2, 3), breaks = c(100,300,500,700,1000)) +
  geom_node_text(aes(label = label)) +
  geom_node_label(aes(label = label), label.size = 0.5) +
  labs(title = "Migration Between Regions for 2014-2017", edge_width = "Ths People") +
  theme_graph() + theme(legend.position = "bottom")

Above graph represents migration between 7 geographical regions of Turkey. When we are following streams from left to right we need to follow lines under x axis or we need to follow upper lines for streams from right to left.

As presented in the previous analysis concentration of the migration is Marmara region. Most thick lines are Black Sea region to Marmara and Eastern Anatolia to Marmara below x asis. This means most intense migration flows realized from these two region to Marmara.

8.4- Current Breakdown of Population for Istanbul as of 2017

The first map shows the distribution of people living in Istanbul according to their place of birth in 2016, in the second map we see the total number of people who migrated to Istanbul in 2017. According to these two maps we can see that those who are not from Istanbul are calling their relatives to Istanbul or it is easier to migrate to Istanbul for relatives. Especially, families from Black Sea region and Middle Anatolia are have an active role in this migration process.

## Using X__1, Nüfusa Kayitli Olunan Il as id variables

8.5- Comparison Between Population With and Without Migration

In the first map, we can see the current distribution of population in Turkey. The most crowded city is Istanbul with a population of over 15 million. Ankara and Izmir are following Istanbul.

What would be the distribution of population without migration in Turkey?

The second map gives the answer of this question. The most crowded cities would not change in ranking, Istanbul, Ankara and Izmir, but the distribution of population over the cities would change. The most affected city would be Istanbul. If if the migration effect is disregarded, the population of Istanbul would be 7.8 million.

org_pop <- pop_data %>%
  select(Birth_Place, People) %>%
  group_by(Birth_Place) %>%
  summarise(pop = sum(People)/1000)
cur_pop <- pop_data %>%
  select(Province, People) %>%
  group_by(Province) %>%
  summarise(popp = sum(People)/1000)
#join
org_pop_m <- data_frame(id = rownames(TRmap@data), Birth_Place = TRmap@data$NAME_1) %>%
  left_join(org_pop, by = "Birth_Place")
org_pop_map <- left_join(TRcity, org_pop_m, by = "id")
cur_pop_m <- data_frame(id = rownames(TRmap@data), Province = TRmap@data$NAME_1) %>%
  left_join(cur_pop, by = "Province")
cur_pop_map <- left_join(TRcity, cur_pop_m, by = "id")
#plot
grid.arrange(ggplot(cur_pop_map) +
  geom_polygon( aes(x = long, y = lat, group = group, fill = popp), color = "grey") +
  coord_map() + theme_void() +
  labs(title = "Population as of 2016", subtitle = paste0("Total Population (ths): ", round(sum(cur_pop$popp,0)))) +
  scale_fill_distiller(name = "Number of People (ths)", palette = "Spectral", limits = c(0,15000)) + theme(plot.title = element_text(hjust = 0.5), plot.subtitle = element_text(hjust = 0.5)), ggplot(org_pop_map) +
  geom_polygon( aes(x = long, y = lat, group = group, fill = pop), color = "grey") +
  coord_map() + theme_void() +
  labs(title = "Population without Migration", subtitle = paste0("Total Population (ths): ", round(sum(org_pop$pop,0))), caption = "Source: TUIK") +
  scale_fill_distiller(name = "Number of People (ths)", palette = "Spectral", limits = c(0,15000)) + theme(plot.title = element_text(hjust = 0.5), plot.subtitle = element_text(hjust = 0.5)), nrow=2)