Explanatory Data Analysis of Trade Statistics of Turkey

Abstract

We worked on trade statistics data of Turkey against the G7 countries and 3 more countries (Russia, India, China) both for export and import.Our aim is to analyze Turkey’s trade characteristics by product groups. Within this aim, we take trade values and quantities by product groups in MS Excel file format from trademap.org website for each trade partner country for the period of 2013 - 2017. In this study, We applied data manipulation to make it proper for analysis and explanatory data analysis with visulations and conclude with overall statistical overview of Turkey’s trade.

Explanation of Raw Data

Following the extraction from source (Trademap.org), our raw data consist of 10 countries’ trade data against Turkey. This data splits to 2 with trade direction as “Export” and “Import”. The other dimension is stat for trade, one is “Quantity” of Export/Import, the other is “Value” in billion USD of Export/Import. Therefore we have four main data frame for each country (total of 40 excel files).

Data Frame DF Definition
ExportValues Export from Turkey to target countries and their values in billion USD
ExportQuantity Export from Turkey to target countries and their quantities in units(tons,gallons etc)
ImportValues Import to Turkey from target counties and their values in billion USD
ImportQuantity Import to Turkey from target counties and their quantities in units(tons,gallons etc)

Table structures for “Quantity” and “Value” files are as follows:

Country Product code Product label Quantity in 2013 Unit Quantity in 2014 Unit Quantity in 2015 Unit Quantity in 2016 Unit Quantity in 2017 Unit
France ’0306 Crustaceans, whether in shell or not, live, fresh, chilled, frozen, dried, salted or in brine, … 121 Tons 139 Tons 22 Tons 24 Tons

Table 1: Quantity excel table structure

Country Product code Product label Value in 2013 Value in 2014 Value in 2015 Value in 2016 Value in 2017
France ’0306 Crustaceans, whether in shell or not, live, fresh, chilled, frozen, dried, salted or in brine, … 0 1155 1182 270 0

Table 2: Values excel table structure

In addition to main raw data, we also have look-up table for product codes; This lookup table registers product labels in 2 digit code that is less categorized than main data.

Product code Product label
’01 Live animals
’02 Meat and edible meat offal
’03 Fish and crustaceans, molluscs and other aquatic invertebrates
’04 Dairy produce; birds’ eggs; natural honey; edible products of animal origin, not elsewhere …

Table 3: Product look-up excel table

Regarding the variables included in these tables; definitions are as follows:

ExportQuantity & ImportQuantity
Country Trade Partner of Turkey
Product_Code Traded Product Code in 4 digit
Product_Label Product Definition
Unit Metric of Quantity
Quantity_2013 Export or Import quantity in 2013
Quantity_2014 Export or Import quantity in 2014
Quantity_2015 Export or Import quantity in 2015
Quantity_2016 Export or Import quantity in 2016
Quantity_2017 Export or Import quantity in 2017
ExportValues & ImportValues
Country Trade Partner of Turkey
Product_Code Traded Product Code in 4 digit
Product_Label Product Definition
Value_2013 Export or Import value in 2013
Value_2014 Export or Import value in 2014
Value_2015 Export or Import value in 2015
Value_2016 Export or Import value in 2016
Value_2017 Export or Import value in 2017

Data Preparation, cleaning and Manipulation

In order to read these excel files, firstly we create a function named “getDataSet”:

PATH <-"C:/Users/emrek/Google Drive/BDA/503-EssentialsOfDataAnalytics/GitHub/gpj18-group_four/Raw Data"
setwd(PATH)

getDataSet <- function(pPattern){
  nm <- list.files(path = PATH, pattern = pPattern)  
  if (exists("resultDataset")){
    rm("resultDataset")
  }
  for (file in nm){
    if (!exists("resultDataset")){
      resultDataset <- read_excel(file, col_names=FALSE, skip=1)
    }
    else{
      temp_dataset <-read_excel(file, col_names=FALSE, skip=1)
      resultDataset<-rbind(resultDataset, temp_dataset)
      rm(temp_dataset)
    }
    
  }
  return(resultDataset)
  
}


import_quantity_dataset <- getDataSet('*.mport.*uant.*')
import_value_dataset <- getDataSet('*.mport.*alu.*')
export_quantity_dataset <- getDataSet('*.xport.*uant.*')
export_value_dataset <- getDataSet('*.xport.*alu.*')

colnames(import_quantity_dataset) <- c("Country","Product_Code","Product_Label","Quantity_2013","Unit1","Quantity_2014","Unit2","Quantity_2015","Unit3","Quantity_2016","Unit4","Quantity_2017","Unit5")
colnames(import_value_dataset) <- c("Country","Product_Code","Product_Label","Value_2013","Value_2014","Value_2015","Value_2016","Value_2017")
colnames(export_quantity_dataset) <- c("Country","Product_Code","Product_Label","Quantity_2013","Unit1","Quantity_2014","Unit2","Quantity_2015","Unit3","Quantity_2016","Unit4","Quantity_2017","Unit5")
colnames(export_value_dataset) <- c("Country","Product_Code","Product_Label","Value_2013","Value_2014","Value_2015","Value_2016","Value_2017")

import_quantity_dataset$Product_Code<- gsub("'", "", import_quantity_dataset$Product_Code)
import_value_dataset$Product_Code<- gsub("'", "",import_value_dataset$Product_Code)
export_quantity_dataset$Product_Code<- gsub("'", "",export_quantity_dataset$Product_Code)
export_value_dataset$Product_Code<- gsub("'", "",export_value_dataset$Product_Code)

And by using the getDataSet function, we merged the Turkey’s export-quantity, export-value, import-quantity, import-value tables for 10 selected countries.

glimpse(export_quantity_dataset)
## Observations: 12,223
## Variables: 13
## $ Country       <chr> "Canada", "Canada", "Canada", "Canada", "Canada"...
## $ Product_Code  <chr> "TOTAL", "7214", "6907", "2523", "7208", "6802",...
## $ Product_Label <chr> "All products", "Bars and rods, of iron or non-a...
## $ Quantity_2013 <dbl> NA, 131711, 1795, 22000, 220, 46950, 19782, 2821...
## $ Unit1         <chr> "No quantity", "Tons", "Tons", "Tons", "Tons", "...
## $ Quantity_2014 <dbl> NA, 133195, 2560, 54210, 157114, 49668, 38396, 1...
## $ Unit2         <chr> "No quantity", "Tons", "Tons", "Tons", "Tons", "...
## $ Quantity_2015 <dbl> NA, 1085, 2291, 78195, 10756, 38561, 29677, 4219...
## $ Unit3         <chr> "No quantity", "Tons", "Tons", "Tons", "Tons", "...
## $ Quantity_2016 <dbl> NA, 1759, 1935, 62500, 38354, 33166, 19759, 2239...
## $ Unit4         <chr> "No quantity", "Tons", "Tons", "Tons", "Tons", "...
## $ Quantity_2017 <dbl> 0, 220831, 140214, 129930, 80459, 35413, 34646, ...
## $ Unit5         <chr> "No quantity", "Tons", "Tons", "Tons", "Tons", "...
glimpse(export_value_dataset)
## Observations: 12,222
## Variables: 8
## $ Country       <chr> "Canada", "Canada", "Canada", "Canada", "Canada"...
## $ Product_Code  <chr> "7214", "8703", "8901", "0802", "8902", "7208", ...
## $ Product_Label <chr> "Bars and rods, of iron or non-alloy steel, not ...
## $ Value_2013    <dbl> 73607, 13976, 0, 57640, 0, 211, 895, 40420, 1052...
## $ Value_2014    <dbl> 73814, 2485, 0, 65460, 0, 96558, 1227, 46362, 11...
## $ Value_2015    <dbl> 616, 2477, 62, 105122, 0, 5033, 1052, 37154, 129...
## $ Value_2016    <dbl> 863, 94, 66747, 94623, 0, 18777, 832, 30918, 171...
## $ Value_2017    <dbl> 106068, 87274, 84922, 75731, 45504, 44261, 40668...
glimpse(import_quantity_dataset)
## Observations: 12,222
## Variables: 13
## $ Country       <chr> "Canada", "Canada", "Canada", "Canada", "Canada"...
## $ Product_Code  <chr> "2601", "7204", "2701", "4401", "0713", "1001", ...
## $ Product_Label <chr> "Iron ores and concentrates, incl. roasted iron ...
## $ Quantity_2013 <dbl> 465332, 373154, 338707, 313714, 194731, 230375, ...
## $ Unit1         <chr> "Tons", "Tons", "Tons", "Tons", "Tons", "Tons", ...
## $ Quantity_2014 <dbl> 162004, 277774, 491793, 126757, 298731, 89640, N...
## $ Unit2         <chr> "Tons", "Tons", "Tons", "Tons", "Tons", "Tons", ...
## $ Quantity_2015 <dbl> 494577, 317117, 504464, 297655, 304681, 161211, ...
## $ Unit3         <chr> "Tons", "Tons", "Tons", "Tons", "Tons", "Tons", ...
## $ Quantity_2016 <dbl> 504892, 372166, 1344338, 308932, 322942, 196538,...
## $ Unit4         <chr> "Tons", "Tons", "Tons", "Tons", "Tons", "Tons", ...
## $ Quantity_2017 <dbl> 675057, 526312, 485643, 241778, 235982, 145438, ...
## $ Unit5         <chr> "Tons", "Tons", "Tons", "Tons", "Tons", "Tons", ...
glimpse(import_value_dataset)
## Observations: 12,223
## Variables: 8
## $ Country       <chr> "Canada", "Canada", "Canada", "Canada", "Canada"...
## $ Product_Code  <chr> "7108", "0713", "7204", "2601", "2701", "2710", ...
## $ Product_Label <chr> "Gold, incl. gold plated with platinum, unwrough...
## $ Value_2013    <dbl> 230689, 132022, 147798, 77901, 62688, 6565, 1065...
## $ Value_2014    <dbl> 8259, 204972, 104914, 25982, 70544, 6244, 169636...
## $ Value_2015    <dbl> 760, 232255, 77607, 43341, 55067, 6183, 47389, 2...
## $ Value_2016    <dbl> 48488, 265707, 84380, 38810, 135545, 7990, 24360...
## $ Value_2017    <dbl> 1030428, 164565, 157861, 78121, 73275, 70084, 63...

We have 13 and 8 variables in quantity and value tables respectively both for export and import.

Now, it is time to join quantity and value data for both export and import files. While joining the tables, we should also clear the repetitive variable “Unit” for each year.

Also, we need to join product labels in 2 digit product codes to 4 digit product codes and labels.

#read the ProductCodes excel
file = "C:/Users/emrek/Google Drive/BDA/503-EssentialsOfDataAnalytics/GitHub/gpj18-group_four/2digitproductlist.xlsx"
prdCode <- read_excel(file, col_names=FALSE, skip=1)
prdCode$X__1 <- gsub("'", "", prdCode$X__1)

# join the import values and quantities
import_jTable <- import_quantity_dataset %>% 
  mutate( Unit= coalesce(Unit1,Unit2 ,Unit3 , Unit4,Unit5)) %>%
  select( Country, Product_Code , Unit, Quantity_2013, Quantity_2014, Quantity_2015, Quantity_2016, Quantity_2017) %>% 
  full_join(import_value_dataset,by = c("Country","Product_Code"))  
# seperate the first 2 digits of 4 digit product code and join the 2digit product code excel 
import_jTable <- import_jTable %>% mutate(PcodeTwo=substr(Product_Code,1,2))
import_jTable <- left_join(import_jTable,prdCode,by =c("PcodeTwo"="X__1") )
  
# join the export values and quantities
export_jTable <- export_quantity_dataset %>% 
  mutate( Unit= coalesce(Unit1,Unit2 ,Unit3 , Unit4,Unit5)) %>%
  select( Country, Product_Code , Unit, Quantity_2013, Quantity_2014,Quantity_2015,Quantity_2016,Quantity_2017) %>% 
  full_join(export_value_dataset,by = c("Country","Product_Code"))  
# seperate the first 2 digits of 4 digit product code and join the 2digit product code excel
export_jTable <- export_jTable %>% mutate(PcodeTwo=substr(Product_Code,1,2))
export_jTable <- left_join(export_jTable,prdCode,by =c("PcodeTwo"="X__1") )

Now we have joined tables for import and export stats of Turkey.

glimpse(import_jTable)
## Observations: 12,223
## Variables: 16
## $ Country       <chr> "Canada", "Canada", "Canada", "Canada", "Canada"...
## $ Product_Code  <chr> "2601", "7204", "2701", "4401", "0713", "1001", ...
## $ Unit          <chr> "Tons", "Tons", "Tons", "Tons", "Tons", "Tons", ...
## $ Quantity_2013 <dbl> 465332, 373154, 338707, 313714, 194731, 230375, ...
## $ Quantity_2014 <dbl> 162004, 277774, 491793, 126757, 298731, 89640, N...
## $ Quantity_2015 <dbl> 494577, 317117, 504464, 297655, 304681, 161211, ...
## $ Quantity_2016 <dbl> 504892, 372166, 1344338, 308932, 322942, 196538,...
## $ Quantity_2017 <dbl> 675057, 526312, 485643, 241778, 235982, 145438, ...
## $ Product_Label <chr> "Iron ores and concentrates, incl. roasted iron ...
## $ Value_2013    <dbl> 77901, 147798, 62688, 33992, 132022, 93218, 0, 6...
## $ Value_2014    <dbl> 25982, 104914, 70544, 15532, 204972, 31688, 0, 6...
## $ Value_2015    <dbl> 43341, 77607, 55067, 32940, 232255, 47770, 0, 61...
## $ Value_2016    <dbl> 38810, 84380, 135545, 31159, 265707, 54832, 0, 7...
## $ Value_2017    <dbl> 78121, 157861, 73275, 23684, 164565, 34618, 2364...
## $ PcodeTwo      <chr> "26", "72", "27", "44", "07", "10", "27", "27", ...
## $ X__2          <chr> "Ores, slag and ash", "Iron and steel", "Mineral...
glimpse(export_jTable)
## Observations: 12,223
## Variables: 16
## $ Country       <chr> "Canada", "Canada", "Canada", "Canada", "Canada"...
## $ Product_Code  <chr> "TOTAL", "7214", "6907", "2523", "7208", "6802",...
## $ Unit          <chr> "No quantity", "Tons", "Tons", "Tons", "Tons", "...
## $ Quantity_2013 <dbl> NA, 131711, 1795, 22000, 220, 46950, 19782, 2821...
## $ Quantity_2014 <dbl> NA, 133195, 2560, 54210, 157114, 49668, 38396, 1...
## $ Quantity_2015 <dbl> NA, 1085, 2291, 78195, 10756, 38561, 29677, 4219...
## $ Quantity_2016 <dbl> NA, 1759, 1935, 62500, 38354, 33166, 19759, 2239...
## $ Quantity_2017 <dbl> 0, 220831, 140214, 129930, 80459, 35413, 34646, ...
## $ Product_Label <chr> NA, "Bars and rods, of iron or non-alloy steel, ...
## $ Value_2013    <dbl> NA, 73607, 895, 1531, 211, 40420, 16625, 37956, ...
## $ Value_2014    <dbl> NA, 73814, 1227, 3961, 96558, 46362, 32160, 1484...
## $ Value_2015    <dbl> NA, 616, 1052, 5810, 5033, 37154, 23569, 6140, 0...
## $ Value_2016    <dbl> NA, 863, 832, 4590, 18777, 30918, 12319, 24849, ...
## $ Value_2017    <dbl> NA, 106068, 40668, 8801, 44261, 30766, 27136, 27...
## $ PcodeTwo      <chr> "TO", "72", "69", "25", "72", "68", "73", "73", ...
## $ X__2          <chr> NA, "Iron and steel", "Ceramic products", "Salt;...

Both tables now have 16 variables and 12,223 rows.

But there is a row in Product_Code column as “TOTAL”, we must clear it.

export_jTable <- export_jTable %>% filter(!grepl("TOTAL", Product_Code))
import_jTable <- import_jTable %>% filter(!grepl("TOTAL", Product_Code))

We have 16 variables in these tables however quantity and value variables repeat for each year. So actually, it is better to organize the table that it will have variables “Year”, “Quantity”, “Value” instead of 10 columns.

#create new objects for each years' quantity
q2013<-import_jTable %>% select(Country, PcodeTwo, X__2 ,Product_Code, Product_Label, Unit, Quantity_2013) %>% mutate(year="2013") %>% rename("Quantity"=Quantity_2013)
q2014<-import_jTable %>% select(Country, PcodeTwo, X__2 ,Product_Code, Product_Label, Unit, Quantity_2014) %>% mutate(year="2014") %>% rename("Quantity"=Quantity_2014)
q2015<-import_jTable %>% select(Country, PcodeTwo, X__2 ,Product_Code, Product_Label, Unit, Quantity_2015) %>% mutate(year="2015") %>% rename("Quantity"=Quantity_2015)
q2016<-import_jTable %>% select(Country, PcodeTwo, X__2 ,Product_Code, Product_Label, Unit, Quantity_2016) %>% mutate(year="2016") %>% rename("Quantity"=Quantity_2016) 
q2017<-import_jTable %>% select(Country, PcodeTwo, X__2 ,Product_Code, Product_Label, Unit, Quantity_2017) %>% mutate(year="2017") %>% rename("Quantity"=Quantity_2017)

#bind the objects
importQuant <- bind_rows(q2013,q2014,q2015,q2016,q2017)

#create new objects for each years' values
v2013<-import_jTable %>% select(Country,Product_Code, Value_2013) %>% rename("Values"=Value_2013) %>% mutate(year="2013")
v2014<-import_jTable %>% select(Country,Product_Code, Value_2014) %>% rename("Values"=Value_2014) %>% mutate(year="2014")
v2015<-import_jTable %>% select(Country,Product_Code, Value_2015) %>% rename("Values"=Value_2015) %>% mutate(year="2015")
v2016<-import_jTable %>% select(Country,Product_Code, Value_2016) %>% rename("Values"=Value_2016) %>% mutate(year="2016") 
v2017<-import_jTable %>% select(Country,Product_Code, Value_2017) %>% rename("Values"=Value_2017) %>% mutate(year="2017")

#bind the objects
importValues <- bind_rows(v2013,v2014,v2015,v2016,v2017)

#join binded quantity and value
import_jTableMELTED <- importQuant %>% left_join(importValues, by=c("Country","Product_Code","year"))


#Repeat with export table

q2013<-export_jTable %>% select(Country, PcodeTwo, X__2 ,Product_Code, Product_Label, Unit, Quantity_2013) %>% mutate(year="2013") %>% rename("Quantity"=Quantity_2013)
q2014<-export_jTable %>% select(Country, PcodeTwo, X__2 ,Product_Code, Product_Label, Unit, Quantity_2014) %>% mutate(year="2014") %>% rename("Quantity"=Quantity_2014)
q2015<-export_jTable %>% select(Country, PcodeTwo, X__2 ,Product_Code, Product_Label, Unit, Quantity_2015) %>% mutate(year="2015") %>% rename("Quantity"=Quantity_2015)
q2016<-export_jTable %>% select(Country, PcodeTwo, X__2 ,Product_Code, Product_Label, Unit, Quantity_2016) %>% mutate(year="2016") %>% rename("Quantity"=Quantity_2016) 
q2017<-export_jTable %>% select(Country, PcodeTwo, X__2 ,Product_Code, Product_Label, Unit, Quantity_2017) %>% mutate(year="2017") %>% rename("Quantity"=Quantity_2017)

exportQuant <- bind_rows(q2013,q2014,q2015,q2016,q2017)

v2013<-export_jTable %>% select(Country,Product_Code, Value_2013) %>% rename("Values"=Value_2013) %>% mutate(year="2013")
v2014<-export_jTable %>% select(Country,Product_Code, Value_2014) %>% rename("Values"=Value_2014) %>% mutate(year="2014")
v2015<-export_jTable %>% select(Country,Product_Code, Value_2015) %>% rename("Values"=Value_2015) %>% mutate(year="2015")
v2016<-export_jTable %>% select(Country,Product_Code, Value_2016) %>% rename("Values"=Value_2016) %>% mutate(year="2016") 
v2017<-export_jTable %>% select(Country,Product_Code, Value_2017) %>% rename("Values"=Value_2017) %>% mutate(year="2017")

exportValues <- bind_rows(v2013,v2014,v2015,v2016,v2017)

export_jTableMELTED <- exportQuant %>% left_join(exportValues, by=c("Country","Product_Code","year"))

Now, we have more useful tables, with 9 variables and 61100 rows.

glimpse(export_jTableMELTED)
## Observations: 61,100
## Variables: 9
## $ Country       <chr> "Canada", "Canada", "Canada", "Canada", "Canada"...
## $ PcodeTwo      <chr> "72", "69", "25", "72", "68", "73", "73", "26", ...
## $ X__2          <chr> "Iron and steel", "Ceramic products", "Salt; sul...
## $ Product_Code  <chr> "7214", "6907", "2523", "7208", "6802", "7306", ...
## $ Product_Label <chr> "Bars and rods, of iron or non-alloy steel, not ...
## $ Unit          <chr> "Tons", "Tons", "Tons", "Tons", "Tons", "Tons", ...
## $ Quantity      <dbl> 131711, 1795, 22000, 220, 46950, 19782, 28212, N...
## $ year          <chr> "2013", "2013", "2013", "2013", "2013", "2013", ...
## $ Values        <dbl> 73607, 895, 1531, 211, 40420, 16625, 37956, 0, 0...
glimpse(import_jTableMELTED)
## Observations: 61,100
## Variables: 9
## $ Country       <chr> "Canada", "Canada", "Canada", "Canada", "Canada"...
## $ PcodeTwo      <chr> "26", "72", "27", "44", "07", "10", "27", "27", ...
## $ X__2          <chr> "Ores, slag and ash", "Iron and steel", "Mineral...
## $ Product_Code  <chr> "2601", "7204", "2701", "4401", "0713", "1001", ...
## $ Product_Label <chr> "Iron ores and concentrates, incl. roasted iron ...
## $ Unit          <chr> "Tons", "Tons", "Tons", "Tons", "Tons", "Tons", ...
## $ Quantity      <dbl> 465332, 373154, 338707, 313714, 194731, 230375, ...
## $ year          <chr> "2013", "2013", "2013", "2013", "2013", "2013", ...
## $ Values        <dbl> 77901, 147798, 62688, 33992, 132022, 93218, 0, 6...