First of all I set my working environment.
setwd('C:/Users/Duygu/Documents/GitHub/pj-cand/files')
The csv file is read:
data <- read.csv('SchoolLifeExpectancy.csv')
The dimensions of my data are:
dim(data)
## [1] 1827 7
Let’s have a quick look to the data file:
head(data, 5)
## Country.or.Area Subgroup Year Source Unit Value
## 1 Afghanistan Female 2004 UNESCO_UIS Database_Sep2007 Years 4
## 2 Afghanistan Female 2003 UNESCO_UIS Database_Sep2007 Years 4
## 3 Afghanistan Male 2004 UNESCO_UIS Database_Sep2007 Years 9
## 4 Afghanistan Male 2003 UNESCO_UIS Database_Sep2007 Years 8
## 5 Albania Female 2004 UNESCO_UIS Database_Sep2007 Years 12
## Value.Footnotes
## 1 1
## 2 1
## 3 1
## 4 1
## 5 1
And the file ends like this:
tail(data,5)
## Country.or.Area Subgroup Year Source
## 1823 Zimbabwe Male 2001 UNESCO_UIS Database_Sep2007
## 1824 Zimbabwe Male 2000 UNESCO_UIS Database_Sep2007
## 1825 fnSeqID Footnote NA
## 1826 1 UIS estimation. NA
## 1827 2 National Estimation. NA
## Unit Value Value.Footnotes
## 1823 Years 10 1
## 1824 Years 10 1
## 1825 NA NA
## 1826 NA NA
## 1827 NA NA
Here is the structure of the data:
str(data)
## 'data.frame': 1827 obs. of 7 variables:
## $ Country.or.Area: Factor w/ 187 levels "1","2","Afghanistan",..: 3 3 3 3 4 4 4 4 4 4 ...
## $ Subgroup : Factor w/ 5 levels "Female","Footnote",..: 1 1 3 3 1 1 1 1 1 1 ...
## $ Year : int 2004 2003 2004 2003 2004 2003 2002 2001 2000 1999 ...
## $ Source : Factor w/ 2 levels "","UNESCO_UIS Database_Sep2007": 2 2 2 2 2 2 2 2 2 2 ...
## $ Unit : Factor w/ 2 levels "","Years": 2 2 2 2 2 2 2 2 2 2 ...
## $ Value : int 4 4 9 8 12 11 11 11 11 11 ...
## $ Value.Footnotes: int 1 1 1 1 1 NA 1 1 1 1 ...
Summary of the data is:
summary(data)
## Country.or.Area Subgroup Year
## Aruba : 14 Female :912 Min. :1999
## Australia : 14 Footnote : 1 1st Qu.:2000
## Austria : 14 Male :912 Median :2002
## Azerbaijan: 14 National Estimation.: 1 Mean :2002
## Bahamas : 14 UIS estimation. : 1 3rd Qu.:2004
## Belarus : 14 Max. :2005
## (Other) :1743 NA's :3
## Source Unit Value
## : 3 : 3 Min. : 2.0
## UNESCO_UIS Database_Sep2007:1824 Years:1824 1st Qu.:11.0
## Median :12.5
## Mean :12.3
## 3rd Qu.:14.0
## Max. :21.0
## NA's :3
## Value.Footnotes
## Min. :1.000
## 1st Qu.:1.000
## Median :1.000
## Mean :1.068
## 3rd Qu.:1.000
## Max. :2.000
## NA's :503
That’s all for now.