Essentials
This course benefits from DataCamp for the Classroom program. See details here.
- Syllabus (download)
- Introduction (download)
- Project Guidelines (html | pdf)
- Homework Tutorial (pdf)
- Progress Journals
Conclusions (Jan 10-15, 2018)
Final (Jan 6-9, 2018)
Final! Read the instructions inside carefully, submit on time. (html | pdf).
Week 7 (Dec 19, 2017)
Presentations! See the presentation guidelines (html | pdf).
Week 6 (Dec 5, 2017)
This is the last week of the lectures. Aside from the remaining CART part, you are going to create your first R packages. R packages are very useful to bring the codes together for projects. Also, you can upload your package to GitHub and share your work with others (or just reach your project easily from basically anywhere with internet connection). We are going to follow these two tutorials.
- Writing an R package from scratch by Hillary Parker
- R Package Primer by Karl Broman
- R Packages by Hadley Wickham
You are going to see only the essentials. I am not going to cover the aspects of crafting an R package in detail. We are going to write a simple function inside a package, write documentation inside with roxygen2
and call it from the package. You can also upload it to a GitHub repo and try to call from there (optional).
Week 5 (Nov 21, 2017)
Your R links from the first assignment are put together into a single file by Özgur Hoca. You can find it here or download the Excel file from here.
Assignments (Due Date Nov. 30)
You have 3 individual assigments. You may do all of them but choose one to report. Add the assignment to your individual Progress Journals. If you add more than one assignment to your PJ, state the one you want to be graded. (p.s. Those data sets are popular on internet. If you find an inspiration, please state it in a references section with links.)
- Assignment 1: Esoph and Youth Survey (html | pdf)
- Assignment 2: Spam Data (html | pdf)
- Assignment 3: Diamonds Data (html | pdf)
Lecture Notes
Please check the Machine Learning tutorials and install the necessary packages. It is recommended for you to come next lecture with your own laptops. We are not going to wait for installation problems as some packages might have many dependencies that might be problematic (Meaning: It took me 3 hours to find the solution to install rattle on Mac, we don’t have that luxury during lecture hours).
Also download the data folder in data.zip
(given below) for the necessary data files.
- Introduction to Machine Learning Part I (html | pdf)
- Introduction to Machine Learning Part II (html | pdf)
- Necessary data folder for ML codes (zip)
- A recap for dplyr is added to the files. (html | pdf)
Note: For Mac OS >10.11 (El Capitan and above) users, it might cause trouble to install rattle
package. Refer to this tutorial.
Also if rmarkdown is giving you any trouble, update it from github.
Week 4 (Nov 7, 2017)
- Shiny examples. See the tutorials and . Example 1 - Initial Example - Also Default Project on RStudio | Example 2 - Movies)
- Unfinished OSYM Case: Data parsers at the management made a mistake in parsing scores table! It appears so that some important programs were missing and they forgot to double check their work. You need to repeat your analysis with the updated data. Updated data can be found in here.
Week 3 (Oct 24, 2017)
- First Data Show is ready! See Berkay’s Data Show here.
- Your first Case Study is “Welcome to University”. See the details from here
Project Examples from Last Year’s Course
Check out these projects to be an example for your projects. Though, better work is expected from you for this year :) (MEF Trivia fact: One of the students of MEF BDA program from last year is now your instructor.)
- Data Crunchers (html analysis | ppt presentation)
- Paranormal Distribution (html analysis | ppt presentation)
- haha (html analysis | ppt presentation)
Week 2 (Oct 10, 2017)
- Introduction to Tidyverse (html)
- Alternate Story of Big Data (html)
secimler
package installation instructions (Click)- Further reading and self study exercises (1 2 3)
secimler
data (for those who could not install the package) June 6 Nov 1
Extra Materials
For audiovisual learners, some webinars here.
dplyr
- Official dplyr tutorial
- dplyr join functions
- dplyr join functions official tutorial
- dplyr Cheat Sheet
ggplot2
RMarkdown
- Introduction to RMarkdown - Official
- R4DS Book - Communication
- DataCamp - Authoring R Markdown Reports Free Part
- RMarkdown Cheat Sheet
Shiny
RStudio
Week 1 (Sep 26, 2017)
- Introduction to R (html | pdf)
- Some base R exercises (html) (Solutions)
- For further reading it is recommended to read R’a Hızlı Giriş and Learn X in Y Minutes - R. You can find the links below. For visual learning you can refer to your Udacity course and DataCamp courses (free ones).
Supplementary Documents
External Good Resources About R and Data Science
- Introduction to Statistical Learning
- R for Data Science
- R’a Hızlı Giriş (Türkçe)
- The Elements of Statistical Learning
- Advanced R
- Bookdown Compilation
- Akademik Bilişim 2017 - R ile Veri Analizi Dersi
- BOUN-FE 522
- Learn X in Y Minutes - R
- dplyr vignettes
- ggplot2 workshop
- RStudio Cheat Sheets (Base R, dplyr, ggplot2, RMarkdown etc.)
- R Reference Cards
- data.table Cheat Sheet
Data Sets for Prospective Projects
- YÖK
https://yokatlas.yok.gov.tr/ https://istatistik.yok.gov.tr/
- ÖSYS
http://www.osym.gov.tr/TR,6552/sureli-yayinlar.html
- EPİAŞ
https://seffaflik.epias.com.tr/transparency/
- SPK
- Merkez Bankası - CBRT
- Emeklilik Gözetim Merkezi
http://www.egm.org.tr/?pid=351
- TURKSTAT - TUIK