This is a good EDA example. The script compares US-wide housing costs as percent of household income, and examines some potential relationships between housing budget and other factors. The visualizations are polygon, bar charts and percentage on state maps. Housing cost percentage, median housing costs as percent of income by family type, median rent costs as a percentage of household income and median home-ownership costs as a percentage of household income are visualized.
The San Francisco crime dataset is analyzed. Dataset contains wide range of crimes, so the analyst visualized them all. It is interesting because he found some crime patterns on the city map and at which times in a day crimes occur.
The case is about a company which wants to understand the reasons behind employee churn. In addition the company wants to forecast future employee resignations. The analyst found that valuable employees that leave are not satisfayed, work on many projects, spend many hours in the company each month and aren’t promoted. The predictive power of different models are very similar and seems robust. The case is interesting because it uses cross-validation, tree learning, naives bayes and logistic regression in the same case.
The kernel is about empirically checking if an US citizen/resident should pursue his/her education to higher levels, whether it is worth doing it. And also it looks for a correlation or pattern between higher income and higher degree level. At the end the analyst has found that it is worth pursuing education and worth doing a PhD, since you get paid higher compared to other degree levels. It is important to note that some states with high unemployment rate are not the smartest decision for any degree level, so state can be considered as a limitation for the occupancy.
The playlist examines R programming and its execution in finance. There are 15 videos that cover a wide range of topics, from getting financial data to modeling and time series analysis.
This is an extra example for this homework. From its web-site: The quantmod package for R is designed to assist the quantitative trader in the development, testing, and deployment of statistically based trading models. quantmod is used mostly in quantitative financial modelling & trading framework using R.
This is another extra example for this homework. I chose this document, because I work in a financial institution. The page contains a list of packages useful for empirical work in Finance, grouped by topic.