Who am I and what is my aim ?

My name is Muharrem Cakir. I have been working as Data Management Consultant at Obase. I am participant as different roles like developer, Architect or Team Leader in our customers’ projects. Our Customers are in various sectors like retail, banking, airlines, Telco , etc. Ä°n our projects I responsible to service optimum solutions to our customers who have DWH , BI or analytics projects.

My aim is develop myself and help our customers especially about big data and analytic subjects.

Tree-Based Machine Learning for Insurance Pricing

Here is well documented summary of how machine learning tecniques can be implemented into insurance sector. Roel Henchaerts is presented that current models (GLM and GAM ) are used to evaluate risk. They hve been tried machine learning tecniques which are Regression Tree, Random Forest and Gradient Boosting Machine (GBM)

Tree-Based Machine Learning for Insurance Pricing

Below, I added 3 examples and one project about R and how we can use it in Data Science projects in various sectors.

Segmentation . How to perform k-means Clustering in R

Customer Segmentation is very important. if you can do is successfuly, Yo can manage your campaign or you can do any other predictive models like churn, propensity successfully. In this video, you can find how to perfomr k-means clustering method in R. Irıs dataset which is popular in analytic world is used in this video. There are 5 features (sepal length,sepal width,petal length,petal width and class)in this dataset. We try to add all plants one of the group in class features. After running k-means method , we see that 150 plants are clustering with 3 clusters of sizes 62,50,38.

Segmentation with k-means Clustering

Customer churn analysis

As thinking having a new customer is 10 times expensive than keeping a existing customer , prediction to estimate who will leave is very important. We try to predict via a churn models. An example of Churn Modeling in R is below:

Data Science Demo - Customer Churn Analysis

Data manipulations

Data manipulation is the most important step in data science Projects. it is observed in lots of projects that data manipalation step is equal %60-65 of all project time. Also it is very critical that if you don’t do right things in this step your model’s accuracy might be low.

Here is a good source about data manipulation:

Hands-on dplyr tutorial for faster data manipulation in R

Classifications in R: Response Modeling/Credit Scoring/Credit Rating using Machine Learning Techniques

A project authered by Ariful Mondal in 2016 is about Response Modeling. I interest in this kind of projects. Mr.Mondal is explaining how a response modeling is established step by step. He uses some R functions to understand and manipulate data. This document is very useful for someone who wants to be data scientist.

Classifications in R: Response Modeling/Credit Scoring/Credit Rating using Machine Learning Techniques