Diamonds

Diamonds are basically a special form of carbon atoms which are arranged in a cubic crystal structure. Beyond that, diamonds are identified as the symbol of love, romance and commitment.Because of these spiritual causes, people have been paying a lot of money to the diamonds. There are important features that determine the value of a diamond. The main ones are carat and clarity. In this work, I will try to evaluate the price of a dimond with specified features.

set.seed(503)
library(tidyverse)
diamonds_test <- diamonds %>% mutate(diamond_id = row_number()) %>% 
    group_by(cut, color, clarity) %>% sample_frac(0.2) %>% ungroup()

diamonds_train <- anti_join(diamonds %>% mutate(diamond_id = row_number()), 
    diamonds_test, by = "diamond_id")

Analysis of Data

As we see in the graph, diamonds are sold in specific sizes like 1, 1.5, 2 carats mostly and most of them are smaller than 2.5 carats. Between 1.5 and 2 carat sizes, lines cuts each other so we can say effects of clarity on the price could be mmislead us in that size area.

ggplot(diamonds_train, aes(x=carat, y=price, colour=clarity))+
  geom_point(alpha=0.2)+
  geom_smooth()

color<-diamonds_train%>% group_by(color)%>% summarise(avg_Price=mean(price))
color
## # A tibble: 7 x 2
##   color avg_Price
##   <ord>     <dbl>
## 1     D  3177.579
## 2     E  3069.929
## 3     F  3745.654
## 4     G  3991.399
## 5     H  4454.128
## 6     I  5123.605
## 7     J  5335.504