Diamonds are basically a special form of carbon atoms which are arranged in a cubic crystal structure. Beyond that, diamonds are identified as the symbol of love, romance and commitment.Because of these spiritual causes, people have been paying a lot of money to the diamonds. There are important features that determine the value of a diamond. The main ones are carat and clarity. In this work, I will try to evaluate the price of a dimond with specified features.
set.seed(503)
library(tidyverse)
diamonds_test <- diamonds %>% mutate(diamond_id = row_number()) %>%
group_by(cut, color, clarity) %>% sample_frac(0.2) %>% ungroup()
diamonds_train <- anti_join(diamonds %>% mutate(diamond_id = row_number()),
diamonds_test, by = "diamond_id")
As we see in the graph, diamonds are sold in specific sizes like 1, 1.5, 2 carats mostly and most of them are smaller than 2.5 carats. Between 1.5 and 2 carat sizes, lines cuts each other so we can say effects of clarity on the price could be mmislead us in that size area.
ggplot(diamonds_train, aes(x=carat, y=price, colour=clarity))+
geom_point(alpha=0.2)+
geom_smooth()
color<-diamonds_train%>% group_by(color)%>% summarise(avg_Price=mean(price))
color
## # A tibble: 7 x 2
## color avg_Price
## <ord> <dbl>
## 1 D 3177.579
## 2 E 3069.929
## 3 F 3745.654
## 4 G 3991.399
## 5 H 4454.128
## 6 I 5123.605
## 7 J 5335.504