Module # 10 assignment

Review the reading resources and post on your blog a new entry with your work with ggplot2 and time series (try yourself) and discuss the input of visualization on time series analysis.



The dataset I will be using is titled "Video Game Sales" (https://www.kaggle.com/datasets/anandshaw2001/video-game-sales)




The first plot I created deals with the amount of video game sales over the years. Using ggplot, I was able to establish a relationship of the total amount of video games sold over the course of forty years. 


> ggplot(videoGameSales, aes(x = as.numeric(Year), y = Global_Sales)) +
+   geom_line(stat = "summary", fun = sum) +
+   labs(title = "Global Video Game Sales Over the Years", x = "Year", y = "Total Sales (millions)")





The next plot was created with R-base graphics. This one was a little trickier without the use  of ggplot. More steps had to be incorporated such as the use of the aggregate function. I divided the code into sections  in order to make it easy to follow. First the Year had to be changed to numeric. Then the sales by the year had to be aggregated to the regional sales. Then it was converted to a matrix. Column names were designated. And then bar plot function is used. As mentioned earlier, if I used ggplot instead, this would have been much easier. We can see from the graph that NA (North America) dominated in sales for certain years with EU (European) trailing close behind, sometimes neck and neck. JP (Japan) would sometimes come close with EU during the late 1990s. Other Regions would also be matched in sales with JP in the mid 2000s.


Comments

Popular posts from this blog

Final Project

Module # 12

Module # 13