Week 2 RStudio

Hello! 
My school district is suffering from a cyber attack, and the entire computer system is shut down. I can’t access my blog from my computer since it’s connected to my school email, but I seem to still have access from my phone. My blog won’t be pretty this time, but I’ll make do!

The next chapter in “Data Visualization with R” by Rob Kabacoff is about ggplot2, which is one of the graphing packages in R. 
I didn’t really experiment with my own data for this one because the book provides examples to use. 

ggplot2 is a package used to build a graph in layers. You can start out with a simple graph, and make it gradually more complex by adding new elements. The book uses data from the 1985 Current Population Survey that shows the relationship between work experience and wages.
I’m going to keep this blog short this week since I’m on my phone, but here’s some pictures and descriptions of what I did.

This is the code used to load the data, and then create and map the graph. The ggplot function contains the data, and the visual properties of the graph. As you can see, the graph is still blank. This is because we haven’t added the actual elements to it yet. 
https://drive.google.com/uc?export=view&id=1hYxpQfmU-v6E6K47ZE4f_He6e3oqAuzV

Next, I added the elements. The elements of the graph are called geoms. This is because the elements on the graph are represented using geometric objects, such as dots, lines, bars, etc. They are added by using the geom_ function. For example, if you want points, you type geom_point() . After adding the elements, there was one dot that was an outlier to the rest. After deleting the outlier and setting up the graph again, this is the result:
https://drive.google.com/uc?export=view&id=1MVsGBLS_Yg1g22enpc5CjT8FmQWxYbHB

Next was changing the color, transparency, and size of the dots. To specify color you just type color = “ ” and specify an available color in the parenthesis. Transparency is done with alpha = and must be between 0 (transparent) and 1 (opaque). After that, a line was added to help visualize the best fit. This was done using the geom_smooth function. This gets complicated to explain, so I’m going to kee it vague and say you can change the type of line, color, and thickness. 
https://drive.google.com/uc?export=view&id=1Bi7ufUxSQw9xAjzcTzgmk_jLky-GSHBw

After that, a sex element was added to compare the wages and experience between men and women, and changed the color of the points to show which is which. In addition, the book showed how to create several graphs with their own points, lines, and labels to show the comparison of wage and experience in different fields. The scale function allows you to modify how the elements are mapped on the graphs, such as the x and y axis scaling, as well as the labels, such as adding a $ to the numbers for the wage. The facet_  function was used to create several different graphs, such as job sectors. More informative labels were also added, such as a title, “hourly wage” and “years of experience.” The result: 
https://drive.google.com/uc?export=view&id=1IFp7cJ-GYkaRSr2bCEDZAAHY3g8KE0Od
You can even pick a theme using theme_  so you can choose a background color, font, grid lines, choose where you want the legend, etc to make the graphs more appealing to look at. The theme used was: theme_minimal() :
https://drive.google.com/uc?export=view&id=1YoCOmpy6ll9RwCHBLdhapieoyMjS-9Hr

You can also place the data and graph mapping options directly inside a geom, which makes it so that these options apply specifically to that single geom. the previous examples were putting the data and mapping options inside the ggplot function, meaning they apply to every geom_ function. 
https://drive.google.com/uc?export=view&id=1RhFKwCzXaepdHBQMn2-A89b3By1kgfZG

Below shows how the graph looks when you only have the sex to color mapping in the geom_point function, but not the geom_smooth function, which creates the line. You can see there is a single, opaque line that has its own color, and the points still have their own color and their own transparency: 
https://drive.google.com/uc?export=view&id=1tGHz_ENhKWkOTF4TglByC9AoqouaHRBu

The ggplot2 graph can also be saved as a named object in R, so it can be pulled out and used again, manipulated, or saved and printed. 

WELP so much for a “shorter” blog post, huh? 

Comments

Popular posts from this blog

Week 13

Blog Post 2 -- Spring 2023 -- Professional Identity

Semester 2 week 4