Week 2 RStudio
Hello!







My school district is suffering from a cyber attack, and the entire computer system is shut down. I can’t access my blog from my computer since it’s connected to my school email, but I seem to still have access from my phone. My blog won’t be pretty this time, but I’ll make do!
The next chapter in “Data Visualization with R” by Rob Kabacoff is about ggplot2, which is one of the graphing packages in R.
I didn’t really experiment with my own data for this one because the book provides examples to use.
ggplot2 is a package used to build a graph in layers. You can start out with a simple graph, and make it gradually more complex by adding new elements. The book uses data from the 1985 Current Population Survey that shows the relationship between work experience and wages.
I’m going to keep this blog short this week since I’m on my phone, but here’s some pictures and descriptions of what I did.
This is the code used to load the data, and then create and map the graph. The ggplot function contains the data, and the visual properties of the graph. As you can see, the graph is still blank. This is because we haven’t added the actual elements to it yet.
Next, I added the elements. The elements of the graph are called geoms. This is because the elements on the graph are represented using geometric objects, such as dots, lines, bars, etc. They are added by using the geom_ function. For example, if you want points, you type geom_point() . After adding the elements, there was one dot that was an outlier to the rest. After deleting the outlier and setting up the graph again, this is the result:
Next was changing the color, transparency, and size of the dots. To specify color you just type color = “ ” and specify an available color in the parenthesis. Transparency is done with alpha = and must be between 0 (transparent) and 1 (opaque). After that, a line was added to help visualize the best fit. This was done using the geom_smooth function. This gets complicated to explain, so I’m going to kee it vague and say you can change the type of line, color, and thickness.
After that, a sex element was added to compare the wages and experience between men and women, and changed the color of the points to show which is which. In addition, the book showed how to create several graphs with their own points, lines, and labels to show the comparison of wage and experience in different fields. The scale function allows you to modify how the elements are mapped on the graphs, such as the x and y axis scaling, as well as the labels, such as adding a $ to the numbers for the wage. The facet_ function was used to create several different graphs, such as job sectors. More informative labels were also added, such as a title, “hourly wage” and “years of experience.” The result:
You can even pick a theme using theme_ so you can choose a background color, font, grid lines, choose where you want the legend, etc to make the graphs more appealing to look at. The theme used was: theme_minimal() :
You can also place the data and graph mapping options directly inside a geom, which makes it so that these options apply specifically to that single geom. the previous examples were putting the data and mapping options inside the ggplot function, meaning they apply to every geom_ function.
Below shows how the graph looks when you only have the sex to color mapping in the geom_point function, but not the geom_smooth function, which creates the line. You can see there is a single, opaque line that has its own color, and the points still have their own color and their own transparency:
The ggplot2 graph can also be saved as a named object in R, so it can be pulled out and used again, manipulated, or saved and printed.
WELP so much for a “shorter” blog post, huh?
Comments
Post a Comment