Hello!
I got a little behind on my blog for this semester, but here I am!
This semester we are learning to use a statistical graphic coding language called R. This language seems to be used to visualize data with graphs and tables. I don't know a whole lot about R other than that, but I am excited to learn a new language since my major is actually in Computer Science. This will be a fun opportunity to enjoy both computer science and biology together.
So, I started reading chapter 1 in Data Visualization with R by Rob Kabacoff. The first chapter is about preparing your data for visualization. This can be done by importing your data from Excel, text files, and even databases using different packages.
Text files are imported using the "readr" package, Excel spreadsheets using the readxl package, and statistical packages using the haven package. Importing data from databases is apparently much more complicated, and is not included in the book I am referencing.
Using RStudio, you can type your commands onto the console window. The book is using employee salaries as examples, so I'll mirror those examples.
To import a text file from a comma delimited file you would type in:
library(readr)
Salaries <- read_csv("salaries.csv")
For Excel, you type something similar, but you specify the sheet you want:
library(readrxl)
Salaries <- read_excel("salaries.xlsx", sheet=1)
If you wanted to import data from a statistical package from Stata (Note: I'm not sure specifically what Stata is) then you would type:
library(haven)
Salaries <- read_sav("salaries.dta")
I decided to try making a simple little table on Excel to try importing it onto R and seeing what happens. I'm a huge Lord of the Rings nerd, so I decided to make a very incomplete table sorting out the different races and their groups in Middle Earth. FOR SCIENCE.
This still gave me a much better idea about how R sorts and displays your data, and how I should type my data on Excel to make it look cleaner on R. I decided it would be easier and require less research on my part if I just put character names from the Lord of the Rings universe, so here is take two:
It looks MUCH more tidy on R now!
So moving on....
After importing data, you must begin sorting it. The book shows examples of this using Star Wars. To sort data, there are two packages you can use: dplyr, and tidyr. The book has a table that shows both package's uses and functions:
Comments
Post a Comment