In my spreadsheet modeling class this semester, I gave an assignment that involved doing some basic pivot tables and histograms for a dataset containing (fake) patient records from a post-anethesia care unit (PACU). It's the place you go after having surgery until you recover sufficiently to either go home (for outpatient surgery) and head back to your hospital room.
You can find the data, the assignment and the R Markdown file in my <a href="https://github.com/misken/hselab-tutorials">hselab-tutorials</a> github repo. Clone or download a zip.
You'll see that one of the questions involves having students reflect on why certain kinds of analytical tasks are difficult to do in Excel. I have them read one of my previous posts on using R for a similar analysis task.
So, I thought it would be fun to do some of the things asked for in this Excel assignment but to use R instead. It is a very useful exercise and I think those somewhat new to R (especially coming from an Excel-centric world like a business school) will pick up some good tips and continue to add to their R knowledge base.
Some of the things that this exercise will touch on include:
- reading a CSV file and controlling the data types as they come in to an R dataframe
- converting Excel date/times to R datetimes (actually to POSIXct)
- doing typical date time math
- working with R factors, levels and some string parsing
- using the plyr package for split-apply-combine analysis (aka “group by” analysis for SQL folks)
- avoiding an evil gotcha involving POSIXlt vs POSIXct datetime classes when using plyr
I thought I'd give RPubs a try to hosting this document. Check out the pacu analysis at RPubs.