Due, in stat405 mailbox, Thursday Oct 11
Perform a team debriefing. What worked well? What didn’t work well? Are you happy with the final project? What would you change next time. Write one page per team. (10 points)
Use regular expressions and the stringr
package to solve the following challenges related to variables from the mpg2
data set from project 1. Unless otherwise noted, each challenge should be answered by creating a vector (or vectors) containing the answer.
Cleaning up transmissions: (5 points)
Is a car an automatic?
Convert Auto to Automatic
Eliminate all transmission codes with less than 50 cars
Convert (e.g.) “Automatic (S4)” to “Automatic 4-spd”. (Hint: either do this in two steps, or read the help for str_match
)
Model names: (5 points)
For all Porsche models, extract the name of the model from its options (i.e. the model name of “Panamera S Hybrid” is “Panamera”)
Most Mercedes-Benz models have names like XYZ123. Extract the two pieces (alphabetic and numeric) and plot them.
How does the Mercedes naming scheme compare to the Infinity naming scheme?
Understanding engine descriptions (10 points)
eng_desc
field by breaking it down into multiple variables. What are the most common components? What components are so rare that it’s not worthwhile to make a special variable for them?You will be graded on the quality of your code - focussing on correctness and concision.