Homework 4

Due, in stat405 mailbox, Thursday 20 Sep

Code cleanup fixes

04-code.r contains some data cleaning code that students used for this project last year. Your aim is to rewrite it to be as clean and understandable as possible. Follow the style guide and add minimal comments that explain the why, not the how.

Working with data

Load the motor insurance data into R, and encode the kilometers, zone and make variables as factors or ordered factors with the correct levels. Include in comments the output of str(yourdataframe).

Hint: you can load the data directly from the appropriate url - you don’t need to download it first.


Turn in a printed copy of your R file. Use comments to make a heading between the two sections.

The code cleanup is worth 10 points, and working with data 15 points. You will lose points for style violations such as:

  • Incorrect spacing around commas, parentheses, operators or comments.

  • Lines that are wider than 80 columns so that they are either wrapped inappropriately, or cut off by the side of the page.

  • Failure to include descriptive comments which tell me what you are trying to achieve (not how you are achieving it).

  • If you create new variables, using names that do not accurately express the content of the variable.

  • Incorrect indenting and/or placement of curly braces

Note that you can lose many points from style violations. Be careful! I emphasise these points because they are so important for the long term maintainability of your code, and because the more easily I can understand your code, the better the grade you will get.