public:courses:data_science:getting_and_cleaning_data:intro

Getting and Cleaning Data

Course page: https://class.coursera.org/getdata-034 By Jeff Leek, PhD, Roger D. Peng, PhD, Brian Caffo, PhD

  • Basic concepts:
    • Find and extract raw data
    • Tidy data principles
    • practical R packages
  • Components of tidy data
    • Raw data: can have multiple levels
    • Tidy data
    • Should produce a code book (metadata):
      • could be in markdown
      • should have a section called “Study design” (eg. how raw data was collected)
      • must have section “Code cook”: description of each variable and its units
    • Explicit and exact recipe to go from raw to tidy (instruction list)
      • R script
      • input = raw data, output = processed data
      • no parameter for script
  • public/courses/data_science/getting_and_cleaning_data/intro.txt
  • Last modified: 2020/07/10 12:11
  • by 127.0.0.1