Data Science Toolbox
Course page: https://www.coursera.org/learn/data-scientists-tools By Jeff Leek, PhD, Roger D. Peng, PhD, Brian Caffo, PhD
- Get help from R:
?rnorm help.search("rnorm") # get the function arguments: args("rnorm") # See the code: rnorm
- R reference card downloaded.
- To ask for help on a problem with R:
- What steps to reproduce the problem ?
- What did we expect to see ?
- What did we see instead ?
- What version of R and packages ?
- What operating system ?
- To ask for help on data analysis question:
- What is the question we try to answer ?
- What steps were used to answer the question ?
- What did we expect to see ?
- What did we see instead ?
- What other solutions did we think about ?
- Places to get info on data science questions:
- stackoverflow
- R mailing list
- CrossValidated
- Google: “[data type] data analysis” or “[data type] R package”
- To checkout a local copy of a github repo:
cd my_folder git init git remote add origin https://github.com/roche-emmanue/my_repo.git
- Help on git and github from:
- Google/stackoverflow
- Basic Markdown:
- Headings with:
## This is a secondary heading ### This is tertiary heading
- Unordered lists:
* Item 1 * Item 2 * Item 3
- Types of data science questions:
- Descriptive:
- Just trying to describe the data
- Exploratory:
- Trying to find relationships (but not really trying to confirm)
- Inferential:
- Take small dataset and try to generalize that to a larger population.
- Predictive:
- X predicts Y, doesn't mean that X causes Y
- Causal
- What happen if we change the value of one variable…
- Mechanistic
- What is Data:
- In a “set of items” = population
- Variables = measurement of characteristics.
- qualitative (discret scale) or quantitative (continuous scale)
- Could be Raw file / API / Video / Audio
- Is the second most important thing: (The question is the most important thing)
- To share large amount of data: http://figshare.com
- To avoid confounder we can fix the confounding variable, stratify variables, or randomize them if they cannot be fixed.