public:courses:data_science:data_science_toolbox:intro

# Data Science Toolbox

Course page: https://www.coursera.org/learn/data-scientists-tools By Jeff Leek, PhD, Roger D. Peng, PhD, Brian Caffo, PhD

• Get help from R:
?rnorm
help.search("rnorm")

# get the function arguments:
args("rnorm")

# See the code:
rnorm
• To ask for help on a problem with R:
1. What steps to reproduce the problem ?
2. What did we expect to see ?
3. What did we see instead ?
4. What version of R and packages ?
5. What operating system ?
• To ask for help on data analysis question:
1. What is the question we try to answer ?
2. What steps were used to answer the question ?
3. What did we expect to see ?
4. What did we see instead ?
5. What other solutions did we think about ?
• Places to get info on data science questions:
• stackoverflow
• R mailing list
• CrossValidated
• Google: “[data type] data analysis” or “[data type] R package”
• To checkout a local copy of a github repo:
cd my_folder
git init
git remote add origin https://github.com/roche-emmanue/my_repo.git
• Help on git and github from:
• Basic Markdown:
## This is a secondary heading
### This is tertiary heading
• Unordered lists:
* Item 1
* Item 2
* Item 3
• Types of data science questions:
1. Descriptive:
• Just trying to describe the data
2. Exploratory:
• Trying to find relationships (but not really trying to confirm)
3. Inferential:
• Take small dataset and try to generalize that to a larger population.
4. Predictive:
• X predicts Y, doesn't mean that X causes Y
5. Causal
• What happen if we change the value of one variable…
6. Mechanistic
• What is Data:
• In a “set of items” = population
• Variables = measurement of characteristics.
• qualitative (discret scale) or quantitative (continuous scale)
• Could be Raw file / API / Video / Audio
• Is the second most important thing: (The question is the most important thing)
• To avoid confounder we can fix the confounding variable, stratify variables, or randomize them if they cannot be fixed.
• public/courses/data_science/data_science_toolbox/intro.txt