Show pageOld revisionsBacklinksBack to top This page is read only. You can view the source, but not change it. Ask your administrator if you think this is wrong. ====== Data Science Toolbox ====== Course page: https://www.coursera.org/learn/data-scientists-tools By Jeff Leek, PhD, Roger D. Peng, PhD, Brian Caffo, PhD * Get help from R: <code>?rnorm help.search("rnorm") # get the function arguments: args("rnorm") # See the code: rnorm</code> * R reference card downloaded. * To ask for help on a problem with R: - What steps to reproduce the problem ? - What did we expect to see ? - What did we see instead ? - What version of R and packages ? - What operating system ? * To ask for help on data analysis question: - What is the question we try to answer ? - What steps were used to answer the question ? - What did we expect to see ? - What did we see instead ? - What other solutions did we think about ? * Places to get info on data science questions: * stackoverflow * R mailing list * CrossValidated * Google: "[data type] data analysis" or "[data type] R package" * To checkout a local copy of a github repo: <code>cd my_folder git init git remote add origin https://github.com/roche-emmanue/my_repo.git</code> * Help on git and github from: * http://git-scm.com/doc * https://help.github.com/ * Google/stackoverflow * Basic Markdown: * Headings with: <code>## This is a secondary heading ### This is tertiary heading</code> * Unordered lists: <code>* Item 1 * Item 2 * Item 3</code> * Types of data science questions: - Descriptive: * Just trying to describe the data - Exploratory: * Trying to find relationships (but not really trying to confirm) - Inferential: * Take small dataset and try to generalize that to a larger population. - Predictive: * X **predicts** Y, doesn't mean that X **causes** Y - Causal * What happen if we change the value of one variable... - Mechanistic * What is Data: * In a "set of items" = **population** * **Variables** = measurement of characteristics. * **qualitative** (discret scale) or **quantitative** (continuous scale) * Could be Raw file / API / Video / Audio * Is the second most important thing: (The question is the most important thing) * To share large amount of data: http://figshare.com * To avoid confounder we can fix the confounding variable, stratify variables, or randomize them if they cannot be fixed. public/courses/data_science/data_science_toolbox/intro.txt Last modified: 2020/07/10 12:11by 127.0.0.1