Difference between revisions of "DataScience"
		
		
		
		
		
		Jump to navigation
		Jump to search
		
				
		
		
	
| (7 intermediate revisions by the same user not shown) | |||
| Line 2: | Line 2: | ||
| '''What would a course on Data Science look like?''' | '''What would a course on Data Science look like?''' | ||
| + | |||
| + | [[Media:intro-to-data-science-nov15.pdf|Intro to Data Science]] | ||
| + | |||
| + | <!-- | ||
| =Introduction= | =Introduction= | ||
| − | [[Image: | + | [[Image:Data_Science_VD.png|400px|thumbnail|center|Drew Conway's Venn diagram of data science]] | 
| + | |||
| + | =Topics would include= | ||
| + | |||
| + | * What is relevant for the UoB? | ||
| + | * y=f(x) relationships:- classifiers & regression | ||
| + | ** Examples: Linear & logistic regression, K-Nearest Neighbours, Decision Trees, Neural Networks etc. | ||
| + | * Data topics: | ||
| + | ** Training, Test & validation data. | ||
| + | ** Sources of data, e.g. web scraping. | ||
| + | ** Exploratory Data Analysis (EDA). | ||
| + | ** Cleaning & munging data (90% of your effort?).  Useful Linux tools. | ||
| + | ** Feature selection. | ||
| + | * Model selection & training topics: | ||
| + | ** Algorithms that scale.  | ||
| + | ** Supervised vs. Unsupervised training. | ||
| + | ** Overfitting. | ||
| + | ** The curse of dimensionality. | ||
| + | * Programming Skills: | ||
| + | ** "Clean code shows clarity of mind," | ||
| + | ** Languages: R? Python? Others? | ||
| + | ** Version control. | ||
| + | ** Build systems. | ||
| + | ** Testing. | ||
| + | ** Scripting and automation. | ||
| + | --> | ||