Syllabus¶
Fall 2018, Prof. Heather Mayes, MW 10-11:20 am, 3150 Dow
See also the [syllabus on Google Drive](https://docs.google.com/document/d/1dSnMrw5chVI4gSzG52Y3_wh5XpOMpfvovxjnbbBYcR8/edit?usp=sharing)
Course description¶
Engineers are encountering and generating a ever-growing body of data and recognizing the utility of applying data science (DataSci) approaches to extract knowledge from that data. A common barrier to learning DataSci is the stack of prerequisite courses that cannot fit into the typical engineering student schedule. This class will remove this barrier by, in one semester, covering essential foundational concepts that are not part of many engineering disciplines’ core curricula. These include: good programming practices, data structures, linear algebra, numerical methods, algorithms, probability, and statistics. The class’s focus will be on how these topics relate to data science and to provide context for further self-study.
Topics¶
- Intro to Bash/Python/vim/IDE (main ideas, then practice the rest of the semester)
- Get students comfortable with pycharm and jupyter notebooks (they can proceed with either)
- Enough of a foundation to be able to immediately use in the next week
- Introduce libraries, including numpy, and numpy arrays
- Installing and updating libraries (conda, pip)
- Chapter 1 of http://www.crcnetbase.com/isbn/978-1-4987-4506-2
- Python intro to quickly get started: https://jakevdp.github.io/WhirlwindTourOfPython/ Getting it done
- Python intro with more comp sci content: http://interactivepython.org/courselib/static/pythonds/index.html
- Intro to Jupyter Notebooks: http://jupyter-notebook.readthedocs.io/en/stable/examples/Notebook/Notebook%20Basics.html
- Programming practices
- Versioning (git)
- Testing
- Modular programming
- Documentation
- Data structures (continuing intro python)
- Basic data types (integers, floats, lists, tuples, arrays, dictionaries)
- Tree, graphs (network X)
- Tie together how data structures are composed of other data structures (trees for sorted lists, keys on dicts, etc.)
- Debugging
- Timing (libraries versus writing own functions)
- Basics of linear algebra/numerical methods
- Show how to solve by hand and using numpy libraries
- Matrix algebra
- Linear determinants
- Basic eigenvalue/eigenvector problems
- Algorithms (e.g. key topics from http://algs4.cs.princeton.edu/home/)
- Terminology: Big O notation, NP hard, etc.
- Sorting (as example)
- Searching/matching
- Graph processing
- String searching and manipulation
- Optimization methods
- Probability and statistics
- Basics (terminology, notation, basic laws)
- Conditional probabilities and Bayesian statistics
- Discrete and continuous probability distributions
- Hypothesis testing (t-tests, p-values and their controversial use, chi-squared tests)
Texts¶
Free online resources¶
- Whirlwind Tour of Python by Jake VanderPlas, https://jakevdp.github.io/WhirlwindTourOfPython/
- Python Data Science Handbook by Jake VanderPlas, https://jakevdp.github.io/PythonDataScienceHandbook/
Free through the University of Michigan: Download while on campus or VPN¶
Willmore, F. T.; Eric, J.; Coray, C., Eds.; Introduction to Scientific and Technical Computing; Taylor & Francis Group: Boca Raton, FL, 2017. http://www.crcnetbase.com/isbn/978-1-4987-4506-2