Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more,
all in one place. Commercial Alternative to JupyterHub.
Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more,
all in one place. Commercial Alternative to JupyterHub.
Math 152: Intro to Mathematical Software
2017-02-13
Kiran Kedlaya; University of California, San Diego
adapted from lectures by William Stein, University of Washington
** Lecture 15: Pandas (part 1) **
data:image/s3,"s3://crabby-images/58dd5/58dd5cdd6167ebbe1668fad9c60cf27c64b1b3d2" alt=""
Usual schedule this week:
Sections Monday
Office hours Monday and Tuesday
HW due Tuesday 8pm
Peer evaluations due Thursday 8pm
However, next Monday is a holiday (Presidents Day), so no lecture, sections or office hours that day.
Pandas Overview
"pandas is an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language."
Problem pandas solves: data analysis and modeling. pandas enables you to carry out your entire data analysis workflow in Python without having to switch to a more domain specific language like R.
Pandas does not implement significant modeling functionality outside of linear and panel regression. Instead one uses statsmodels ("estimate statistical models, and perform statistical tests") and scikit-learn ("Machine Learning in Python"), which we will look at later.
Look at the overview of functionality at the bottom here: http://pandas.pydata.org/#library-highlights