I recently did some Pandas-based analysis of Uber trip data for trip prediction purpose. During the process, I found Panads is really helpful in data processing and visualization, and thus this post.
However, due to obvious reasons, I am not supposed to publish internal business data online. And hence here I stripped down the original codes and changed the data source to stock prices.
Unfortunately, Uber trip data and stock price data are fundamentally different. For example, Uber trip data exhibit strong weekly pattern while stock prices do not. As the side effect, some of the analysis in this post is not practically useful, and some of it might not even make sense. But I still consider it valuable for practicing Pandas.
Pandas is really a great tool for data transformation, analyzing and visualization, as long as the data set can fit in memory.
To better understand Pandas, one needs first to have a grasp of the basic concepts. I found the best starting point was to read through Introduction to Data Structures. Once you understand the basic data structures, it should be easier to understand the others.
For those need to deal with time series, you should also read through Time Series / Date functionality.