Monthly Archives: July 2017

Understanding the Python world

There are two classes of Python users, those who use Python for management scripting and web development, and those who use Python for scientific computing. According to many, Python is winning the war against R in the field of machine learning. As machine learning becomes ever more important and popular, so is Python.

This article is for those just got into the Python world. For those experienced Python developers, you are excused to leave now. Python is extremely simple to start with, the traditional Hello World program takes only one line. Before no time, you would be writing classes, methods, system calls, and you really enjoy the convenience and power of the language. But quickly, you also realize that, there are many strange codes floating around. And you hear about lots of unacquainted jargons. You could feel a little puzzled, and a little confused about the big picture of the Python world.

That is where this article comes to help. I intend to draw a broad picture of the Python world, with pointers for you to explore in depth. At the end of the article, you won’t become an expert on anything with Python. But, hopefully, you will be satisfied in understanding the big picture, and know where to go for further information and to gain expertise.
Continue reading

Using dropwizard in building micro-services in Java

Micro-services are the norm nowadays when building backend internet applications or business applications. Of course we would not want to build everything from scratch. What micro-service framework should we use then? If you work in Java, your choices basically boil down to either Dropwizard or Spring. If you work in Scala, you have more choices like Twitter’s Finagle, Play framework and others.

Between Dropwizard and Spring, I personally prefer Dropwizard. For Dropwizard seems simpler and has a more elegant design. Then, what will Dropwizard do for us as a developer? Let’s quickly run through a micro-service building virtual exercise.

Continue reading

Stock price analysis and visualization with Pandas


I recently did some Pandas-based analysis of Uber trip data for trip prediction purpose. During the process, I found Panads is really helpful in data processing and visualization, and thus this post.

However, due to obvious reasons, I am not supposed to publish internal business data online. And hence here I stripped down the original codes and changed the data source to stock prices.

Unfortunately, Uber trip data and stock price data are fundamentally different. For example, Uber trip data exhibit strong weekly pattern while stock prices do not. As the side effect, some of the analysis in this post is not practically useful, and some of it might not even make sense. But I still consider it valuable for practicing Pandas.

Pandas is really a great tool for data transformation, analyzing and visualization, as long as the data set can fit in memory.

To better understand Pandas, one needs first to have a grasp of the basic concepts. I found the best starting point was to read through Introduction to Data Structures. Once you understand the basic data structures, it should be easier to understand the others.

For those need to deal with time series, you should also read through Time Series / Date functionality.

Continue reading

How to create a site similar to this

This site is based onĀ WordPress, a free and opensource content management system (CMS) based on PHP and MySQL. Most steps can be found from this well written article: Setting Up WordPress on Amazon EC2 in 5 minutes.

It is always a good idea to setup a SSL certificate for your site. SSL certificate is an online identification document issued by CAs (Certificate Authority). There are many CA vendors to choose from. I bought from with an annual cost of around $27. You should stay away from StartCom & WoSign because they are blacklisted by major browsers (read here).

Continue reading