Visualize Tennis World with,Docker and Pandas

Namaste everyone. Today I am going to do a small experiment on the tennis game history. I questioned my self  “how I can know the facts of tennis without asking others?”. “What if I generate them myselves?”. So I tried to visualize and see what are the top countries producing majority number of Tennis players. But here I don’t want to go straight forward into solution. Rather we will discuss about few things which are useful in constructing a universal visualization lab. I want to use this article to introduce, a plotting library in Python.

What I finally visualized in the experiment

I wanted to find out what countries are having large number of players in professional ATP tennis.


Western countries are occupying top list in producing tennis players in ATP history.

Western countries are occupying top list in producing tennis players.

We can solve many other queries like:

“How well players are performing in their respective ages?”

“Which country is producing more quality players?”

and more. But I am going to show you how we can visualize and bring solution like above one.

For downloading the Ipython notebook visit this link. 

Building a Python data visualization lab in the docker

Folks, you may be wondering why I brought docker into picture. I am discussing about docker because it is an advantage for a data analyst or a developer to isolate his job with other stuff. I need to write 100 articles showing setup procedure in 100 operating systems. But docker allows us to create an identical container in any operating system we are working with. I will show now how to build a complete scientific python stack from scratch in a docker container. You can store it as a package which you can also push to cloud via dockerhub. So let us begin.

I hope you know something about docker. If not just read my previous article here. Docker up and running

Step 1

$ docker run -i -t -p -p -v /home/naren/pylab:/home/pylab ubuntu:14.04

By this a Ubuntu14.04 container will be created with two ports open. 8000,8001.We can use these ports to forward Ipython notebook to host browser in our visualization procedure later. It also mounts the pylab folder in my host /home directory to /pylab in container.  When you run this, you will be automatically enter into the bash shell of the container.

Step 2

Now install required packages as below.

root@ffrt76yu:/# apt-get update && apt-get upgrade
root@ffrt76yu:/# apt-get install build-essential
root@ffrt76yu:/# apt-get install python python-pip python-dev
root@ffrt76yu:/# pip install pandas ipython jupyter plotly

That’s it.  Pandas will install numpy and matplotlib as deapendencies. We are now ready with our development environment for visualizing anything. We can launch a Ipython notebook using this command.

s ipython notebook --ip= --port=8000

So now we have a running Ipython notebook on port 8000 of our local machine. Now fire up your browser and you will find notebook software is running on it. select new “python 27” project in the top right menu.

If you don’t want all the pain, just pull my plotting environment from docker hub.

$ docker run -i -t -p -p -v /home/naren/pylab:/home/pylab narenarya/plotlab

Beginning of the visualization is a library which allows us to create complex graphs  and charts using numpy and pandas. We can load a dataset into a dataframe using pandas. Then we will plot the cleaned data using Full documentation of can be found at:

For my work I used Jeff Sachmann’s ATP tennis dataset from github. 

Extract all data set files to your pylab so that it is visible to your notebook. We here are interested in the atp_players.csv. We first clean data to find out how many players belong to a single country and map them on a scatter plot. Code looks like this.

from random import shuffle
import colorsys
import pandas as pd
from plotly.offline import init_notebook_mode, iplot
from plotly.graph_objs import *


# Load players into players dataframe 
players = pd.read_csv('atp_players.csv')

# Find top 20 countries with more player frequncies 
countries = players.groupby(['Country']).size()
selected_countries = countries.sort_values(ascending=False)[:20]

# Generating 20 random color palettes for plotting each country.
N = 20
HSV_tuples = [(x*1.0/N, 0.5, 0.5) for x in range(N)]
RGB_tuples = map(lambda x: colorsys.hsv_to_rgb(*x), HSV_tuples)

""" plotting code. A iplot needs data and a layout 
    So now we prepare data and then layout. Here data is a scatter plot
trace0 = Scatter(
    x = list(selected_countries.index),
    y = list(selected_countries.values),
    mode = 'markers',
    marker = {'color' : plot_colors, 'size' : [30] * N}

# Data can be a list of plot types. You can have more than one scatter plots on figure 
data = [trace0]

# layout has properties like x-axis label, y-axis label, background-color etc
layout = Layout(
    xaxis = {'title':"Country"}, # x-axis label
    yaxis = {'title':" No of ATP players produced"}, # y-axis label
    height=600, # height & width of plot
    plot_bgcolor='rgb(233,233,233)', # background color of plot layout

# Build figure from data, layout and plot it.
fig = Figure(data=data, layout=layout)

There is nothiing facny in the code. We just did the following things:

  • Loaded ATP players dataset into Pandas Dataframe
  • We need to assign different random colors to each country. So created random RGB values
  • Created a Scatter kind of plot with markers mode
  • Created a layout with axis details
  • plotted data and layout using iplot method of plotly library.

When I run this code in Ipython notebook (Shift + Enter). I will see the scatter plot given in the beginning of article.

For full documentation on all kinds of plots visit this link.

This is only one visualization from dataset. You can draw so many analytics from all the datasets provided in the git repo. One obvious advantage here is you are doing this entire thing in a docker container. It will be faster and easy to overcome failure of environments. You can also commit your container to a docker image.

For downloading my Ipython notebook visit this link. 

my email address is: . Thanks to all.


Lessons I learnt in quest of writing beautiful python code

Hello everyone. I always wonder what are the good practices in developing software in Python. I am young and  inexperienced few years back. But  people around me and situations I faced from past few years had taught me many things. Many things about coding style, good development patterns etc. Here I am going to discuss few things which are important to turn your normal coding style into an elegant one. These things are collected from my own , others code reviews.

If you keep all these points in mind  from tomorrow you will see a different aspect of coding. Thanks for my inspirational man, Chandra -Software Architect @ Knowlarity Communications for reviewing my code and giving valuable tips with his vast software development experience. Let us see how not to write code.

* Your code is a Baby. Protect it with Exception Handling

A Software or program fails when it accepts the wrong input. A good developer always handles his piece of code. No one can guess all possible bugs that creeps in. In statically typed languages like C, C++ type system enforces the kind of information passed to a variable. But in dynamic languages like Python and Ruby there are many chances of failure of a program due to entry of incorrect type. Duck typing is a comfort. But it comes with expense of more careful error handling. Here I always wrap my code in TRY | EXCEPT blocks. If you know what type of error you might encounter, it is easy to make your code function properly. At least it won’t break your code. Let us see the first illustration of handling JSON data.

import json

def handle_json(data_string):
    parsed_data = json.loads(data_string)
    return parsed_data

A newbie of Python just leaves the above code and thinks his job was finished. But code may break if ill formed JSON is passed through handle_json function. So it is better to handle error.

import json

def handle_json(data_string):
        parsed_data = json.loads(data_string)
       return {}
    return parsed_data

This is basic error handling. It will turn into a good practice if we log a message when error occurs. Handling specific error will do more good.

import json

def handle_json(data_string):
        parsed_data = json.loads(data_string)
    except ValueError as e:"Error occured: %s" % e.message)
        return {}
    return parsed_data

So never think error handling as an add on. It is a compulsory thing when writing software for reliable systems.

* Never put magic numbers in the code

It is common for us to use constants in the programs. We define few things as mapping to sequence of numbers. Enumerate data type is an example. It gives us a range of named constants. So use name of the constant instead of constant itself.

fruit = int(raw_input("1.Apple\n2.Mango\n3.Gauva\n4.Grape\n5.Orange\nEnter your favorite fruit: "))
if fruit == 1:
    print "Fruit is Apple"
elif fruit == 2:
    print "Fruit is Mango"
elif fruit == 3:
    print "Fruit is Gauva"
    print "Fruit is not available"

It is just a simple program which inputs a number and uses that input to select fruit type. But when one sees the code, he will be wondered what those 1,2,3 means. English names convey better messages than mere numbers. So good practice is not to hard code anything. Instead use your own Enum type to map meaningful names to Numbers.

class Fruit(object):
    APPLE, MANGO, GAUVA, GRAPE, ORANGE = range(1,6) 
     def tostring(cls, val):
         """String representation of a Fruit type."""
         for k, v in vars(cls).iteritems():
             if v == val:
                 return k
fruit = int(raw_input("1.Apple\n2.Mango\n3.Gauva\n4.Grape\n5.Orange\nEnter your favorite fruit: "))
print "The fruit is: %s" % (Fruit.tostring(Fruit.APPLE)).capitalize()

See by building our own enumeration, we are able to transform a hard coded program into beautiful, meaningful one. Here we defined a class to store named constants. We are reverse looking a key from value using our tostring method. Never ever put magic numbers in the code because in larger systems it creates ambiguity. Code is for humans first and for computers next.

* Best ways of working with a dictionary

Many of us will be working with dictionaries in Python as frequently as we take a sip of coffee. When carefully observed beginner developers usually have a habit of accessing a dictionary value using bracket method.

students =  {1: "Naren", 2: "Sriman", 3:"Habeeb"}
print students[1]

Everybody does that, you might wonder. Yes it is the most trivial way of accessing a value from a dictionary. But as we shouted in our first tip, you should handle error when you try to query dictionary for non-existing key. Like this you can say

students =  {1: "Naren", 2: "Sriman",  3:"Habeeb", 4:"Ashwin"}
    print students[1]
except KeyError as e:
    print None

Instead of doing all these things we can do one straight operation called GET on a dictionary. Python will return you value for a key if key exists else returns None.

students =  {1: "Naren", 2: "Sriman", 3:"Habeeb", 4:"Ashwin "}
print students.get(1)
# This prints None
print students.get(101)

But in my beginning of development career, I used to mix both the ways in a program which looks pretty awkward. So my advice is to use get function or bracket [] method according to your personal taste but two things.

  • Using get gives you automatic error handling
  • Keep your program uniform.

One more useful case is when you are processing a dictionary and want’s to update an existing dictionary with the new one. Many people does this thing.

students =  {1: "Naren", 2: "Sriman", 3:"Habeeb", 4:"Ashwin "}
new_students = {5: "Tony", 6:"Srikanth", 7:"Rajesh"}
# A trivial way to add new students to students map
students[5] = new_students[5]
students[6] = new_students[6]
students[7] = new_students[7]
But there is a handy method called  UPDATE on any python dictionary. It allows us to merge the second dictionary with the first.
students =  {1: "Naren", 2: "Sriman", 3:"Habeeb", 4:"Ashwin "}
new_students = {5: "Tony", 6:"Srikanth", 7:"Rajesh"}
This function is crisp because it is avoiding lot of typing. And also it makes program looks cleaner.

* Always do Validation of the data first and then pre-processing

My context here is, many fellow programmers do return empty (None) from a function when they found data is invalid to proceed. But they do lot of pre-processing before checking for validity. Here computation is wasted. It is illogical for a program to spend time in doing useless things and checking whether it use useful or ignored. It may seems fine for many people but handling this design pattern cleverly can have huge impact on code performance.


valid_data = [1,2,3,4,5]
def process(value):
    new_value = preprocess(value)
    if value not in valid_data:
        return None
    return new_value

def preprocess(value):
    # Do a heavy computation task
    return value

print process(23)
In less code situations we will capture the inefficiency of checking condition last. I always feel of losing my common sense, if I see above mistake in my code later. I always check conditions in the first line of function and then do anything with that data. So process function should be like this
def process(value):
    # Make habit of filtering in first line itself
     if value not in valid_data:
         return None
    # Now do whatever you want
    new_value = preprocess(value)
    return new_value

I write code for a telephony company where product is built upon thousands of lines of legacy python code. There performance is critical. If I design one procedure using above mistake, it will have a huge business impact on the product. Even few seconds delay is not bearable. So keep in mind. Always return invalid cases and then do pre-processing.  

* Avoid trivial conditionals in code

This is not actually a mistake but a very good practice to avoid lot of IF and ELSE blocks in the code.

def  is_even(value):
    if value % 2 == 0:
        return True
        return False
print is_even(4)

But observing carefully we can remove else here because it is trivial that if condition is True, control won’t stay any more in the function. So we can modify code to

def  is_even(value):
    if value % 2 == 0:
        return True
    return False
print is_even(4)

So remember this as a thumb rule. “Always  try to use a single conditional when there is a truth checking and take another as trivial thing“.

* Other notable points

In addition to above points, there are few other important things.

  • Touch maximum level of abstraction by placing common logic on top level of code and specific implementations on bottom.
  • Follow PEP-8 and PEP-257. It will make code more readable. I hated it first but now loving the structure of code.
  • Make sure of doc strings of classes and methods conveying the right message in a Python program.
  • In ORM like Django or SQLAlchemy use filter rather than Get because the former one is safe. FILTER always return empty list, GET throws duplicate error which you should  handle explicitly.
  • Make a habit of removing print statements and debuggers before committing the code to GIT.
  • When you add a new feature, please do write a unit test case. It will help a new developer in understanding functionality of class or procedure you had defined.
  • Never push code without developer testing.

Once again thanks for my inspirational man, Chandra -Software Architect @ Knowlarity Communications and Mohammed Habeeb  for reviewing my code and giving valuable tips with their vast software development experience.


Building your own URL shortening service with python and flask

Have you ever wondered how people create URL shortening websites. They just do it using common sense. You heard it right. I too thought it is a very big task. But after thinking a bit, I came to know that simple mathematical concepts can be used in writing beautiful applications. What is the link between mathematics and URL shortening?. That is what we are going to unveil in this article.

In a single statement URL shortening service is built upon two things.

  1.  String mapping Algorithm to map long strings to short strings ( Base 62)
  2.  A simple web framework (Flask, Tornado) that redirects a short URL to Original URL

There are two obvious advantages of URL shortening.

  1. Can remember the URL. Easy to maintain.
  2. Can use the links where there are restrictions in text length Ex. Twitter.

Technique of URL shortening

There is nothing like URL shortening algorithm. Under the hoods, every record storing in the database is allocated with one Primary Key(PK).  That PK is passed into an algorithm which in turn generates a string. We will indirectly map that short string with the URL that customer registers with us.

I visit website of and pass my blog link to it. Then I got this short link.

Screenshot from 2015-10-31 18:29:14

Here one question comes to our mind. How they reduce lengthy string to a short one? .  They are not actually reducing size of original link.They just do abstraction here. Steps every one do are:

  • Insert a record with URL into database
  • Use the record ID returned to generate the short string
  • Pass it back to Customer
  • Whenever you receive a request, then extract short string from URL and re-generate Database record ID -> Fetch the URL -> Simple Redirect to Website


That’s it. It is very simple to generate a short string from a given large number using Base62 Algorithm. Whenever a request comes to our website,  we can get back the number by decoding the short string from URL. Then use that number ID to fetch record from database and redirect to that URL.

Let us build one such URL shortener in Python

Code for this project is available at my git repo.

As I told you before there are three ingredients in preparing a URL shortening service.

  • Base62 Encoder and Decoder
  • Flask for handling requests and redirects
  • SQLite3 for serving the purpose of database

Now If you know about converting Base10 to Base64 or Base62( any base) then you can proceed with me. Other wise just see what are base conversions here.

I here interested only in Base62 because I need to generate strings which are combinations of [a-z][A-Z][0-9].  Encoder maps integer to a string. Decoder generates integer from given string.  They are like Function and Reverse Functions. This is the Base62 code for encoder and decoder in Python

from math import floor
import string

def toBase62(num, b = 62):
    if b <= 0 or b > 62:
        return 0
    base = string.digits + string.lowercase + string.uppercase
    r = num % b
    res = base[r];
    q = floor(num / b)
    while q:
        r = q % b
        q = floor(q / b)
        res = base[int(r)] + res
    return res

def toBase10(num, b = 62):
    base = string.digits + string.lowercase + string.uppercase
    limit = len(num)
    res = 0
    for i in xrange(limit):
        res = b * res + base.find(num[i])
    return res
Now let me create a database called urls.db using the following command.
 $ sqlite3 urls.db

Now I am creating  for flask app and a template file.


from flask import Flask, request, render_template, redirect
from math import floor
from sqlite3 import OperationalError
import string, sqlite3
from urlparse import urlparse

host = 'http://localhost:5000/'

#Assuming urls.db is in your app root folder
def table_check():
    create_table = """
        URL  TEXT    NOT NULL
    with sqlite3.connect('urls.db') as conn:
        cursor = conn.cursor()
        except OperationalError:

# Base62 Encoder and Decoder
def toBase62(num, b = 62):
    if b <= 0 or b > 62:
        return 0
    base = string.digits + string.lowercase + string.uppercase
    r = num % b
    res = base[r];
    q = floor(num / b)
    while q:
        r = q % b
        q = floor(q / b)
        res = base[int(r)] + res
    return res

def toBase10(num, b = 62):
    base = string.digits + string.lowercase + string.uppercase
    limit = len(num)
    res = 0
    for i in xrange(limit):
        res = b * res + base.find(num[i])
    return res

app = Flask(__name__)

# Home page where user should enter 
@app.route('/', methods=['GET', 'POST'])
def home():
    if request.method == 'POST':
        original_url = request.form.get('url')
        if urlparse(original_url).scheme == '':
            original_url = 'http://' + original_url
        with sqlite3.connect('urls.db') as conn:
            cursor = conn.cursor()
            insert_row = """
                INSERT INTO WEB_URL (URL)
                    VALUES ('%s')
            result_cursor = cursor.execute(insert_row)
            encoded_string = toBase62(result_cursor.lastrowid)
        return render_template('home.html',short_url= host + encoded_string)
    return render_template('home.html')

def redirect_short_url(short_url):
    decoded_string = toBase10(short_url)
    redirect_url = 'http://localhost:5000'
    with sqlite3.connect('urls.db') as conn:
        cursor = conn.cursor()
        select_row = """
                SELECT URL FROM WEB_URL
                    WHERE ID=%s
        result_cursor = cursor.execute(select_row)
            redirect_url = result_cursor.fetchone()[0]
        except Exception as e:
            print e
    return redirect(redirect_url)

if __name__ == '__main__':
    # This code checks whether database table is created or not

 Let me explain what is going on here.
  • We have Base62 encoder and decoder
  • We have two functions one is index. Another one is short_url
  • Index function(‘/’) returns home page and also posts original URL into database
  • short url(‘/short_url’) just recieves the request for redirect and finally redirects shortened URL to Original URL. If you observe code carefully, you can easily grasp things.

We can also give look at template here. .

Project structure looks this way.

Screenshot from 2015-11-01 01:50:01

Run the flask app on port 5000.

$ python
 * Running on (Press CTRL+C to quit)
 * Restarting with stat......

If you visit http://localhost:5000 in your browser you will see

Screenshot from 2015-11-01 01:30:49

Now  enter URL to shorten and click submit. It posts data to database and generates short string like below image. In my case it is http://localhost:5000/f . The string seems to be very short, but as no of URLs registered increase the string increases gradually. Ex. 11Qxd etc

Screenshot from 2015-11-01 01:32:20

 Now if we click that link, it takes us to
Screenshot from 2015-11-01 01:34:52
So this is how URL shortening work. For entire code, just clone my repo and give a try.
I hope you enjoyed the article. Please do comment if you have any query. Even you can mail me at

A primer on Database Transactions and Asynchronous Requests in Django

Hello, Namaste.  Today we are going to  look at few Django web framework cookies that makes our life more sweeter.  Let us learn few things which helps us implement the functionality when situation demands. The topics are following:

  1. Implementing Database transactions in  Django
  2.  Making asynchronous HTTP requests from Django code

1) Django DB Transactions

I am creating a REST API. I want to insert POST data  into database. But here a list is received in POST. I want to validate each element in list and  make sure to insert data. Here there are two rules to say this insertion operation atomic.

  • Insert data if all elements pass the validation criteria.
  • While inserting, if  there is duplicate data then abort the transaction  and return integrity error.

Demo project will be available at

Let me create a sample Django project to illustrate all the things we are going to discuss. I am doing this on Ubuntu14.04 Machine with Python2.7, Django1.8 and MySQL

$ virtualenv cookie
$ source cookie/bin/activate
$ pip install django==1.8.5 python-mysqldb

Now let us create a sample project called cookie

$ django-admin startproject cookie

Here in cookie I am going to create a view which takes a list of numbers and if all numbers are primes then it will store those numbers in db. If invalid prime or any duplicate entry  it aborts the operation.

$ django-admin startapp primer

Now do the following to create a model called Prime in primer app.

# primer/

from django.db import models
import re

class Prime(models.Model):
    number = models.IntegerField(unique=True)
    def __str__(self):
        return str(self.number)
    def prime_check(self):
        if re.match(r'^1?$|^(11+?)\1+$', '1' * self.number):
            raise Exception('Number is not prime')

Prime_check is the function we defined to validate data before inserting into db. Always validate your data using a model class method. Now go and modify to change database to MySQL. Add primer app to and run migrations

 # cookie/


 'default': {
 'ENGINE': 'django.db.backends.mysql',
 'NAME': 'cookie',
 'USER': 'root',
 'PASSWORD': 'passme',
 'HOST': 'localhost',
 'PORT': '3306',
$ python makemigrations
$ python syncdb

Now MySQL tables User,Prime will be created. Now let us create a url and view that takes list of numbers as POST and inserts into db, if all are primes. Now modify primer/ and primer/ as below:

# primer/
from django.conf.urls import include, url
from django.contrib import admin
from primer import views

urlpatterns = [
   url(r'^admin/', include(,
   url(r'^supply_primes/$', views.supply_primes, name="prime")
# primer/
from django.shortcuts import render
from django.http import HttpResponse, JsonResponse
from django.views.decorators.csrf import csrf_exempt
from primer.models import Prime
import json 

# Create your views here.
def supply_primes(request):
    if request.method == 'GET':
        return JsonResponse({'response':'prime numbers insert API'})
    if request.method == 'POST':
        primes = json.loads(request.body)['primes']
        #Validating data before inserting
        valid_prime = Prime()
        for number in primes:
            valid_prime.number = number
            except Exception:
                message = {'error': {
                      'prime_number': 'The Prime number : %s \
                       is invalid.' % number}}
                return JsonResponse(message)
        return JsonResponse({"response":"data successfully stored"})

We can filter data before inserting anything. Integration error comes only when we insert data into db.

If we insert [11, 13, 15]  and next try to insert [14, 15, 13] then in the second case error will be returned while inserting 13 because it is duplicate.  But already 14, 15 are inserted. This is where transactions comes handy. Now we can modify code to

from django.shortcuts import render
from django.http import HttpResponse, JsonResponse
from django.views.decorators.csrf import csrf_exempt
from primer.models import Prime
from django.db import transaction,IntegrityError
import json 

# Create your views here.

def supply_primes(request):
    if request.method == 'GET':
        return JsonResponse({'response':'prime numbers insert API'})
    if request.method == 'POST':
        primes = json.loads(request.body)['primes']
        #Validating data before inserting
        valid_prime = Prime()
        for number in primes:
            valid_prime.number = number
            except Exception:
                message = {'error': {
                      'prime_number': 'The Prime number : %s \
                       is invalid.' % number}}
                return JsonResponse(message)
         #Carefully look for exceptions in real time at inserting
        for number in primes:
            except IntegrityError:
            # We got error. undo all previous insertions
                message = {'error': {'prime_number': 'This prime number(%s) is already registered.' % number}}
                return JsonResponse(message)
         # If everything is fine, Commit the changes and flush db
        return JsonResponse({"response":"data successfully stored"})


The three statements I used for transactions

  • transaction.set_autocommit(False)
  • transaction.rollback()
  • transaction.commit()

First statement tells remove auto pilot and make it manual . Let me chose whether to save something or not

Second statement tells rollback whatever changes I did until last commit

Third statement pinpoints that commit and flush to db.

These statements gives us full control of storage pattern in databases. Without transactions you have one single statement Prime(number=number).save(),  which directly push changes to database. If we need to put something into DB through our own logic then use transaction library in Django.

Let us see it in action

Run Django web  server as below

  $ python runserver

It runs our Django project on localhost with PORT 8200

Let us use fire postman to make a POST request to http://localhost:8200/supply_primes . You can also use CURL.

Screenshot from 2015-10-26 16:46:51

It is showing that data is successfully stored. Because all are primes. If we see the data.

Screenshot from 2015-10-26 16:50:04

Now let me try to insert [26, 13, 17]. Because 26 is not prime it returns me following response.

Screenshot from 2015-10-26 16:51:01

cool. Then try to insert [29, 13, 67]. If you observe we are trying to insert duplicate.

Screenshot from 2015-10-26 16:52:42

and database looks like

Screenshot from 2015-10-26 16:53:38

Here 29 is not inserted. It is inserted actually but rolled back when 13 generates IntegrityError. This is how transactions work.

2) Asynchronous Requests from the Django Code

Think that your Django code base is too large and slow. Some one is asking you to insert a hook in the code which posts some data to external URL. Then your django behaves more slower. If you are making 100 sequential requests then the last hook is executed after a long time. critical code should not be blocked because of side players.

The solution to overcome this problem is to make asynchronous non-blocking requests from the Django code.

* Synchronous code

import requests
res = requests.get('http://localhost:8200/supply_primes')
# some other django task
counter += 1

Here counter will be incremented after a successful or erratic request made by second statement. It means blocking request is making django to pause until request is processed.

* Asynchronous code

If we are using Python3 we have a wonderful library called asyncio to make parallel HTTP requests. visit this diligent link if you are using Python3.  . If you are using your Django projects with Python 2.7.x then carry on.

$ pip install requests-futures

This is the library which makes parallel requests through requests library of Python.

from requests_futures.sessions import FuturesSession
session = FuturesSession()
res = session.get('http://localhost:8200/supply_primes')
# Some other django task
count += 1

Here session.get won’t block the increment of counter. So your django code speeds up. Always use this library wherever you want to spawn a separate process for making HTTP requests.

For more details visit demo project at 

Five trivial things every python programmer should work with


Namaste everyone. This time I came up with  few sensitive suggestions which can effect our coding style. A good habit leads to a good output. If you are already working with the things I am going to mention, then you are on right track. Other wise you will sure gain something useful in few minutes.

1) Virtualenv

Yeah, the first important thing we should know about is working with virtual environments in python. I observed that lot of people are installing packages for their default python Interpreter. Separating the interpreting environment  always keeps things clean. We can work with different projects on same machine without conflicts using virtual environment. For installing virtualenv on Ubuntu -14.04 machine just do

$ sudo apt-get install python-pip
$ pip install virtualenv

Suppose I am working on Flask project, I create a virtual environment for that and install all dependencies for Flask. A virtual environment is created with command “virtualenv env_name”

# This creates a virtual environment called flask_env
$ virtualenv ~/flask_env

Now tell machine to drop default python interpreter and load this flask_env interpreter using

$ source ~/flask_env/bin/activate

Now you are in a separate world. Install packages using pip.

(flask_env)$ pip install flask requests

Now if you want to drop from virtual environment do, deactivate

# This command deactivates virtual environment's interpreter and loads default

(flask_env)$ deactivate

Hint: Always use Virtualenv to separate project environments.

2) IPython

Have you ever faced problem of hitting up arrow key for several times to collect nth previous command in python shell. Also you need to rush to Python API for knowing about properties and methods available in a package or module. Then you should use IPython. It is an interactive shell with tons of options. You can see method names, properties of any module on the fly. It is a tool that every programmer should have.  For installing IPython just use this command.

$ pip install ipython

there is another variation of IPython called Notebook where we can save our scripts as notebooks on web based interpreter. We can share them, use them.

You can launch IPython shell using ” ipython” command. To see the suggestion lookup for method names press TAB after entering the dot ( . )

Screenshot from 2015-10-11 19:54:56

Generally IPython is used for creating shorter scripts and testing language features . My favorite command is “%cpaste”. Using it I can copy code directly into terminal without losing the indentation.  In conventional python shell pasting and formatting is painful. For more details visit this link  

3) Anaconda sublime plugin

Screenshot from 2015-10-11 20:16:26

If you are writing shorter scripts and testing them IPython is sufficient. But if you want a full fledged  python editor with following features:

  • Automatic code completion
  • PEP-8 and PEP-257 checking and reporting

Then you should use Anaconda plugin with Sublime Text . Sublime Text 3 is a great editor for python development. It is fluid, takes less resources and can handle any kind of file without pain. Combining  [ Anaconda plugin + Sublime Text 3 ] = Python IDE . You can see how to setup plugin using package control here.

4) IPdb

One more common thing I observe in python beginners is not using any debugger while testing their code. Python is interpreted language. It executes line by line. But still in big projects with various function calls, we do not  get the actual code flow. We all know classic debugger in python called Pdb. IPdb is a combination of IPython + Pdb (Interactive Python Debugger).

Using IPdb we can set break points anywhere in our code using one single statement.

import ipdb;ipdb.set_trace()

Insert above statement into your python code. When program executes, control stops at the above line. From then you can go line by line and inspect variables etc to debug code. I am listing the primary keys used for debugging here.

  • n – execute next line
  • c – execute remaining program
  • r – continue execution until current function returns

For more details about IPdb and debugging visit this link.

5)  Logging

I saw people  putting print statements many times to debug the code and to write information on console. Logging information on console is a very bad practice. Python provides an excellent in-built library which is sadly neglected by most of the python developers. Logging your program activities is a very good habit to avoid failures. Here we can jump start how to log a python function activity in  a file.

import logging

logger = logging.getLogger(__name__)

fh = logging.FileHandler('add.log')

formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')


def add(x,y):"Just now received parameters %d and %d" % (x, y))
    addition = x + y"Returning the computed addition %d" % addition)
    return addition

if __name__ == '__main__':

Here we are not doing anything fancy. We are just logging activity of an add function in a file called add.log. Here we created a python script which does these things.

  • Create a logger object with current file name as handle
  • Set level to DEBUG. It can also be INFO or ERROR according to the context of log
  • Create a file handler, which redirects logs to a physical file
  • Create format handler and set it to file handler. This is nothing but defining custom message for time and date etc  that appears in the log file.
  • Add file handler to logger object we created
  • Sprinkle INFO  or DEBUG messages wherever you want to note down activity. They will be recorded in the file. You can review a log file in case of failure.

Screenshot from 2015-10-11 20:58:54

See how simple logging is. But very few developers shows interest in doing it while building software. Make logging in your program as a habit.

So these are five notable minimum things every python developer should use and care about to improve their productivity. If you have any queries just comment below. Thanks.



Build massively scalable RESTFul API with Falcon and PyPy

Namaste everyone. If you build a RESTFul API for some purpose, what technology stack you use in python and why?. I may receive the following answers from you.

1)  I use Flask with Flask-RESTFul

2)  I use (Django + Tastypie) or (Django + REST Framework)

Both options are not suitable for me. Because there is a very good light-weight API framework available in python called Falcon. I always keep my project and REST API loosely coupled. It means my REST API knows little about the Django or Flask project that is being implemented. Creating cloud API’s with low-level web framework than a bulky wrapped one always speeds up my API.

What is Falcon?

As per Falcon website Falcon official website

“Falcon is a minimalist WSGI library for building speedy web APIs and app backends. We like to think of Falcon as the Dieter Rams of web frameworks.”

“When it comes to building HTTP APIs, other frameworks weigh you down with tons of dependencies and unnecessary abstractions. Falcon cuts to the chase with a clean design that embraces HTTP and the REST architectural style.”

If you want to hit bare metal for creating API use Falcon. You can build easy to develop, easy to serve and easy to scale API with Falcon. Just use it for speed.

What is PyPy?

“If you want your code to run faster, you should probably just use PyPy.” — Guido van Rossum

PyPy is a fast, compliant alternative implementation of the Python language

So PyPy is a JIT implementation for your Python code. It is a separate interpreter that can be used as a normal interpreter in a virtual environment to power our projects. In most of the cases, there are no issues with PyPy.

Let’s start building a simple todo REST API

Note: Project source is available at

Falcon and PyPy are our ingredients to build scalable, faster REST API. We start with a virtual environment that runs PyPy with falcon installed using pip. Then we use rethinkDB as the resource provider for our API. Our todo app does three main things.

  1. Create a note (PUT)
  2. Fetch a note by ID (GET)
  3. Fetch all notes (GET)
  4. PUT & DELETE are obvious

Install RethinkDB on Ubuntu14.04 in this way.

$ source /etc/lsb-release && echo "deb $DISTRIB_CODENAME main" | sudo tee /etc/apt/sources.list.d/rethinkdb.list
$ wget -qO- | sudo apt-key add -
$ sudo apt-get update && sudo apt-get install rethinkdb
$ sudo cp /etc/rethinkdb/default.conf.sample /etc/rethinkdb/instances.d/instance1.conf
$ sudo /etc/init.d/rethinkdb restart

Create virtualenv for the project and install required libraries. Download PyPy from this URL .PyPy Download. After downloading, extract files and install pip if required.

$ sudo apt-get install python-pip
$ virtualenv -p pypy-2.6.1-linux64/bin/pypy falconenv
$ source falconenv/bin/activate
$ pip install rethinkdb falcon gunicorn

Now we are ready with our stack. PyPy as python interpreter, Falcon as web framework to build the RESTful API. Gunicorn is a WSGI server that serves our API. Now, let us prepare our rethinkDB database client for fetching and inserting resources. Let me give the filename “”
import os
import rethinkdb as r
from rethinkdb.errors import RqlRuntimeError, RqlDriverError

RDB_HOST = 'localhost'
RDB_PORT = 28015

# Datbase is todo and table is notes
PROJECT_DB = 'todo'

# Set up db connection client
db_connection = r.connect(RDB_HOST,RDB_PORT)

# Function is for cross-checking database and table exists 
def dbSetup():
        print 'Database setup completed.'
    except RqlRuntimeError:
            print 'Table creation completed'
            print 'Table already exists.Nothing to do'


Don’t worry, if you do not know about rethinkDB. Just go to this link and see quickstart. RethinkDB Python. We just prepared a db connection client and created database, table. Now the actual thing comes. Falcon allows us to define a resource class which we can route to a URL. In that resource class we can have four REST methods

  1. on_get
  2. on_post
  3. on_put
  4. on_delete

So we are going to implement first two functions in this article. Create a file called
import falcon
import json

from db_client import *

class NoteResource:
    def on_get(self, req, resp):
        """Handles GET requests"""
        # Return note for particular ID
        if req.get_param("id"):
            result = {'note': r.db(PROJECT_DB).table(PROJECT_TABLE). get(req.get_param("id")).run(db_connection)}
            note_cursor = r.db(PROJECT_DB).table(PROJECT_TABLE).run(db_connection)
            result = {'notes': [i for i in note_cursor]}
        resp.body = json.dumps(result)

    def on_post(self, req, resp):
         """Handles POST requests"""
             raw_json =
         except Exception as ex:
             raise falcon.HTTPError(falcon.HTTP_400,'Error',ex.message)

             result = json.loads(raw_json, encoding='utf-8')
             sid =  r.db(PROJECT_DB).table(PROJECT_TABLE).insert({'title':result['title'],'body':result['body']}).run(db_connection)
             resp.body = 'Successfully inserted %s'%sid
         except ValueError:
             raise falcon.HTTPError(falcon.HTTP_400,'Invalid JSON','Could not decode the request body. The ''JSON was incorrect.')

api = falcon.API()
api.add_route('/notes', NoteResource())

We can break down the code into following pieces.

  1. We imported falcon and database client
  2. Created a resource class called NoteResource
  3. Created two methods called on_get and on_post on NoteResource.
  4. In on_get method, we are checking for “id” parameter in the request and sending one resource (note) or all resources (notes). req, resp are the request and response objects of falcon respectively.
  5. In on_post method, we are checking for data as a raw JSON. We are decoding that raw JSON to store title and body in the rethinkDB notes table.
  6. We are creating API class of falcon and adding a route for it. ex: ‘/notes’ in our case.

Now in order to serve API, we should start WSGI server because falcon needs an independent server to deliver the API. So launch Gunicorn

$ gunicorn app:api


This will run Gunicorn WSGI server on port 8000. Visit


to view all notes stored.

If notes are empty then add one using POST request to our API.

Screenshot from 2015-09-13 03:13:54

Now add one more note as shown above with different data. Let us say it is { “title” : “At 10:00 AM” , “body” : ” Scrum meeting scheduled”}. Now visit http://localhost:8000/notes once again and you will find this


.Screenshot from 2015-09-13 03:19:37

If we want to fetch an element by id then do it with this. http://localhost:8000/notes?id=d24866be-36f0-4713-81fd-750b1b2b3bd4. Now only one note with given ID will be displayed.

Screenshot from 2015-09-13 03:22:51

This is how falcon enables us to create REST API easily at very low level. There are many additional features available for Falcon. For more details visit Falcon home page. If you want to see the full source code of above demonstration, visit this link.

please do comment if you have any query. Have a good day :) .

Build a real time data push engine using Python and Rethinkdb


Namaste everyone.Today we are going to talk about building real time data push engines.How to design models for the modern realtime web will be the lime light point in this article. We are going to build a cool push enigne that notifies “Super Heroes” real time in the Justice League(DC). We can also develop real time chat applications very easily with same principles.

What actually is a Data Push Engine?

Push engine is nothing but a software piece that pushes notifications from the server to all the clients who subscribed to recieve these events.When your app polls for data, it becomes slow, unscalable, and cumbersome to maintain.In order to overcome this burden two proposals were made.

  1. Web Sockets
  2. Server Sent Events(SSE)

But using any one of the above technologies is not sufficient for modern real time web. Think it in this way. The query-response database access model works well on the web because it maps directly to HTTP’s request-response. However, modern marketplaces, streaming analytics apps, multiplayer games, and collaborative web and mobile apps require sending data directly to the client in realtime. For example, when a user changes the position of a button in a collaborative design app, the server has to notify other users that are simultaneously working on the same project. Web browsers support these use cases via WebSockets and long-lived HTTP connections, but relying on database to notify updates is cool.

Seeing is believing

I am going to run my project first to make you confident with it. Project is nothing but a website which does following. Code for this project is available at

  • I am going to start a Justice League website (like the one superman runs).
  • Website collects nickname and email of a SuperHero.
  • Notify all existing heroes about new joinees in real time.

So I am going to tell you a small story. Just click the first image and navigate to last one by one. Don’t forget to read description below in each  image!. Press Esc to exit from slide show.

I think you got something with above story.If you don’t let me explain. Here we are asking information from clients and navigating them to their dashboard. From then all clients who are on dashboard will be notified about newly joined people instantly. No refresh,No ajax polling. Thanks to our push engine.

Are you kidding ,I can implement that using web sockets?

Yes are right. You can purely implement the above notification system using websockets. But why I used few more things to do that. Here is the answer.

“Using websockets code for designing push logic  is cumbersome. Websocket code must do a push from server and recieve that in client . Traditional databases do not know about the websockets or Server Sent Events. There we need to poll the database changes and then push them to intermediate queue and from there to clients. I say remove that headache from our server. Just exploit database capability of pushing changes in realtime whenever a change occurs to it’s data. That is why I chose RethinkDB plus Websockets“.

How I build that Push engine

I used two main ingredients to create data push engine shown above.

  1. Python Tornado web server ( for handling websocket requests and responses)
  2. RethinkDB ( for storing data and also to push real time changes to the server)

What is RethinkDB?

According to RethinkDB official website

RethinkDB is the first open-source, scalable JSON database built from the ground up for the realtime web. It inverts the traditional database architecture by exposing an exciting new access model – instead of polling for changes, the developer can tell RethinkDB to continuously push updated query results to applications in realtime. RethinkDB’s realtime push architecture dramatically reduces the time and effort necessary to build scalable realtime apps.

When is RethinkDB a good choice?

RethinkDB is a great choice when your applications could benefit from realtime feeds to your data.

The query-response database access model works well on the web because it maps directly to HTTP’s request-response. However, modern applications require sending data directly to the client in realtime. Use cases where companies benefited from RethinkDB’s realtime push architecture include:

  • Collaborative web and mobile apps
  • Streaming analytics apps
  • Multiplayer games
  • Realtime marketplaces
  • Connected devices

We know that modern web demands falls in one of the above catagories.So RethinkDB is extremely useful for the people want to exploit it’s real power for building real time apps.

RethinkDB has a dedicated python driver.In our project we are just inserting our dicument and reading the changes on users table.For getting familiar with RethinkDB python client visit these links.

Setup for our data push engine

Install RethinkDB on Ubuntu14.04 in this way.

$ source /etc/lsb-release && echo "deb $DISTRIB_CODENAME main" | sudo tee /etc/apt/sources.list.d/rethinkdb.list
$ wget -qO- | sudo apt-key add -
$ sudo apt-get update && sudo apt-get install rethinkdb
$ sudo cp /etc/rethinkdb/default.conf.sample /etc/rethinkdb/instances.d/instance1.conf
$ sudo /etc/init.d/rethinkdb restart

Create virtualenv for the project and install required libraries

$ virtualenv rethink
$ source rethink/bin/activate
$ pip install tornado rethinkdb jinja2

Now everything is fine.My main applciation will be and there are templates and staticfiles in my project.The project structure looks like this.

|-- requirements.txt
|-- static
|  `-- js
|      `-- sockhand.js
`-- templates
|   `--detail.html

Now letus write our file.

#For tornado server stuff 

import tornado.ioloop
import tornado.web
import tornado.gen
import tornado.websocket
import tornado.httpserver
from tornado.concurrent import Future

from jinja2 import Environment, FileSystemLoader #For templating stuff

import rethinkdb as r #For db stuff

from rethinkdb.errors import RqlRuntimeError, RqlDriverError

from conf import * #Fetching db and table details here

#Load the template environment

template_env = Environment(loader=FileSystemLoader("templates"))

db_connection = r.connect(RDB_HOST,RDB_PORT) #Connecting to RethinkDB server

#Our superheroes who connects to server
subscribers = set() 

#This is just for cross-checking database and table exists 
def dbSetup():
    print PROJECT_DB,db_connection
        print 'Database setup completed.'
    except RqlRuntimeError:
            print 'Table creation completed'
            print 'Table already exists.Nothing to do'
        print 'App database already exists.Nothing to do'

#There is a loop type in python rethinkDB client.set it to tornado

class MainHandler(tornado.web.RequestHandler): #Class that renders details page and Dashbaord
    def get(self):
        detail_template = template_env.get_template("detail.html") #Loads tenplate
    def post(self):
        home_template = template_env.get_template("home.html")
        email = self.get_argument("email")
        name = self.get_argument("nickname")
        connection = r.connect(RDB_HOST, RDB_PORT, PROJECT_DB)
        #Thread the connection
        threaded_conn = yield connection
        result = r.table(PROJECT_TABLE).insert({ "name": name , "email" : email}, conflict="error").run(threaded_conn)
        print 'log: %s inserted successfully'%result

#Sends the new user joined alerts to all subscribers who subscribed
def send_user_alert():
    while True:
            temp_conn = yield r.connect(RDB_HOST,RDB_PORT,PROJECT_DB)
            feed = yield r.table("users").changes().run(temp_conn)
            while (yield feed.fetch_next()):
                new_user_alert = yield
                for subscriber in subscribers:

class WSocketHandler(tornado.websocket.WebSocketHandler): #Tornado Websocket Handler
    def check_origin(self, origin):
        return True

    def open(self):
        subscribers.add(self) #Join client to our league

    def on_close(self):
        if self in subscribers:
            subscribers.remove(self) #Remove client

if __name__ == "__main__":
    dbSetup() #Check DB and Tables were pre created
    #Define tornado application
    current_dir = os.path.dirname(os.path.abspath(__file__))
    static_folder = os.path.join(current_dir, 'static')
    tornado_app = tornado.web.Application([('/', MainHandler), #For Landing Page (r'/ws', WSocketHandler), #For Sockets
(r'/static/(.*)', tornado.web.StaticFileHandler, { 'path': static_folder }) #Define static folder 

    #Start the server
    server = tornado.httpserver.HTTPServer(tornado_app)
    server.listen(8000) #Bind port 8888 to server

I am going to define database configuration parameters like db_name, table_name etc in seperate file.

import os

RDB_HOST = os.environ.get('RDB_HOST') or 'localhost'
RDB_PORT = os.environ.get('RDB_PORT') or 28015
PROJECT_DB = 'userfeed'

That’s it. We had our and ready. I will explain what I did above in point-wise below.

  • importing tornado tools and rethinkDB client drivers
  • writing a function called db_setup that checks whether required database and table were created or not
  • using MainHandler class to handle http requests. For GET request displaying enter details page and for POST showing the dashboard.
  • WSocketHandler is the tornado websocket handler that adds or removes subscribers.
  • We have one method called send_user_alert . It is the actual pusher of changes to the client.It does only two things. “subscribing to database table change” . “sending those changes to client “

In rethinkdb we have a concept called change feeds. It is similar to Redis PUBSUB.We can subscibe to a particular change-feed and rethindb returns us a cursor which is of infinite length.Whenever db recieves a change in particular table it triggers event to that subscribed cursor with new and old values of data.For example.

#cursor is returned when we subscribe to changes on authors table
cursor = r.table("users").changes().run(connection)

#just loop through it infinitely to grab changes that RethinkDB push to cursor
for document in cursor:

I think you got the thing by now. The other files in our project are templates and static files

  • detail.html
  • home.html
  • sockhand.js

The code for templates is quite obvious. You can find templates here

But we need to look into js file

//function that listens to Socket and do something when notification comes
function listen() {
    var source = new WebSocket('ws://' + + '/ws');
    var parent = document.getElementById("mycol")
    source.onmessage = function(msg) {
              var message = JSON.parse(;
              //Return random color for superhero
              var child = document.createElement("DIV");
              child.className = 'ui red message';
              var text = message['new_val']['name'].toUpperCase() + ' joined the league on '+ Date(); 
              var content = document.createTextNode(text);
              return false;

    console.log('I am ready'); 

Here we are defining a listen function when webpage is loaded. That listen function initializes a variable called source which is of type WebSocket and links it to the /ws url that we defined in the Tornado application. It also sets a callback when a message is recieved and that callback code updates the DOM structure and adds information about new user.

If you are still confused ,then run  application yourselves and see the things. The app we wrote above is a data push engine that routes directly from database to client.  Go to this project link . clone it. Install requirements.txt .Then run visit localhost:8000. If you still have any queries on how it works then feel free to comment below or approach

I thought to introduce rethinkDB for absolute beginners but article becomes very lengthy then.Sure I will come up with an article dedicated for RethinkDB in near future.

In this way we can build a real time data push engine using python and Rethinkdb.

Points to ponder

  • Use rethinkDB for building real time applications.It is scalable too.
  • Use Tornado because it can easily handle concurrent connections without any fuss.
  • Remove queuing from your architecturaal design
  • Use websockets for bidirectional communication
  • Try out new things frequently



Build a Collatz conjecture solver with Python and ZeroMQ

Connecting computers is so difficult that software and services to do this is a multi-billion dollar business. So today we’re still connecting applications using raw UDP and TCP, proprietary protocols, HTTP, Websockets. It remains painful, slow, hard to scale, and essentially centralized.

To fix the world, we needed to do two things. One, to solve the general problem of “how to connect any code to any code, anywhere”. Two, to wrap that up in the simplest possible building blocks that people could understand and use easily. It sounds ridiculously simple. And maybe it is. That’s kind of the whole point. Zero MQ comes to rescue us from the problem.With averge hardware configuration, we can handle 2.5-8 Million messages/second using ZeroMQ.

What is ZMQ?

ZeroMQ is a library used to implement messaging and communication systems between applications and processes – fast and asynchronously.It is faster like a bullet train.You can use it for multiple purposes.Like the things listed below.

* Networking and concurrency library

* Asynchronous messaging

* Brokerless communication

* Multiple transport

* Cross-platform and open-source

Why a message queue is required in the distributed applications and how ZeroMQ can be used as the best communication practise between applciations will be explained in few minutes.


I guess you captured the logic from above pic.Instead of hitting the server directly for each request  we can push it into a message queue and process it by workers , then route it to appropriate location. Ok, now context switch to collatz conjecture.

What is Collatz conjecture?

This is the 3n+1 mathematical problem (or Collatz conjecture). Collatz conjecture states that for any number n, the following function f(n) will always boil down to 1 as  result, if you keep feeding the previous result to the function over and over again.
  f(n) = {
           3n+1, if n is odd,
           n/2, if n is even
           1, if n is 1
Eg: if n = 20, then:
           f(20) = 20/2 = 10
           f(10) = 10/2 = 5
           f(5)  = 3*5+1 = 16
           f(16) = 16/2 = 8
           f(8)  = 8/2 = 4
           f(4)  = 4/2 = 2
           f(2)  = 1
The term cycle count refers to the length of the sequence of numbers generated. In the above case, cycle count for f(20) is 9.
We are going to build a collatz conjecture cycle finding server using Python and ZeroMQ. complete code of this project is available at this location

Beauty of collatz conjecture

All numbers leads to one. It is the philosophy of collatz conjecture. visit this site to visually see the construction of collatz numbers for orbital length of 18.

Let us build a Collatz Conjecture Cycle Server

Now come to the coding part.Our aim is to construct the server that takes a number from client and calculates longest collatz cycle from 1 to that number and returns it back. For ex. If we give input of 1000 our server should calculate collatz cycles for 1,2,3,……..,1000 seperately and return the longest cycle of all.


  • Python
  • ZeroMQ
  • Gevent

we can build the same server with Python and Gevent alone , but that setup is vulnerable after 10K connections. For scaling it to millions ,we should use the power of ZeroMQ.


  • Install  ZeroMQ4.1.2 ( ) .Below process shows step wise installation procedure.
$ sudo apt-get install uuid uuid-dev uuid-runtime
$ sudo apt-get install libzmq-dbg libzmq-dev libzmq1
$ sudo apt-get install build-essential gcc
$ cd tmp/ && wget
$ tar -xvf zeromq-4.1.2.tar.gz && cd ./zeromq-4.1.2.tar.gz
$ ./configure && make
$ sudo make install
$ sudo ldconfig
  • Install pyzmq
$ sudo apt-get install python-dev
$ sudo pip install pyzmq
  • Install gevent
$ sudo pip install gevent

Please be aware that ZeroMQ installation will fail if all deapendecies are not installed. pyzmq installation will fail if python-dev library deapendency is not fulfilled. I hope now you are ready with the required set up on a Ubuntu 14.04 machine.

This ZeroMQ server is used to serve requests through a TCP port to which a zmq socket is attached. Server collects data from that bound socket. let us first design function for returning maximum collatz cycle for a given inupt range. That algorithm would look like below

Collatz Conjecture Algorithm

import gevent
from gevent import monkey


#Algorithm for finding collatz conjecture longest cycle.
#Returns max cycle from cycles calculated from 1 to n.
def do_collatz(n):
    def collatz(n,cycle=''):
        while True:
            if n == 1:
                cycle += str(n)
                if n % 2 == 1:
                    n = ( 3 * n ) + 1
                    cycle += str(n)
                    n = n/2
                    cycle += str(n)
       return len(cycle)
    #This Gevent code is for speeding up the calculation of cycles
    jobs = [ gevent.spawn(collatz, x) for x in range(1,n) ]
    return max([g.value for g in jobs])

Now let us use this function in our ZeroMQ server that we are going to write below. It just recieves a number from client and calls this do_collatz function and returns the longest cycle back to the client.

ZeroMQ Collatz Server

import time
import zmq
import gevent
from gevent import monkey


#Create context
context = zmq.Context()

#Set type of socket
socket = context.socket(zmq.REP)

#Bind socket to port 5555

#Algorithm for finding collatz conjecture
def do_collatz(n):
    def collatz(n,cycle=''):
        while True:
            if n == 1:
                cycle += str(n)
                if n % 2 == 1:
                    n = ( 3 * n ) + 1
                    cycle += str(n)
                    n = n/2
                    cycle += str(n)
       return len(cycle)
    #This Gevent code is for speeding up the calculation of cycles 
    jobs = [ gevent.spawn(collatz, x) for x in range(1,n) ]
    return max([g.value for g in jobs])

#Create a loop and listen for clients to send requests 
while True:
    # Wait for next request from client
    number = int(socket.recv())
    print("Received request for finding max collatz cycle between 1.....%s" % number)
    # Send reply back collatz conjecture maximum cycle to client
    num = str(do_collatz(number))


Now we have a server ready to serve any no of clients.Let us build a ZeroMQ client to send request to above server and recieves the maximum collatz cycle for a given no.

ZeroMQ Collatz Client

import zmq
context = zmq.Context()

# Socket to talk to server
print 'Connecting to hello world server'

socket = context.socket(zmq.REQ)

number = raw_input("please give a no to calculate collatz conjecture max cycle: ")
print 'Sending request %s ' % number
#Send number to server
#Wait and print result
message = socket.recv()
print 'Collatz Conjecture max cycle of %s is <[ %s ]>' % (number, message)

That’s it. Our client is ready too. Now open the terminal with three tabs. One for server and other two for two clients to send the request. The output looks like this on my Ubuntu  14.04 machine.

Next try to give input 1000 from one client and 7000 from another client. Server instantly return maximum collatz cycle back to the client. It looks like this.


So it is clearly visible that our ZeroMQ server is working perfectly for serving the clients and solves collatz conjecture problem. This is called Request-Reply pattern of implementation of ZeroMQ. Here communication is acheived through TCP rather than HTTP. There are three more patterns can be implemented using ZeroMQ. They are:

  • Publish/Subscribe Pattern: Used for distributing data from a single process (e.g. publisher) to multiple recipients (e.g. subscribers).
  • Pipeline Pattern: Used for distributing data to connected nodes.
  • Exclusive Pair Pattern: Used for connecting two peers together, forming a pair.

So ZeroMQ has a lot of scope. It is a good scalability solution for current distributed application architectures. All code for above collatz-cycle is availlable at below github link.



Build an API under 30 lines of code with Python and Flask



Hello everyone. Now a days developers need to perform many jobs. Like web development, database development, API development and so on. Some companies are just having jobs called API developer on their openings sheet.What role APIs are playing now and why one should learn building them is our topic today. Developing an API with Python is a very easy task when compared to other languages. So,sit back and grab this skill for you. Take my words ,this skill is hot right now in the market.

What is a REST API?

REST (REpresentational State Transfer) is an architectural style, and an approach to communications that is often used in the development of Web services. The use of REST is often preferred over the more heavyweight SOAP (Simple Object Access Protocol) style because REST does not leverage as much bandwidth, which makes it a better fit for use over the Internet. The SOAP approach requires writing or using a provided server program (to serve data) and a client program (to request data).

In simple three lines REST API is a:

1 ) A way to expose your internal system to the outside world.

2) Programmatic way of interfacing third party systems.

3) Communication between different domains and technologies.

I think we are sounding technical. let us jump into practical things.By the end of this tutorial ,you will be comfortable in creating any API using Python and Flask.

Ingredients to build our API

We are going to use these things to build a running API.

*  Python

*  Flask web framework

*  Flask-RESTFul extension

*  SQLite3

* SQLAlchemy

Let us build Chicago employees salary API under 30 lines of code

I am going to build a Salary info API of Chicago city employees. Do you know ?,it is damn easy. An API can give you computation result or data from a remote database in a nice format. It is what API is intended for.API is a bridge between private databases and applications. I am collecting employee salary details from Chicago city data website

code of this entire project can be found at this link

Let’s begin the show……..

First , i downloaded the data-set as CSV and dumped it into my sqlite database.

$ sqlite3 salaries.db
sqlite> .mode csv salaries
sqlite> .import employee_chicago.csv salaries

and imported CSV.

Now we are going to build a flask app that serves this data as a REST API.

$ virtualenv rest-api
$ source rest-api/bin/activate
$ mkdir ~/rest-app
$ cd ~/rest-app

Now we are in the main folder of app.Create a file called in that folder.We need few libraries to finish our task.Install them by typing below commands.

$ pip install flask
$ pip install flask-restful
$ pip install sqlalchemy

That’s it. We are ready to build a cool salary API that can even be accessed through mobile. Let us recall the REST API design .It has 4 options. GET,PUT,POST,DELETE



here we are  dealing with an open data which can be accessed by multiple applications. So we implement GET here and remaining REST options becomes quite obvious.


from flask import Flask, request
from flask_restful import Resource, Api
from sqlalchemy import create_engine
from json import dumps

#Create a engine for connecting to SQLite3.
#Assuming salaries.db is in your app root folder

e = create_engine('sqlite:///salaries.db')

app = Flask(__name__)
api = Api(app)

class Departments_Meta(Resource):
    def get(self):
        #Connect to databse
        conn = e.connect()
        #Perform query and return JSON data
        query = conn.execute("select distinct DEPARTMENT from salaries")
        return {'departments': [i[0] for i in query.cursor.fetchall()]}

class Departmental_Salary(Resource):
    def get(self, department_name):
        conn = e.connect()
        query = conn.execute("select * from salaries where Department='%s'"%department_name.upper())
        #Query the result and get cursor.Dumping that data to a JSON is looked by extension
        result = {'data': [dict(zip(tuple (query.keys()) ,i)) for i in query.cursor]}
        return result
        #We can have PUT,DELETE,POST here. But in our API GET implementation is sufficient
api.add_resource(Departmental_Salary, '/dept/<string:department_name>')
api.add_resource(Departments_Meta, '/departments')

if __name__ == '__main__':

save it as and run as

 $ python 

That’s it. Your salary API is up and running now on localhost , port 5000. There are two rules we defined in the API. One is to get details of all departments available and second is to get employee full detail, who is working in a particular department.

So now go to


and you will find this.


See how flask is serving database data into JSON through the REST API we defined. Next modify URL to peek all employees who are working in Police department.



Oh man, seems like police officers are well paid in Chicago but they can’t beat a Django or Python developer who earns $ 1,00,000 per annum. just kidding.

My code walk-through is as follows

*  I downloaded latest Salary dataset from chicago data site

*  Dumped that CSV  into my SQLite db.

*  Used SQLAlchemy to connect to database and do select operations.

*  Created Flask-Restful classes to map functions with API URL

*  Returned the queried data as JSON ,which can be used universally.

See how simple it is to create a data API. We can also add support to PUT,POST and DELETE on data too.We can also have an authentication system for fetching data through API. Python and Flask are very powerful tools to create API rapidly. GitHub link is given below. Give it a try and extend it with the things mentioned above.

See you soon with more stories.

Docker , the future of Virtualization for your Django web development




Always use a virtual environment for your software development

Hello friends. I am a DevOps engineer who deals with development , production , monitoring , configuration management etc of a Software. But for every developer , there will be few  house keeping things which are quite irritating. You are having a single Ubuntu system and for the project’s sake , you will install many database servers and other applications in it. The dependencies for your applications will cause trouble , when  you are installing both your favorite games and working projects in a single system.

Why Docker?

What if something went wrong in your working project and everything in your system is messed up.  So I always  suggest you to separate your work with personal things. In this tutorial, I am going to show you how a developer can run his/her project and its applications in a light-weight virtual environment called container. This container is created by the Docker . We can access docker container from the host system for code editing and view execution. So always use a virtual environment for your software development.

What is Docker?

  • Open platform for developers and sysadmins to build, ship, and run distributed applications.
  • Light weight containers
  • It’s fast , not like a VM
  • Minimal resource usage
  • Run thousands of containers
  • Easy to run your whole production stack locally

Docker can create 100 light-weight environments in a single laptop. Unlike a virtual machine , your docker container will launch in one second. It will give an isolated environment for a developer to work with. In this post , I am going to create a docker container and setup a Python Django project and push it to my cloud repository.

Docker achieves its robust application (and therefore, process and resource) containment via Linux Containers (e.g. namespaces and other kernel features). Its further capabilities come from a project’s own parts and components, which extract all the complexity of working with lower-level linux tools/APIs used for system and application management with regards to securely containing processes.

Main Docker Parts

  1. docker daemon: used to manage docker (LXC) containers on the host it runs
  2. docker CLI: used to command and communicate with the docker daemon
  3. docker image index: a repository (public or private) for docker images

Main Docker Elements

  1. docker containers: directories containing everything-your-application
  2. docker images: snapshots of containers or base OS (e.g. Ubuntu) images
  3. Dockerfiles: scripts automating the building process of images

Installation Instructions for Ubuntu 14.04


$ sudo apt-get upgrade
$ sudo apt-get install
$ sudo service start
$ sudo ln -sf /usr/bin/ /usr/local/bin/docker
$ sudo sed -i '$acomplete -F _docker docker' /etc/bash_completion.d/


Using the Docker

Now we use Ubuntu as our base image for creating containers, then modify a freshly created container and commit it.

$ sudo docker pull ubuntu

this pulls the base ubuntu image to your system. After a successful pull we can launch a new container using following command.

$ sudo docker run -i -t -p ubuntu 

This creates a new container with container port 8000 forwarded to the host port 8000 . The flags mean this:

-i        attaches stdin and stdout

-t     allocates a terminal or console

-p     forwards ports from container to host

Things a developer needs

1) Edit Project code from host in your favorite editor.

2) Run project in that virtual container environment

3) See the output through the port that forwarded.

So in-order to edit code you need to access container data in your host. Instead you just mount a host directory in your docker container. Then your project code is in host ,but runs in the container. Cool right?.

So we will modify the command for running the new container as this.

$ sudo docker run -i -t -p -v /home/hproject:/home/cproject ubuntu 

Now a docker container will be created and started  with following things.

* port is forwarded on localhost:8000

* Data volume “/home/hproject” of host is mounted as “/home/cproject” in the container. It means the files lies in the /home/hproject of our host system is accessible perfectly from the container at /home/cproject.

These two things are required by a Django developer because , he need to modify code and view output through the browser. Now he don’t care where code is running. But here code is running in isolation. That too in a light-weight docker container. Vagrant has same port forwarding and volume mounting strategy but Vagrant is a VM , Docker is a VE.

Play with the container

Now we have a container started . It looks like bash with # symbol. So now install following things in it.

* Python


* Virtualenv

* Git

* Setup Django ,MySql and your Project

That’s it. You can install any thing that required to the project.But remember after doing stuff , use exit to come out of container.

# exit

Now if you do not commit the container, all changes you made will be lost. So commit the container first. Before that find out the containerID by typing this command.

$ sudo docker ps -a 

and find out the latest container exited. You can give a name to a container while committing

$ sudo docker commit b03e4fb46172 mycontainer 

This will commit the latest container we played till now and also gives it a name mycontainer. If once again we wish to launch container for working with our project , then use following commands.

$ sudo docker start mycontainer
$ sudo docker attach mycontainer

That’s it. You will enter into a virtual container where you can run your Django project. If you plan to remove a container just do

$ sudo docker rm mycontainer

If you wish to push your container to the cloud , you should use the Docker hub registry. Don’t forget to commit after coming out of container. You can carry your entire project with environment anywhere. Package your Project (Django ,Flask) + Environ (MySql,PostgreSQL,Redis) to a tar file and export it to any place. That is the magic of Docker. For doing that just export a container to a TAR file.

Exporting the containers

$ sudo docker export mycontainer> /home/mycontainer.tar

Instead you can also save images and carry them over system through FTP and then load it in target node.

Loading and saving the images

$ sudo docker save mycontainer > /tmp/mycontainer.tar

and then load it in target host as

$ sudo docker load mycontainer < /tmp/mycontainer.tar

References for dockerizing your mind