Friday, March 2, 2012

Configuration Best practices in Python

Based on my research over different ways of saving the configuration data of an application,
I have come across some interesting ways of saving the configuration.

ConfigParser (known as configparser in python 3 )
The ConfigParser class implements a basic configuration file parser language which provides a structure similar to .ini files in windows.

Suppose, I am having the following configuration file:
java_home: /usr/java/jdk1.7.0
home = /home/sumit
something: this is a multiple line statement
    indented on each new line.

To read this, I can have the following :

import ConfigParser, os
# create a basic configuration parser
config = ConfigParser.ConfigParser()
#Use it to open our config file
# Read under the section, section for the first(0th) value having the key as home
content =  config.get('Section', 'home', 0)
print "received: "+str(content)
content =  config.get('Section', 'java_home', 0)
print "received: "+str(content)
content =  config.get('Section', 'something', 0)
print "received: "+str(content)

Writing into a config parser is equally easy, we simply insert the desired values into the config

import ConfigParser, os
conf = ConfigParser.RawConfigParser()
conf.add_section('my section')
conf.set('my section',  'name', 'Ganesh')
conf.set('my section',  'bool', 'true')
conf.set('my section',  'percentage', '65.34%')

#saving our changes into a configuration file, finally
with open('output.cfg',  'wb') as configuration_file:

There are still other libraries that provide more functionality for the saving of such configuration data, such as , or some of the ones are build on top of ConfigParser . Numerous other libraries also exist for different/customized solutions for the same issue.

Binary parsing of data
This is the serialization of the data (or in simple words, the flattening of different forms of data in binary format)
We can use pickle, or its C-based implementation, cpickle in order to save the data faster.
this is helpful if we are trying to save or load a large amount of data, or this process has to go on repeatedly in a short amount of time.

We bind the data through the following routine:
import pickle

data  = 'this is some form of data to be persisted'
list = [1,2,3,4,5]

opfile = open('data.dbi',  'wb')

# pickle the textual data using the protocol 0
pickle.dump(data, opfile, 0)

# pickle the list data using the highest possible protocol
pickle.dump(list, opfile, -1)


This creates a binary file, data.dbi which has the binary data.
To recover the data back again from this file, we reverse this process and use pickle again to reclaim the original data.

import pickle

#unpickles the data back from the serialized file

pickle_file = open('data.dbi', 'rb')

data1 = pickle.load(pickle_file)


data2 = pickle.load(pickle_file)



Thus, we have different ways of performing the same data, while following the best practices to cater to the needs of viewing the configuration by the end user or to ensure efficiency in processing of the data.