Wednesday, October 31, 2012

An insight into sonar plugin development

In the recent months, I've been involved in developing a language plugin for sonar that displays different metrics for a specified language. I am writing this post as there is not much content available for this topic even when sonar is a widely popular tool.

Below are tips to avoid some of the common pitfalls in developing sonar:
  • Read the official documentation carefully, even when there isn't much of it.
  • Download the sources of all the open source plugins and try looking their code for proper understanding into plugin development.
  • Do read books available on the topic in addition to the blogs for proper understanding.
  • Use maven, this is a real lifesaver and focus on TDD while coding - as you cannot hope to debug the process otherwise.

If you've followed these steps diligently,  you'd have a basic understanding on the innards of sonar plugin development.
In very brief words, a sonar plugin comprises of a java based program that performs the heavy duty work of loading/populating the code metrics and a ruby based UI (embedded ruby pages(erb)) that helps in displaying of this data, in a nice manner and also uses view helpers to create eye-catching widgets that display these data. There is another provision for a complete rails based application to be created in place of this, but that's a different idea altogether.

In my case, I needed to Populate the data from a database , so in a class implementing Sensor interface, following method was present

  public void analyse(Project project, SensorContext sensorContext)
Inside this method, I can use Project object to read from project properties,etc as inputs to my process and SensorContext object as output in the form of  measures;

sensorContext.saveMeasure(new Measure(MyMetricAnalysisClass.MetricName, Double.valueOf(metricVal)));

One of the nice features of working with erb templates in the UI for widgets is that you can use a lot of helper methods to create slick effects with your data. One such example is a piechart:
<%= piechart( 'Field1 = '+var1.to_s+'; Field2 = '+var2.to_s+';'  , { :size => "500x200"}) -%>

One caveat present while transferring the data from server to views is that only text data works well, the information about using data structures is sketchy at best so we are left with using the data delimimted with ';' (See the first argument in piechart helper method above). Surprisingly, other data cause widgets to fail with no descriptive error messages. For larger data, it is more intuitive to use a GWT page instead of a widget as the users can focus more on the details provided in the page as opposed to the concise totals present on the widgets.
Hope this post helps the users developing plugins on this platform something to get started with.

Saturday, September 22, 2012

Going to classes, the online way

Recently, there have been a lot of announcements regarding free online classrooms for instructor-led E-learning. I was a bit skeptical of it at first, but now am glad that I enrolled myself into it.

Some of the advantages that online courses offer over conventional courses:
  • All the comforts of an online\distance learning course for free.
  • Vibrant discussion forums that help erase doubts.
  • High quality material: These courses are taught as-is in the real classes in top universities.
  • Hands-on learning: The learner gets to create programs and solve practical problems related with the topic.
  • Stay focused on a specific topic - working as a professional developer, it is natural to get stuck with the mundane aspects of programming and forget the big picture as there is the excitement into hacking into different topics without much time constraints, like in college. 
I am currently enrolled in two courses at and one at edx.
My first course is a 10 week course about Big Data, which I had an inking that it would be related with hadoop and other high level aspects of it, but what surprised me was that the first three weeks of this course have passed and the instructor is tirelessly helping us understand the basics/theory of the importance of big data and its key challenges. This being an IIT post graduate level course, further lends credibility to its undivided concentration over the background aspects of  big data (and intelligently breaking it down to look, listen, learn, connect, predict, correct).
Another course that has just started is Programming in Scala by none other than its inventor itself, Martin Odensky. What is pleasant about the course is that right from day one, the course uses test driven development and sbt for building & submitting programming excercises.
I also have high hopes for the upcoming SAAS course offered by MIT that seems promising and offers me to create a cloud based solution after a long time.

These myriad courses do take up time, especially the after work hours, but is a better use to put my time rather than just browsing around. So, lot of efforts and headaches are induced into this process, but is worth a try - whether you are a student, a developer or a curious bystander who happens to have some of these type of subjects affecting them in one way or another.

Wednesday, August 22, 2012

Problem with as/400 jdbc driver

While iterating over the resultset obtained from an as/400 datasource today, I was constantly getting this error:
descriptor index not valid
Upon searching for this error, I found its solution that should better be shared with everyone.
The jdbc driver starts the columns from 1 instead of 0 index, so when calling some data from the ResultSet object, such as resultSet.getString(1), the column fetched is the first column, and not the second column as it seems.
It could be nice if the error message explained this problem in a non-cryptic manner. This problem is one of those nagging problems that repeat from time to time(as this api doesn't seem to conform to java conventions in this case).

Tuesday, August 21, 2012

Book Review: Hadoop: The Definitive Guide, Second Edition by Tom White

Anyone interested in big data management today has at least a passing familiarity with Hadoop, an open source map-reduce algorithm implementation. Here's my review of the second edition of one of the most comprehensive books on the topic.
As a longtime hadoop enthusiast, I already had read the first book, I was interested in finding out what this second edition has in store for the readers.

The book builds over its predecessor and apart from addition of Hive and Sqoop, a case study covering graph visualization in social networks has been added. The hadoop version has been updated, as a developer, I'd recommend latest stable release of hadoop as it is an active project. However, as Tom White is himself a committer in this project, various project insights are added along the way as in the original edition.

From the first time hadoop adopter's point of view too, this text is an easy to adapt and the learning curve of hadoop is lessened to a great extent.
The book starts by building the context, presenting the history and ecosystem of hadoop and gives its user a high level overview. The underpinings of hadoop, or the mapreduce algorithm and its implementation in hadoop is covered in the next few chapters. This contains practical aspects of running any hadoop application including HDFS file manipulation and map reduce operation in detail. An exhaustive list of mapreduce techniques alongwith their examples are then covered that come up in everyday development while using hadoop api to interface with big data.
Another highlight of this book is the comprehensiveness of running and deploying hadoop in various configurations. Also, closely knit data management tools in the hadoop ecosystem or its sub-projects such as pig, hive, hbase, zookeeper and sqoop have been covered.
This is followed by various case studies that make an interesting read. It was disheartening to see no major updates in the case studies compared to the previous edition .

From a person already having the original edition of this book, the second edition does not have much to cover, but for a person not having read any previous editions, this is a comprehensive book.
Note: This book has been provided to me for reviewing under the Oreilly Blogger Review Program.

Tuesday, July 31, 2012

VM management with Vagrant

Setting up a virtualized environment on your own machine is always a headache inducing and risky setup. However, with virtualbox, one can use the vms without changing the OS internals (such as in Xen). The chief disadvantage of using virtualbox directly is that the virtual machine quickly escalates to a huge scale, and if you are cloning/ distributing your virtual machine, it easily becomes a hassle involving both the VM and its virtual hard disk.

However, with Vagrant, one can quickly create, configure and delete virtual machines, similar to a professional cloud environment like Amazon.
According to their documentation; ' Vagrant gives you the tools to build unique development environments for each project once and then easily tear them down and rebuild them only when they’re needed so you can save time and frustration.'. The documentation starts with a 5 minute tutorial that demonstrates how easy it is to setup, connect and configure different VM instances.

The configuration is handled by a vargant file, which is a ruby DSL.For the GUI inclined people, Vagrant can also be configured by different provisioners like Puppet or Chef. Thus, developers and project managers can quickly configure and setup environments for different projects.

By easily enabling the developer to directly configure the VM, the development process can be streamlined, in principle with the DevOpts movement.

Thursday, June 28, 2012

Appharbor: Heroku's cousin for .NET ?

These days, I am exploring appharbor, the PAAS for .net based runtime, which was reviewed in Thoughtworks tech radar publication recently.
My initial experiences have been good while working with the service and utilizing various addons, most of which are free to use for limited uses.
Setting up the application was easy and I used one of their example to play around and learn at the same time.
Check my application out for more details.

Wednesday, May 30, 2012

Reducing the repetitive java code

Just came across this neat hack of reducing your boilerplate java code. The project lombok is one such micro framework that injects the commonly occurring code based on the passed annotations. It is not an annotation processor, but rather an IDE integration that adds code on the fly.
After downloading, run the jar and specify your eclipse installation. You then need to restart the eclipse for the changes to take place. One of the most used annotation is the @Data annotation that wraps various other annotations in itself.
In case of this annotation name creating conflicts, the user can opt for member specific annotations .

import lombok.Data;
public class Pojo{
    private String name;
    private int id;
    private float weight;
In this case, we only need to specify the member fields and all the getters/setters as well as commonly overridden methods like toString and hashCode would be auto-generated by the IDE.
There are provisions for other IDEs and command line execution also for non-eclipse users which make this a nifty tool to use.

Sunday, May 13, 2012

Repairing your problematic display drivers

I am sharing a recent experience with my laptop. It is a MSI CR-500 that houses an Nvidia GeForce 8200m G graphics card. In previous Ubuntu(11.04 and earlier) versions, the stock graphics driver resulted in suboptimal displays and compiz and other desktop polishing tools were not able to be installed.
Out of habit, I tried to install a custom Nvidia binary driver into my newly installed Ubuntu 12.04 last week but the graphics driver failed to install correctly. The newer linux kernal did not allow my exsiting drivers to install so had to download a newly released driver. Undaunted, I tried installing a new driver, but it failed and my graphics display manager went kaput.
To get the things back, I simply reinstalled the Ubuntu and got my settings back. Here's what I did:
Booted off 12.04 livecd/pen drive
  • In install option window, went to manual drive partitioning option
  • Selected the partition containing the 12.04 and set its mount point to /
  • Did not select the format disk option – to retain my earlier settings and installations.
It gave me a warning specifying deletion of files.

The OS got restored in 20 minutes and I was good to go!
NOTE : Nothing much got deleted as I was having a fairly new OS with not much loaded components. Surprisingly, the java7 unzipped at /usr/local/lib was untouched, but the scala installed at /home/sumit was removed.

Friday, May 11, 2012

Practical Malware Analysis: Book review

Practical Malware Analysis: Book review

This is my review of the book, practical malware analysis by Michael Sikorski and Andrew Honig done  under the Oreilly Blogger Review Program.

This book teaches you the techniques and strategies followed by professionals to analyze and identify malware. As windows continue to be the most used OS in the world, it is not surprising that malware ranging from annoying worms to cyber weapons like stuxnet continue to spread using different means over the windows operating system.

Being a security book,  I was looking forward to a lot of exercises and security tools that would assist me in finding details about the malware that I might require. The book does the necessary job but often strays off its topic as it delves into the basics for what is more than sufficient, creating discontinuity in reading the text.

Tools such as OllyDbg, IDA pro, Win Dbg,etc are given in sufficient detail and various chapters are dedicated to their various uses in analysis and reverse engineering, which will be beneficial to an security professional. From a casual user point of view, the expansive details might be more of theoretical annoyance and the book is at places too advanced into the details.

On the nicer features in this book, there is a keen focus on practical implementation of the things taught at the end of each chapter in form of a set of labs that the end user is expected to complete. For me, this worked very well as I was able to skim across various chapters and perform lab routines to reinforce my understandings.

One of the caveats of having an extended introduction of various terminologies is that they seem stretched a bit too long. The book deals almost exclusively with the windows OS, so the name of the book should've been Practical Windows Malware Analysis which would aptly reflect the target environment of the book. As a user, it was a rewarding experience in reading from the book if the order of chapters were followed and the lab exercises done.

Friday, April 20, 2012

Python coding conventions

It is often a confusion in my team regarding what practice to follow while coding in python.
I have research it (mainly from PEP-08) and am sharing my findings.
Code Style
Zen of Python (PEP-20)
Type import this on python shell to see the zen of python.

The Zen of Python, by Tim Peters
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!


Naming Conventions
As the naming conventions of python are otherwise loosely defined, some rules were created to be followed by all standardized modules.
In this, the python community is undecided. Generally, the java convention is followed, but with an underscore instead of capital letter between letters in a word, but capital case as in C# is also observed.

Underscore usage:
This is not a syntactical requirement, but a convention.
 weak "internal use" indicator. Eg; from M import * does not import objects in M which start with an _
 used by convention to avoid conflicts with Python keyword. Eg; class_ m

Abbreviations should be capitalized like URLBroadcastListener

Code lay-out
Use of 4 spaces per indentation level, avoid tabs and do not mix tabs with spaces.
Do not exceed 80 character lines.
Avoid compound statements (single line having more than one line of code separated by a semicolon ; )
Blank lines can be used to perform the following:
To separate logical spaces(only if necessary) within the methods.
To separate different methods within a class.
Class definitions and functions may have 2 blank lines in between.

Each should be on separate lines, but could be supplied on the same line if presnet in same namespace.
Order of imports:
  Standard imports
  Related third party imports
  Local application/library imports
Use absolute imports from within the namespace instead of relative values.
Specify clearly, if using a module containing a class as: from namespace.classname import classname to reduce ambiguity

Whitespace formatting is best left to our own descretion, and should make the code readable.
Reiterating this often forgotten best practice- Never mix tabs with spaces.
Avoid WhiteSpaces at:
Just after start of parenthesis or a method call. eg: def meth ( arg): #Both the spaces around (
Just before a comma, semicolon or a colon. eg: def meth(arg , arg2) :
Just before indexes. eg: collect ['paper']
More than a single whitespace around operators. eg: pi  =  3.14159
No whitespaces should be used around keyword arguments or in assigning default parameter values
Compound statements on the same line (separated with a ; or : ) are discouraged as they affect readability.

Comments & Documentation strings
Here too, common programming rules apply: Comments should only be used if code is not easily decipherable; and should be updated with code changes.
Comments should be complete line in itself with longer ones finishing with a period (.)
Block comments start before any block of code with a single # and a space.
Inline comments should be rarely used, if that particular line performs something critical for the developer's attention.
Documentation strings (aka documentation comments in some languages) should be written for all public classes, methods, functions and modules.
These should preferably be in multiple line.

Docstring Conventions
These conventions are optional, as they pertain to the documentation portion of python. Still, these can be quite handy if you would be releasing the api documentation of your application later as api generators like Docutils require that you conform to a specified convention.
Docstring is a string literal occuring at the first statement of a method or class.
It is created using three " literals and can be single or multi line.

It is worth noting that the zen of python must take precedence over yours or mine preferences. This is necessary to allow programmers from diverse backgrounds to communicate effectively in code (ours was a mix of java, C# and visual basic programmers and we were using a wrapper of a C++ library, leading to major headaches and refactoring towards code maintenance).

Monday, April 2, 2012

"Programming Entity Framework DbContext" by Julia Lerman & Rowan Miller; O'Reilly Media

This is my review of the book, programming entity framework DbContext by Julia Lerman & Rowan Miller, done under the Oreilly Blogger Review Program.

Targeted at developers familiar with, and using entity framework in their applications, especially at places where object context interacts with the core of the framework, this book offers developers all about DbContext API, with the how-tos for dealing with creating customized queries to the database when the built-in mechanism of the framework does not suffice. Julia Lerman is the leading authority on learning Entity Framwork and Rown Miller is the project manager in Microsoft's team of ADO.NET Entity Framework.

The point of having this book is simple, after all that ease of creation of database entity models (and their corresponding configuration files) , one needs to interact them with application code where middle tier code performs interactions with the entities.

The book jumps into the differences between DbContext and ObjectContext at the start and explains the download instructions as well as code examples explaining integration with outside code as well as edmx model designer. It then explains the query part  and demonstrates LINQ queries to interact with the entities using DbSet in an eloquent manner while explaining entitiy retrival in detail (something that is rare to find by in a .NET text). I felt at ease with the details and quality of the coverage of the DbContext API and it almost felt like an in depth coverage into an established ORM tool like Hibernate or JPA while dealing with entities and relationship sets.It then proceeds to explain problems and solutions regarding tracking change of entities without the availability of a database, as the disconnected context state management is a feature which is hard to figure by itself in n-tiered applications.

The second half of the book deals with the newly released contents in the entity framework. It covers change tracker in detail and provides ample code along with the details of the new api. The validation api is also new in the framework and is introduced comprehensively, and some of the examples that I attempted worked correctly. The customization of validations is also covered in detail which is very handy to use in practical applications. The chapter on advanced features is mostly about unit testing with dbcontext, and should be named so instead of its generic name.
The book finishes off with the insight about the upcoming entity framework 5, but lacks any code example as the beta for the same is not out.

Overall, the book follows a style quite similar to a cookbook, but packs in with some well guided theory as well. The lack of appendix is missing, which could be there instead of the last chapter but the rest of the book follows the contents around the topic and is indeed a pleasurable learning experience. For developers looking forward to another .NET book with extensive visuals and wizards, this would come as a disappointment but for those looking to solve their middle tier infrastructure plumbing with proper code, this is the solid reference to  be kept near while developing.
You can purchase this book from here.

Friday, March 2, 2012

Configuration Best practices in Python

Based on my research over different ways of saving the configuration data of an application,
I have come across some interesting ways of saving the configuration.

ConfigParser (known as configparser in python 3 )
The ConfigParser class implements a basic configuration file parser language which provides a structure similar to .ini files in windows.

Suppose, I am having the following configuration file:
java_home: /usr/java/jdk1.7.0
home = /home/sumit
something: this is a multiple line statement
    indented on each new line.

To read this, I can have the following :

import ConfigParser, os
# create a basic configuration parser
config = ConfigParser.ConfigParser()
#Use it to open our config file
# Read under the section, section for the first(0th) value having the key as home
content =  config.get('Section', 'home', 0)
print "received: "+str(content)
content =  config.get('Section', 'java_home', 0)
print "received: "+str(content)
content =  config.get('Section', 'something', 0)
print "received: "+str(content)

Writing into a config parser is equally easy, we simply insert the desired values into the config

import ConfigParser, os
conf = ConfigParser.RawConfigParser()
conf.add_section('my section')
conf.set('my section',  'name', 'Ganesh')
conf.set('my section',  'bool', 'true')
conf.set('my section',  'percentage', '65.34%')

#saving our changes into a configuration file, finally
with open('output.cfg',  'wb') as configuration_file:

There are still other libraries that provide more functionality for the saving of such configuration data, such as , or some of the ones are build on top of ConfigParser . Numerous other libraries also exist for different/customized solutions for the same issue.

Binary parsing of data
This is the serialization of the data (or in simple words, the flattening of different forms of data in binary format)
We can use pickle, or its C-based implementation, cpickle in order to save the data faster.
this is helpful if we are trying to save or load a large amount of data, or this process has to go on repeatedly in a short amount of time.

We bind the data through the following routine:
import pickle

data  = 'this is some form of data to be persisted'
list = [1,2,3,4,5]

opfile = open('data.dbi',  'wb')

# pickle the textual data using the protocol 0
pickle.dump(data, opfile, 0)

# pickle the list data using the highest possible protocol
pickle.dump(list, opfile, -1)


This creates a binary file, data.dbi which has the binary data.
To recover the data back again from this file, we reverse this process and use pickle again to reclaim the original data.

import pickle

#unpickles the data back from the serialized file

pickle_file = open('data.dbi', 'rb')

data1 = pickle.load(pickle_file)


data2 = pickle.load(pickle_file)



Thus, we have different ways of performing the same data, while following the best practices to cater to the needs of viewing the configuration by the end user or to ensure efficiency in processing of the data.

Friday, February 24, 2012

Playing with Mvc3

I have known this framework since its launch of second iteration and did not try it as it never appealed to me to the extent that I switched from rails into it. But today, I decided  to give this revamped framework a go - since so much has been talked about this in the .net crowd, a customer at work preferred its usage over silverlight (on which we were displaying our appliciations till now).
Out of box, this framework does not appeals per se, but given the amount of documentation and the availability to plug-in only the desired element (an open ended approach) in the architecture, this seemed to me as a mixed bag - moving .net developers away from that design-oriented into a code-oriented (I have always considered Razor to be loosely based on velocity) programming, but still not supporting third party frameworks (unlike various jee/spring applications where we can mix and drop different components into different tiers).
Scaffolding was also missing surprisingly from this rails offshoot, but I enabled this through another extra tool .
Now that I am done with basic scaffolding and T4 templates, I am on my way to discover more about this framework and its highlights and limitations over this weekend.

Tuesday, January 31, 2012

Strange error in wxpython List Control

During a recent manipulation to delete elements of WxPython's ListControl object, the application gave the following strange error as a pop up:
Could not retrieve information about list control item #Number
The number was always the last element of the list control.
The list control wiki also does not explains this issue.
This question was discussed before, but was wrongly asked.
The error is actually in finding the last element of the list control; asthe self.logwin.GetItemCount()  results in an error. The last correct element is actually at self.logwin.GetItemCount()-1
Hopefully this solves the problems of others who stumble upon here.

Monday, January 30, 2012

Introduction towards using Ruby enVironment Manager

As I have been using RVM for quite a while during recent times, I felt it was necessary to give a quick introduction of it.

Similar to bundler, the dependency management tool, this sort of has become the de-facto method of setting up development environment in the ruby community.

Basically , this is a bundler for the entire ruby platform, or the ruby application stack. By using this, at a given project, we can specify the version of ruby that application runs upon (yes,  even its different ports like jruby and ironruby) and its different gems.

To install, it is recommended that you install from the default location as mentioned from the website itself as the bleeding edge projects tend to migrate and change their locations quite often.

After a single step installation and setup, you can perform different steps, some of which are :

Installation of platforms

rvm install

for eg : rvm install ruby-1.9.3

This will install the given version of ruby platform by fetching its source tarball into your .rvm/archives folder at home location and build that ruby from source

Testing of all platforms managed by rvm

rvm do ruby [filename.rb]

This will run the ruby command on all installed

Selection of an installed platform

rvm use

for eg : rvm use 1.8.7

This searches for an appropriate version of ruby and specifies steps for installation of that version if it is not present in the rvm.


In line with the compartmentalization introduced at platform level, rvm also provides the same at gem level. We can create different 'schemes' containing specified gems of specific version.

The gemsets are namespaces having different combinations of gems. This solves a lot of headache from developers as newer versions can be tried without compromising on stability of existing systems.

Creating gemsets

rvm gemset create 

This creates different gemsets which are used in the following manner :

rvm @[gemsetname]

This sets our current gemset and now we can install any gems that we like using the gem install -v version gemname here.

The version@gemset name is unique and will have these configurations saved.

Using gemsets

After the creation you can load a gemset configuration through:

rvm use version@gemset

rvm use gemset

There are host of other options that we may use, but these are enough to get started (as in git, where the necessary commands are a breeze once we understand its intent). Since I am currently working on only hobby projects based on ruby, I have not used rvm's advanced topics yet. I would update here as I come across other exciting facets of this technology.

Tuesday, January 10, 2012

Experiences as a software engineer

I will soon be completing my first year working as a software engineer. An year ago, I was an eager college student and was dabbling in open source stuff heavily to make up of the extra time that I had spent in college. I had the prerequisite knowledge to be a developer even before college, but in India, you are worthless if you do not have anything on paper, but are deemed to be knowledgeable if having degrees and certifications.

To start off with, I consider myself lucky to be in my present job, last year during my fifth semester, a trip to various local companies to 'test the waters' proved out to be an opportunity in disguise. I was interviewed and then selected without much fanfare. During the first day as an intern(my last semester involved industrial training) I jumped headfirst into connection pooling using spring, which was required here and there was no looking back (In direct contrast to people working in multinationals, who with much fanfare start their training - but later migrate to management as growth as a developer is restricted).
Among other things, I was having a java and ruby background and was started as a java developer. One of the perks of being in a product company is you have time and opportunity to try out, learn and explore new stuff all while doing your job.
After working considerably in the RnD over product development, I was put to test in scaling web scraping applications, porting of our product in .net platform and creating a keyboard/mouse capture application for assembling a test management mashup during the course of the previous year. Although I miss web applications a wee bit, the middleware and logic is what keeps me busy. On a parallel basis, I am into lot of books involving me to be a better developer. The Clean CoderThe passionate programmer,
With regards to updating my blog with the new stuff once every month, I learnt and made some interesting excercises during my free time, but paucity of time as well as my frequent forays in dzone and other knowledge portals led to cancellation of various blog posts. I am thinking of reverting to small posts providing a high level overview of technology practices that I consider important in current scenario instead of writing tutorial like posts.
Am having quite a few of these items in my mind, and will post them as the time allows.