Sunday, February 20, 2011

Increasing usage of Functional Programming in driving scalable architecture

One of the most exciting technologies that I've been pursuing lately is nothing new, but as an old wine in a new bottle, provides an alternative method to solve emerging issues for addressing performance. I am referring to Functional Programming, that has been growing in importance during some period in past few years due to various ongoing development in today's huge data processing applications used in enterprise environments. I've been seeing the rise of this otherwise academic language, and is increasingly used/considered to be usable for the past few years due to various ongoing developments.

Features

Functional programming provides various features that separates it as a different paradigm for programming( http://en.wikipedia.org/wiki/Functional_programming ). I'll be explaining various features as per this post's context.
Higher Order Functions : FP revolves around functions! Think of this as a small version of class in OOP. If you've worked with, say anonymous or inner classes in the past. You've unknowingly been doing FP in OOP, and needless to say, it was difficult to learn and maintain. Generics solve the same problem as First Class Functions, but the resultant implementation leaves a lot to be desired.
Recursion : Again a handy feature that provides a lot of bang for small amount of buck, or code! If you know how to use your functions, you can do a lot with your code without having to write boilerplate or plumbing code in your algorithmic implementations.
Immutability : Now this is what I am talking about! We all are familiar with multi-threaded programming, yet the tools for using it are hardly ever used as multi threaded programming involves sharing of resources, which is not a good thing. In all other programming paradigms, we are familiar with the foll construct :
      int a = 0;   Initialization at this line in the memory
   a = 1;       Memory address pointed by a gets 'Updated'
The problem lies with the second statement as we don't know when and more importantly which thread is going to invoke it. But if you recall from your Mathematics, we used to have following notations there:
   let a=0      Initialize a
   a=5          Assign a something new, and ignore previous state
As the value of a is immutable, we can safely have as many threads to it as we need. This means, as our applications are thread safe, we can have concurrent / parallel executions of the same code by 2 or 200000 invocations without adverse effects.

Applicability in meeting the challenges in scaling and performance


Today's applications need to fully utilize the underlying hardware. Moore's law doesn't holds when we consider raw cycle performance, but is still applicable if we consider the overall increase in processing capability through hyper threaded and multi core processor hardware. The information generation and its processing needs has increased too, leading to use of parallel computations. This is where the FP comes in handy. Its true that infrastructure can be abstracted from the application (as with cloud platforms) but if we do that ourselves, we can have greater flexibility and scalability for our solutions.
One feature of many 'newer' breed of FP languages like F# or scala that are hybrid in nature that I haven't discussed before is the ease of use of these languages is the ability to construct Domain Specific Languages; DSLs. From the business' perspectives, DSLs are fast becoming key to map the technology-business divide and keep application agility and re-usability simultaneously.

My exposure these days is with scala, which is a hybrid FP language that runs on JVM, and is interoperable with java. It is attracting the same kind of curiosity for java developers right now which ruby (largely because of Rails) did a few years back. However, this language aims at scalability, which is also its full name (scala stands for scalable architecture) within the JVM itself, making it an ideal candidate for solving middleware performance and scalability related issues. The fact that it replaced Rails in handling message processing in Twitter speaks for itself. I am also working on a research paper that explains the use of scala (and FP in general) to process massive amounts of data in distributed/cloud environment. Will post some more information here as it gets finalized.