epistemologic

Amit Rathore’s blog about software development and project management

Archive for the 'meta' Category


Easy external DSLs for Java applications

Posted by Amit Rathore on April 27, 2007

Or JRuby for fun and profit

I’ve been developing software for some time now, and have recently found myself applying ideas from various esoteric areas of computer-science to every day tasks of building common-place applications (often these concepts are quite old, indeed some were thought of in 1958).

One powerful idea that I’ve been playing with recently is that of embedding a DSL (domain specific language) into your basic application framework - and writing most of the features of the application in that DSL.

This is simply an implementation of the concept of raising the level of abstraction. The point, of course, being that when writing code in a DSL implemented in such a fashion, one can express ideas in terms of high-level abstractions that represent actual concepts from the problem domain. In other words, it is like using a programming language that has primitives rooted in the domain.

A lot of people have been writing about this kind of software design - and most implement these ideas in a dynamic language of their choice. How does one go about doing the same in a language like Java? That is what this article is about. And I cheat in my answer. Consider the following design stack -

Creating DSLs in JRuby

I propose that you only implement basic and absolutely required pieces of functionality in Java - the things that rely on, say, external systems that expose a Java interface, or some EJB-type resource, or some other reason that requires the use of Java. The functionality you develop here is then exposed through an interface to the layers above. You can also add APIs for other support services you might need.

The layer above is a bunch of JRuby code that behaves as a facade to the Java API underneath. This leaves you with a Ruby API to that underlying Java (and other whatever, really!) stuff - and makes it possible to code against that functionality in pure Ruby. The JRuby interpreter runs as part of the deployable and simply executes all that Ruby code transparently. As far as your Ruby code is concerned, it doesn’t even care that some of the calls are internally implemented in Java. Sweet!

We can stop here. At this point, we are in a position to write the rest of our application in a nice dynamic language like Ruby. For some people, a nice fluent interface in Ruby suffices - and keeps all developers happy. This is depicted on the top-left part of the diagram.

However, we can go one step further, and implement a DSL in Ruby that raises the level of abstraction even more - as referred to earlier. So on top of the Ruby layer, you’d implement a set of classes that allow you to write code in simple domain-like terms, which then get translated into a form executable by the Ruby interpreter. This is shown in the top right part of the diagram. Ultimately, potentially any one (developers, QA or business analysts) could express their intent in something that looked very much like English.

So where to write what?

How much to put in your Java layer depends on the situation - some people (like me) prefer to write as little as possible in such a static language. Others like the static typing and the associated tool support, and prefer to put more code here. When merely shooting for a little bit of dynamism through the DSL engine in the layers above, most of the code could be written in Java, and a fluent API in the dynamic language could be enough. When shooting for rapid feature turn-around and a lot more flexibility, most of the code could be in the DSL or in the external dynamic language.

The answer to this question really depends on things like the requirements, team structure, skill-sets, and other such situational factors.

OK, so where’s the code?

My intention with this post was to stay at a high level - and talk of how one could structure an application to make it possible to embed a scripting engine into it, and to give an overview of the possibilities this creates. In subsequent posts, I will talk about how actual DSLs can be created, tested, and also how a team might be structured around it.

Posted in DSL, code, java, jruby, languages, meta, multi-paradigm, ruby | No Comments »

Turing complete languages and productivity

Posted by Amit Rathore on March 2, 2007

I was having a conversation with a colleague about varying levels of programmer productivity - specifically around language choice. As usual, one of the things that came up was the idea of Turing completeness. The important point to note is that the idea of Turing equivalence is completely separate from expressibility, efficiency, convenience, or well - productivity.

My take on it is this - for simple systems and programs - this idea of languages being equivalent approaches triviality. For non-trivial systems, however, the story changes quite a bit. If the system is sufficiently complex (and needs to be indistinguishable from magic), the best way to do it in a lower-level but “equivalent” language like C or Java is to build abstractions like crazy. Including building an interpreter for a higher level language.

I think that is the only way one can truly claim that languages are equal. Java is “equal” to Ruby because you can use JRuby to write all the code of consequence. C is infinitely superior to C++ when you embed Lua in it. Or Blub is better than Smalltalk because you implemented a Lisp in the former.

After all, what does it mean to write a system in a particular language when you can throw in a completely different layer of abstraction through a library like this? When these lines are blurring so much, and programming models and paradigms are becoming incestuous, is it not time to stop having silly debates about languages and start building systems with what works best?

P.S. - Of course, to be able to do the right thing, you need to have a toolbox with the right tools. Start collecting them, now!

Posted in languages, meta, multi-paradigm | No Comments »

More on Lisp syntax, and language extensions

Posted by Amit Rathore on February 16, 2007

Following my recent post on the topic, I thought of one more thing that the syntax of Lisp allows you to do. Being homoiconic, and the fact that code manipulation is so simple (it’s all lists), layering on “language extensions” becomes possible. For example, if Betty Programmer realizes that OO is a great way to design and write code but that Lisp by itself doesn’t provide an OO facility (there are no “class” constructs, no inheritance etc.) - she doesn’t need to despair.

She can write code to add an OOP system to the language. Yes, this means Lisp really blurs the distinction between the language designer and the programmer. In other words, while it’s fairly obvious that Lisp is very well suited to writing DSLs, it is also possible to fundamentally extend the language as well - like adding an OO system, or pattern-matching, or logic-programming (ala Prolog).

Now, obviously, I’m not proficient enough yet to do anything of this sort. But, as I said before, it is my intention to learn :)

Lisp. A language where being meta is something worth thinking about.

Posted in code, lambda, languages, lisp, meta | 3 Comments »

Lisp syntax, and when code is data

Posted by Amit Rathore on February 14, 2007

Like I said earlier, my friend Ravi introduced me to Lisp several years ago, but it has taken me many years to really want to learn it well enough. I’ll write about my reasons in another post. In any event, at the beginning of this year, I started to pick it up again, promising myself that I’d be serious. This time. So far so good.

I think I’ve started to grok one of the core ideas of Lisp. I had always read that the syntax of Lisp was one of its strengths. And I had always struggled with that idea, knowing it was important, yet was quite unable to really put my finger on it. I think I’m closer to it today.

If you had to create a programming language to write programs that wrote programs (as in, say DSLs) - what design choices would you make?

For one thing, you’d have to be able to generate and manipulate (walk parse-trees, compare and transform nodes etc.) code as though it were just another data-structure. Right? OK, so the code that was being generated would look like and behave like data.

You would then create an EVAL function that could run the generated code. Maybe your generated code would in turn produce generated code, so to keep things easy and simple, your language syntax would be the same as that of the generated language. In other words, you’d end up with a homoiconic programming language. Finally, you would bootstrap your language processor and arrive at your final metacircular evaluator.

To recap, this language would have syntax that looked and behaved like data and because of it could generate and manipulate that data, which itself could be code. What would this data structure look like? One obvious choice for this is a tree (because of parse-trees). If you think about it, XML is just like a tree. But it’s kludgey. What we want is something like XML but without all the cruft. For example -


<program>
	<function name=\"add_to_stock\">
		<param name=\"counter\" />
			<call_function name=\"increment\">
				<argument value=\"counter\"/>
			</call_function>
	</function>

	<function name=\"remove_from_stock\">
		<param name=\"item\"/>
			<call_function  name=\"decrement_from_stock_file\">
				<argument value=\"item\"/>
			</call_function>
	</funtion>
</program>

The syntax is truly disgusting, but useful - especially if you need to programmatically generate it. Let’s now try to make it easier for humans, too. I’m going to remove the ‘program’ tag, because all this stuff is code. I’m going to then change from XML tags to simple ‘(’ and ‘)’ without the names - and make an assumption - the first word that appears is always a function call. Except for define - which I’ll use to denote a definition for a function. I’ll also lose the XML attribute names, assuming that words that follow the function name are always parameters (unless it’s a code block itself - which would get evaluated first). So, we’re left with -




(define (add_to_stock counter)
    (increment counter))

 (define (remove_from_stock item)
    (decrement_from_stock_file item))



Where does this leave us?

It’s the same exact XML syntax, but it’s just a bit modified and has a few rules thrown in. Importantly, it’s still as easy to generate as XML. It’s just a list of lists of words. As in, a unit of code in this format would always start and end with parenthesis, and they would enclose either a bunch of zero or more symbols, or other lists.

In fact, a language that was good at list processing and had an eval function would probably do a really good job with this stuff!

Posted in code, lambda, languages, lisp, meta | 2 Comments »

Beyond objects - I

Posted by Amit Rathore on February 9, 2007

… It’s in words that the magic is–Abracadabra, Open Sesame, and the rest–but the magic words in one story aren’t magical in the next. The real magic is to understand which words work, and when, and for what; the trick is to learn the trick.
… And those words are made from the letters of our alphabet: a couple-dozen squiggles we can draw with the pen. This is the key! And the treasure, too, if we can only get our hands on it! It’s as if–as if the key to the treasure is the treasure!

John Barth, Chimera

It’s a copy of a quotation that appears in The Structure and Interpretation of Programs. It speaks to the fact that if you know the name of something, as in a way to refer to it, then you have control over it. Like functions or closures for instance. If you can refer to it by name, you can tell it to someone else. Or pass it to another function. You can remember the name, so you can speak it when you’re ready. When the context is right. Like when all the variables, objects, and functions have aligned.

The idea of metalinguistic abstraction is a fundamental one. It’s a bottom-up way of thinking about your problem-space, and figuring out primitives and operations on those primitives. Rather than trying to build a system that satisfies a particular set of constraints in the given problem-space, the idea is to build a way to express more complex concepts using those very primitives and operations. That way, changes to the system can be made by changing higher level constructs - and going further down the layered stack of abstractions as needed, depending on how fundamental the changes are. This way, changes (or any new set of constraints) can be handled in a more clean and elegant fashion, and the entire system benefits automatically by a change made at a lower level.

If this sounds like building what these days is called a domain-specific language (DSL), then sure, but it is not a new way of doing things. It is however, possible to do a lot of this stuff easier in “enterprise-type” scenarios, because of a larger acceptance of more dynamic languages. Like Ruby, for example. It isn’t particularly easy to write software this way - it involves a change in the way most of us were taught programming. Everything is not imperative any more, code doesn’t have to be “written” before it runs, lines between data and code begin to blur. On that last point, what is a closure? Is it data or is it a procedure? It’s sometimes one, sometimes the other, and occasionally both, and indeed, sometimes it is neither. Here’s an example -


def create_coord(x, y)
Proc.new { |selector|
if selector == :x
x
elsif selector == :y
y
end
}
end

c = create_coord 10, 20
puts “X is ” + c.call(:x).to_s
puts “Y is ” + c.call(:y).to_s

So, the question here is, what is c? Is it data? Or is it code, that does something? Here, c is an object that represents a cartesian coordinate, but without any explicit data elements like variables or attributes… so, what is it?

Note - This is Ruby code, but a different language, (like Lua, for example, that has syntactic sugar for things like keys on a hash table) could make it unnecessary for the call(:symbol) syntax and instead, just calling x and y would translate it to a call on the object with the given method name.

The point, however, is that there are more ways to think about abstraction than just OO, and to paraphrase Paul Graham, some of them can transcend objects. The bottom-up approach itself can be implemented using plain objects and OO thinking, but there are other ways which can make for some powerful expressablilty. The code above approaches functional programming, and I’m interested in scaling that model to build large systems from pure or near-pure functional constructs. I’m nowhere close to being proficient in writing software this way, but it is my intention to learn.

Posted in code, design, lambda, lisp, meta | 2 Comments »

On becoming a master

Posted by Amit Rathore on June 22, 2006

For a while now, I’ve been convinced that it is becoming increasingly important for regular software developers to really know Computer Science. Yes, the internals that you don’t really care about ’cause all you care about is “business value”. Yes, I’m talking about finite automata, algorithms, operating systems, compiler theory, artificial intelligence, functional programming… Without this knowledge and the deep understanding of where to apply it, it is very hard to progress to beyond journeyman. I’ve also realized that what many people claim is master-level of competence, is really not quite the truth. If someone who is supposed to be a master programmer doesn’t understand computer science internals, then he or she is really not all that masterful.

Those who think that these are merely details left to the systems guys are just plain wrong. While you may not write the next RSA, the enlightenment that all this knowledge gives you, when coupled with actually applying this theory to “application development” will take you to the next level.

Posted in code, meta | No Comments »

On DSLs

Posted by rathoreamit on January 20, 2006

I’ve been busy, working on two very interesting projects these past several weeks (on the side - my current project for work is unbelievably mind-numbing). Both these projects are in Ruby and use Ruby On Rails. There, got that out of the way.

I’ve always been in search of software designs that result in a clean DSL (Domain Specific Language)… lisp style, bottom up, bring the language up to the problem, rather than the other way around, etc. What I’ve managed so far is a set of mini-DSLs in my applications. One for each module, so to speak.

And perhaps, that is the way things work in general; rather than the unrealistic goal of building *one* DSL which is supposed to capture your entire domain - build several, smaller ones, which each capture aspects of the domain.

Posted in architecture, code, design, meta | No Comments »