Three Haiku on Clojure

These are three random topics about Clojure that are sort of connected, in that all three address Clojure’s incredible flexibility and genericity from different angles. They were born from a few recent small projects that I started.

My favorite facility for polymorphism in Clojure

Object-oriented programming, at least as presented in C++ and Java, lets you do three things much more easily than languages like C:

  1. Encapsulation. You can hide data from certain parts of the code.
  2. Inheritance. You can create hierarchies of related objects which share some behaviors and modify other behaviors to suit themselves.
  3. Polymorphism. You can implement a function which behaves differently depending on the type of its arguments.

I’m being deliberately vague. Even though these three principles were all over the multiple choice section of the exam on Java when I was in school, they actually transcend object-oriented programming. They’re things which are helpful in all languages. Languages differ in how they support these principles, though.

Encapsulation lets us limit the number of things we have to worry about by only allowing certain parts of the code to modify data. If we have a global variable and we discover in debugging that its value is wrong, the piece of code that made its value wrong could be anywhere. If we have a private member of a class, the piece of code that made its value wrong can only be in that class.

Clojure doesn’t really support encapsulation. Like JavaScript and C, data in Clojure is local or global. You can do weird contortions with closures to make things private, like people do in JavaScript. I wouldn’t, but you can. Since Clojure data is almost all immutable, it’s not nearly as necessary to have encapsulation.

Clojure doesn’t really support inheritance either. But how much of the inheritance we see in Java is about avoiding code duplication, and how much is about the type system? Clojure is dynamically typed, so it doesn’t need inheritance for a type system. And Clojure has macros, the ultimate way of reducing code duplication, so it doesn’t really need inheritance for that reason either. You can build your own inheritance using derive and the polymorphism tools, but the language doesn’t support it directly, not really.

Clojure supports all kinds of sophisticated mechanisms for polymorphism. Records, types, protocols, multimethods. Probably some more that I’m forgetting or that only exist in the upcoming 1.8.1 alpha pre-release snapshot. All of them are useful and interesting, but my favorite example of Clojure polymorphism is probably tree-seq.

Why do I love tree-seq so much? Because this is the only polymorphic function in any language I can think of that both does not care, in any way, in the slightest, not even a little, how you represent the data it works on, and is also so simple that you can make a call to it in one line with practically no boilerplate:

(tree-seq #(not (nil? %))
  #(get % 1)
      [:root
          [[:a [[:b nil] [:c nil]]]
              [:d [[:e
                    [:f nil]]]
                  [:g nil]]]])

It works on this horror of nested vectors, producing a lazy sequence of the child vectors of each node. Even though this thing is the most ghetto way I can imagine to represent a binary tree, tree-seq doesn’t care and still processes it.

It also works on slightly more sophisticated structures:

(defrecord Node [name children])
(def t
    (->Node :root
      [(->Node :a
               [(->Node :b nil) (->Node :c nil)])
       (->Node :d [(->Node :e
                           [(->Node :f nil)])
                   (->Node :g nil)])]))
(tree-seq #(not (nil? %)) :children t)

tree-seq is so cool, you can even do weird things like storing the tree’s structural information in one place and its data in a completely different place:

(def children {:root [:a :d], :a [:b :c], :d [:e :g], :e [:f]})
(tree-seq #(not (nil? (get children % nil)))
      #(get children % nil)
      [:root :a :b :c :d :e :f :g])

Sure, this example is weird and pointless. But it works. tree-seq does not give a hot toddy how your tree is represented, as long as there’s some way to tell which nodes are leaf nodes and some way to get the children of non-leaf nodes.

There’s something refreshing about that freedom. It’s nice to think “For the rest of my code, it would be really nice to represent a tree as XXX” and know that Clojure’s built-in functions can work with that representation. No need to create abstract anonymous protected thread-non-mutable generic reified construction apparati for your new tree representation, or write five hundred lines of XML to register it with your dependency injection framework.

You Can Only Luminus Once Per Project

When I wrote Odyssey Through Three Web Frameworks, I was under the mistaken impression that Luminus was a web framework like Rails. It’s not, and you will be disappointed if you expect it to be.

Facts about Luminus:

  1. It’s implemented as a Leiningen plugin, and in the Java world it would probably be part of Eclipse.
  2. It generates a bunch of useful namespaces for you, and includes a bunch of useful libraries for you, and it does give you some stub functions, but it has nothing as extensive as Rails’s ActiveRecord or Django’s admin interface.
  3. If you don’t like the libraries that Luminus includes for you, it’s easy to pick a different one. Luminus uses Selmer, which is like a Clojure implementation of Django’s template language. I use Enlive. Luminus was okay with that, and we were still able to work together.
  4. Luminus still does not do security.
  5. Luminus is Homu Homu’s favorite Clojure technology, even if she can only use it once a day.

In short, Luminus is still just as do-it-yourself as Clojure always was, while helping out with boilerplate and giving just a bit more guidance than you would get without it.

The thought-polluting effects of Clojure

Recently, I wrote some Python code. It’s been a while, and it was surprisingly good to be back, even though I was definitely not loving on LXML.

A few years ago, whenever I started a new program, I would write a class. I would think about how to carve things up into classes. A few times, I wrote C programs, and I felt lost and confused because I couldn’t put things in classes.

I did use standalone functions in C++ and Python, but only when the function was completely unrelated to the rest of the program. If it was some random utility, I would write it as a standalone function. And if I was overloading an operator that C++ requires to be overloaded as a standalone function, I would write it as a standalone function, after first trying to write it as a method, staring at the bizarre compiler errors for five minutes, and then remembering that you can only overload certain operators as standalone functions. But I usually wanted everything in a class, even when I wasn’t really using the facilities of classes for anything. Both versions of PySounds, for example, stick all the logic in a class, then have a main module that instantiates an object of that class. In PySounds2, the main module does other UI-related things as well, but in PySounds1, it was almost literally just the code if __name__ == "__main__": PySounds().runStuff(lexfile, scfile).

Then suddenly I ran into Clojure. In Clojure, there are no classes. You write everything as standalone functions and stick it in a namespace. I wouldn’t say I felt lost and confused over this. Unlike C, Clojure has namespaces, so I didn’t feel like I was making program soup. But it did take some getting used to. I proceeded to get used to it.

Now, when I go to write some Python, I don’t jump right to classes. I go as far as I can with standalone functions and modules. If I write some code that feels twisted because of the data access patterns, I’ll consider whether it seems classy or not. If it does, I might write a class. Or I might not; I might just add some keyword arguments to a function. I might just stick two pieces of data in a tuple and pass that in or out. I organize Python like it’s Clojure, with modules as namespaces.

This is still pretty idiomatic in Python, but a while back, when I unconsciously applied the same pattern to Java, things got ugly. Java wants, needs, can’t live without, everything in classes. I wanted things not to be in classes. I wrote a class with a bunch of static methods. When I put it up on Code Review.SE, someone complained that it wasn’t object-oriented. And that was just, because it wasn’t, because I didn’t want it to be and Java did and we fought and ended up at a lousy compromise, as usually happens when you disagree with Java.

I’m not sure if there’s really any significance to this change in approach. I’ve felt for at least the past couple years that object-oriented programming was a lot more heavy machinery than I needed for most of the programs I work on. Especially in Java, when you start getting into generics and things get crazy-complicated.

I kind of appreciate Martin Odersky’s approach to Scala, where he knowingly cordoned off all the crazy type rules and made them library implementors’ problem, both at the language level and culturally in the Scala community. Actually, I appreciate a lot of things Odersky has said and done, which is why I have kind of a soft spot for Scala even though the language, on first inspection, doesn’t really match my aesthetic taste. I would like to spend some more time with Scala someday. That might revive my interest in object-oriented programming a bit. Right now it’s at an all-time low.

On the other hand, I recently went over some posts on Code Review.SE where the posters seemed to have totally ignored Clojure’s rudimentary object-oriented features, either knowingly or from ignorance, even though those features were perfect for their problem domain. (Both games, as I remember.) I realized that I really don’t know these features that well, so I got started on a simulation program where I would use them. I haven’t gotten far yet, but I’ve been quite pleased with records and protocols so far. They seem to be simple, flexible, and performant, while also being more sophisticated than regular maps or C structs.

Conclusion

I guess the only takeaway here is that Clojure has had a positive effect in code I write in other languages, by pushing me to make things more polymorphic and generic, and by spurring me to resist needless complexity. Paradoxically, by refusing to be object-oriented, Clojure has brought me greater understanding of object-oriented programming, both its upsides and its downsides. Now, instead of mindlessly classifying everything, I see how far I can go without classes, and when some particular code seems like it could really use a class, I use one. Rather than heavyweight classes with fifty member variables and hundreds of methods, I prefer lightweight classes with a single obvious purpose.

It’s tempting when you first learn about OOP to use it all the time, for everything, especially when your professors and your languages are expecting it. But it’s good to find out how things would go if you didn’t have OOP, whether you’re using C, JavaScript, or Clojure.

Advertisements

Dependency Injection As I Understand It

I’ve spent about a year trying to understand dependency injection. Not because it was actually that hard to understand what the Wikipedia article or this Stack Overflow question or this MSDN article were saying. I may be an idiot, but I did learn something in my CS degree. Actually, I learned two things: that despite what my middle school teachers told me, I’m an idiot; and that one ought never dereference the null pointer, though one sometimes may accidentally dereference a pointer to a block of memory which has already been deallocated, hence unintentionally dereferencing the null pointer, which, as I said, one ought never do, especially as, when one is running one’s program in kernel mode, dereferencing the null pointer may clobber the operating system’s private memory block and bring the entire system crashing down.

No, the reason it took me a year to understand dependency injection is that it seemed so idiotically simple that even an idiotic idiot like me ought to be able to understand it easily. It seemed that it ought to be part of some bullet list of basic object-oriented design practices in some beginner’s textbook like the Gaddis series, not a big, fancy concept with a long name and SO questions and MSDN articles and frameworks. However, the thought that the programming gods would make a mistake like that was so loathsome that my brain simply refused to even conceive it. Anything which requires a framework is clearly large and complex, I thought; programmers, after all, don’t just go around making frameworks to do trivial things that should be a footnote in the “Introduction to Classes and Objects” chapter of some Gaddis book, even in Java, in which language you’re not a real programmer until you’ve made a framework. It was only yesterday, when dependency injection rather suddenly made complete sense to me, that I was able to conceive such a thought.

This is, according to my current understanding, the essence of dependency injection. Say I have some Python code that looks like this:

def write_to_database(db_name, stuff_to_write):
  conn = sqlite3.connect(db_name)
  cursor = conn.cursor()
  try:
    # Now do stuff with conn and cursor.
  finally:
    cursor.close()

That code depends on having a connection to a database. So we pass it the name of the database, and it opens one. (If you’re not familiar with SQLite, its databases are just files on disk written in a certain binary format, so handling SQLite databases is just like handling regular files. You don’t need to set up a database server on a port like you do with MySQL or PostgreSQL. So db_name here would just be a path.)

But suppose we had some other database package which had the same API as Python’s SQLite3 module, and we wanted to have this function write to it instead? Clearly, it’s stupid to write the exact same function over again except with the line conn = fricktardSQLEleventyOneDotSeven.connect(db_name) in place of conn = sqlite3.connect(db_name). I guess we could move this function into a class, and then write two child classes, SQLiteDBWriter and FricktardSQLEleventyOneDotSevenDBWriter, with a helper method get_db_connection(db_name) which is overridden to return a connection to an SQLite database in one class, and to return a connection to a FricktardSQL database in the other. That doesn’t really seem any smarter, does it?

Or we could do this:

def write_to_database(conn, cursor, stuff_to_write):
  try:
    # Now do stuff with conn and cursor.
  finally:
    cursor.close()

The caller decides what kind of database it wants to connect to, and just passes that connection on to write_to_database. As long as conn and cursor support the same methods as the versions the SQLite3 module gives you, you can easily inject a connection to a FricktardSQL database as the connection which the function depends on. And that’s dependency injection!

This is somewhat less clear-cut in Java, because of the static type system. In Java, the com.nerus.zipacna.camullus.yu.anubis.apophis.tealc.FricktardSQLConnection class has to be part of the same class hierarchy as the SQLite connection, or else the compiler won’t let you pass it to org.my.program.to.write.stuff.to.a.database.DatabaseWriter.writeToDatabase, even if it supports the same API. So there might be cases where you have to resort to that separate class version I sketched out above, or something like it, if you’re using a language with a strong static type system like Java’s. I assume this sort of case is what the frameworks exist for.

Some statically typed languages have language-level ways to get around this. For instance, C# has optional dynamic typing, so you could just declare the connection reference in your method signature to be dynamic and write the method like the Python version. Scala lets you mix in traits on a per-instance basis; as far as I can tell, you could write a Scala version like this:

def writeToDatabase(conn: Cursory, stuffToWrite: Stuff) = {
  // Write stuff to database
}

trait Cursory {
  def cursor(): CursorType = {
    internalCursor
  }
}

val ftSQLDBConnection = new FricktardSQLDBConnection with Cursory
writeToDatabase(ftSQLDBConnection, stuff)

In general, you could add the Cursory trait to any kind of connection object and allow it to be used by the writeToDatabase function. I have a “hello, world”-level knowledge of Scala, so corrections are welcome.

That’s dependency injection as I understand it. The moment I changed my write_to_db(db_name, stuff) function to write_to_db(conn, cursor, stuff), it suddenly clicked that I just did dependency injection. However, corrections and clarifications are welcome as always. Maybe you don’t think I really did dependency injection? Maybe you don’t think “FricktardSQL” should be pronounced “fricktard-skew-ell” and support the alternative pronunciation, “fricktard-sequel” (i.e. it’s pronounced like what I said when I saw the trailer for Furious 7, to which I shall not link since I desire not to give it any publicity, even from the two or three people that occasionally read this blog)? Or maybe you think dependency injection doesn’t really exist and everything we think is dependency injection is an illusion of our patriarcho-capitalist philosophies? I welcome it all.