May 2008

Created 31st May, 2008 09:26 (UTC), last edited 31st May, 2008 09:46 (UTC)

Back in around 1995 when I started to write web applications it was already pretty clear that asking a web server something and getting it to process a form were really forms of remote procedure calls (RPC).

I remember on first hearing about SOAP that it would be nice to have a consistent format for this mechanism, but instead of keeping to a simple mechanism it seems to have grown rather baroque over the years.

The basic idea of RPC over HTTP is pretty straightforward. Where things get complicated is in the myriad ways of actually doing anything useful – how do you handle parameters and pass values for example?

http://example.com/factorial/6
http://example.com/factorial?n=6

Both of the above URLs are perfectly reasonable ways of asking for the factorial of 6, so which should you use? The first feels more like a positional calling convention – something used in most programming languages. The second is a named parameter mechanism. From a programming language syntax point of view we might think of it as something like these two:

factorial(6)
factorial(n = 6)

I suspect that when looking at URLs we're probably more familiar with the second form even though most of the languages we program in prefer the first. Odd.¹ [1A browser given either of the two URLs will cache the result, but they have subtly different behaviour between the two. The first form will be much more aggressively cached, which of course fits in better with a function like factorial which is pure (in the functional programming sense) – it will always return the same answer for the same input. It's kind of neat that standard HTTP proxies and caching mechanisms will perform a sort of distributed memoization for this type of function call.]

Another thing to think about is what the first part of the URL really is. A binary function might be more illuminating.

http://example.com/add/2/5
http://example.com/add/1

These are somewhat analogous to the following:

add 2 5
add 2

The first will simply return the answer, but the second is just partial application and it will return a function. Really the URL before we apply the parameters is a form of lambda. Of course if we have lambdas then we should also be thinking of higher order functions.

def  ntofirst(f):
    def deco(*args, **kwargs):
         args += (kwargs['n'],)
         del(kwargs['n'])
         return f(args, kwargs)
    return deco

The above Python decorator ² [2Notice how neatly Python's dual notion of positional and keyword arguments fits in to the URL calling convention.] is the sort of thing that we might employ to turn the URL http://example.com/factorial?n=6 into http://example.com/factorial/6. Calling this over the web I suppose might look something like this

http://example.com/ntofirst/example.com%25factorial

But what is returned? In normal programming terms the thing we get back is a lambda with a closure. Clearly for the web that needs to be a represented by a URL too, and this is where things get harder.

I can see two reasonable types that might be returned here. The first is a version of the function application function. That is the URL returned will apply the parameters to the URL and return the result. That is, when the URL is given a query string of n=6 it will return the answer 120.

The second is that it should be a form of function binding. That is, when given the query string n=6 it will return the URL . This isn't how the Python works, but is the way that this sort of thing works in C++ (all the arguments are bound to the lambda, but the lambda isn't actually invoked).

What I'm really wondering though is whether a good answer to this won't also serve as a self descriptive RPC mechanism whose definition bootstraps from the requirements of the higher order functions we want to be able write. What you might also end up with is an interesting programming language which could be executed by a web browser simply by following links.