Archive for December, 2009

mod_rewrite, subfolders and trailing slashes

December 30, 2009

I’m posting this because I spent many hours debugging this issue and couldn’t find a single site on the internet that provided a solution. I’m not thrilled with the answer I have, so if anyone knows of a better way to do this, I’m all ears.

The goal

I’m currently starting to host multiple web applications on a single domain in separate subfolders. For example, wordify should be served from http://www.snoyman.com/wordify/. Since I’m using Apache on my host, I need to use mod_rewrite to accomplish this. Anyone who has ever dealt with mod_rewrite probably has the same fear of it that I do. However, I’m purposely not going to go into a mod_rewrite rant.

Anyway, the mod_rewrite needs to make an internal redirect (meaning not changing the user’s URL via a 301 redirect) to the CGI script. For example, I want a request to http://www.snoyman.com/wordify/toword/57/ to be treated by Apache as http://www.snoyman.com/wordify/dispatch.cgi/toword/57/. No problem.

The problem

My initial solution gave a problem when people went to http://www.snoyman.com/wordify. In particular, it would redirect them to a URL based on the absolute filename on the system of the request (eg, http://www.snoyman.com/f5/snoyman/public/…). My temporary solution was to set up a 301 redirect from that ugly URL to the correct location, but that’s not an efficient solution since it requires a whole extra round trip for the user to get to the page.

It turns out that mod_rewrite does funny stuff with converting URLs to pathnames, and there’s no way to specify which type of value you would like to deal with.

The solution

Below is my new .htaccess file. Here’s what you need to know to follow this and adapt it to your own purposes:

  • The root of my website is /f5/snoyman/public
  • All of the wordify files are kept in /f5/snoyman/public/wordify
  • The CGI program for generating the content is called dispatch.cgi
Options +ExecCGI
AddHandler cgi-script .cgi

Options +FollowSymlinks

RewriteEngine On
RewriteRule ^/f5/snoyman/public/wordify$ /wordify/ [R=301,S=1]
RewriteCond $1 !^dispatch.cgi
RewriteRule ^(.*) dispatch.cgi/$1 [L]

A few notes:

  • The first two lines are purely for CGI purposes. You might not need them, or may wish to use different file extensions.
  • The follow symlinks is not needed for the rewrite purposes.
  • The first RewriteRule line does the trailing-slash addition. It specifies R=301 so that the user receives a 301 permanent redirect. The S=1 is a skip option so that the following rule is not executed.
  • The RewriteCond applies to the second RewriteRule, and makes sure we don’t end up with an infinite redirect loop. Remember, mod_rewrite will reloop through all your rules each time one of the rules makes a change to the URL.
  • I’m not sure exactly what the L option does, but it definitely doesn’t guarantee that a rule is the last one executed. That’s why we need the S on the first rule.

It’s ugly, but it works. The only downside I’d like to address is to disallow access from http://www.snoyman.com/wordify/dispatch.cgi/…, but I’m not too worried about that. Following good RESTful principles would make information available at precisely one URL, but the dispatch.cgi is unlikely to be stumbled upon by mistake.

Yesod RESTful web framework sample

December 21, 2009

Update: the example should now be working (as of 2009-12-22 18:10 UTC). Thanks to Chris and Felipe for the bug reports below.

View the sample being discussed here. For full effect, try with and without Javascript.

I’ve been working for a while on a web framework, previously under the name “restful”, but more recently renamed to Yesod. In future posts, I hope to give a more general overview of the features of this framework, but for now I’m just interested in showing a single code sample.

In order to get a nice test suite going, I’ve been racking up simple example web apps. While trying to think of one, I ran across a post on Happstack, which incidentally had a great sample app: factorials. I’m not trying to compare features of Yesod against Happstack here, just give proper attribution for the idea.

The code is available as part of my github repo. I’ve also converted the code to HTML. The file is well enough documented; the rest of this post will try to point out the features of Yesod that make this demonstration notable.

Multiple representations

This is probably the most important piece. Every web framework that exists can generate an HTML page. The vast majority can also generate JSON. Most of them know to set the content-type header correctly (I hope). Yesod, however, takes the same data and can give it different representations.

The trick is in the Yesod.Rep module, in the HasReps typeclass. Any instance of this typeclass can specify multiple renderings of itself. For example, HtmlObject has both HTML and JSON representations (more are possible, but probably unnecesary). You can wrap an HtmlObject with a TemplateFile and then have the data displayed nicely with a HStringTemplate template. To top it all off: HtmlObject handles all the entity encodings for you, so no more cross-site scripting attacks (exaggeration, I know).

Simplified routes

I was always annoyed when using Django that I specified my routes using regexs. There’s no need. I’ve never seen a webapp that did something beyond breaking up pieces across slashes and routing based on that. To get really fancy, you can accept only digits for one of the path pieces.

If you look in the code, you’ll see what looks like a quasi-quoted YAML file. Well, that’s exactly what it is. Yesod includes some Template Haskell to use this YAML file to generate a completely compile-time checked set of routes. It guarantees:

  • No overlapping routes exist.
  • Within each route, there are not duplicate handlers for each verb (request method).
  • Each specified handler takes the right arguments. For example, the resource path “/user/#userid/variable/$varname/” would require a function that takes an Int (for the #userid) and String (for the $varname).

There is also a version of the TH function which does not check for overlapping patterns.

Swappable backends

This example uses the hack-handler-simpleserver so it can be easily tested on a local system without running a web server. However, swap that for hack-handler-cgi, and you’ve got a CGI program. In fact, it will work with any Hack handler.

Various features

There’s a bunch of features in use here under the surface, such as automatic URL cleanup (trailing slashes and the like), JSON-P support, etc. There’s even more power not being used: OpenID authentication, client-side encrypted session data, request method override, etc. These will all be documented before release.

Conclusion

Yesod has been in development for quite a while now (over a year I believe). It’s the core for a few of my sites (photoblog is the largest), and is rapidly approaching its first release. It’s been on hold while some of its underlying libraries matured (failure, attempt and data-object). However, if you’re interested in building Ajax sites following RESTful principles, it could very well be the framework you’re looking for.

data-object family

December 17, 2009

A big thanks to Nicolas Pouillard, who co-authored data-object (as well as some of the underlying libraries like attempt) for coming up with many of the great ideas here.

Introduction

Before you get worried, this has nothing to do with object-oriented. The term “object” here refers to a JSON object, which basically means a data type which can represent three things:

  • Scalars
  • Sequences (or lists)
  • Mappings (or dictionaries)

This format happens to be an incredibly useful things, and the goal of data-object is to provide the Object data type in one place where other libraries can use it, and thus easily exchange data with other libraries. So far, this library has been used for:

  • data-object-json: a wrapper around json-b for JSON parsing/emitting
  • data-object-yaml: a binding to the libyaml C library. (Note: the C source code is included in the package, so you don’t need to have it installed separately on your system.)
  • json2yaml: a simple utility program for converting JSON to YAML files (I was shocked that I couldn’t find something like this elsewhere).
  • It is also playing a prominent role in the Yesod web framework to provide such features as automatic string escaping, JSON output and interfacing with HStringTemplate

Hopefully that gives you an idea that this library is useful. Before rolling your own data type to do basically the same thing, please consider using this library instead.

Overview of design choices

The datatype itself is incredibly simple; the important points are what go along with it.

  • The Object datatype is polymorphic in both the key and value. You can make String->String objects, Int->String, or anything else you like.
  • This library depends on convertible-text, which provides generic conversion type classes.
  • There is a template haskell function included to automatically generate a number of instances.
  • There are three specific aliases provided for Object in their own modules: TextObject, StringObject and ScalarObject.

What, no code samples?

Sorry, not this time. If you want to see example code that uses the data-object library, I recommend data-object-json and json2yaml (data-object-yaml has a lot of C library cruft).

Also, this library is still young, so I’m very much open to suggestions.

Two language extensions

December 7, 2009

Below are my ideas for two languages extensions which I think add a lot to the Haskell language without adding too much ambiguity. At least, I haven’t found any issues with the ideas so far, but I’m sure plenty of other people will be able to ;).

AutomaticClassSynonyms

Let’s say I’ve got the Failure class, and I’d like to define a MonadFailure class- simply for convenience- which is a subclass of both Failure and Monad. Well, defining the class is easy:

class (Monad m, Failure m) => MonadFailure m

I believe that this should automatically make anything which is an instance of both Monad and Failure an instance of MonadFailure, since the definition of MonadFailure is completely empty. I look at class instances as needing to address two issues:

  • Existence: is there some instance which makes sense?
  • Uniqueness: of those instances which make sense, which one should I use?

Here, there is no room for ambiguity: there exists an instance which makes sense (eg, instance MonadFailure Maybe), and there is precisely one instance which makes sense. There is no alternative way to define this instance.

Therefore, I think that in this case we should not complain if we have two instances for the same data type, since we know that the instances will be identical. That would make this extension work very nicely with existing code. It also adds no new syntax.

What I didn’t say

I specifically do not think this extension should make automatic instances of classes which have default definitions for all its functions. The first example that comes to mind is Exception: even though both fromException and toException have default definitions, I think the user should still have to explicitly instanciate exception, even if a type is already an instance of Typeable and Show.

SubClassOverloading

This extension is a bit more complicated. For motivation, let’s look at the interaction between Monad and Applicative. For most cases, a Monad can define an Applicative instance as such:

instance Functor MyMonad where
  fmap = liftM
instance Applicative MyMonad where
  pure = return
  (<*>) = ap

Well, that’s irritating! Instead of just writing a five line Monad instance, I have to write five extra boilerplate lines.

As a separate issue, Applicative is not defined as a superclass of Monad, and therefore I cannot treat all Monads as Applicatives. But we can’t add that superclass requirement without breaking existing code.

So I say we allow the definition of Monad as such:

class Applicative m => Monad m where
  fail s :: s -> m a -- or we could just take this out...
  (>>=) :: m a -> (a -> m b) -> m b -- the same
  return :: a -> m a -- also the same
  fmap = liftM -- a default definition for a superclass function
  pure = return
  (<*>) = ap

And suddenly all Monads are Applicative! Since every function in the Functor and Applicative classes is a given default definition in Monad, they can be automatically derived.

But what if you want to define a special version of fmap? Simple: do it like always! The definition in Monad is merely the default; if the compiler finds a separate instance for your data type, it uses that instead. This way, old code still works without a hitch.

The downside

The only downside I can see is that suddenly you’ll have instances of classes where before there were none. Not that having Applicative instances in and of itself is a downside, but there might be cases where it would define inappropriate instances (not that I can think of any off-hand). On the other hand, this would be mitigated slightly by the requirement of the type-class author to explicitly turn on this flag.

Let the beatings begin

Well, this is my first time suggesting any changes to Haskell, so I expect to be thoroughly scolded for my perposterous, heratical notions. Even if these suggestions are lacking, however, I hope we eventually get something which allows these kinds of features in Haskell.

String-like

December 4, 2009

While working on a type-safe method for embedding HTML fragments, I was reminded of some of my annoyances with my web-encodings package. In particular, I hated how I was doing all of these automatic conversions from and to lazy bytestrings, strict bytestrings and strings (ie [Char]). It’s always caused me a few headaches:

  • Often times I need to explicitly set types with a type signature.
  • I know that I’m needlessly wasting cycles.
  • There is more than one way to convert between a String and a ByteString; in particular, Latin-1 encodings (ie, Data.ByteString.Char8.pack) versus UTF-8.

I decided that it would be a good idea to provide these functions for all string-like data types. In addition to strings, strict bytestrings and lazy bytestrings, I also want to support strict and lazy text. My first idea was to provide five different modules in web-encodings. I did not relish the thought of writing it, much less maintaining it.

class StringLike

Then I had an idea. When doing html escaping, for example, all I really need to do is call “concatMap escapeHtmlChar”, where escapeHtmlChar might look like:

escapeHtmlChar '<' = "&lt;"
escapeHtmlChar '>' = "&gt;"
...
escapeHtmlChar c = [c]

I could obviously write 5 versions of the escapeHtml function, each calling a specialized version of concatMap. In fact, it’s very simple to do so: all five data types involved provide a concatMap function. I might need a little tweaking for packing at some points, but it’s very simple.

But of course I still didn’t want to have five functions. So I decided to create the “StringLike” typeclass. It looks something like this:

class StringLike a where
    head :: a -> Char
    tail :: a -> a
    lengthLT :: Int -> a -> Bool
    concatMap :: (Char -> String) -> a -> a
    ... (many more function)

As simple as this looks, there are a few things to note:

  • The basic type is always a Char. This means that we are treating bytestrings as if they are encoded in Latin-1.
  • Based on a suggestion by Daniel Fischer, there is no length function. Instead, there are length comparison functions, which is probably what’s needed in general.
  • There’s a fine line of when to use String and when to use the type itself. For example, I think the first argument to concatMap should be a function returning a String, not the specific type. tail should most definitely return the type itself. But there are some corner cases, such as the isPrefixOf function.

You can see the whole StringLike typeclass on github.

The ugly

Well, since my functions (encodeHtml, decodeUrl, etc) are still dealing with type classes instead of concrete values, I might still need an occasional type signature to get it to work. However, since there’s only one type involved, it should be much easier. For example, stringing together a number of these functions is completely unambiguous.

Also, I’ve lost the ability to pattern match strings. Instead, I must manually check the length and use head and tail functions. This is made most clear by the decodeUrl function. I have a feeling view patterns might be of assistance here, but I haven’t looked into it yet.

Useful?

I’m curious if the community would find this useful as a standalone package. If I were to release it, it would probably be two modules:

  • Data.StringLike would simply be the basic operations any string-like type should provide.
  • Data.StringLike.Extra would be higher-level functions built on top of this. Most likely, it would all go in a typeclass so individual types could provide more efficient versions of specific functions.

Look forward to hearing some opinions on this.