Monday, July 21, 2008

Designing maintainable view code

This is something I've been meaning to talk about for a while but never got around to. As it turns out, there are a few gems scattered around the world of web development that I think can help designing view code in MVC frameworks that is not cryptic to look at.

De facto "tag soup" view code is primarily composed of three distinct types of markup:

  • HTML markup: the tags that describe whether something is a title or a table cell
  • variables:: the stuff that we populate on the controllers and that often look like a variation of <%=MyVariable%>
  • actual code: if statements, for loops and formatting helper functions

So what makes tag soup messy?

The server-side angled brackets peppered everywhere can make things really difficult to read, because there's no good way to correlate code indentation with where tokens begin and end. You either end up with long lines with a ton of brackets:

<a href="<%=url%>"><%=text%></a>

Or you indent everything making opening html tags span multiple lines (and let's not get into what the output code will look like).

<a
  href="<%=url%>"
>
  <%=text%>
</a>

Language designers have seen this problem before: HEREDOC and WYSIWYG string notations are prime examples of how to deal with substrings that contain a different realm of logic than the rest of the source code. So let's do what they do: replace a common token with a not so common one.

We could turn this:

<a href="<%=url%>"><%=text%></a>

into:

<a href="{url}">{text}</a>

Much easier to read, no?

But there's a problem: it makes inline javascript impossible to parse unambiguously. Now the parser needs to know that "function() {a}" is server-side generated code and "function() {b}" is actual client-side javascript. This is a problem because we don't have any escaping rules. And the matter of fact is, people don't want escaping rules here. Who in their right mind wants to write "function() \{a\}"?

One way to deal with this dilemma that I saw in the D language and that I found incredibly clever is to use the grave accent symbol (`). It's not used by any popular web language, and it never appears as a standalone symbol in content. We can rewrite our code to look like this:

<a href="`url`">`text`</a>

Still clean and much less prone to ambiguity problems. We should still make the parsing engine also check for an escaping token, just in case you happen to be blogging about D (hey I'm doing it right now!).

But but...

Yeah yeah, I already hear some people cringing at this. It looks "weird". Whatever.

We're designing our view code, so if you don't like the traditional symbol-parsing-oriented approach, we can take the other route and go with conventions. Like so:

<a href="SS_URL">SS_TEXT</a>

If we start from the assumption that "SS_" is a prefix that can only appear before server-side variables, then the chances of colliding with javascript and content is minimal. You'll still need a mechanism for escaping, though. Never forget that you could be blogging about "SS_" conventions.

We could again define an escaping pattern, or you could take a risk and go lispy, i.e. have undefined symbols return the name of the symbols themselves. Basically:

//content
SS_MYVAR does <a href="SS_URL">SS_TEXT</a>
//pseudocode
if (SS_MYVAR == undefined) return "SS_MYVAR";

Of course, you could run into trouble later if you're going with the last approach, so make sure you understand your requirements before bashing the ` symbol too much.

What about code?

So we have a few ideas on how to make variable declarations easier to read, but what about code? It doesn't look incredibly clean when mangled with HTML in the view and it certainly doesn't go in the controller layer.

There are a few ideas that I think are interesting to explore:

XSLT or any home-brewed variation of it: XSLT is supported by most popular languages natively. For custom languages, assuming your framework can parse HTML, parsing a <foreach> tag or attribute and building your DOM programatically from these rules should not be hard. As a matter of fact, the so called "tag soup" does just that, except that it doesn't always respect the DOM rules (e.g. <<%=0 < 1 ? 'br' : 'hr'%> /> is not something you'll want to feed into an (X)HTML parser). The problem with leaving ifs and fors in the view is that some types of layouts create a lot of clutter: for example, when you need to add a class name to one in every 3 of a list, the simplest solutions usually involve either putting some of the view logic in the controller or peppering the view with counter variables and increment calls.

Unobtrusive code: if you do javascript, you must have heard of this - basically code is separated from the markup and functionality is attached via query languages, much like how CSS is applied to HTML. CSS selectors and Xpath are popular DSLs used to accomplish this. The beauty of doing things this way is that now your view is language agnostic (at least as far as if statements and for loops go) and you're free to use expressive language features on your DOM manipulation layer - if you have these features and can afford to use them. The down side is that now you have an extra step to think about: you need to declare variables in the controller, then reference them in the DOM manipulation code, then define conventions so that the manipulation layer knows what data goes in which tag in the view.

One can go a step further and eliminate the markup altogether. You could now abstract away the gory entrails of HTML (I believe that's what Objective-J is aiming to do, correct me if I'm wrong). The trade-off with that approach is that you risk running into leaky abstractions: you could get burnt by whitespace-related issues like HTML compression with white-space:pre in the CSS, IE bugs due to the CSS "display" rule or whatever.

Cook up your own language: this is an interesting choice. Perhaps the most famous example of this approach is HAML. DSLs can make code incredibly concise and expressive, but have arguably the steepest learning curve of the three and are not necessarily readable, if, for example, an intern needs to go in and translate an error message in a form. Also, again, abstraction brings about risks, so weigh them carefully.

You're stupid, my way is better

We've all heard that line when bringing up new ideas.

Designing stuff is hard, especially when other people actually have to use it. It doesn't make a whole lot of sense to even try anything here if you're in team of stuck-up zealots, and by no means are these suggestions supposed to be exhaustive.

Views are a specific domain, and view code is a domain specific language. Languages are tools and developers are their users.

Users are niches to be targeted, and at the same time, they have disparate opinions that you need to listen to. That is the tao of software development :)

update: fixing some typos

No comments:

Post a Comment