Keith's Blog: 2009

Tuesday, September 22, 2009

The better way

http://www.realworldtech.com/forums/index.cfm?action=detail&id=67226&threadid=66595&roomid=2

This link describes _exactly_ what I think is the future of OS design. Andrew Tanenbaum has said about Minix 3, "a bug in a driver [...] cannot bring down the entire OS." That's a big claim, and these types of claims are what make microkernels sound attractive.

But as Linus has argued so very often, distributed algorithms are not easy, therefore microkernels will never supplant traditional kernels.

But there may be a better way. Get rid of this idea that the only way to protect the kernel is to run untrusted code in a user process. New idea: untrusted code is written in a language that either statically or at-runtime is able to restrict the behavior of said code from doing anything really bad.

When such code is loaded into the kernel, the kernel's "trusty" (no pun intended) compiler combs through it, adding any necessary bounds checking on any pointer dereferences, and bingo. You can make the same guarantee in a traditional-ish kernel as a microkernel. (The only thing non-traditional about this new type of kernel would be the inclusion of said compiler.)

Wednesday, June 24, 2009

Appropriate OO in Multi-Person Projects

After my last couple of posts, it hit me that it kinda sounds like I'm advocating no OO whatsoever in multi-person projects. That is not true. I just think it shouldn't be the organizing principle on such projects.

But, I do think that little entities that live a life of their own and are not integral to the program can be encapsulated as objects.

I other words, if I were part of a team writing a word processor, I wouldn't create a class called "document." I would, however, decide how a document would be stored in memory (it should be a well-understood data structure for everyone on the team), but I wouldn't use a class. I would implement a collection of functions to operate on different parts of a document, but these wouldn't be object methods.

On the other hand, some small things have a limitted amount of information associated with them and are somewhat peripheral to the primary concerns of the program. Take a "string," for instance. It just has an array of characters and a length (and possibly a handful of other meta-data). It may, therefore, be appropriate to have a "string" class.

I guess my rule of thumb would be, if it is one of the data structures that defines the application, don't hide it in a class. If it is a data structure for which a CPAN module might exist (Perl), then it is probably appropriate to use a class.

Tuesday, June 23, 2009

Here are a couple of super-lame graphics to illustrate the difference between separating programmers in a large project using OO isolation mechanisms, vs letting everyone know and hack on a single data structure (using code review and testing for protection rather than isolation).

The circles are programs or portions of a program, and the squares are programmers.

First, OO:

Next, better:

In the OO case, you have portions of the program that are hard-separated (separated via the OO system). It is difficult to change the interactions, even if it is wrong. And the communication necessary to do so is expensive.

In the better case, the program is more monolithic, but has well understood data structures. Everybody hacks on the same code base (their own copy of it, of course). (Functions, files and namespaces may be used to hierarchically separate portions.) Each programmer submits his changes to a code-reviewer (more senior programmer). If he is interacting correctly with the data structures, his changes are incorporated. The project can organically become whatever it needs to become.

Multi-Person Projects

I've heard the claim that OO helps manage large multi-programmer projects. False (usually).

In a previous entry, you may notice that I argue that inheritance reduces flexibility. I will now take that one step further and say that most aspects of a rigid OO system reduce flexibility.

Aside: However, I conceed that OO can provide a lot of code savings, as well as a higher level way to talk about your program (which may help readibility). But this higher level treatment of the code helps organize the code, but does nothing for ensuring sane data structures.

But there is a fundamental problem with OO (which, incidentally, may also be its biggest strength). You can hide implementations. In fact (and this is scary), you can hide data structures. As long as you provide the promised interface, you can really screw up the internals and get away with it.

Oh, and what happens when you need to change the interface of your class? Usually big teams that use OO, divide programmers and teams along class interfaces. That way, a programmer can work on one side of an interface, and another programmer can work on the other side and the two never have to speak (which is good because programmers are intraverts). But what happens when they do have to speak? What happens when you invariably discover that the interface was not quite right? Usually communication regarding changing an interface is not cheap. And the nature of OO is such that this must be decided up front.

There is a better way. Do you want a project that many people can work on without unresolvable conflicts? Do this.

(1) Get the data structures right. Your application should be defined by its data structures. If the data structures change, you have a different application than the original. If the code changes, but the data structures remain unchanged, then it's still the same program.

(2) Teach the data structures to all programmers so that everyone can work with them confidently.

(3) Use code-review to make sure someone isn't improperly mucking with some data structure. (This is in contrast to using a pre-encoded class structure to force everyone to play nicely.)

This keeps the sanity and structure of the program at the human level, and out of the code. That is, the compiler is not used to enforce any rigid structure. You don't end up making rediculous decisions because you chose some class interface incorrectly, and lack the resources to change it.

In support of these arguments, look at the code in the "Git" SCMS and the Linux kernel. The data structures are well understood by those working on the projects. The code can get a bit ugly, but it stays flexible. The only things that can hold the project back are (1) the intelligence (and number) of those working on it, and (2) the quality of the datastructures, not arbitrary class definitions.

Now, having said all this, I will conceed that I use OO all the time on my one-man projects due to the code savings. Since I'm the only one working on all sides of any class interface, it is cheap to change anything that needs to change. So these arguments really apply to multi-programmer projects.

Keith's Blog