Hacker News Clone

The Brittleness Of Type Hierarchies

by singular on 5/8/2012, 12:05 PM with 73 comments

by mattiask on 5/8/2012, 2:20 PM
As I've become more experienced over the years I've come to believe that the prolification of patterns to "fix" oop problems (dependency injection, immutability, builder patterns, events etc etc) are a symptom of inherent flaws in the object oriented model.
It feels like we're bending over backwards to fix a model that promotes lots of problematic designs while not doing much for resolving them (or supporting basic things like concurrency/parallelism). We can never predict all possible problems, or predict the future so no language can be "perfect", but a language could inherently be more agile and flexible. I wish I could say what a better model would be. I'd hazard a guess that its something more based on composition and functional programming than inheritance and classes. Perhaps even metaprogramming and/or code generation
For now it seems OOP is the worst paradigm for programming, except all other paradigms of programming
by madhadron on 5/8/2012, 3:02 PM
Apparently most of the readers have missed the point. He says up front that what he's describing doesn't really happen seriously in small pieces of code. The code example is an illustration, one that I thought was very clear.
As for a solution? The only purpose of inheritance or subtyping is polymorphism. You may be doing polymorphism in a very roundabout way (if (isa(X)) { ...get a field from X... }), but it's still polymorphism under the hood. There's actually a very good argument against inheritance for polymorphism: you can't straightforwardly write a statically typed, polymorphic max function. You have to introduce generics to the language. That way lies the Standard Template Library and generic functions a la Common Lisp or Dylan (which is a pretty wonderful world).
Now, in implementation you may want some of the polymorphisms to be due to the same fields being in the same memory offset in all subtypes, which seems different, but why must it be? Why shouldn't it be a declaration about a family of types? I may have to go play with that...though I think it's equivalent to how it's done in Forth. So much seems to be.
by DanielBMarkham on 5/8/2012, 1:28 PM
Here's the thing: while you do not have to do a Big Design Up Front, there's no reason in the world why you can't have a lot of conversations around future behavior of the system as you go about working through your first few sprints.
While good Agile teams can do whatever is put in front of them, there is an implicit assumption in project work: if you start out building as securities system you're not going to be changing over to a system to feed and care for circus elephants in the middle of the project. That is, there is a fixed and limited set of nouns and relationships which comprise 90-95% of the problem domain that can easily be discovered simply by talking about the problem.
I'm not trying to disparage the author: this is a real problem. I'm just pointing out that mature teams cover the domain fairly completely in an informal fashion (perhaps a few hours of conversation spread out over a week or two) before writing anything. That's not design, that's just understanding the world of the customer. [Insert long rant here about how most programming teams have forgotten or hardly use any sort of analysis techniques]
Of course, the best of these teams still run into the same problem down the road, but it should be a pretty long ways down the road. Like years. If not, you probably never really understood what the hell you were doing in the first place. (Not the programming part, the part about fully understanding the user)
Type hierarchies can allow for flexibility easily. It's up to the team to spot where flexibility is going to be needed and put it in there. Brittleness is a risk just like any other project risk.
by jdlshore on 5/8/2012, 9:00 PM
I commented on the author's blog, but I thought people here might be interested as well:
I agree with Chris Parnin (in the comments of the author's blog)--this isn't a type hierarchy problem. It's an incremental design problem. It's true that inheritance should be used with caution, and this example (intentionally) overuses it, but the deeper problem seems to be that the author doesn't understand how refactoring and incremental design work.
Let's stipulate that your initial guesses about a domain will almost always be wrong. In this example, the author assumed that all securities will have an Isin, but it turns out they don't. Options are a type of security that don't have Isin.
One solution is to hack Option as a subtype of Security. As the author shows, this leads to a big mess. A much better solution is to refactor as soon as you notice that the domain is wrong.
Here's how it works:
Step 1: Notice that Options are securities, but they don't have Isins. Observe that the domain model is wrong. Smack yourself on the forehead.
Step 2: Realize that Security is not in fact representative of all Securities. Rename it IdentifiedSecurity (or IsinSecurity, if you prefer). This is an automated refactoring in C# and Java, and will automatically rename all uses of the class as well.
Step 3: Create a new superclass called Security and move Description and Exchange to that superclass, if desired.
Step 4: Create Option as a subclass of the new Security superclass.
Step 5: Enjoy your improved design. Some parts of the application (such as Trade) will be too conservative and use IdentifiedSecurity when they could use Security; those are easily fixed on a case-by-case basis as needed.
For more about incremental design, see Martin Fowler's _Refactoring,_ Joshua Kerievsky's _Refactoring to Patterns,_ or the "Incremental Design" chapter of my book (http://jamesshore.com/Agile-Book/incremental_design.html). You can also see me aggressively apply incremental design in my Let's Play TDD screencast, here: http://jamesshore.com/Blog/Lets-Play .
by tomgallard on 5/8/2012, 12:55 PM
If I were solving this problem in C# I'd probably lean towards defining specific properties of objects in interfaces, rather than just through a type hierarchy.
So we might have the interface ISecurityWithIsin , IOption etc.
This has the advantage of allowing
a.) easy use of mocking, dependency injection for testing.
b.) Classes can implement more than one of these interfaces.
The question then becomes- where do you put your base, shared functionality (e.g. a method that is common to all stocks with Isin numbers). Possibly this becomes another set of classes...
by hamidpalo on 5/8/2012, 1:18 PM
```
    We are faced with the dilemma - a lot of the code is now reliant on Isin, and NullReferenceExceptions are getting thrown all over the place because the field isn’t getting populated
```
Sounds like it's catching bugs. You get to fix all the Isin places in a single unit test pass. Or you change the types and it's all fixed by getting it to compile.
The examples provided by the author are absolutely horrible code. A type hierarchy can be more than two deep. How about adding another base type for securities with ISINs?
Moreover, if you ever see code like security is Option or a switch based on the name of the type it is a great sign of poorly architected code.
The solutions provided aren't really solutions at all. How would functional programming solve the problem? If anything a lot of functional languages are even more rigidly typed than C#.
by anuraj on 5/8/2012, 1:02 PM
One of the fundamental principles of OO is go for interface inheritance(design) as opposed to class inheritance(implementation). Better way to enable code reuse is through association. That is why languages like Java do not allow multiple class inheritance, but allow you to inherit from multiple interfaces. You are doing it the wrong way!
by carsongross on 5/8/2012, 1:47 PM
As always: It Depends (tm).
Type hierarchies have their place, but can be overused and abused. My typical approach is to be fairly conservative with base classes, and rely on them more for shared implementations rather than polymorphism. Gosu also supports composition (See http://lazygosu.org/ search for delegates) for shared implementations, but it is syntactically heavier-weight, even if it can be cleaner.
For polymorphism, I'm more inclined to use interfaces. I think there is a place for explicit (java-style) as well as go-style (implicit) interfaces.
The real culprit here is overdesign/premature abstraculation: you can go batshit early on in a project with almost any language feature and compromise your flexibility. Broadly, write as little code as possible, balanced with readability (e.g. don't go ape-shit with obscure macros) and using standard idioms, and let the underlying abstractions emerge when they are ready.
The older I get, the more I feel like less code is the most important thing by a long shot.
by joeyh on 5/8/2012, 4:44 PM
Here I've translated the example to haskell:
```
    data Exchange = Exchange { bic :: String, name :: String }
    data Security = Security { description :: String, exchange :: Exchange, isin :: String }
    data Stock = Stock { security :: Security }
    data Bond = Bond { security :: Security, expiry :: EpochTime }
    data Trade = Trade { price :: Decimal, quantity :: Decimal, security :: Security }
```
Now to add Option:
```
    data Option = Option { security :: Security, call :: Bool, lotSize :: Decimal, maturity :: EpochTime, strikePrice :: Decimal }
```
And in the example, the problem is that the Option uses Security, which has an isin, which doesn't make sense for Option. In haskell, this is a sort of problem which is typically fixed by adjusting the data types. There are many ways they could be changed, some will model the domain better than others. Let's just make the same quick fix used in the example, of allowing isin to not be set:
```
    data Security = Security { description :: String, exchange :: Exchange, isin :: Maybe String }
```
This means that isin is Nothing or Just a String. As soon as this change is made, every place in the program that directly accessed the isin will fail to compile. Fixing the compilation errors will involve adding a case to handle isin-less Securities.
```
    - foo (Security { isin = i }) = 
    + foo (Security { isin = Just i }) = ...
    + foo (Security { isin = Nothing }) = ...
```
The code does become somewhat ugly with these cases, but you know every case has been covered, and that it will work.
Maybe later it's decided to go back and fix it to use the separation between physical and derivative securities that was originally considered but not done due to lack of time. It could then look like this:
```
    data Security = PhysicalSecurity { description :: String, exchange :: Exchange, isin :: String } 
                  | DerivativeSecurity { description :: String, exchange :: Exchange }
```
Again this type change would drive a pass through the code, fixing it up to compile.
```
    foo (PhysicalSecurity { isin = i }) = ...
    foo (DerivativeSecurity {}) = ...
```
Again you'll know when you're done because the program will successfully compile. In this case, splitting the data type seems to have led to better, clearer code. It might be worthwhile to factor out a helper type to simplify the Security type:
```
    data SecurityBase = SecurityBase { description :: String, exchange :: Exchange }
    data Security = PhysicalSecurity { base :: SecurityBase, isin :: String }
                  | DerivativeSecurity { base :: SecurityBase }
```
Although you may find this complicates other things as you "follow the types" and change the code to match. There are surely other approaches; so far this has stuck with simple data types, but typeclasses could also be used. You may want to constrain Bonds to using a PhysicalSecurity, and Options to using a DerivativeSecurity, and there are various ways that could be enforced. And so on.
What was surprising to me coming to haskell from a background in loosely typed languages (and lowlevel langs like C) is that the types are not a straightjacket that is set in stone from the start, but ebb and flow as you refine your understanding of the problem domain. What well chosen types in haskell do constrain is the mistakes you want to be prevented from making. These days if I find myself repeatedly making a mistake in my code, I adjust the types to prevent that sort of mistake in the future.
---
Side note: The above code will not compile as written, because it exposes an annoying problem in haskell's record syntax. There are several fields named "security" that conflict with one-another. This is typically dealt with by using ugly field names (stockSecurity, bondSecurity, tradeSecurity, optionSecurity), or more advanced things like lenses, or by putting the data types in separate modules and using module namespacing.
by pnathan on 5/8/2012, 4:41 PM
I spent a few years using type hierarchies intensely in the early 00s and found the experience excruciatingly bad. The crystalline structure of your types quickly shatters on the shoals of reality and you are left taping the pieces together. After a particularly bad experience I generally stopped writing OO beyond simple structs.
Around 2010 I started reading rpg's writings on Lisp and software development; that opened my thoughts to a different thought process of how to design software with objects that I haven't really finished working out.
I do agree with you: the C++ modality of inheritance doesn't really work in many cases. It's a tool, but a tool that works badly often. I think a more CLOS or Haskellian viewpoint will yield better results in the long run.
by aphyr on 5/8/2012, 5:49 PM
I'm in favor of decoupling data structure from interface entirely, via records + protocols. Hierarchies have their place, but ultimately can't deal with cross-cutting concerns. Mixins with structural typing is one way to approach the problem, but for formal contracts I prefer Clojure's approach: http://www.ibm.com/developerworks/java/library/j-clojure-pro...
by achy on 5/8/2012, 2:40 PM
How can he write such an article, stating that he used C# because he knows it, and not tackle the problem using the main resource for such issues in C# / Java: Interfaces. Using interfaces, you can decouple all of those classes from each other, and never have this issue in the first place. If the stated model is the way he would typically tackle a problem in C# then there are fundamental issues with his choices, something that is not a failing of the Type system.
by stephenjudkins on 5/8/2012, 8:24 PM
I'm surprised he didn't mention typeclasses as a solution to this general problem. In many cases it's an unambiguously better solution than inheritance. A particular strength is that typeclasses can be easily defined or overridden at call-sites as easily as at where data types are defined. OOP forces an uncomfortably close complecting of data and operations on that data, leading to the difficulties enumerated in this blog post.
by darklajid on 5/8/2012, 9:12 PM
I stopped when I read this:
```
  Functional programming is enjoying a great upswing in interest
  and popularity these days. I wonder whether the stronger type
  systems of these languages...
```
Functional programming == stronger type system? I thought I can use JS in a functional way without a lot of safety nets. Clojure isn't Scala. Is he right? What am I missing?
by Bjartr on 5/8/2012, 2:21 PM
"Or do we find some other, less salubrious way around the problem?"
Since salubrious means "good" or "healthy", this statement doesn't make much sense. If you're going to use words your readers are likely going to have to look up, at least use them correctly.
EDIT: at least, it doesn't make sense insofar as I understood the intent of the sentence.
by Scramblejams on 5/8/2012, 3:06 PM
Could someone who understands CLOS well weigh in on how this problem might be approached from that point of view?
by mariusmg on 5/8/2012, 3:27 PM
Yeah, because it's so hard NOT to use inheritance.....
by algolicious on 5/8/2012, 3:55 PM
I'm not quite sure what the issue is here. It turns out that the author modeled the domain incorrectly. At least that incorrect model is completely explicit in the code. If it weren't spelled out explicitly, the coupling that the author speaks of would be insidiously spread throughout the code. In order to make the change that the author wants, it's as easy as introducing a new abstract class. In fact, if all the existing code correctly assumes the existence of an Isin, we can create the following set of classes:
abstract class BaseSecurity { public string Description { get; set; } public Exchange Exchange { get; set; } }
then modify Security to derive from BaseSecurity:
abstract class Security : BaseSecurity { public string Isin { get; set; } }
Then you are done, except for two issues: first is that any serialized data needs to be regenerated, and second is that you can't trade BaseSecurities. However, this trading functionality can be written separately without disturbing the existing ecosystem of software. This is what your type hierarchy buys you.
On the other hand, if we insist that this is not correct, and Security should have Isin removed, then we can add a new PhysicalSecurity between Security and the various implementations, and Stock/Bond/Trade can inherit from PhysicalSecurity.
In that case, the problem is that a lot of code was written with the incorrect assumption that an Isin exists in every security. Now we have to take a step back and ask how to fix that code on a case by case basis. No matter what language you use, it's always possible to write bad code with incorrect assumptions, and in that case you must pay the price. Hopefully you would be clear with your client on the delays required.
Now we can ask ourselves how a static language treats the situation differently than a dynamic language. The author seems to think a dynamic language would help, providing only praise in his description of them.
With a static language, we can simply remove Isin from the definition of Option. This will cause a lot of compilation failures. However, every place where there is a compilation failure is a place in the code which had an incorrect assumption. Each of these incorrect assumptions must be considered individually. After all, this represents the model for a trading system, and any bugs would likely result in severe financial consequences.
In a dynamic language, the definition could be changed, but there would not be any inherent mechanism to catch the now-incorrect calls. Instead, we would just get the NullPointerExceptions that the author complains about and which jeopardize the viability of the financial trading system. Perhaps the coders would have written beautiful unit tests that would help, but that could be the case in any static language as well. Of course, it's also possible that the coders would have created trivial unit tests or no tests at all.
In any case, I see this situation as a win for static type systems rather than a loss.
by tmitchel2 on 5/8/2012, 12:43 PM
The idea of composition over inheritance directly solves the issues discussed here... interfaces with a bit of DI work wonders.
http://en.wikipedia.org/wiki/Composition_over_inheritance
by ehosca on 5/8/2012, 2:22 PM
its a problem if you don't know your domain ...
you have no business designing type hierarchies if you don't have a clue about the domain you are modeling.
by anuraj on 5/8/2012, 2:38 PM
I think one of the fundamental issues here is how the programmer views OO as a programming methodology alone. OO is more a collaboration tool which helps large teams come up with complex functionality. The architect or lead designer comes up with system level abstractions and module contracts. The module designer then comes up with module level abstractions and interfaces. Finally the programmer is supposed to code to the interface given to him. Thus large projects can be managed better as each person knows their roles and responsibilities and system can be thought of as composed of blackboxes.
This works only when the architect knows his job and module designers are good. Good programmers often do not make good architects (it is a different thing that often good architects are good programmers too). In OO design comes first, second and third; implementation comes last. This creates a situation where programmers do not have enough work towards the beginning of the project. But as any normal scenario, this text book version works only 80%. Remaining 20% are situations where we do not know the abstractions to begin with or implementation feasibility is questionable.
This is where I use the programming resources to do prototyping of 20% functionality while the design is going on in parallel. In cases where abstractions may change, keep them at a very high level and evolve the design over time. By providing hooks to refactor and evolve the design over time, you can future insulate to some extent.
As long as every programmer is not forced to think in OO design terms and is given a simple contract of coding to the interface it works. That said good architects are rare and the job requires some experience and expertise in abstract thinking and most programmers do not end up as one.