Interfaces and the Need to Know Implementations

In many languages, interfaces and abstract classes serve as constructs for abstractions. Other languages rely on duck typing for the same purpose. Either way, we hide the underlying complexity of implementations under simpler fa├žades. Though we may use these abstractions, all too often we don't trust these abstractions.

Developers frequently talk about how they can't use an abstract type. They need to know exactly what concrete type of object they are dealing with. But in many cases, we don't or even can't know. It's possible that it is some second- or third-party type that we've never seen. How could we possibly trust some unknown code that is just shoved willy-nilly into ours?

But do we really need to know the concrete runtime type? Knowing can simplify debugging. When I need to chase a bug, I need to know where to look. If it is in some indeterminate subtype, how will I be able to track it down? This is a legitimate concern. But tools help there. Modern debuggers will make the unknown subtype know. Well-defined abstractions properly segmented can actually make debugging easier by eliminating where the bug isn't and isolating around where the bug is. The other reason to know the runtime type is for managing performance. The difference between two implementations could be significant depending on the use. We would have to know which we are using to determine or guarantee a performance expectation.

So why can't we just trust the abstractions? When we are working with our own code, we feel we have to know what the real type is. When it comes to the standard library, though, we have an implicit trust. When we use Arrays.asList() in Java, we only care that it returns a List. We can look up the implementation, but that's not the type we get back. It theoretically can be any type of List that fulfills the specification. Similarly, we can write a method that accepts a Set as a parameter. We could get a HashSet, but we might also get a TreeSet. Ideally, we don't have to care which we get. There may be differences in performance, but that is the concern of the caller passing in the Set, as long as we specify in what ways performance might be affected. When we ask for a value out of a Map and get a null, we don't start rooting around in the code for HashMap to find out why we got a null. We assume we're doing something wrong in the calling code. Why should we treat our own abstractions differently?

If we trust the abstractions of the standard library so implicitly, we should trust our own abstractions with the same confidence. How can we develop the same level of trust in our own code? We assume that the standard library is reliable because it is well-tested. This means both real-world testing and specified tests like unit tests, integration tests, etc. While real-world testing can only be achieved through experience, we can easily build the specified tests. Writing tests to the interface gives us confidence our code is doing what we expect. Using those tests to ensure functionality gives us confidence that our code works. These tests can be the wall at our back when we are asserting that our code lives up to our abstraction.

Tests are a great start, but they do not guarantee bug-free code. There are times when we do have to debug. We need two additional conditions to make this effort more trivial. First, we need to craft our exceptions to point directly to the problem. If there is an abnormal condition that we can identify in code, such as getting an unexpected null, we need to not just throw a random NullPointerException (or worse, just swallow the error), but to create a specific error message (if not a specific exception type) that points directly to the fact that we did not get back what we expected. Incorporating this level of exception management requires diligence, but pays dividends.

The second condition to simplifying debugging is to write units so we can interrogate them in isolation. With the testing regime mentioned earlier, we should already be doing this to support simpler unit tests. Interrogating units in isolation allows us localize problems. We can debug with an individual implementation without needing to pull in the entire application. How can we do this without knowing the runtime type though? We are programming to the interface, so all implementations should behave the same. We can write a test for the bug condition against all implementations. They should all behave similarly, but we can find bugs by seeing that they differ from each other, and from expectations set forth in the abstraction. This allows us to use the testing framework to work for us to find the bugs. The abstraction gives us confidence and direction for the implementation.


Popular posts from this blog

The Timeline of Errors

Magic Numbers, Semantics, and Compiler Errors

Assuming Debugging