Ask Joel IT: Static is Evil

Well, static is not evil but how people use it sometimes is. My friend Susan has taught Java for years and will forever be remembered for the mantra that "static is evil." In an object-oriented boot camp Susan taught there was one group who named themselves "Static is Awesome" just to make her say it! Spreading the message is a good thing. Not because static is evil, but because when static is misused it leads to brittle programs.

The word static means "fixed in position," it cannot be moved. In a computer language we have long applied the keyword static to "fix" the location of something. The how and why of fixing that location has changed over the years. Regardless of the particulars, something that is fixed is contrary to the principles in object-oriented programming where dynamic and abstract structures create robust and adaptable programs! Let's take a look at why that happens...

What is static?

The C programming language is a structured, procedural language that relies on global data and functions. A modularization feature of C is to group data and functions together in a library that an application can link to. An identifier is a name for a variable or function. Global identifiers in one library can conflict with the identifiers in another library.

But C has a solution for variables and functions in a library that are not needed by other application code: the static keyword "fixes" the identifier so that it belongs only to a particular library. The result is that static keeps the compiler from exposing the identifier outside of the library and removes potential conflicts! That also creates the ability to "encapsulate" functionality in a library that nobody else can see or use. I frequently took advantage of that in programs that I wrote in 1970s and early 1980s, long before C++.

When C++ came along all of the sudden classes provided the encapsulation that I previously built using libraries. C++ is compiled into libraries linked to applications just like C, and the keyword static still hides global variables and functions just like it does in C. But while the compiler allows it, I really should not expect to find any global variables or functions in a robust C++ program!

In C++, Java, and C# data and methods in a class are by default "instance" data and methods, and are bound to a specific object instance, an instance of the class. Every instance has its own copy of the data, and here each Customer has its own name:

Inside of a class declaration the meaning of the static keyword changes from C to C++, Java, and C#: now it is used to bind data or methods to the class itself instead of any instance. So it "fixes" the location to the class! But what really happens when we do that?

Well, if what is declared as static is private then it is shared across all instances of the class. In effect, it is a global identifier visible between the class instances. If what is static is public then the effect is to create a global identifier visible across the whole application!

Here the count field is static and shared between the classes. To make that work a single instance of the variable is created in memory, and all the Customer instances reference it. Here is part of the Customer class coded in Java:

public class Customer {

@tab;private static int count = 0;
@tab;private String name;

@tab;public Customer() {

@tab;@tab;count = count + 1;
@tab;}

@tab;public int getCount() {

@tab;@tab;return count;
@tab;}

@tab;...
}

Note that the class constructor increments the shared count field by one. Here is a test method that verifies that getCount returns two in both instances after two instances are created:

@Test
public void countIsStatic() {

@tab;int expectedCount = 2;
@tab;Customer customer1 = new Customer();
@tab;Customer customer2 = new Customer();

@tab;assertEquals(expectedCount, customer1.getCount());
@tab;assertEquals(expectedCount, customer2.getCount());
}

It is the idea of something being "global" that is so offensive in object-oriented programming. Why? Because global variables or functions are a throwback to structured and procedural programming. The client-code that uses a static field or method has a solid, direct dependency on it. And when you have a dependency like that it is very difficult to adapt to changes down the road.

Can we eliminate static fields and methods entirely? No. There are four places that it is very appropriate to use them:

To support data shared across class instances.
To define constants that must be used across class instances or the whole application.
To group related but un-choesive methods together: e.g. the Math class in Java and C#.
To define a global entry point to find something: e.g. a method to locate a singleton instance.

When static is a bad idea!

I always try to teach a subject by reinforcing the positive reasons to do something, so approaching static with examples of what is bad is backwards for me. But I think it works better here because I can examine some fundamental mistakes that I have seen made, and explain why they are mistakes.

Static methods because main is static

public class MyProgram {

@tab;public static void main(String [] args) {

@tab;@tab;display(args);
@tab;}

@tab;private static void display(String [] args) {

@tab;@tab;for (String arg : args) {

@tab;@tab;@tab;System.out.println(arg);
@tab;@tab;}
@tab;}
}

This example is an abbreviated version of what I see all the time: the methods that main calls are made static so that main can find them. The problem is that this is not object-oriented programming, it is procedural programming disguised as a class. And it is a direct result of what I have seen this written in more than one book: static methods can only refer to static methods and data in a class.

Those authors need to state their intent better: the variable "this" ("me" in Visual Basic .NET) does not exist in a static method because there is no inherent instance. But given a reference to an instance of the class in the static method everything non-static in the class is available, even things marked as private. Because after all the static method is still a member of the class.

public class MyProgram {

@tab;public static void main(String [] args) {

@tab;@tab;MyProgram program = new MyProgram();

@tab;@tab;program.display(args);
@tab;}

@tab;private void display(String [] args) {

@tab;@tab;for (String arg : args) {

@tab;@tab;@tab;System.out.println(arg);
@tab;@tab;}
@tab;}
}

So we used a static method main to kick the program off because Java forces us to. But then we turned around, created an instance of the class, called a private method using that instance, and everything else going forward works in an object-oriented way.

Static methods instead of a singleton

We know that creating instance after instance of a class that has no data just to get the functionality is a waste of resources. Even without data memory still needs to be allocated and eventually released, and repeatedly allocating and releasing a small amount of memory is very inefficient. A good example is a payment provider that processes a credit card transaction; the class provides the authorization but does not store any data.

In the next example every Sale instance can share the same payment provider. So, what if we make the methods static? Then we never need to create an instance!

The authorizePayment method is underlined in the UML diagram indicating that it is static. Here is what the code would look like:

public class CardInfo {

@tab;private String name;
@tab;private String number;
@tab;private int ccv;
@tab;private java.sql.Date expires;

@tab;public String getName() { return name; }
@tab;...
}

public class IntuitPaymentProvider {

@tab;public static boolean authorizePayment(BigDecimal amount, CardInfo cardInfo) {

@tab;@tab;...
@tab;}
}

public class Sale {

@tab;public boolean pay(BigDecimal amount, CardInfo cardInfo) {

@tab;@tab;return IntuitPaymentProvider.authorizePayment(amount, cardInfo);
@tab;}
}

Unfortunately we just built a procedural program again: we created static methods because we wanted global functions. And since IntuitPaymentProvider was the only payment provider we modeled the pay method in the Sale class around it. So when the boss says we need to change to a new provider we will have to go in and adjust all the client code that uses the static methods.

public class PaypalCardAuthorization {

@tab;public static boolean submit(String cardNumber, String name, int ccv, int month,
@tab;@tab;int year, BigDecimal amount) {

@tab;@tab;...
@tab;}
}

public class Sale {

@tab;public boolean pay(BigDecimal amount, CardInfo cardInfo) {

@tab;@tab;Calendar cardExpires = Calendar.getInstance();

@tab;@tab;cardExpires.setTimeInMillis(cardInfo.expires.getTime());

@tab;@tab;int month = cardExpires.get(Calendar.MONTH);
@tab;@tab;int year = cardExpires.get(Calendar.YEAR);

@tab;@tab;return PaypalCardAuthorization.submit(cardInfo.getNumber(), cardInfo.getName(),
@tab;@tab;@tab;cardInfo.getCcv(), month, year, amount);
@tab;}
}

You can see from the highlighted code that we had to modify a big chunk in the Sale class to get it to talk to the new Paypal provider we are using. The bad part is that violates the design principle "open for extension, closed for modification." The point of that principle is that when you open up a class to modify it because a dependency changes, then you risk breaking that class and all the other code that depends on that class. We cannot always avoid opening up a class, but we certainly want to minimize doing it. The article this link references takes a closer look at Robert Martin's first five design principles, of which open/closed is the second principle (Martin).

The solution to avoiding changes in the Sale class is to design to an abstraction. We can build an interface that the Sale class will use, and then adapt whatever third party module we choose to use to that interface. That explains the extra layer of adapter classes in the following diagram. The Sale class only knows about the abstraction PaymentProvider, it does not care what concrete class we give it to use as long as that class provides the interface.

The chances are good that we will need to change the provider in the future, after all if the boss did it once it will happen again. So all we need to do is add a new adapter class on top of the third-party stuff (PaymentProvider is open for extension!) and then the Sale class can use it (Sale remains closed for modification!). We have just utilized the adapter pattern from the Gang of Four design patterns book (Gamma).

But... static does not agree with this pattern: you cannot inherit and override static methods, and static methods cannot implement an interface. So the PaymentProvider cannot be a class with static methods, it will not work. To achieve the polymorphism the adapters must provide the interface to the Sale class, so the methods cannot be static there either. If all the Sale instances need to share the same adapter then the adapter must be a real instance, a singleton. The article referenced by this link discusses how and why to build singletons in more detail. My immediate goal here was to show why we do not twist static methods into the singleton role.

Here is what the PaypalPaymentAdapter and the Sale class look like for the model above:

public class PaypalPaymentAdapter implements PaymentProvider {

@tab;@Override
@tab;public boolean authorizePayment(BigDecimal amount, CardInfo cardInfo) {

@tab;@tab;Calendar cardExpires = Calendar.getInstance();

@tab;@tab;cardExpires.setTimeInMillis(cardInfo.getExpires().getTime());

@tab;@tab;int month = cardExpires.get(Calendar.MONTH);
@tab;@tab;int year = cardExpires.get(Calendar.YEAR);

@tab;@tab;return PaypalCardAuthorization.submit(cardInfo.getNumber(), cardInfo.getName(),
@tab;@tab;@tab;cardInfo.getCcv(), month, year, amount);
@tab;}
}

public class Sale {

@tab;public boolean pay(BigDecimal amount, CardInfo cardInfo) {

@tab;@tab;PaymentProvider paymentProvider = new PaypalPaymentAdapter();

@tab;@tab;return paymentProvider.authorizePayment(amount, cardInfo);
@tab;}
}

Of course even though we save the PaypalPaymentAdapater as the abstract PaymentProvider, the Sale class is still dependent on the concrete class PaypalPaymentAdapter in this example. The solution to fix that is a factory that externalizes the choice of the payment adapter outside of the Sale class, but that is more in-depth than we need to go here.

All of the examples from this article are set up as Java projects in the zip file StaticIsEvilExamples.zip^† for you. The projects were built in STS (based on Eclipse), but you should be able to look at the files with any text editor.

Wrap it up

These two examples are the most often misused occurrences of static members. If you understand why they should not be used then you can apply that to other situations as well. There are perfectly good reasons for using static: constants shared across instances or an application, the getInstance method in a singleton class to retrieve the one instance, etc.

The bottom line: static methods are attached to a class, and when used in client code they create a direct dependency between that code and the class. In an object-oriented program we want to use abstractions and polymorphism to remove as many dependencies as possible. So static and polymorphism simply do not agree with each other very well. Anything that is static just does not feel very object-oriented.

References

See the references page.

^† If downloading a zip file is blocked by security on your network try these tar files:

http://askjoelit.com/download/StaticIsEvilExamples.tar

Wednesday, June 25, 2014

Static is Evil

What is static?

When static is a bad idea!

Static methods because main is static

Static methods instead of a singleton

Wrap it up

No comments:

Post a Comment