Constructors vs static methods in C#

We create objects in C# by calling their constructor. Simple, just a “new Class()”. There are other ways to create objects though: one is…

In C#, we create objects by calling their constructor. Simple, just a “new Class()”. There are other ways to create objects though: one is to call a static method of that class, like “Class.Create()”. Why do such methods exist, and why can’t they be simple constructors?

Let’s start with an example. We all know that you can initialize an integer with a simple int i = 5; But for the sake of it, we’ll be using the object representation for int, the Int32 class. For instance, you can initialize an Int32 using:

var i = new Int32(5);

or by using the Parse method if the input parameter is a string:

var i = Int32.Parse("5");

Why isn’t the latter also a constructor with a string parameter? Why isn’t it simply:

var i = new Int32("5");

Why really. Let’s rewind this a bit…

Why do we have constructors at all?

Because a meaningful initial state of an object may not be just a bunch of zeros. You might need some extra code to avoid an invalid state and bring the object to the desired initial state. Constructors basically incorporate your custom initialization code into object creation ceremony using a concise syntax.

Constructors are also atomic. The compiler and runtime ensures that you’ll never have an instance of your class that is allocated but not fully initialized at any point in the lifetime of your application, thanks to constructors. That’s handy for avoiding bugs.

Let’s imagine we didn’t have constructors. In that case, we would have to allocate and initialize the object separately, possibly like this:

var i = new Int32;
i.Init(5);

It would mean that at the time between the allocation and initialization, there would be an accessible reference to the newly created object albeit in an invalid state. That’s unsafe for multi-threaded programming and prone to bugs at the same time. Especially if the reference is accessible out of scope.

Constructors make our lives easier. They save us from writing a name for each of our initialization code, because they are nameless in C#. The memory allocation and custom initialization are lumped together in a single, simple and an atomic call.

What is Int32.Parse()?

Static methods that return an instance of the same class usually do more than just to set some field members and simple validation. Int32.Parse() does a lot: it validates the string according to a specific culture by going over each character, and converts it to an integer using the provided options.

Then, let’s look at what would happen if we used new Int32(string) for parsing:

  • The behavior would be ambiguous. It would imply that Int32 class supported a text representation of integers, possibly in another hidden field. We might hope that it would parse it and convert it to an integer representation but we wouldn’t know. Even if we assumed it would parse the strings, we could not differentiate it from a method like, say, ParseHex(string).
  • It would imply little or no overhead. Constructors are lean. They do simple validation and initialization. We may not know that whole stack of globalization namespace and complicated string parsing is in the works when we initialize our class. Our poor coder might opt to use new Int32("5") without knowing that it is orders of magnitude slower than simply allocating the integer. It would scale badly and performance issues.
  • The intent may not be clear. C# constructors are nameless, which means they have no way to convey their behavior to the caller. On the other hand, the name “Parse” tells a lot. It implies that there is some processing going on behind the scenes. It supposedly follows some rules for that. Its name is part of a gang of other Parse() methods in other classes so you can leverage your existing knowledge. You get a better idea about its performance characteristics and what it does. With a constructor, you’d have to be extra careful with the documentation.

I think these all can be summarized as: it would be unclear and ambiguous.

When to use a static method?

It becomes easier to decide if we know what can go wrong when we try to solve all our problems with constructors. Static methods that return an object instance can be useful when:

  • The initialization code does significantly more work than just simple assignment and/or validation and the performance penalty is worth to be emphasized with a name.
  • The intent is different than simply initializing a class so it needs to be conveyed to the programmer with a method name.
  • There may be multiple methods with the same set of parameters and they need to be differentiated. We wouldn’t have that problem though if we had used a language that supported argument labels like Swift, or arbitrary names for constructors like Object Pascal.

There is one more case when using a static method over a constructor might be preferable. Take Tuple for instance. You can initialize a Tuple with this code:

var tuple = Tuple.Create(1, 2, 3);

Why isn’t it just a constructor? Because C# does not support generic type-inference on constructors. A static method allows you to omit type specifier, unlikenew Tuple<int>(1, 2, 3) and let the compiler work its magic. You might opt for static methods for such cases.

There might be other cases that I might have missed, but the basic distinction of “simple initialization of object fields” versus “complex and time-taking logic” can be a good starting point when deciding.

On a final note, it’s usually a good idea to think about why you’re doing things in a certain way. That might even expose redundant or problematic traditions in the culture of software development. We are so involved with our daily work sometimes that we blindly waste hours performing completely unhelpful tasks, like over-commenting our code. Let this be a reminder to you to do a reality check once in a while.