Closure in C#
Assumptions
In this blog post, I’m allowing myself to assume that:
- You know the C# programming language (or any programming language that supports anonymous methods like JavaScript, for example) and
- Your code already uses delegates and lambda expressions (or anonymous methods).
Motivation
Delegates and Lambda expressions are so lovely:
- They make our code more concise and facilitate some tasks that would otherwise be complex and cumbersome.
- The introduction of delegates to the C# language allowed us to treat functions as first-class citizens, assign functions to variables, take functions as arguments of another function, or return a function from another function.
- Used with LINQ (Language Integrated Query), they offer powerful tools to process data in our daily programming tasks.
- And so on.
As with any powerful tool, though, there are some basic concepts that we need to understand to avoid the subtle bugs lurking in our codebase. One concept that gives lambda expressions their power is the concept of Closure.
What is a closure?
Let’s perform a few experiments with lambda expressions.
Experiment 1
Let’s suppose we have the code below:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
public class ClosureCsharp
{
public static void Main()
{
int x = 5;
int n = 2;
Func<int, int> f = y => y * n;
Console.WriteLine($"{x} x {n} = {f(x)}");
n = 3;
Console.WriteLine("After assignment.");
Console.WriteLine($"{x} x {n} = {f(x)}");
}
}
When running this code snippet, we get the output below:
Great! We got the correct result as we expected. But how?
As we all know, the scope evaluation inside a C# code block is from top to bottom: any declaration or assignment of variables is visible to subsequent expressions following this declaration/assignment inside this block, but an expression that uses a specific variable would not be aware of any future assignment to this variable. Thus, after the assignment int n = 2
, normally, we might think that the code at line 7
would be translated to:
and any future assignment of n
after this line would not affect the value of f
. But why did the assignment n = 3
cause the behavior of f
to change? Well, as you might be expecting, the answer is Closures.
What is happening in line 7
is that the lambda expression captures the variable n
in a context so that every modification on n
would have an impact on the function f
. This process is called variable capturing: the context “closes over” the variable, hence the name closure.
As the picture above shows, the delegate f
will refer to the created context. Any assignment to the variable n
would impact this context, hence changing the behavior of f
.
In line 11
, after the assignment n = 3
, the captured variable will be updated to 3.
That is how we get the correct answer: 5 x 3 = 15
.
Also, it’s worth noting that the compiler performs this context creation at compile-time and is completely transparent to the programmer. How it is implemented will vary by the scope of the variable captured (a local variable, an instance variable, a method argument, etc.) and any optimization performed by the compiler. The context often is implemented as a private class, and each delegate invocation is translated into the call to an instance method on an object created from this class. You may find more details about this in Jon Skeet’s book C# in Depth.
Experiment 2
As the Motivation section discusses, functions are first-class citizens in C# using delegates and lambda expressions. As such, they can be assigned to variables (as we’ve already done), passed as method arguments, and returned by methods. So now, let’s suppose that the lambda expression is provided by a helper function MultiplierBy()
:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
public class ClosureCsharp
{
public static void Main()
{
int x = 5;
int n = 2;
Func<int, int> f = MultiplierBy(n);
Console.WriteLine($"{x} x {n} = {f(x)}");
n = 3;
Console.WriteLine($"{x} x {n} = {f(x)}");
}
static Func<int, int> MultiplierBy(int n)
{
return y => y * n;
}
}
Which give us the output:
Damn! Wrong answer after the assignment at line 11
!
I want to raise two points from this experiment:
-
First, at line
7
, even if we’ve declared a delegate and it gets its value from a helper function, no variable capturing has been performed. It enforces that variable capturing is a compiler trick, and it happens once and for all at compile time, not at runtime. The compiler would create the “capturing context” only when it finds a declaration of a delegate using a lambda expression (or an anonymous method), which refers to variables visible to this declaration, leading us to our second point. -
Variable capturing has been performed inside the
MultiplierBy()
helper function. Here, the context captures the parametern
of this method, not the variablen
declared at line6
. Then, this context has been returned to the calling method and assigned to the delegatef
. The image below depicts this process:
After the assignment at line 11
, we have the situation below:
Here, n
is outside the context referred by f
; hence, any assignment performed to n
won’t impact f
.
Experiment 3: multiple variable capture
In this last experiment, let’s consider this code snippet:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
public class ClosureCsharp
{
public static void Main()
{
List<Action> actions = new List<Action>();
for (var i = 0; i < 10; i++)
{
var msg = $"Do action #{i}";
actions.Add(() => Console.WriteLine(msg));
}
foreach(var action in actions)
{
action();
}
}
}
Here, we declare a list of Action
delegates. A for
loop is used to populate this list, and then a foreach
loop is used to iterate over this list to call each Action
.
When running this code, we get the output:
Great! So far, so good.
Once again, it may seem more or less evident that we got the correct result, but let’s examine what is happening inside the for
loop.
Here, as we know, till this point, since the lambda expression is using the variable msg
, a context will be created to capture this variable.
Visually, this can be described as follows:
- First, the compiler sees that the lambda expression uses a variable it “sees” in its scope. The compiler will then create a “mold” of the context to be used:
- Since
msg
is declared for each iteration, an instance of this context is created for each iteration, and each context contains a different “version” ofmsg
. - For each iteration, a reference to a context is added to the
actions
collection.
Hence, when we iterate over the actions
collection, we iterate over the created contexts (or closures).
Now, let’s modify a little bit our code:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
public class ClosureCsharp
{
public static void Main()
{
List<Action> actions = new List<Action>();
for (int i = 0; i < 10; i++)
{
actions.Add(() => Console.WriteLine($"Do action #{i}"));
}
foreach (var action in actions)
{
action();
}
}
}
In line 9
, we’ve decided to save a line of code by passing directly the interpolated string to the lambda expression without using a temporary variable msg
.
Removing a temporary variable won’t hurt that much, huh? Well, let’s see what this will give us as output:
Phew! Once again, it’s closures in action. This one is among the subtleties that give us a headache when debugging a codebase that uses a lot of lambda expressions inside loops.
So, what’s going on here? Well, let’s take a look at the for
loop again:
As previously mentioned, variable capturing occurs inside the for
loop, but this time, it’s the turn of the variable i
to be captured.
Since the variable i
is visible for each iteration of the for
loop, multiple contexts are still created, but they all share a reference to this captured variable. Visually, this can be seen as follows:
So, any update on the value of i
will impact all the created contexts. Since the final value of i
is 9, all the contexts capturing i
will share this value. That’s why we got the repeated text as output.
Now, let’s push this experiment a little bit forward.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
public class ClosureCsharp
{
public static void Main()
{
List<Action> actions = new List<Action>();
int i;
for (i = 0; i < 10; i++)
{
actions.Add(() => Console.WriteLine($"Do action #{i}"));
}
i = 11;
actions.Add(() => Console.WriteLine($"Do outside action #{i}"));
i = 12;
actions.Add(() => Console.WriteLine($"Do another outside action #{i}"));
i = 13;
foreach (var action in actions)
{
action();
}
}
}
Here, in line 6
, we’ve declared the variable i
outside the for
loop so that it will be accessible outside of it. And in lines 14
and 18
, we’ve added two other actions to the actions
list. Both of those outside actions refer to the variable i
. Before iterating over the actions
collection, we assign 13 to i
. That gives us the output below:
Once again, both outside lambda expressions captured the variable i
, so they will share it with the ten others created inside the for
loop (as shown in the picture below). As we may now expect, the assignment of value 13 to the captured variable before iterating over the actions
collection will impact every closure sharing it.
Bottom line
As seen in this blog post, closures are significant elements to be considered when using lambda expressions (or anonymous methods) in our code. They make lambda expressions as intuitive as possible, but care should always be taken, mainly when used inside loops.