## Tail recursion in C#

Regardless of the programming language you’re using, there are tasks for which the most natural implementation uses a recursive algorithm (even if it’s not always the optimal solution). The trouble with the recursive approach is that it can use a lot of space on the stack: when you reach a certain recursion depth, the memory allocated for the thread stack runs out, and you get a stack overflow error that usually terminates the process (`StackOverflowException` in .NET).

###### Terminal recursion? What’s that?

Some languages, more particularly functional languages, have native support for an optimization technique called tail recursion. The idea is that if the recursive call is the last instruction in a recursive function, there is no need to keep the current call context on the stack, since we won’t have to go back there: we only need to replace the parameters with their new values, and jump back to the beginning of the function. So the recursion is transformed into an iteration, so it can’t cause a stack overflow. This notion being quite new to me, I won’t try to give a full course about tail recursion… much smarter people already took care of it! I suggest you follow the Wikipedia link above, which is a good starting point to understand tail recursion.

Unfortunately, the C# compiler doesn’t support tail recursion, which is a pity, since the CLR supports it. However, all is not lost! Some people had a very clever idea to work around this issue: a technique called “trampoline” (because it makes the function “bounce”) that allows to easily transform a recursive algorithm into an iterative algorithm. Samuel Jack has a good explanation of this concept on his blog. In the rest of this article, we will see how to apply this technique to a simple algorithm, using the class from Samuel Jack’s article; then I’ll present another implementation of the trampoline, which I find more flexible.

###### A simple use case in C#

Let’s see how we can transform a simple recursive algorithm, like the computation of the factorial of a number, into an algorithm that uses tail recursion (incidentally, the factorial can be computed much more efficiently with a non-recursive algorithm, but let’s assume we don’t know that…). Here’s a basic implementation that results directly from the definition:

```BigInteger Factorial(int n)
{
if (n < 2)
return 1;
return n * Factorial(n - 1);
}

```

(Note the use of `BigInteger`: if we are to make the recursion deep enough to observe the effects of tail recursion, the result will be far beyond the capacity of an int or even a `long`…)

If we call this method with a large value (around 20000 on my machine), we get an error which was quite predictable: `StackOverflowException`. We made so many nested call to the `Factorial` method that we exhausted the capacity of the stack. So we’re going to modify this code so that it can benefit from tail recursion…

As mentioned above, the key requirement for tail recursion is that the method calls itself as the last instruction. It seems to be the case here… but it’s not: the last operation is actually the multiplication, which can’t be executed until we know the result of `Factorial(n-1)`. So we need to redesign this method so that it ends with a call to itself, with different arguments. To do that, we can add a new parameter named `product`, which will act as an accumulator:

```BigInteger Factorial(int n, BigInteger product)
{
if (n < 2)
return product;
return Factorial(n - 1, n * product);
}

```

For the first call, we’ll just have to pass 1 for the initial value of the accumulator.

We now have a method that meets the requirements for tail recursion: the recursive call to `Factorial` really is the last instruction. Now that we have put the algorithm in this form, the final transformation to enable tail recursion using Samuel Jack’s trampoline is trivial:

```Bounce<int, BigInteger, BigInteger> Factorial(int n, BigInteger product)
{
if (n < 2)
return Trampoline.ReturnResult<int, BigInteger, BigInteger>(product);
return Trampoline.Recurse<int, BigInteger, BigInteger>(n - 1, n * product);
}

```
• Instead of returning the final result directly, we call `Trampoline.ReturnResult` to tell the trampoline that we now have a result
• The recursive call to `Factorial` is replaced with a call to `Trampoline.Recurse`, which tells the trampoline that the method needs to be called again with different parameters

This method can’t be used directly: it returns a `Bounce` object, and we don't really know what to do with this… To execute it, we use the `Trampoline.MakeTrampoline` method, which returns a new function on which tail recursion is applied. We can then use this new function directly:

```Func<int, BigInteger, BigInteger> fact = Trampoline.MakeTrampoline<int, BigInteger, BigInteger>(Factorial);
BigInteger result = fact(50000, 1);

```

We can now compute the factorial of large numbers, with no risk of causing a stack overflow… Admittedly, it’s not very efficient: as mentioned before, there are better ways of computing a factorial, and furthermore, computations involving `BigInteger`s are much slower than with `int`s or `long`s.

###### Can we make it better?

Well, you can guess that I wouldn’t be asking the question unless the answer was yes… The trampoline implementation demonstrated above does its job well enough, but I think it could be made more flexible and easier to use:

• It only works if you have 2 parameters (of course we can adapt it for a different number of parameters, but then we need to create new methods with adequate signatures for each different arity)
• The syntax is quite unwieldy: there are 3 type arguments, and we need to specify them every time because the compiler doesn’t have enough information to infer them automatically
• Having to use `MakeTrampoline` just to create a new function that we can then call isn’t very convenient; it would be more intuitive to have an `Execute` method that returns the result directly

And finally, I think the terminology isn’t very explicit… Names like `Trampoline` and `Bounce` sound like fun, but they don’t really reveal the intent.

So I tried to improve the system to make it more convenient. My solution is based on lambda expressions. There is only one type argument (the return type), and the parameters are passed trough a closure, so there is no need for multiple methods to handle different numbers of parameters. Here’s what the `Factorial` method looks like with my implementation:

```RecursionResult<BigInteger> Factorial(int n, BigInteger product)
{
if (n < 2)
return TailRecursion.Return(product);
return TailRecursion.Next(() => Factorial(n - 1, n * product));
}

```

It can be used as follows:

```BigInteger result = TailRecursion.Execute(() => Factorial(50000, 1));

```

It’s more flexible, more concise, and more readable…in my opinion at least. The downside is that performance is slightly worse than before (it takes about 20% longer to compute the factorial of 50000), probably because of the delegate creation at each level of recursion.

Here’s the full code for the `TailRecursion` class:

```public static class TailRecursion
{
public static T Execute<T>(Func<RecursionResult<T>> func)
{
do
{
var recursionResult = func();
if (recursionResult.IsFinalResult)
return recursionResult.Result;
func = recursionResult.NextStep;
} while (true);
}

public static RecursionResult<T> Return<T>(T result)
{
return new RecursionResult<T>(true, result, null);
}

public static RecursionResult<T> Next<T>(Func<RecursionResult<T>> nextStep)
{
return new RecursionResult<T>(false, default(T), nextStep);
}

}

public class RecursionResult<T>
{
internal RecursionResult(bool isFinalResult, T result, Func<RecursionResult<T>> nextStep)
{
_isFinalResult = isFinalResult;
_result = result;
_nextStep = nextStep;
}

public bool IsFinalResult { get { return _isFinalResult; } }
public T Result { get { return _result; } }
public Func<RecursionResult<T>> NextStep { get { return _nextStep; } }
}

```
###### Is there a better way to accomplish tail recursion in C#?

Sure! But it gets a little tricky, and it’s not pure C#. As I mentioned before, the CLR supports tail recursion, through the `tail` instruction. Ideally, the C# compiler would automatically generate this instruction for methods that are eligible to tail recursion, but unfortunately it’s not the case, and I don’t think this will ever be supported given the low demand for this feature.

Anyway, we can cheat a little by helping the compiler to do its job: the .NET Framework SDK provides tools named ildasm (IL disassembler) and ilasm (IL assembler), which can help to fill the gap between C# and the CLR… Let’s go back to the classical recursive implementation of `Factorial`, which doesn’t yet use tail recursion:

```static BigInteger Factorial(int n, BigInteger product)
{
if (n < 2)
return product;
return Factorial(n - 1, n * product);
}

```

If we compile this code and disassemble it with ilasm, we get the following IL code:

```.method private hidebysig static valuetype [System.Numerics]System.Numerics.BigInteger
Factorial(int32 n,
valuetype [System.Numerics]System.Numerics.BigInteger product) cil managed
{
// Code size       41 (0x29)
.maxstack  3
.locals init (valuetype [System.Numerics]System.Numerics.BigInteger V_0,
bool V_1)
IL_0000:  nop
IL_0001:  ldarg.0
IL_0002:  ldc.i4.2
IL_0003:  clt
IL_0005:  ldc.i4.0
IL_0006:  ceq
IL_0008:  stloc.1
IL_0009:  ldloc.1
IL_000a:  brtrue.s   IL_0010

IL_000c:  ldarg.1
IL_000d:  stloc.0
IL_000e:  br.s       IL_0027

IL_0010:  ldarg.0
IL_0011:  ldc.i4.1
IL_0012:  sub
IL_0013:  ldarg.0
IL_0014:  call       valuetype [System.Numerics]System.Numerics.BigInteger [System.Numerics]System.Numerics.BigInteger::op_Implicit(int32)
IL_0019:  ldarg.1
IL_001a:  call       valuetype [System.Numerics]System.Numerics.BigInteger [System.Numerics]System.Numerics.BigInteger::op_Multiply(valuetype [System.Numerics]System.Numerics.BigInteger,
valuetype [System.Numerics]System.Numerics.BigInteger)
IL_001f:  call       valuetype [System.Numerics]System.Numerics.BigInteger Program::Factorial(int32,
valuetype [System.Numerics]System.Numerics.BigInteger)
IL_0024:  stloc.0
IL_0025:  br.s       IL_0027

IL_0027:  ldloc.0
IL_0028:  ret
} // end of method Program::Factorial

```

It’s a bit hard on the eye if you’re not used to read IL code, but we can see roughly what’s going on… The recursive call is at offset `IL_001f;` this is where we’re going to fiddle with the generated code to introduce tail recursion. If we look at the documentation for the `tail` instruction, we see that it must immediately precede a `call` instruction, and that the instruction following the `call` must be `ret` (return). Right now, we have several instructions following the recursive call, because the compiler introduced a local variable to store the return value. We just need to modify the code so that it doesn’t use this variable, and add the `tail` instruction in the right place:

```.method private hidebysig static valuetype [System.Numerics]System.Numerics.BigInteger
Factorial(int32 n,
valuetype [System.Numerics]System.Numerics.BigInteger product) cil managed
{
// Code size       41 (0x29)
.maxstack  3
.locals init (valuetype [System.Numerics]System.Numerics.BigInteger V_0,
bool V_1)
IL_0000:  nop
IL_0001:  ldarg.0
IL_0002:  ldc.i4.2
IL_0003:  clt
IL_0005:  ldc.i4.0
IL_0006:  ceq
IL_0008:  stloc.1
IL_0009:  ldloc.1
IL_000a:  brtrue.s   IL_0010

IL_000c:  ldarg.1
IL_000d:  ret		// Return directly instead of storing the result in V_0
IL_000e:  nop

IL_0010:  ldarg.0
IL_0011:  ldc.i4.1
IL_0012:  sub
IL_0013:  ldarg.0
IL_0014:  call       valuetype [System.Numerics]System.Numerics.BigInteger [System.Numerics]System.Numerics.BigInteger::op_Implicit(int32)
IL_0019:  ldarg.1
IL_001a:  call       valuetype [System.Numerics]System.Numerics.BigInteger [System.Numerics]System.Numerics.BigInteger::op_Multiply(valuetype [System.Numerics]System.Numerics.BigInteger,
valuetype [System.Numerics]System.Numerics.BigInteger)
IL_001f:  tail.
IL_0020:  call       valuetype [System.Numerics]System.Numerics.BigInteger Program::Factorial(int32,
valuetype [System.Numerics]System.Numerics.BigInteger)
IL_0025:  ret		// Return directly instead of storing the result in V_0

} // end of method Program::Factorial

```

If we reassemble this code with ilasm, we get a new executable, which runs without issues even for large values which made the old code crash. Performance is also pretty good: about 3 times as fast than the version using the `Trampoline` class. If we compare the performance for smaller values (so that the old code doesn’t crash), we can see that it’s also 3 times as fast as the recursive version with no tail recursion.

Of course, this is just a proof of concept… it doesn’t seem very realistic to perform this transformation manually in a “real” project. However, it might be possible to create a tool that rewrites assemblies automatically after the compilation to introduce tail recursion.

## [C#] A simple implementation of the WeakEvent pattern

As you probably know, incorrect usage of events is one of the main causes for memory leaks in .NET applications : an event keeps references to its listener objects (through a delegate), which prevents the garbage collector from collecting them when they’re not used anymore. This is especially true of static events, because the references are kept for all the lifetime of the application. If the application often adds handlers to the event and never removes them, the memory usage will grow as long as the application runs, until no more memory is available.

The “obvious” solution, of course, is to unsubscribe from the event when you’re done with it. Unfortunately, it’s not always obvious to know when you can unsubscribe… an object that goes out of scope usually isn’t aware of it, so it doesn’t have a chance to unsubscribe from the event.

Another approach is to implement the WeakEvent pattern, which principle is to keep only weak references to the listeners. That way, unsubscribed listeners can be claimed by the garbage collector. Microsoft included in WPF a few types to deal with the WeakEvent pattern (`WeakEventManager` class and `IWeakEventListener` interface), and gives guidelines on how to implement your own weak event. However this technique is not very convenient, because you need to create dedicated classes to expose new events, and the listeners need to implement a specific interface.

So I thought about another implementation, which allows creating weak events almost the same way as normal events. My first idea was to use a list of `WeakReference`s to store the list of subscribed delegates. But this doesn’t work so well, because of the way we typically use delegates :

```myObject.MyEvent += new EventHandler(myObject_MyEvent);
```

We create a delegate, subscribe it to the event, and… drop it. So the only accessible reference to the delegate is actually a weak reference, so there’s nothing to prevent its garbage collection… and that’s exactly what happens ! After a variable period of time (from my observations, no more than a few seconds), the delegate is garbage collected, and isn’t called anymore when the event is raised.

Rather than keeping a weak reference to the delegate itself, we should use a less transient object : the target object of the delegate (`Delegate.Target`) would be a better choice. So I created the `WeakDelegate<TDelegate>` class, which wraps a delegate by storing separately the method and a weak reference to the target :

```    public class WeakDelegate<TDelegate> : IEquatable<TDelegate>
{
private WeakReference _targetReference;
private MethodInfo _method;

public WeakDelegate(Delegate realDelegate)
{
if (realDelegate.Target != null)
_targetReference = new WeakReference(realDelegate.Target);
else
_targetReference = null;
_method = realDelegate.Method;
}

public TDelegate GetDelegate()
{
return (TDelegate)(object)GetDelegateInternal();
}

private Delegate GetDelegateInternal()
{
if (_targetReference != null)
{
return Delegate.CreateDelegate(typeof(TDelegate), _targetReference.Target, _method);
}
else
{
return Delegate.CreateDelegate(typeof(TDelegate), _method);
}
}

public bool IsAlive
{
get { return _targetReference == null || _targetReference.IsAlive; }
}

#region IEquatable<TDelegate> Members

public bool Equals(TDelegate other)
{
Delegate d = (Delegate)(object)other;
return d != null
&& d.Target == _targetReference.Target
&& d.Method.Equals(_method);
}

#endregion

internal void Invoke(params object[] args)
{
Delegate handler = (Delegate)(object)GetDelegateInternal();
handler.DynamicInvoke(args);
}
}
```

Now, we just need to manage a list of these `WeakDelegate<TDelegate>`. This is done by the `WeakEvent<TDelegate>` class :

```    public class WeakEvent<TEventHandler>
{
private List<WeakDelegate<TEventHandler>> _handlers;

public WeakEvent()
{
_handlers = new List<WeakDelegate<TEventHandler>>();
}

{
Delegate d = (Delegate)(object)handler;
}

public virtual void RemoveHandler(TEventHandler handler)
{
// also remove "dead" (garbage collected) handlers
_handlers.RemoveAll(wd => !wd.IsAlive || wd.Equals(handler));
}

public virtual void Raise(object sender, EventArgs e)
{
var handlers = _handlers.ToArray();
foreach (var weakDelegate in handlers)
{
if (weakDelegate.IsAlive)
{
weakDelegate.Invoke(sender, e);
}
else
{
_handlers.Remove(weakDelegate);
}
}
}

protected List<WeakDelegate<TEventHandler>> Handlers
{
get { return _handlers; }
}
}
```

This class automatically handles the removal of “dead” (garbage collected) handlers, and provides a `Raise` method to call the handlers. It can be used as follows :

```        private WeakEvent<EventHandler> _myEvent = new WeakEvent<EventHandler>();
public event EventHandler MyEvent
{
remove { _myEvent.RemoveHandler(value); }
}

protected virtual void OnMyEvent()
{
_myEvent.Raise(this, EventArgs.Empty);
}
```

This is a bit longer to write than a “regular” event, but considering the benefits, it’s very acceptable. Anyway, you can easily create a Visual Studio snippet to quickly create a weak event, with only 3 fields to fill in :

```<?xml version="1.0" encoding="utf-8" ?>
<CodeSnippets  xmlns="http://schemas.microsoft.com/VisualStudio/2005/CodeSnippet">
<CodeSnippet Format="1.0.0">
<Title>wevt</Title>
<Shortcut>wevt</Shortcut>
<Description>Code snippet for a weak event</Description>
<Author>Thomas Levesque</Author>
<SnippetTypes>
<SnippetType>Expansion</SnippetType>
</SnippetTypes>
<Snippet>
<Declarations>
<Literal>
<ID>type</ID>
<ToolTip>Event type</ToolTip>
<Default>EventHandler</Default>
</Literal>
<Literal>
<ID>event</ID>
<ToolTip>Event name</ToolTip>
<Default>MyEvent</Default>
</Literal>
<Literal>
<ID>field</ID>
<ToolTip>Name of the field holding the registered handlers</ToolTip>
<Default>_myEvent</Default>
</Literal>
</Declarations>
<Code Language="csharp">
<![CDATA[private WeakEvent<\$type\$> \$field\$ = new WeakEvent<EventHandler>();
public event \$type\$ \$event\$
{
remove { \$field\$.RemoveHandler(value); }
}

protected virtual void On\$event\$()
{
\$field\$.Raise(this, EventArgs.Empty);
}
\$end\$]]>
</Code>
</Snippet>
</CodeSnippet>
</CodeSnippets>
```

This snippet gives the following result in Visual Studio :

## Automating null checks with Linq expressions

The problem

Have you ever written code like the following ?

```X xx = GetX();
string name = "Default";
if (xx != null && xx.Foo != null && xx.Foo.Bar != null && xx.Foo.Bar.Baz != null)
{
name = xx.Foo.Bar.Baz.Name;
}
```

I bet you have ! You just need to get the value of `xx.Foo.Bar.Baz.Name`, but you have to test every intermediate object to ensure that it’s not null. It can quickly become annoying if the property you need is nested in a deep object graph….

A solution

Linq offers a very interesting feature which can help solve that problem : expressions. C# 3.0 makes it possible to retrieve the abstract syntax tree (AST) of a lambda expression, and perform all kinds of manipulations on it. It is also possible to dynamically generate an AST, compile it to obtain a delegate, and execute it.

How is this related to the problem described above ? Well, Linq makes it possible to analyse the AST for the expression that accesses the `xx.Foo.Bar.Baz.Name` property, and rewrite that AST to insert null checks where needed. So we’re going to create a `NullSafeEval` extension method, which takes as a parameter the lambda expression defining how to access a property, and the default value to return if a null object is encountered along the way.

That method will transform the expression `xx.Foo.Bar.Baz.Name` into that :

```    (xx == null)
? defaultValue
: (xx.Foo == null)
? defaultValue
: (xx.Foo.Bar == null)
? defaultValue
: (xx.Foo.Bar.Baz == null)
? defaultValue
: xx.Foo.Bar.Baz.Name;
```

Here’s the implementation of the `NullSafeEval` method :

```        public static TResult NullSafeEval<TSource, TResult>(this TSource source, Expression<Func<TSource, TResult>> expression, TResult defaultValue)
{
var safeExp = Expression.Lambda<Func<TSource, TResult>>(
NullSafeEvalWrapper(expression.Body, Expression.Constant(defaultValue)),
expression.Parameters[0]);

var safeDelegate = safeExp.Compile();
return safeDelegate(source);
}

private static Expression NullSafeEvalWrapper(Expression expr, Expression defaultValue)
{
Expression obj;
Expression safe = expr;

while (!IsNullSafe(expr, out obj))
{
var isNull = Expression.Equal(obj, Expression.Constant(null));

safe =
Expression.Condition
(
isNull,
defaultValue,
safe
);

expr = obj;
}
return safe;
}

private static bool IsNullSafe(Expression expr, out Expression nullableObject)
{
nullableObject = null;

if (expr is MemberExpression || expr is MethodCallExpression)
{
Expression obj;
MemberExpression memberExpr = expr as MemberExpression;
MethodCallExpression callExpr = expr as MethodCallExpression;

if (memberExpr != null)
{
// Static fields don't require an instance
FieldInfo field = memberExpr.Member as FieldInfo;
if (field != null && field.IsStatic)
return true;

// Static properties don't require an instance
PropertyInfo property = memberExpr.Member as PropertyInfo;
if (property != null)
{
MethodInfo getter = property.GetGetMethod();
if (getter != null && getter.IsStatic)
return true;
}
obj = memberExpr.Expression;
}
else
{
// Static methods don't require an instance
if (callExpr.Method.IsStatic)
return true;

obj = callExpr.Object;
}

// Value types can't be null
if (obj.Type.IsValueType)
return true;

// Instance member access or instance method call is not safe
nullableObject = obj;
return false;
}
return true;
}
```

In short, this code walks up the lambda expression tree, and surrounds each property access or instance method call with a conditional expression (condition ? value if true : value if false).

And here’s how we can use this method :

```string name = xx.NullSafeEval(x => x.Foo.Bar.Baz.Name, "Default");
```

Much clearer and concise than our initial code, isn’t it ? :)

Note that the proposed implementation handles not only properties, but also method calls, so we could write something like that :

```string name = xx.NullSafeEval(x => x.Foo.GetBar(42).Baz.Name, "Default");
```

Indexers are not handled yet, but they could be added quite easily ; I will leave it to you to do it if you have the use for it ;)

Limitations

Even though that solution can seem very interesting at first sight, please read what follows before you integrate this code into a real world program…

• First, the proposed code is just a proof of concept, and as such, hasn’t been thoroughly tested, so it’s probably not very reliable.
• Secondly, keep in mind that dynamic code generation from an expression tree is tough work for the CLR, and will have a big impact on performance. A quick test shows that using the `NullSafeEval` method is about 10000 times slower than accessing the property directly…

A possible approach to limit that issue would be to cache the delegates generated for each expression, to avoid regenerating them every time. Unfortunately, as far as I know there is no simple and reliable way to compare two Linq expressions, which makes it much harder to implement such a cache.

• Last, you might have noticed that intermediate properties and methods are evaluated several times ; not only this is bad for performance, but more importantly, it could have side effects that are hard to predict, depending on how the properties and methods are implemented.

A possible workaround would be to rewrite the conditional expression as follows :

```Foo foo = null;
Bar bar = null;
Baz baz = null;
var name =
(x == null)
? defaultValue
: ((foo = x.Foo) == null)
? defaultValue
: ((bar = foo.Bar) == null)
? defaultValue
: ((baz = bar.Baz) == null)
? defaultValue
: baz.Name;
```

Unfortunately, this is not possible in .NET 3.5 : that version only supports simple expressions, so it’s not possible to declare variables, assign values to them, or write several distinct instructions. However, in .NET 4.0, support for Linq expressions has been largely improved, and makes it possible to generate that kind of code. I’m currently trying to improve the `NullSafeEval` method to take advantage of the new .NET 4.0 features, but it turns out to be much more difficult than I had anticipated… If I manage to work it out, I’ll let you know and post the code !

To conclude, I wouldn’t recommend using that technique in real programs, at least not in its current state. However, it gives an interesting insight on the possibilities offered by Linq expressions. If you’re new to this, you should know that Linq expressions are used (among other things) :

• To generate SQL queries in ORMs like Linq to SQL or Entity Framework
• To build complex predicates dynamically, like in the PredicateBuilder class by Joseph Albahari
• To implement “static reflection”, which has generated a lot of buzz on technical blogs lately

## [C#] Parent/child relationship and XML serialization

Today I’d like to present an idea that occurred to me recently. Nothing about WPF this time, this is all about C# class design !

The problem

It’s very common in C# programs to have an object that owns a collection of child items with a reference to their parent. For instance, this is the case for Windows Forms controls, which have a collection of child controls (`Controls`), and a reference to their parent control (`Parent`).

This kind of structure is quite easy to implement, it just requires a bit of plumbing to maintain the consistency of the parent/child relationship. However, if you want to serialize the parent object to XML, it can get tricky… Let’s take a simple, purely theoretical example :

```    public class Parent
{
public Parent()
{
this.Children = new List<Child>();
}

public string Name { get; set; }

public List<Child> Children { get; set; }

{
child.ParentObject = this;
}

public void RemoveChild(Child child)
{
this.Children.Remove(child);
child.ParentObject = null;
}
}
```
```    public class Child
{
public string Name { get; set; }

public Parent ParentObject { get; set; }
}
```

Let’s create an instance of `Parent` with a few children, and try to serialize it to XML :

```            Parent p = new Parent { Name = "The parent" };
p.AddChild(new Child { Name = "First child" });
p.AddChild(new Child { Name = "Second child" });

string xml;
XmlSerializer xs = new XmlSerializer(typeof(Parent));
using (StringWriter wr = new StringWriter())
{
xs.Serialize(wr, p);
xml = wr.ToString();
}

Console.WriteLine(xml);
```

When we try to serialize the `Parent` object, an `InvalidOperationException` occurs, saying that a circular reference was detected : indeed, the parent references the children, which in turn reference the parent, which references the children… and so on. The obvious solution to that issue is to suppress the serialization of the `Child.ParentObject` property, which can be done easily by using the `XmlIgnore` attribute. With that change the serialization works fine, but the problem is not solved yet : when we deserialize the object, the `ParentObject` property of the children is not set, since it wasn’t serialized… the consistency of the parent/child relationship is broken !

A simple and naive solution would be to loop through the `Children` collection after the deserialization, in order to set the `ParentObject` manually. But it’s definitely not an elegant approach… and since I really like elegant code, I thought of something else ;)

The solution

The idea I had to solve this problem consists of a specialized generic collection `ChildItemCollection<P,T>`, and a `IChildItem<P>` interface that must be implemented by children.

The `IChildItem<P>` interface just defines a `Parent` property of type P :

```    /// <summary>
/// Defines the contract for an object that has a parent object
/// </summary>
/// <typeparam name="P">Type of the parent object</typeparam>
public interface IChildItem<P> where P : class
{
P Parent { get; set; }
}
```

The `ChildItemCollection<P,T>` class implements `IList<T>` by delegating the implementation to a `List<T>` (or to a collection passed to the constructor), and maintains the parent/child relationship :

```    /// <summary>
/// Collection of child items. This collection automatically set the
/// Parent property of the child items when they are added or removed
/// </summary>
/// <typeparam name="P">Type of the parent object</typeparam>
/// <typeparam name="T">Type of the child items</typeparam>
public class ChildItemCollection<P, T> : IList<T>
where P : class
where T : IChildItem<P>
{
private P _parent;
private IList<T> _collection;

public ChildItemCollection(P parent)
{
this._parent = parent;
this._collection = new List<T>();
}

public ChildItemCollection(P parent, IList<T> collection)
{
this._parent = parent;
this._collection = collection;
}

#region IList<T> Members

public int IndexOf(T item)
{
return _collection.IndexOf(item);
}

public void Insert(int index, T item)
{
if (item != null)
item.Parent = _parent;
_collection.Insert(index, item);
}

public void RemoveAt(int index)
{
T oldItem = _collection[index];
_collection.RemoveAt(index);
if (oldItem != null)
oldItem.Parent = null;
}

public T this[int index]
{
get
{
return _collection[index];
}
set
{
T oldItem = _collection[index];
if (value != null)
value.Parent = _parent;
_collection[index] = value;
if (oldItem != null)
oldItem.Parent = null;
}
}

#endregion

#region ICollection<T> Members

{
if (item != null)
item.Parent = _parent;
}

public void Clear()
{
foreach (T item in _collection)
{
if (item != null)
item.Parent = null;
}
_collection.Clear();
}

public bool Contains(T item)
{
return _collection.Contains(item);
}

public void CopyTo(T[] array, int arrayIndex)
{
_collection.CopyTo(array, arrayIndex);
}

public int Count
{
get { return _collection.Count; }
}

{
}

public bool Remove(T item)
{
bool b = _collection.Remove(item);
if (item != null)
item.Parent = null;
return b;
}

#endregion

#region IEnumerable<T> Members

public IEnumerator<T> GetEnumerator()
{
return _collection.GetEnumerator();
}

#endregion

#region IEnumerable Members

System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
{
return (_collection as System.Collections.IEnumerable).GetEnumerator();
}

#endregion
}
```

Now let’s see how this class can be used in the case of the above example… First let’s change the `Child` class so that it implements the `IChildItem<Parent>` interface :

```    public class Child : IChildItem<Parent>
{
public string Name { get; set; }

[XmlIgnore]
public Parent ParentObject { get; internal set; }

#region IChildItem<Parent> Members

Parent IChildItem<Parent>.Parent
{
get
{
return this.ParentObject;
}
set
{
this.ParentObject = value;
}
}

#endregion
}
```

Note that here the `IChildItem<Parent>` interface is implemented explicitly : this is a way to “hide” the `Parent` property, that will only be accessible when manipulating the `Child` object through a variable of type `IChildItem<Parent>`. We also define the `set` accessor of the `ParentObject` property as `internal`, so that it can’t be modified from another assembly.

In the `Parent` class, the `List<Child>` just has to be replaced by a `ChildItemCollection<Parent, Child>`. We also remove the `AddChild` and `RemoveChild` methods, which are no more necessary since the `ChildItemCollection<P,T>` takes care of setting the `Parent` property.

```    public class Parent
{
public Parent()
{
this.Children = new ChildItemCollection<Parent, Child>(this);
}

public string Name { get; set; }

public ChildItemCollection<Parent, Child> Children { get; private set; }
}
```

Note that we give the `ChildItemCollection<Parent, Child>` constructor a reference to the current object : this is how the collection will know what is the parent of its elements.

The code previously used to serialize a `Parent` now works fine. During the deserialization, the `Child.ParentObject` property is not assigned when the `Child` itself is deserialized (since it has the `XmlIgnore` attribute), but when the `Child` is added to the `Parent.Children` collection.

Eventually, we can see that this solution enables us to keep the parent/child relationship when the object graph is serialized to XML, without resorting to unelegant tricks… However, note that the consistency of the relation can still be broken, if the `ParentObject` is changed by code outside the `ChildItemCollection<P,T>` class. To prevent that, some logic must be added to the `set` accessor to maintain the consistency ; I only omitted that part for the sake of clarity and simplicity.