Wonga’s backend is composed of autonomous services communicating via messaging using NServiceBus. All in all, our backend services handle close to 2,000 different message types. Our range of products are offered in many countries, and local regulations impose significant variations. In order to manage complexity, our Continuous Integration system is building, testing and packaging different variants of our backend services for every offering. Consequently, the set of handled message types varies across offerings.

In order to populate a Configuration Management Database, we want to find all message types handled in an offering. The collected data can later be used to validate configurations (e.g. message routes) or configuring various monitoring systems. In NServiceBus, message handlers implement the IHandleMessages<T> interface, where T is the message type. Hence, in order to identify all the handlers, we have to find all the concrete implementations IHandleMessages<T>.

The obvious solution is to write a simple .NET executable loading all the packaged assemblies into the CLR. Thanks to reflection, we can identify all the subtypes of IHandleMessages<T> quite easily:

Directory.GetFiles(path, "*.dll")
  .Select(dll => Assembly.LoadFrom(dll))
  .SelectMany(assembly => assembly.GetTypes())
  .Where(type => type.IsClass && !type.IsAbstract)
  .Where(type => typeof(IHandleMessages<>).IsAssignableFrom(type));

However, it is usually a bad idea to use reflection to complete static code analysis; mainly because once an assembly is loaded into an AppDomain it cannot be unloaded. Hence, we use the Cecil library from the Mono project to map the assembly into memory, parse the CIL, and walk the internal tables manually. Many tools use Cecil, including NDepend.

If we rewrite the expression above using Cecil, we get something that looks quite similar:

Directory.GetFiles(path, "*.dll")
  .Select(dll => ModuleDefinition.ReadModule(dll))
  .SelectMany(module => module.GetTypes())
  .Where(type => type.IsClass && !type.IsAbstract)
  .Where(type => IsSubType(type, type.Module.GetType("NServiceBus", "IHandleMessages`1")));

It is up to us to implement IsSubType, though:

bool IsSubType(TypeDefinition type, TypeReference superType)
{
  return
      IsType(type, superType) ||
      (type.BaseType != null && IsSubType(type.BaseType.Resolve(), superType)) ||
      (type.Interfaces
          .Select(interface => interface.Resolve())
          .Any(interface => IsSubType(interface, superType)));
}

bool IsType(TypeReference a, TypeReference b)
{
  return a.Namespace == b.Namespace && a.Name == b.Name;
}

The base case checks whether the two types are the same. If they are, then by definition, type is a subtype of superType. Otherwise, we’ve two possible recursion paths. First, we check if type is extending a class. If it is, then we check if the base class is a subtype of superType. Otherwise, we check if any of the implemented interfaces are a subtype of superType.

Now that our code finds the message handlers packaged in an offering, we can find the handled message types. As previously mentioned, the IHandleMessages<T> interface is parameterized by the message type, so at first glance it looks like it shouldn’t be too complicated.

Directory.GetFiles(path, "*.dll")
  .Select(dll => ModuleDefinition.ReadModule(dll))
  .SelectMany(module => module.GetTypes())
  .Where(type => type.IsClass && !type.IsAbstract)
  .Where(type => IsSubType(type, type.Module.GetType("NServiceBus", "IHandleMessages`1")))
  .Select(GetMessageType)

A simple although incorrect implementation of GetMessageType is the following:

TypeReference GetMessageType(TypeDefinition handlerDefinition) { TypeReference referenceToIHandleMessages = definition.Interfaces .Where(interface => IsType( interface, handler.Module.GetType("NServiceBus", "IHandleMessages`1"))) .Single(); return ((GenericInstanceType)referenceToIHandleMessages).GenericArguments.Single(); }

We look for IHandleMessages<T> in the interfaces implemented by the handler. Then, we extract the only type argument from the reference to IHandleMessages<T>. Note that in order to get the type argument, we cast the reference to IHandleMessages<T> to a GenericInstanceType. In the picture below, IHandleMessages<T> is a TypeDefinition and IHandleMessages<ILoanApproved> is a GenericInstanceType. (In the definition, T is a generic type parameter, while in the reference ILoanApproved is a generic type argument.)

Although this implementation of GetMessageType works in the vast majority of the cases, it does not work in the general case. As you have probably noticed, we are assuming that the handler implements IHandleMessages<T> directly. In other words, GetMessageType does not work if the handler either inherits from a class implementing IHandleMessages<T> or implements another interface extending IHandleMessages<T>. Let’s have a look at three examples breaking the current implementation:

We are trying to solve the general problem, so let’s start by rewriting GetMessageType to delegate to a generic GetActualTypeArgument method.

TypeReference GetMessageType(TypeDefinition handlerDefinition)
{
  return GetActualTypeArgument(
      handlerDefinition, handler.Module.GetType("NServiceBus", "IHandleMessages`1"), 0);
}

GetActualTypeArgument takes three parameters: the subtype, the supertype, and the index of the supertype’s type parameter we want to unify. As IHandleMessages<T> has only one type parameter, the index is 0.

TypeReference GetActualTypeArgument(
  TypeReference subType, TypeReference superType, int typeParameterIndex)
{
  IList<TypeReference> linearHierarchy =
      GetLinearHierarchy(subType, superType);
  IList<TypeParameterInstantiation> chain =
      GroupInTypeParameterInstantiations(typeParameterIndex, linearHierarchy);
  return Unify(chain);
}

The method works in three steps. First, we extract the linear hierarchy of type references from the supertype to the subtype. Second, for every type reference in the hierarchy, we resolve its definition and extract type parameter instantiations. Finally, we unify the chain of instantiations to resolve for the type parameter we are looking for.

The implementation of GetLinearHierarchy is quite similar to IsSubType. Its goal is to return only the relevant type references from the hierarchy, e.g. only the types from IHandleMessages<T> to the handler.

IList<TypeReference> GetLinearHierarchy(TypeReference subType, TypeReference superType)
{
  TypeDefinition subTypeDefinition = subType.Resolve();

  Collection<TypeReference> interfaces = subTypeDefinition.Interfaces;
  foreach (TypeReference interface in interfaces)
  {
      if (IsType(interface, superType))
      {
          return new List<TypeReference>(new [] { interface, subType });
      }

      IList<TypeReference> linearHierarchy = GetLinearHierarchy(interface, superType);
      if (linearHierarchy != null)
      {
          linearHierarchy.Add(subType);
          return linearHierarchy;
      }
  }

  TypeReference baseType = subTypeDefinition.BaseType;
  if (baseType == null)
  {
      return null;
  }
  else if (IsType(baseType, superType))
  {
      return new List<TypeReference>(new[] { baseType, subType });
  }
  else
  {
      IList<TypeReference> linearHierarchy = GetLinearHierarchy(baseType, superType);
      if (linearHierarchy != null)
      {
          linearHierarchy.Add(subType);
          return linearHierarchy;
      }
  }
  return null;
}

The next step is to resolve the definition of every type reference in the hierarchy and extract type parameter instantiations. The pictures below demonstrate what we are trying to achieve:

GroupInTypeParameterInstantiations returns a list of TypeParameterInstantiations, a simple class encapsulating a GenericParameter and a TypeReference. Note that as exemplified in the pictures above, the TypeReference might be anything, including a GenericParameter.

IList<TypeParameterInstantiation> GroupInTypeParameterInstantiations(
  int index, IList<TypeReference> linearHierarchy)
{
  TypeReference type = linearHierarchy.First();
  List<TypeReference> tail = linearHierarchy.Skip(1).ToList();

  TypeDefinition typeDefinition = type.Resolve();
  GenericParameter genericParameter = typeDefinition.GenericParameters[index];
  TypeReference genericArgument = ((GenericInstanceType)type).GenericArguments[index];
  var instantiation = new TypeParameterInstantiation(genericArgument, genericParameter);

  if (genericArgument is GenericParameter)
  {
      TypeDefinition subTypeDefinition = tail.First().Resolve();
      for (var j = 0; j < subTypeDefinition.GenericParameters.Count; j++)
      {
          if (subTypeDefinition.GenericParameters[j].FullName == genericArgument.FullName)
          {
              IList<TypeParameterInstantiation> result =
                  GroupInTypeParameterInstantiations(j, tail);
        result.Add(instantiation);
        return result;
          }
      }
      throw new Exception();
  }
  return new List<TypeParameterInstantiation>(new[] { instantiation });
}

GroupInTypeParameterInstantiations start from the top of the hierarchy. We extract the generic argument from the type reference and the generic parameter from the type definition using the index, and store them in a TypeParameterInstantiation.

If the generic argument is a generic parameter as in T := U, then we need to recurse in order to find an instantiation for U. Otherwise, we’re done.

The recursion is a little bit tricky. What we need to do is to first find the index of the type parameter in the next type in the hierarchy. As U is a type parameter, we look at the subtype definition, which is FirstHandler<T, U>.  We iterate through the type parameters to find the index of U, which is 1. Then, we call GroupInTypeParameterInstantiations on the rest of the hierarchy.

Now that we have all the instantiations, the last step is to unify them in order to find a substitution for the type parameter we are looking for. This is quite simple, as we just have to return the first generic argument from the chain:

TypeReference Unify(IList<TypeParameterInstantiation> chain)
{
  return chain.First().GenericArgument;
}

GetMessageType is now successfully returning the handled message type of handlers that do not implement IHandleMessages<T> directly. There is still one case that is not handled: handlers that inherit from IHandleMessages<T> multiple times. In order to address this issue, one would need to modify GetLinearHierarchy to return multiple hierarchies and then find and unify the instantiations of each linear hierarchy independently.

Finally, it is also worth noting that GetActualTypeArgument doesn’t work if a type argument in the hierarchy is a parameterized type, as in the following example:

GroupInTypeParameterInstantiations need an extra branch checking if genericArgument is an instance of GenericInstanceType. If it is, then we need to recursively extract all the type parameters from the argument. The complexity arises from the fact that we now have to potentially track multiple type arguments, which makes the implementation of Unify more complicated.

We will implement these changes in an upcoming blog post. If you want to tackle the challenge on your own, feel free to send your solution to talent@wonga.com. We would love to talk to you.