While working on Xtext and the inference of meta-models of a given grammar I encountered different problems that could be broken down to a well-known question: “What is the least common compatible type of those?”. While Liskov discusses how subtypes and supertypes could be derived in general I already had a hierarchy of defined types. Books like Types and Programming Languages seem to cover questions like this exhaustively and come to a quite simple view at this topic.
The following diagrams are geared to the UML notation. Boxes are types in this context whereas an arrow indicates a supertype relationship.
In Figure 1, A is the direct supertype of B where C is a direct and E an indirect subtype of A. Each type is compatible with A. Type E and F are compatible with B, which is also the least common super type of E and F. Every type is compatible to itself and the least common compatible type of D and G is D again.
So far, this is pretty obvious. The difficulties arise when dealing with Ecore, or EMF in the context of Eclipse. The meta-model of EMF says, that any EClass (read: Type) can have any number of supertypes. This leads into potential scenarios of multiple-inheritance where things are not so obvious anymore.
Figure 2 illustrates a simple scenario, where D and E have more than one supertype. Both, D and E are compatible with B and C. One can argue whether B or C should be called the common compatible type of D and E. It’s even harder to tell, whether B or C is the least common compatible type. Or are they both?
In the context of Xtext it was necessary to name a single type that works not only as an arbitrary compatible type in general (where both B and C would fit) but would unambiguously be the least common compatible type. In Figure 2 this would be A.
Some argued B is as “tight” as C. Others said, the absence of a unique root causes some sort of ambiguity. In the end we agreed that neither C nor B nor A would be “unambiguous enough” to fulfill the requirements of a so-called LUCCT. We also agreed that this question is more challenging than picking the most specific type that is compatible with F and D.
After thinking about this it turns out that a proper solution is not easy but it is no rocket science either.
- First, find the intersection of all compatible types of the inspected types.
- Next, concentrate on this set only and find all types that are directly or indirectly connected to every other type in the set. These are the LUCCT candidates.
- Last, select the most specific candidate of step 2 as the result. If there are no candidates there’s no LUCCT for the inspected types. Yes, there is always a most specific candidate otherwise it would not be a candidate.
In the given scenario of Figure 4 where one wants to determine the LUCCT of G and I the set of compatible types consists of A, B, C, D, E and F. These types on their own are all directly or indirectly subtypes or supertypes of A and D whereas pairwise compared B and C or E and F are not. Since D is more specific than A the LUCCT of G and I is D, voilà!
The gentle reader may apologize the absence of terms like graph reduction or isthmus to explain this method.