Implementing Value Equality in C#

Updated on 2019-11-17

Robustly implement value equality semantics in your classes in C#

Introduction

This article endeavors to demonstrate value equality semantics in C# using various techniques.

Background

Reference equality and value equality are two different ways to determine the equality of an object.

With reference equality, two objects are compared by memory address. If both objects point to the same memory address, they are equivalent. Otherwise, they are not. Using reference equality, the data the object holds is not considered. The only time two objects are equal is if they actually refer to the same instance.

Often, we would prefer to use value equality. With value equality, two objects are considered equal if all of their fields have the same data, whether or not they point to the same memory location. That means multiple instances can be equal to each other, unlike with reference equality.

.NET provides a couple of facilities for implementing value equality semantics, depending on how you intend to use it.

One way to do it is to overload the appropriate methods on the class itself. Doing so means the class will always use value semantics. This might not be what you want, as in general not only might you want to distinguish between instances, but also value semantics is more resource intensive. Often times however, this is exactly what you need. Use your best judgement.

Another way to do it is to create a class that implements IEqualityComparer. This will allow your class to be compared using value semantics within classes like Dictionary<TKey,TValue>, but normal comparisons will use reference equality. Sometimes, this is precisely what you need.

We'll explore both mechanisms here.

Coding this Mess

First, consider the employee class:

public class Employee
{
    public int Id;
    public string Name;
    public string Title;
    public DateTime Birthday;
}

As you can see, this is a very simple class that represents a single employee. By default, classes use reference equality semantics, so in order to do value semantics we'll need to do additional work.

We can use value semantics with this, or any class by creating a class implementing IEqualityComparer:

// a class for comparing two employees for equality
// this class is used by the framework in classes like
// Dictionary<TKey,TValue> to do key comparisons.
public class EmployeeEqualityComparer : IEqualityComparer<Employee>
{
    // static singleton field
    public static readonly EmployeeEqualityComparer Default = new EmployeeEqualityComparer();
    // compare two employee instances for equality
    public bool Equals(Employee lhs,Employee rhs)
    {
        // always check this first to avoid unnecessary work
        if (ReferenceEquals(lhs, rhs)) return true;
        // short circuit for nulls
        if (ReferenceEquals(lhs, null) || ReferenceEquals(rhs, null))
            return false;
        // compare each of the fields
        return lhs.Id == rhs.Id &&
            0 == string.Compare(lhs.Name, rhs.Name) &&
            0 == string.Compare(lhs.Title, rhs.Title) &&
            lhs.Birthday == rhs.Birthday;
    }
    // gets the hashcode for the employee
    // this value must be the same as long
    // as the fields are the same.
    public int GetHashCode(Employee lhs)
    {
        // short circuit for null
        if (null == lhs) return 0;
        // get the hashcode for each field
        // taking care to check for nulls
        // we XOR the hashcodes for the
        // result
        var result = lhs.Id.GetHashCode();
        if (null != lhs.Name)
            result ^= lhs.Name.GetHashCode();
        if (null != lhs.Title)
            result ^= lhs.Title.GetHashCode();
        result ^= lhs.Birthday.GetHashCode();
        return result;
    }
}

Once you've done that, you can then pass this class to, for example, a dictionary:

var d = new Dictionary<Employee, int>(EmployeeEqualityComparer.Default);

Doing this allows the dictionary to use value semantics for key comparisons. This means that the keys are considered based on the value of their fields rather than their instance identity/memory location. Note above we are using Employee as the dictionary key. I've often used equality comparer classes when I needed to use collections as keys in dictionaries. This is a reasonable application of it, as you normally do not want value semantics with collections, even if you need them in particular cases.

Moving on to the second method, implementing value semantics on the class itself:

// represents a basic employee
// with value equality
// semantics
public class Employee2 :
    // implementing this interface tells the .NET
    // framework classes that we can compare based on
    // value equality.
    IEquatable<Employee2>
{
    public int Id;
    public string Name;
    public string Title;
    public DateTime Birthday;

    // implementation of
    // IEqualityComparer<Employee2>.Equals()
    public bool Equals(Employee2 rhs)
    {
        // short circuit if rhs and this
        // refer to the same memory location
        // (reference equality)
        if (ReferenceEquals(rhs, this))
            return true;
        // short circuit for nulls
        if (ReferenceEquals(rhs, null))
            return false;
        // compare each of the fields
        return Id == rhs.Id &&
            0 == string.Compare(Name, rhs.Name) &&
            0 == string.Compare(Title, rhs.Title) &&
            Birthday == rhs.Birthday;
    }
    // basic .NET value equality support
    public override bool Equals(object obj)
        => Equals(obj as Employee2);
    // gets the hashcode based on the value
    // of Employee2. The hashcodes MUST be
    // the same for any Employee2 that
    // equals another Employee2!
    public override int GetHashCode()
    {
        // go through each of the fields,
        // getting the hashcode, taking
        // care to check for null strings
        // we XOR the hashcodes together
        // to get a result
        var result = Id.GetHashCode();
        if (null != Name)
            result ^= Name.GetHashCode();
        if (null != Title)
            result ^= Title.GetHashCode();
        result ^= Birthday.GetHashCode();
        return result;
    }
    // enable == support in C#
    public static bool operator==(Employee2 lhs,Employee2 rhs)
    {
        // short circuit for reference equality
        if (ReferenceEquals(lhs, rhs))
            return true;
        // short circuit for null
        if (ReferenceEquals(lhs, null) || ReferenceEquals(rhs, null))
            return false;
        return lhs.Equals(rhs);
    }
    // enable != support in C#
    public static bool operator !=(Employee2 lhs, Employee2 rhs)
    {
        // essentially the reverse of ==
        if (ReferenceEquals(lhs, rhs))
            return false;
        if (ReferenceEquals(lhs, null) || ReferenceEquals(rhs, null))
            return true;
        return !lhs.Equals(rhs);
    }
}

As you can see, this is a bit more involved. We have the Equals() and GetHashCode() methods which should be familiar, but we also have an Equals() overload and two operator overloads, and we implement IEquatable. Despite this extra code, the basic idea is the same as with the first method.

We implement Equals(Employee2 rhs) and GetHashCode() almost the same way as we did in the first method, but we need to overload the other Equals() method and forward the call. In addition, we create two operator overloads for == and !=, duplicating the reference equality and null checks, but then forwarding to Equals().

Once we've implemented an object this way, the only way to do reference equality comparisons is by using ReferenceEquals(). Any other mechanism will give us value equality semantics, which is what we want.

Examples of using this can be found in the Main() method of the demo project's Program class:

static void Main(string[] args)
{
    // prepare 2 employee instances
    // with the same data
    var e1a = new Employee()
    {
        Id = 1,
        Name = "John Smith",
        Title = "Software Design Engineer in Test",
        Birthday = new DateTime(1981, 11, 19)
    };
    var e1b = new Employee()
    {
        Id = 1,
        Name = "John Smith",
        Title = "Software Design Engineer in Test",
        Birthday = new DateTime(1981, 11, 19)
    };
    // these will return false, since the 2 instances are different
    // this is reference equality:
    Console.WriteLine("e1a.Equals(e1b): {0}", e1a.Equals(e1b));
    Console.WriteLine("e1a==e1b: {0}", e1a==e1b);
    // this will return true since this class is designed
    // to compare the data in the fields:
    Console.WriteLine("EmployeeEqualityComparer.Equals(e1a,e1b): {0}",
        EmployeeEqualityComparer.Default.Equals(e1a, e1b));
    // prepare a dictionary:
    var d1 = new Dictionary<Employee, int>();
    d1.Add(e1a,0);
    // will return true since the dictionary has a key with this instance
    Console.WriteLine("Dictionary.ContainsKey(e1a): {0}", d1.ContainsKey(e1a));
    // will return false since the dictionary has no key with this instance
    Console.WriteLine("Dictionary.ContainsKey(e1b): {0}", d1.ContainsKey(e1b));
    // prepare a dictionary with our custom equality comparer:
    d1 = new Dictionary<Employee, int>(EmployeeEqualityComparer.Default);
    d1.Add(e1a, 0);
    // will return true since the instance is the same
    Console.WriteLine("Dictionary(EC).ContainsKey(e1a): {0}", d1.ContainsKey(e1a));
    // will return true since the fields are the same
    Console.WriteLine("Dictionary(EC).ContainsKey(e1b): {0}", d1.ContainsKey(e1b));

    // prepare 2 Employee2 instances
    // with the same data:
    var e2a = new Employee2()
    {
        Id = 1,
        Name = "John Smith",
        Title = "Software Design Engineer in Test",
        Birthday = new DateTime(1981, 11, 19)
    };
    var e2b = new Employee2()
    {
        Id = 1,
        Name = "John Smith",
        Title = "Software Design Engineer in Test",
        Birthday = new DateTime(1981, 11, 19)
    };
    // these will return true because they are overloaded
    // in Employee2 to compare the fields
    Console.WriteLine("e2a.Equals(e2b): {0}", e2a.Equals(e2b));
    Console.WriteLine("e2a==e2b: {0}", e2a == e2b);
    // prepare a dictionary:
    var d2 = new Dictionary<Employee2, int>();
    d2.Add(e2a, 0);
    // these will return true, since Employee2 implements
    // Equals():
    Console.WriteLine("Dictionary.ContainsKey(e2a): {0}", d2.ContainsKey(e2a));
    Console.WriteLine("Dictionary.ContainsKey(e2b): {0}", d2.ContainsKey(e2b));
}

Points of Interest

Structs do a kind of value equality semantics by default. They compare each field. This works until the fields themselves use reference semantics, so you may find yourself implementing value semantics on a struct anyway if you need to compare those fields themselves by value.

History

  • 17th November, 2019 - Initial submission