Determining the intersection point of two lists in O(1) space

Alex Muscar

May 21, 2022

This was meant to be a single post outlining an approach for determining the intersection point of two singly-linked lists in constant space and its generalisation to work with other data structures. Since the result would be a relatively lengthy read I’ll publish the first part covering the linked list algorithm in this post, and I’ll leave the generalisation for a future post.

May 24, 2022: Laurențiu pointed out an error in the original version of the C++ implementation. The last loop was comparing the node values instead of the node identities. I’ve updated the implementation accordingly.

Problem statement

A while back a friend shared an interesting problem with me1: given the heads of two singly-linked lists, it asks for an algorithm to find the node at which they intersect. We can assume the input lists don’t have cycles. The linked lists must retain their structure after the algorithm is finished.

One obvious solution is to store the first list’s nodes in a set or similar data structure and check if any of the nodes in the second list are present in it.

To make things more interesting my friend also asked if this can be done in constant space.

A constant space solution

If the lists were the same length this would be easy to solve in constant space: just walk over the two lists at the same time and compare the elements. Since our lists can be any length we can’t use this approach.

What we do know is that our lists don’t have loops2. Can we can take advantage of this observation in any way? Since the lists don’t have loops they have an end. What if we walked the lists backwards comparing each element? That would work, but walking the lists backwards is awkward because they are singly-linked. The constraint on space rules out any additional data structures where we could store info abot the reverse lists. If only we had doubly linked lists.

Hm, maybe there’s a way forward: it’s not a doubly-linked list we really want, but the back pointers. What if we traversed each list reversing pointers along the way? We’d be left with two singly linked lists that we can then traverse to compare elements. Looks like we have a solution.

But the constraints ask for the structure of the lists to be left unchanged after the algorithm finishes. So we’d have to walk the lists again reversing pointers to their original direction. This works, but it’s getting complicated. Not to mention that we have to traverse each list 3 times. Can we come up with a simpler solution?

Why did we want to start at the end of the lists? Because the lists have different lengths so we don’t know how many nodes to skip from each until they get “in sync”, and we can start comparing nodes.

There’s an asumption we’re making here–and I think the original problem makes the same assumption implicitly. Namely, that the two lists, once they intersect, have the same elements. That’s an important assumption, and it’s what allows the approach outlined above to work. If the lists just intersected in one element, and then went their separate ways we’d face the same problem as when we wanted to start looking from the beginning of the lists. Keep in mind that I’m making this assumption moving forward.

Going back to walking the lists in reverse, we can look at it in a slightly different light: we’re walking a list that branches into two lists at some element. That’s the element we want to find. We don’t care about the nodes that come after. That’s a useful insight.

Our initial problem was that we didn’t know when the lists would get in sync so we can start comparing elements. But using our insight, it seems safe to just skip elements from the lists until each has an equal number of elements, that is, until we’re at the same distance from the split point on both branches. Then we can just compare the elements, and we’re done. Note that this approach still works when the two lists don’t intersect (why?).

Given two pointers A and B to singly-linked lists, we came up with the following agorithm:

  1. [Determine list lengths.] Set M ← length(A), and N ← length(B).
  2. [Sync A’s starting point with B.] If M > N, then set A ← next(A), M ← M − 1 and repeat this step.
  3. [Sync B’s starting point with A.] If N > M, then set B ← next(B), N ← N − 1 and repeat this step.
  4. [Search for first common node in lock-step] If A ≠ NULL ∧ B ≠ NULL ∧ A ≠ B set A ← next(A), B ← next(B) and repeat this step.
  5. [Done.] Return A.

(Why does it just work to return A?)

It’s finally time to write some code. Since the focus is on the intersection point algorithm we’ll define the simplest node structure we can work with:

template<typename T>
struct node
{
    T value;
    node* next;
};

Yes, it uses raw pointers which goes against modern C++ best practices. This is not production code.

Determining the length of a list is a straightforward task:

template<typename T>
int length(node<T>* a)
{
    int n = 0;
    while (a != nullptr) {
        n++;
        a = a->next;
    }
    return n;
}

With the boilerplate out of the way, the algorithm can be implemented as follows:

template<typename T>
node<T>* intersection_point(node<T>* a, node<T>* b)
{
    auto m = length(a);
    auto n = length(b);
    while (m > n) {
        a = a->next;
        m--;
    }
    while (n > m) {
        b = b->next;
        n--;
    }
    while (a != nullptr && b != nullptr && a != b) {
        a = a->next;
        b = b->next;
    }
    return a;
}

  1. A previous version of the post linked to the problem on a competitive coding site. Since then I’ve removed the link.↩︎

  2. There is another implicit assumption that the lists are finite. Since computers have a limited amount of memory this might sound pedantic, but it’s going to be important in the next post.↩︎