In this post I will present solution to the following problem:

We have a non empty singly linked list with a cycle in it.

We must find first element of a cycle in a linear time.

For example, given a singly linked list:
Algorithm should return node `N3`

as it is the first node counting
from head of a list that is part of cycle.

##### Boundary conditions

To better understand this problem let’s think about boundary conditions for our algorithm.

One of the boundary conditions is that entire list forms single cycle:
In this case our algorithm should return first element of a list (`N1`

node for a list
presented on picture above).

Another boundary condition is that only last element
of the list forms a cycle:
In this case algorithm should return last element of the list (`N4`

node for
a list presented on picture above).

The last boundary condition is a list that consists of only one element: In this case algorithm should return that single element.

For all except last described boundary conditions we should consider lists with odd and even number of elements.

##### Create test suite

Let’s code these boundary conditions and a few of other generic cases as a set of unit tests. To represent list nodes we will use following Java class:

Our algorithm will be represented by `CycleStart.find`

method:

To avoid code duplication we will use JUnit 4 parameterized unit tests. AssertJ will be used as an assertion library:

Each of arrays returned by static `data()`

method contains
`CycleStartTest`

constructor parameters. For each of these arrays JUnit will create
`CycleStartTest`

instance with parameters passed to class constructor. Then
JUnit will call all methods annotated with `@Test`

on that instance.
In our case we have two parameters `listSize`

and `cycleSize`

, I think these
are pretty self explanatory.

The last missing piece is `ListWithCycle`

helper that creates list of given size
with cycle of given size:

Now with unit tests covering all normal and border cases we can start implementing our algorithm.

##### Create simple implementation

Let’s start with simple implementation that works in `O(N)`

time but uses `O(N)`

additional memory.
The idea behind this algorithm is simple. We will be tracking all
already visited nodes, when we visit some node N twice we will know that
node N must be start of a cycle.

This is implemented in Java as:

We use `IdentityHashMap<K,V>`

to track already visited nodes.
`IdentityHashMap<K,V>`

is special purpose `Map<K,V>`

implementation that uses *references* to objects
as a keys.
Internally it uses `==`

instead of `Object.equals()`

to compare keys for equality, and
`System.identityHashCode()`

instead of `Object.hashCode()`

to compute hashes.

This algorithm passes all unit tests but can’t we do any better?

In fact we can, there is a beautiful algorithm that runs in `O(N)`

time and
uses `O(1)`

memory.

##### Create efficient implementation

Let’s create a more efficient implementation. To understand how it works we must first familiarize ourselves with fast and slow pointer method. Main usage of fast and slow pointer method is cycle detection in singly linked lists. The main idea is to have two pointers that traverse the same list. One pointer “slow” moves only one element at time, other pointer “fast” moves two elements at time. Singly linked list contains cycle if both pointers met. This can be implemented in Java as:

In our algorithm we’ll use simple fact from fast/slow pointer method: when fast and slow pointers are inside cycle, fast pointer cannot “jump over” slow pointer. In other words situation like: is not possible.

PROOF: First we assume that slow pointer always moves first and that
we stop algorithm when fast and slow pointer meet. (If entire list is one
cycle we would not count first step - when fast and slow pointers point to the list head).
Let’s assume that fast pointer jumped over slow pointer as depicted on image above.
We know that slow pointer always moves first and it moves only one element at time,
so before slow pointer moved it was at `N2`

node. But that was the node that
fast pointer was pointing to before it moved. In other words before slow and fast pointers
moved they pointed to the same node so our algorithm should have stopped, but it didn’t.
Here we have contradiction that completes the proof.

Consequence of above proof is simple fact: let’s say slow and fast pointers are pointing at cycle first element and that cycle has C elements in total. After at most C slow pointer moves, pointers must meed again.

Now let’s go back to our cycle start finding algorithm. To help us reason we will introduce some variables:

- N - number of nodes in list before cycle (
`N = 3`

in our example below) - C - number of nodes in cycle (
`C = 4`

in our example below) - K - position of slow pointer inside cycle
(first element of cycle has
`k: 0`

, second`k: 1`

etc.)

Let’s consider following situation, we ran fast/slow pointer algorithm on 7 element list
depicted below:
After N iterations of algorithm slow pointer enters
first element of the cycle and has position `K = 0`

. Meanwhile fast
pointer that moves 2 times faster has position `Kfast = N % C`

.
From this point every iteration of algorithm moves slow pointer from position `K`

to `K+1`

,
and fast pointer from position `Kfast`

to `Kfast+2`

. In other words after `S`

iterations
of algorithm slow pointer has position `S % C`

and fast pointer has position `(N+2*S) % C`

.
We know that fast pointer cannot jump over slow pointer, both pointers must met
in some `Smet`

iteration (where `Smet <= C`

). When this happens we have:

or in more mathematical notation:

We can transform this equation using modulo arithmetic into:

and finally into:

This last equation is very important, it tells us that after another `N`

iterations of the
algorithm slow pointer will point at cycle first element:

From this we can get main idea of our new algorithm. First run fast/slow pointer algorithm until pointers met. Then we know that after another N iterations slow pointer will point at cycle first element. But hey if we start moving from list head after N iterations we will point at first cycle element too! We don’t know N but if we start moving slow pointer one element at time and simultaneously we start moving another pointer from list head one element at time they must met at cycle first element:

Above description of algorithm will work only when `C > 1`

and `N > 1`

. Case when `C = 1`

is simple and I leave it as an exercise to reader.
Case when `N = 0`

: entire list forms a single cycle is easy too, we solve it now.
Slow and fast pointers will met after `Smet`

iterations, then:

In other words because `Smet <= C`

and `Smet = 0 (mod C)`

pointers can met only at the first element of the cycle (at position 0 or C).
First element of the cycle is a list head, so in this case our algorithm will work too!

Now let’s move to implementation:

All tests are green so we are done, yay!