Home

 › 

Articles

 › 

What is the Merge Sort Algorithm, and How Does it Work? (With Examples)

Merge Sort

What is the Merge Sort Algorithm, and How Does it Work? (With Examples)

When you’re trying to sort an array of data, you’re going to need to be familiar with sorting algorithms. As you can expect, these are algorithms that can be implemented to sort data. What’s more, there are many kinds of these, ranging from simple to implement to more complex. Each of these algorithms has its own advantages and drawbacks, mostly to do with how easy they are to use and how quickly they can be executed. Merge sort is a relatively simple algorithm that has many applications. Let’s take a closer look at how merge sort works and how to implement it with this article.

What is Merge Sort?

Merge sort is a sorting algorithm used to sort an array of data. It’s known as a “divide-and-conquer” algorithm. These types of algorithms rely on breaking a complicated problem down into smaller problems that are easier to solve. The solutions for these “subproblems” are then combined to obtain a solution for the initial problem. Let’s take a look at the divide-and-conquer process in more detail.

How Divide-and-conquer Algorithms Work

As mentioned, divide-and-conquer works by splitting a problem down into simpler and smaller problems. We can illustrate this with an example.

Given an array, A, we can designate a subproblem as sorting a subsection of this array. For example, a subarray starting at index x and ending at index y. This would be denoted as A[x…y].

The divide process involves splitting the subarray A[x…y] into two subarrays. If we call z the midpoint, we get A[x…z] and A[z+1…y].

The conquer step involves sorting both subarrays A[x…z] and A[z+1…y]. If we haven’t reached a singular array, we can divide these subarrays again.

Once the obtained subarray is singular, we end up with two sorted subarrays. We can then combine the sorted A{x…z] and A[z+1…y} to give the sorted array A[x…y].

The Algorithm Behind Merge Sort

We can illustrate the merge sort process with the following pseudocode:

Merge_sort(arr_start,end)

if beg<end
set mid = (start + end)/2
Merge_sort(arr, start, mid)
Merge_sort(arr, mid + 1, end)
Merge(arr, start, mid, end)
end of if

End(Merge_sort)

There are actually two ways to implement merge sort – the top-down approach, and the bottom-up approach. It’s a good time to give a brief overview of these processes.

The Top-down Approach

As the name suggests, this algorithm starts at the top with the initial array, splits the array in half, recursively sorts these subarrays (that is, by splitting them and repeating the process), and then merges the results into a sorted array.

The Bottom-up Approach

On the other hand, the bottom-up approach relies on an iterative method. It starts with an array of a single element and then merges the elements on either side whilst sorting them. These combined and sorted arrays are merged and sorted until only one element remains in the sorted array. The main difference is that, while the top-down approach sorts each subarray recursively, the bottom-up approach breaks the array down into singular arrays. Then, each array is sorted and merged iteratively.

Working of Merge Sort

Following this, a working example of how merge sort is used would be useful.

Let’s say we have an array, arr. Arr = [37, 25, 42, 2, 7, 89, 14].

Firstly, we perform a check to see if the left index of the array is less than the right index. If this is true, the midpoint is calculated. This is equal to “(start_index” + (“end_index”-1))/2, which, in this case, is 3. Therefore, the element at index 3 is the midpoint, which is 2.

In this way, this array of 7 elements has been split into two arrays of size 4 and 3 for the left and right subarrays respectively:

Left array = [37, 25, 42, 2]

Right array = [7, 89, 14]

We continue by calculating the midpoint again, and splitting the arrays into two until division is no longer possible. The end result is:

[37], [25], [42], [2], [7], [89], [14]

And then, we can start merging and sorting these subarrays as such:

[25, 37], [2, 42], [7,89], [14]

These subarrays are then merged and sorted again to give:

[2, 25, 37, 45] and [7, 14, 89]

Finally, these subarrays are merged and sorted to give the final sorted array:

[2, 7, 14, 25, 37, 45, 89]

Implementation of Merge Sort

By now, we’ve covered how the merge sort algorithm works, the steps involved, and the working behind it with an example. So, it’s finally time to see how to implement this algorithm using code. For illustrative purposes, we’ll use the Python programming language. The process can be described using the following code:

def mergeSort(arr):

    if len(arr) > 1:

       mid = len(arr)//2

       L = arr[:mid]

       R = arr[mid:]

       mergeSort(L)

       mergeSort(R)

       i = j = k = 0

       while i < len(L) and j < len(R):
           if L[i] <= R[j]:
               arr[k] = L[i]
               i += 1
           else:
               arr[k] = R[i]
               j += 1
           k += 1

       while i < len(L):
           arr[k] = L[i]
           i += 1
           k += 1

       while j < len(R):
           arr[k] = R[j]
           j += 1
           k += 1

def printList(arr):
    for i in range(len(arr)):
        print(arr[i], end=" ")
    print()

if__name__ == '__main__':
   arr = 12, 11, 13, 5, 6, 7]
   print("Given array is", end="\n")
   printList(arr)
   mergeSort(arr)
   print("Sorted array is", end="\n")
   printList(arr)

This looks like a lot of code, but it’s manageable if we break it down.

Explanation of the code

Firstly, we’re defining the function mergeSort as a function of the array “arr”.

The first “if” statement checks if the length of the array is greater than 1. If so, we can proceed with dividing the array.

Then, we calculate the midpoint of the array, and spit it into two subarrays, “L” and “R”.

The mergeSort function is recursively called on L and R, dividing them until each subarray only contains one element.

At this point, three variables are initialized to zero = i, j and k. The first two are the beginning and end indices of the subarray being operated on. “k” is a variable that keeps track of the current position in the array.

The next part, beginning with the “while” statement, is the main section of the algorithm. The algorithm is iterated over L and R, comparing the elements at i and j and copying the smaller element to the array. Along with k, these index variables are incremented, as per the increment operator “=-”.

The next section of “while” statements dictates that any remaining elements in the two arrays are copied to the initial array.

We’re almost finished. Next, we define the printlist function as a function of the array “arr”.

Lastly, we use the “if_name_ == ‘_main_’” line to ensure that the code is only executed when the script is run directly, and not when it’s imported.

The original array is printed as “Given array is”, and the final, sorted array is printed as “Sorted array is”. The final line prints this sorted array, after the initial array has been merge sorted as per the previous code.

To conclude, below is a screenshot showing this code being implemented in Python:

Merge Sort
Python code for the merge sort algorithm.

Best and Worst Use Cases of Insertion Sort

Merge sort is a simple algorithm, but, as with any algorithm, has its best use cases and worst cases. These are illustrated below in terms of time complexity and space complexity.

Time Complexity with Insertion Sort

Time complexity in each case can be described in the following table:

CaseTime Complexity
BestO(n log n)
AverageO(n log n)
WorstO(n log n)

The best case refers to when the array is partially sorted, the average case refers to when the array is jumbled and the worst case refers to when the array is in ascending or descending order, and you want the opposite. As such, this would require the most sorting.

Fortunately, merge sort has the same time complexity in each case, of O(n log n). This is because, in each case, each element needs to be copied and compared. This leads to a logarithmic dependency on the input size since the array is recursively divided until each subarray contains one element. As far as sorting algorithms go, the time complexity of merge sort is better than some others, such as insertion sort and selection sort. As such, merge sort is one of the best sorting algorithms for working with large datasets.

Next, we’ll examine the space complexity of merge sort.

Space Complexity of Merge Sort

CaseTime Complexity
BestO(n)
AverageO(n)
WorstO(n2)

Compared to other sorting algorithms, the space complexity of merge sort doesn’t fare quite as well This is because, instead of a complexity of O(1) (where memory usage is constant and only one temporary variable is stored), merge sort has a space complexity of O(n), where n is the input size. In its worst case, n/2 temporary arrays must be created, with each having a size of n/2. This means there’s a space requirement of O(n2/4), or O(n2). But, in the average and best cases, the number of arrays needed is log(n) base 2, meaning that the complexity is proportional to n. Overall, the complexity of merge sort is relatively good, and the algorithm is suitable for large datasets, but the memory usage is more significant than other methods.

Wrapping Up

In conclusion, the benefits and drawbacks of the merge sort algorithm have been explained, along with how the algorithm works, illustrated with examples. Merge sort is a suitable sorting algorithm for large datasets, even if its memory usage is relatively high. In brief, it’s a relatively simple algorithm to understand and a great one to implement in many environments.

Frequently Asked Questions

What is merge sort?

Merge sort is a divide-and-conquer algorithm, used to sort arrays of data.

How does merge sort work?

Merge sort works by dividing an array into two smaller subarrays, and sorting these subarrays recursively by splitting them down further. The final sorted subarrays are then merged to give the sorted version of the initial array.

What is the time complexity of merge sort?

The time complexity of merge sort is O(n log n), where n is the input size.

What is the space complexity of merge sort?

The space complexity of merge sort is O(n) in the best, average and worst cases. As such, it’s fairly dependent on memory.

What are the applications of merge sort?

Merge sort is best used for sorting large datasets, external sorting (where data is too large to fit into memory), counting the number of inversions in an array, and merging multiple sorted arrays into one array.

What are the benefits of merge sort?

Merge sort is quite simple and efficient, especially for large data sets, and can be used to merge multiple arrays into one sorted array.

What are the drawbacks of merge sort?

Merge sort is slower than some other sorting algorithms when used for small datasets. Merge sort also requires additional memory space compared to simpler algorithms. Additionally, merge sort must go through its entire process even if the array is already sorted.

Is merge sort a stable algorithm?

Yes, merge sort is a stable algorithm, because it maintains the relative order of equal elements once the array is sorted. It does this by comparing the values of elements, and, when they are determined to be equal, the one from the left subarray is chosen first when the subarrays are to be merged. This is useful in situations such as sorting database records where multiple keys are in use, i.e. relational databases.

What are the alternatives to merge sort?

Merge sort is generally more efficient than insertion sort and selection sort, even though it requires more memory.

Can merge sort be made more efficient?

Yes, by combining with insertion sort. The latter can be used for smaller arrays, particularly where the size is equal to 43 or less. This is because the number of operations needed to sort this size array will be less in insertion sort than merge sort.

To top