Binary trees are widely used in development, but can often sound mysterious or intimidating to the uninitiated. With a concept like binary trees, we feel it’s better to not over-explain the granular aspects of it — particularly its algorithmic components.
Let’s start with a basic definition of a binary tree. What is it, and what does it do? Basically, a binary tree is a method of structuring data hierarchically, so that the most influential data points are placed above the least influential data points. Binary trees are mostly comprised of nodes of data organized into parent-child relationships, though they do make room for some outliers known as leaves.
Binary trees come with a wide variety of use cases in computer and data science. They’re used to sort and index databases, build search algorithms, construct dictionaries, set up priority queues in pipelines, create decision trees, and dictate the behavior of AI characters in video games.
We see binary trees applied so widely because they’re flexible, versatile, and efficient. Generally speaking, binary trees are actually a rather intuitive concept, so we’ll strive to make our coverage of it intuitive as well. Let’s dive in.
Binary Trees: Terms You Should Know
Before we dive deeper into the uses of binary trees, as well as the different types of binary trees, let’s make sure we understand the basic terms and components related to binary trees.
Nodes denote the most fundamental units of binary tree design. A node is simply a unit of data containing information on the data and any relevant references to its parents or children. Nodes can consist of numbers, but they can also comprise other units of data such as: strings, Boolean values (“True” or “False”), symbols, variables, functions, or a combination of one or more of these.
These make up the topmost nodes in a binary tree. Yes, it’s somewhat of a misnomer, since any good arborist will inform you that roots are generally at the bottom of a tree. Oh well. The metaphor mostly works, since the roots of a binary tree are still the most significant nodes within the system. They’re the fundamental building blocks for the binary tree as a whole.
Parents are nodes of data that possess children, or dependents. In other words, they are somewhat more critical points of data that produce their own sub-fields of data.
A child is a node of data that’s connected to and dependent upon a parent node. Depending on how the binary tree is structured, they can be either a “left child” or a “right child”— meaning that they’re either placed to the left or to the right of their parent node.
While most of the time you’ll see parent-child relationships in binary trees, there are also nodes in binary trees that do not have children. They are like parents in that they tend to be primary categories rather than subcategories, but they do not produce “offspring.”
Benefits and Uses of Binary Trees
Now that we’ve broken down the main terms seen in binary trees, let’s get a sense of what it is that makes binary trees so special. First off, binary trees are extremely efficient. Perhaps most importantly, they require relatively little memory.
Indeed, whenever you can give structure to your system, you automatically reduce its memory, but this is especially true if you keep the structure simple, which is certainly the case with binary design. Binary trees are efficient in other ways besides memory, though. They also facilitate efficient access and manipulation of data nodes.
Because all the nodes connect back to roots, and since most of them are organized into parent-child relationships, it’s relatively easy to search for data within the tree. By the same token, it’s quick and easy to insert or delete nodes.
When Would You Use Binary Trees?
Due to these advantages, binary trees can be readily deployed in a variety of contexts. For instance, when constructing a database, developers often use binary trees to index the database. One simple example of indexing would be for an e-commerce company to label all of its products with a product ID. Such indexing allows users to quickly locate and access data, typically by using SQL to make specific calls.
People will also use binary trees to build dictionaries which, if you’ve had exposure to Python, are series of data organized into key-value pairs — with the key being any kind of identifier and the value being the value of the key. Since we know that binary trees are organized into parent-child relationships, it’s easy to see how dictionaries fit so readily into the binary tree schema.
If you’re interested in game development, it might interest you to know that binary trees are a popular way of dictating the behavior of an AI character. Basically, binary trees allow you to define the basic actions and motivations of a character, and then create sets of possibilities that define how that character might react when confronted with various external events. To take one example, what might a police officer do if the protagonist bumps into him?
Decision trees operate in a similar vein. Within a supervised machine learning environment, a binary tree structure allows you to hierarchically organize the behavioral and cognitive patterns of your ML model. You can get as granular as you’d like with your ML algorithm while also limiting memory consumption and avoiding an overly complex data structure.
Types of Binary Trees
Although all binary trees follow the same basic parameters — that is, all binary trees contain roots, parent, and child nodes — not all will be structured the same way. Depending on the binary tree’s function, it will adhere to a different structure. We’ll go over the most common types below.
Complete Binary Tree
A complete tree occurs when all levels, except perhaps the deepest (or lowest) level, are completely filled. In this scenario, all leaf elements must lean to the left, like so:
This type of binary tree is mostly used for heap sorting and, by extension, building a heap data architecture. Heap sorting is a type of algorithm premised on the condition that the value of any node must be greater than or equal to the value of its children.
Once put in place, the algorithm will continuously remove the maximum element from the heap and place it at the farthest left point of the binary tree, as seen above. So, it allows you to sort large datasets quickly and efficiently. Additionally, it enables you to implement a priority queue, meaning a data structure that sorts a set of elements by descending priorities.
Full Binary Tree
Also known as a proper binary tree, a full binary tree denotes one in which every parent node possesses either two or no children. No parent will ever have just one child. And yes — if the primary node has no children, it is indeed a leaf.
Full binary trees are similar to complete binary trees, except that with full binary trees, the last leaf element must have a right sibling. As with complete trees, full binary trees are primarily used for heap sorting.
Perfect Binary Tree
Perfection is the name of the game with this binary tree because it very haughtily insists that every parent node has exactly two child nodes and that all leaf nodes are at the same level.
/ | | \
4 5 6 7
Perfect binary trees also provide functionality similar to complete and full binary trees, except that they serve a different kind of data structure in which the leaf nodes end up on the same level as one another.
Balanced Binary Tree
Balanced trees are characterized by the fact that the left and right subtrees of any node differ by a factor of no greater than one. Visually speaking, these don’t appear different from other binary trees to the naked eye. Your system, though, does pick up on differences in height, and that’s ultimately what matters.
Balanced trees are always as short, or compressed, as possible, which makes for much more efficient searching and traversing within the tree. They also do not require you to manually balance, as they perform this function automatically.
Because they are self-balancing and as short as possible, balanced binary trees are especially space-efficient and flexible. Plus, they allow for more advanced binary tree structures. However, their advanced functionality also means that they’re more difficult to use and maintain, so be wary of this before implementing.
Binary Trees: Resources for Learning
If you’re currently in, or planning on enrolling in, a software development or data science bootcamp, odds are your instructor will be spending some time on binary trees. In which case, lucky you for already having exposure to the concept!
However, if you’ve already completed your training but somehow missed the boat on binary trees (perhaps you fell asleep the day it was covered), you might need to pursue some independent learning on the topic.
Before you do, though, understand that there are some obvious prerequisites. First, since binary trees are heavily algorithmic, you’ll need to understand a thing or two about the underlying mathematical concepts. Second, you need to have a firm grounding in at least one programming language.
To the latter point, binary trees fortunately can be built in a variety of languages. The main ones include Python, Java, C, and C++. The typical favorite is Python, but the other coding languages can get the job done as well.
Now, where should you go to learn? If you’re brand new to the practice, you might consider signing up for a comprehensive bootcamp. Low-cost options can be found on Udemy, Coursera, and the like.
If you’re comfortable learning on your own, though, the best options seem to be this guide offered by Programiz, or this tutorial created by freeCodeCamp. These tutorials distinguish themselves from the pack by being well-organized, effectively written, and complete with sample diagrams and code.
The image featured at the top of this post is ©jijomathaidesigners/Shutterstock.com.