Home

 › 

Articles

 › 

PostgreSQL Types: The 13 Data Types You Need to Know

PostgreSQL data types

PostgreSQL Types: The 13 Data Types You Need to Know

You might have at some point found yourself trying to understand PostgreSQL and its different data types but couldn’t grasp it. Perhaps you are a programmer, database administrator, or aspiring data analyst trying to figure out the best data types to use in a PostgreSQL database use case. 

Whatever the reason, you’re in the right place. In this article, we’ll take an in-depth look at PostgreSQL data types, from their specific applications to errors and pitfalls to avoid.

Data Types in Databases

However, before we get started, let’s take a step back and talk about what data types are in relation to databases. Simply put, a data type is a classification of data based on the type of value it represents. In a relational database, for example, each column must have a data type that defines the type of data stored in that column. This helps ensure data integrity and accuracy when querying the database.

PostgreSQL is a relational database and it supports a wide range of data types, each with its own unique characteristics and uses. Understanding these data types is crucial to building efficient, scalable, and reliable database systems. So, without further ado, let’s dive into some of the PostgreSQL data types you need to be aware of.

Numeric Data Types

In PostgreSQL, the numeric data types store numbers with a specified precision and scale. The precision determines the maximum number of digits stored, while the scale determines the number of digits to the right of the decimal point. These data types are useful in many applications that require precise numerical calculations, such as financial software and scientific simulations.

PostgreSQL has various numeric data types, including SMALLINT, INTEGER, BIGINT,  DECIMAL/NUMERIC, FLOAT/REAL, and DOUBLE PRECISION. The SMALLINT data type stores a small integer with a range of -32768 to +32767. The INTEGER data type stores a standard integer with a range of -2147483648 to +2147483647. The BIGINT data type stores a large integer with a range of approximately -9.223 × 10^18 to -9.223 × 10^18.

The DECIMAL/NUMERIC data type stores fixed-point numbers with a user-specified precision and scale, while the FLOAT/REAL data type stores floating-point numbers with user-specified precision. The DOUBLE PRECISION data type stores double-precision floating-point numbers.

When working with numeric data types, it’s important to avoid common errors and pitfalls. One common mistake is using the wrong data type for a particular calculation or operation. For example, if you need to store a large number with high precision, you might be tempted to use a FLOAT data type. However, FLOAT data types are not always precise and can result in rounding errors. Instead, it’s better to use a DECIMAL/NUMERIC data type for such calculations.

Character Data Types

The character data type is one of the most commonly used data types in PostgreSQL. It stores strings of text, such as names, addresses, and descriptions. Character data types in PostgreSQL have three variations: CHAR, VARCHAR, and TEXT.

CHAR is a fixed-length data type that’s useful when you need to store a consistent amount of data in each column. For example, if you’re storing phone numbers and want to make sure they’re always in the same format, you could use CHAR(10) to store them as 10-digit numbers. If you don’t use the entire length of a CHAR column, PostgreSQL will pad it with spaces to the specified length.

VARCHAR, on the other hand, is a variable-length data type that’s useful when you need to store varying amounts of data. It’s great for storing text that varies in length, like blog posts. Because it only uses as much space as it needs, it can save storage space compared to using CHAR.

The maximum length specifier, which ranges from 1 to 10485760 (10 MB), determines the maximum length for VARCHAR. The actual storage size depends on the length of the string being stored. Nevertheless, do not that exceeding the maximum length specifier can cause errors.

TEXT, meanwhile, is a variable-length data type of unlimited length. This is useful when you need to store large amounts of text data, such as long articles. There’s no maximum limit for the length of a TEXT column and it can store as much data as needed, limited only by the disk space available to the database server. It’s good to always use the right character data type when setting up your database to avoid wasted storage space or truncated data.

Monetary Data Types

As the name suggests, monetary data types are used to store currency values. PostgreSQL offers two monetary data types: MONEY, which represents currency amounts with a fixed precision of two decimal places, and SMALLMONEY, which has a fixed precision of four decimal places. These data types are commonly used in applications that deal with financial transactions, such as accounting software or e-commerce platforms.

Monetary data types may not be suitable for all financial calculations though. In some cases, it may be necessary to use decimal or numeric data types instead. For example, if you need to perform complex financial calculations that involve rounding or scaling, decimal or numeric data types would be more appropriate.

You will also want to avoid performing calculations on monetary data types using floating-point arithmetic, which can result in rounding errors and other inaccuracies. It’s instead best to use the appropriate functions and operators provided by PostgreSQL when working with monetary data types.

Data Brokers
PostgreSQL offers two monetary data types used in applications that focus on financial transactions.

©LookerStudio/Shutterstock.com

Binary Data Type

In PostgreSQL, binary data types store binary data such as images, audio files, or other types of non-textual data. The two binary data types supported by PostgreSQL are BYTEA HEX format and BYTEA ESCAPE format.

The BYTEA HEXformat stores binary data as a series of hexadecimal digits, with each byte represented by two characters. This format is often used when working with binary data in programming languages that do not support binary data natively.

In contrast, the BYTEA ESCAPE format stores binary data as a sequence of escape sequences that represent non-printable characters. This format is commonly utilized for binary data in SQL statements or in database backups. Always ensure that you sanitize and validate binary data correctly to prevent security vulnerabilities like injection attacks.

Boolean

In PostgreSQL, the Boolean data type is represented by the values “true” and “false”. They often represent binary values, such as on/off or yes/no responses, or in more complex decision-making processes. For instance, you can use them to filter results based on certain conditions or even to control the flow of a program.

Always ensure that you format the values correctly. For instance, PostgreSQL requires values to be entered in lowercase (“true” and “false”) rather than uppercase. Another common mistake to avoid is treating a Boolean value as a string. Also, keep in mind that a Boolean value is different from a string, and you should not use them interchangeably.

Date/Time

The date/time data type in PostgreSQL has a variety of time-related applications. It’s used to store dates, times, or both in a single data type. With the DATE data type, you can store only the date without the time, while with the TIME data type, you can store only the time without the date. The TIMESTAMP data type allows you to store both date and time with specified precision. Lastly, the INTERVAL data type lets you store a period of time.

When working with date/time data types, it’s crucial to keep time zones and daylight saving time in mind. PostgreSQL automatically adjusts timestamps for time zones, but it’s important to remain consistent in your use of time zones to avoid confusion. 

It’s also a good idea to avoid relying solely on the default time zone settings, which can lead to unexpected results. Ensure that you’re setting time zones explicitly and taking into account any differences in daylight saving time. By doing so, you’ll be able to accurately store and manipulate dates and times for your application’s needs.

Composite Data Type

In PostgreSQL, composite data types allow you to group together multiple related fields into a single data type. This is particularly useful when you have a set of fields that are commonly used together and need to be treated as a single unit.

To define a composite data type, you use the CREATE TYPE statement. For example, let’s say you have a table of addresses and you want to group together the street address, city, state, and zip code fields into a single composite data type called “address.” You could define the composite data type like this:

CREATE TYPE address AS (
street_address TEXT,
city TEXT,
state TEXT,
zip TEXT
);

Once you’ve defined the composite data type, you can use it in your table definitions. For example:

CREATE TABLE customers (
id SERIAL PRIMARY KEY,
name TEXT,
shipping_address address,
billing_address address
);

In this example, the “customers” table has two fields that use the “address” composite data type: “shipping_address” and “billing_address.” This makes it easy to query and manipulate the related address fields as a single unit. Composite data types can’t be directly indexed so if you need to, you’ll have to create an index on one or more of the fields within the composite type.

Text search is a vital component of modern applications, especially those that rely on large volumes of text data. PostgreSQL’s Text Search data type is a powerful tool for storing and searching text documents. The Text Search data type has several variations, including TSVECTOR and TSQUERY, which allow for full-text search functionality in your application.

Using TSVECTOR, PostgreSQL can create indexes on the text data and perform searches on the indexed data, which makes searching faster and more efficient. TSQUERY allows for complex text queries, including Boolean operators, phrase matching, and stemming. By using TSVECTOR and TSQUERY, developers can efficiently and accurately search large volumes of text data while maintaining consistency and accuracy in their search results.

Geometric

If you work with GIS (Geographic Information System) applications, you’ll need to store and manipulate geometric shapes such as points, lines, and polygons. This is where the geometric data type in PostgreSQL comes in handy.

The geometric data type enables you to store different types of geometric shapes in a database. This data type comes with a range of functions that allow you to perform various geometric operations, such as measuring distances between two points, finding the intersection of two lines, and so on.

One of the key advantages of using the geometric data type in PostgreSQL is that it allows you to perform spatial queries, which can be very useful for GIS applications. Spatial queries enable you to locate all objects within a given radius, identify objects that intersect with a given polygon, or find objects contained within a given polygon.

When working with geometric data types, you’ll need to be mindful of the coordinate system you’re using, as different coordinate systems can affect the results of geometric operations. By keeping these considerations in mind, you can make the most of this data type in PostgreSQL.

Network Address

The Network Address data type is used to store IP addresses and MAC addresses. PostgreSQL supports two main network address data types: INET and MACADDR.

INET holds IP addresses, which identify devices on a network. Whether it’s your laptop or your smartphone, every device connected to the internet has an IP address. MACADDR, on the other hand, holds MAC addresses, which uniquely identify network interfaces.

Besides the two, PostgreSQL also supports two other network-related data types, CIDR and MACADDR8. The CIDR data type is used to store IP network blocks in a compact form. It stores both the IP address and the subnet mask in a single column, making it easy to calculate network addresses and ranges. The MACADDR8 data type is an extension of the MACADDR data type, which provides more space to store longer MAC addresses. Bear in mind that these data types are specific to networking applications and may not be applicable in other contexts.

Static IP vs Dynamic IP
PostgreSQL supports network address data types used to store IP and MAC addresses.

©Habichtland/Shutterstock.com

JSON

If you’re a web developer, chances are you’re no stranger to JSON data. JSON (JavaScript Object Notation) data, is a lightweight format for exchanging data commonly used for representing complex data structures in a human-readable format. The JSON data type in PostgreSQL is specifically designed to store and manage JSON data, making it a go-to choice for web applications and APIs.

One major benefit of using the JSON data type is that it allows for flexible schema design. You can store JSON data with varying keys and values, allowing for easy adaptation to changing data requirements. This can be particularly useful when working with APIs, where the structure of the data may change over time.

You may want to take note of the possibility of queries involving JSON data being more complex than with traditional relational data. So it’s important that you plan your schema and queries carefully.

Array

Array data types allow you to store multiple values of the same data type in a single column, saving you time and simplifying your database design. PostgreSQL supports arrays of almost all data types, from simple types like integers and strings to complex data types like JSON and geometric shapes. You can even create multi-dimensional arrays for more complex data structures.

That being the case, array data types aren’t suitable for virtually all use cases. Arrays are not suitable when you need to perform complex queries or updates that require modifying individual elements of an array. In these cases, it’s often better to use a separate table to store the array elements as rows, rather than as a single column.

Another common pitfall with arrays is using them for large data sets. While arrays can be very convenient for storing small sets of data, they can quickly become unwieldy and slow when dealing with large datasets. This is because arrays are stored as a single column, which can make it challenging to perform complex queries and updates efficiently.

UUID

UUIDs, or universally unique identifiers, are a way to generate unique identifiers for your data. They’re often used as primary keys in databases, ensuring that each record has a unique identifier. But not all UUID algorithms are the same. It’s important to choose an algorithm that produces truly unique identifiers.

The UUID data type is commonly used in distributed systems, where multiple nodes may be generating new records simultaneously. In this scenario, using a traditional integer primary key could lead to conflicts and inconsistencies. The UUID data type provides a reliable way to generate unique identifiers across multiple nodes.

UUIDs can come with some downsides though. They take up more storage space than traditional integer primary keys, and they can be slower to generate. Using UUIDs could also make it more difficult to search for and retrieve specific records.

PostgreSQL Data Types Compared to Other RDBMSs

Just like PostgreSQL, other popular RDBMSs (relational database management systems) such as MySQL and SQL Server support various data types. Let’s briefly take a look at how PostgreSQL data types compare to these two.

MySQL and PostgreSQL share many of the same data types, including integers, floating-point numbers, and character strings. There are some differences nonetheless. For example, PostgreSQL has a broader range of data types than MySQL, including arrays and network address types.

Similarly, SQL Server shares many data types with PostgreSQL, including integers, decimal numbers, and character strings. However, SQL Server also includes some data types that PostgreSQL does not, such as geographic spatial data types.

PostgreSQL continues growing in popularity among developers and database administrators because of its robust and flexible data types. Besides that, it also offers advanced features like user-defined types and composite types that provide further customization and flexibility in data management.

Final Word

Data types are a fundamental concept in databases. PostgreSQL offers quite a number of them including commonly used types like integers and character strings, as well as more specialized types like UUID and monetary data types. 

It’s important to understand the unique features and potential pitfalls of each data type to ensure proper usage and avoid errors. Understanding them is also helpful in the successful implementation and management of a PostgreSQL database as well as in building a robust back-end for your application.

PostgreSQL Types: The 13 Data Types You Need to Know FAQs (Frequently Asked Questions) 

What is the difference between a text and a varchar data type in PostgreSQL?

Both TEXT and VARCHAR data types can be used to store character strings, but the main difference is that TEXT has no length limit, while VARCHAR requires a specified length. This means that TEXT can be used for any length of character string, while VARCHAR is more efficient for shorter strings.

What is a composite data type in PostgreSQL?

A composite data type in PostgreSQL is a user-defined data type that can combine multiple fields of different data types into a single data type.

How does PostgreSQL handle NULL values?

PostgreSQL allows NULL values to be stored in table columns, which represent the absence of a value. When querying a table with NULL values, special operators like IS NULL and IS NOT NULL can be used to filter the results.

What is the difference between a SMALLINT and an INTEGER data type in PostgreSQL?

Both SMALLINT and INTEGER data types are used to store whole numbers, but SMALLINT is a 2-byte data type that can store values from -32768 to 32767, while INTEGER is a 4-byte data type that can store values from -2147483648 to 2147483647.

How do I ensure that UUIDs generated by my application are unique?

You can use a UUID generator library or function that employs a reliable algorithm for generating UUIDs. You could also configure your database to use UUID as a primary key and set up a unique index on the UUID column to prevent duplicate UUIDs from being inserted.

To top