What are the COMPRESS and DECOMPRESS functions in SQL Server?
The COMPRESS function in SQL Server 2016 and higher uses the GZIP algorithm to convert input data into a compressed VARBINARY(MAX) format, significantly reducing storage footprint for large strings, JSON, and XML. The DECOMPRESS function reverses this process. Because DECOMPRESS returns binary data, you must use an explicit CAST or CONVERT (e.g., CAST(DECOMPRESS(col) AS VARCHAR(MAX))) to retrieve the original human-readable value..
You are reading Part 7 of the T-SQL Enhancements in SQL Server 2016 for Developers and DBAs. View All 10 Parts
COMPRESS and DECOMPRESS are native T-SQL functions introduced in SQL Server 2016 that allow developers to handle data compression directly within the database engine using the GZIP algorithm. Unlike Row or Page compression which happens at the storage layer, these functions allow for application-level compression, meaning data can be stored in a compact binary format and only expanded when explicitly called.
This enhancement is a game-changer for environments dealing with massive amounts of semi-structured data like JSON, XML, or logs, as it significantly reduces the storage footprint and minimizes Network I/O during data transmission.
Quick Summary: Features and Benefits of T-SQL Compression
Using native compression in SQL Server 2016 provides a specific set of advantages depending on your data type.
| Feature | Primary Benefit | Best Use Case |
|---|---|---|
| GZIP Algorithm | Reduces storage by 60-80% for text. | Large JSON, XML, or Log strings. |
| Column-Level Control | Granular control without locking tables. | High-traffic OLTP tables. |
| Native T-SQL | Reduces Network I/O and Cloud costs. | Azure SQL and Hybrid environments. |
| CAST Requirement | Ensures data integrity on retrieval. | Standardizing application output. |
COMPRESS vs. Row/Page Compression
How is this different from Row or Page Compression?
While SQL Server has offered Row and Page compression for years, the COMPRESS and DECOMPRESS functions are fundamentally different. Standard table compression is handled automatically by the Database Engine at the storage level. In contrast, these T-SQL functions are Application-Level compression tools.
Use COMPRESS when you want to save space on specific large text fields (like audit logs or JSON payloads) without the CPU overhead of compressing the entire table. It is also the only way to ensure data remains compressed while traveling over the network to your application, potentially reducing Network Latency in high-traffic environments.
How GZIP Compression Works in SQL Server
The COMPRESS function takes an input (string or binary) and transforms it into a VARBINARY(MAX) type using the GZIP algorithm. Conversely, DECOMPRESS takes that binary data and expands it back to its original state.
Because the compression happens at the column level, you have granular control over exactly what data gets shrunk. This is particularly useful for:
- Cloud Cost Management: Reducing the size of data in Azure SQL Database to lower storage tier costs.
- Offloading CPU: Allowing the SQL engine to store compressed data while the application handles the logic of when to expand it.
T-SQL COMPRESS and DECOMPRESS Syntax & Examples
To use these functions effectively, you must understand that DECOMPRESS always returns a VARBINARY(MAX) result. To see the original text, you must explicitly CAST or CONVERT the result back to its original data type (e.g., VARCHAR or NVARCHAR).
Example: COMPRESS and DECOMPRESS function of SQL Server 2016
The following script demonstrates the lifecycle of a string through the compression engine:
-- Example COMPRESS and DECOMPRESS function of SQL Server 2016
SELECT
'MyTechMantra' AS [Input Value To Compress]
,COMPRESS('SearchSQLServer') AS [Compressed Value]
,DECOMPRESS(COMPRESS('MyTechMantra')) AS [Decompressed Value]
,CAST(DECOMPRESS (COMPRESS ('MyTechMantra')) AS VARCHAR(MAX)) AS [Derive Actual Input Value]
GO
Key Observation: In the example above, the column [Decompressed Value] will appear as hexadecimal “gibberish” to the human eye. The [Derive Actual Input Value] column shows the corrected result after applying the CAST function.
⚠️ DBA WARNING: Performance Impact
WhileCOMPRESSis excellent for storage, do not use it on columns that are frequently used in WHERE clauses or JOIN conditions. Because the data is stored as a binary blob, SQL Server must decompress the value for every single row to evaluate the filter. In large tables, this will lead to massive CPU spikes and slow query performance. Always decompress at the application layer or only on the finalSELECTlist.
Dealing with “Binary Filtering” (The Workaround)
Can you filter by a compressed column?
A common challenge with COMPRESS is that you cannot search for a string like WHERE CompressedColumn LIKE '%SearchTerm%' because the data is stored as a binary GZIP blob.
To maintain performance while benefiting from compression, the best practice is to use a Shadow Column or a Persisted Computed Column. By storing a small, uncompressed “Key Field” (like a CategoryID or a short Hash) alongside the compressed blob, you can filter your data efficiently without forcing the SQL engine to decompress every row during a scan.
Use Cases: T-SQL Compression vs. Page Compression
While SQL Server has offered Row and Page compression for years, the COMPRESS function serves a different purpose:
- JSON and XML Blobs: Page compression is often ineffective for large LOB (Large Object) data. Native
COMPRESScan often shrink these by 60-80%. - Archive Tables: For data that is rarely accessed but must be kept for compliance, storing it in a compressed binary format saves massive amounts of disk space.
- Network Bandwidth: Compressed data travels faster over the network. If your application resides in a different data center than your SQL Server, sending compressed
VARBINARYcan improve query response times.
Performance Considerations: CPU vs. Storage
Before implementing COMPRESS across your entire database, consider the trade-offs:
- CPU Overhead: Compression and decompression are CPU-intensive tasks. If your server is already running at 90% CPU utilization, adding these functions may cause bottlenecks.
- Indexing Limitations: You cannot create a useful index on a
VARBINARY(MAX)compressed column. If you need to search the data, you should store the searchable keys in standard columns and compress only the “payload” data.
Conclusion
Implementing the COMPRESS and DECOMPRESS functions in SQL Server 2016 is a strategic move for modernizing data storage. By shifting from plain-text storage to GZIP-compressed binary, you can significantly reduce storage costs and optimize I/O performance. However, always remember to cast your decompressed results to ensure data readability within your T-SQL queries.
Next Step: In Part 9 of this series, we will explore STRING_SPLIT and STRING_ESCAPE, two essential functions that simplify how we handle delimited strings and JSON formatting in T-SQL.
Frequently Asked Questions (FAQs) on COMPRESS and DECOMPRESS
1. What algorithm does SQL Server use for the COMPRESS function?
SQL Server uses the standard GZIP algorithm. This means that data compressed in SQL Server can technically be decompressed by other applications that support GZIP.
2. Does DECOMPRESS automatically return the original data type?
No. DECOMPRESS always returns VARBINARY(MAX). You must use CAST or CONVERT to transform it back to VARCHAR, NVARCHAR, or other types.
3. Can I compress an existing column using an ALTER TABLE command?
Not directly. You would typically add a new VARBINARY(MAX) column, update it using the COMPRESS function on the old column, and then drop the original column.
4. Is there a limit to how much data I can compress?
The limit is the same as the VARBINARY(MAX) data type, which is 2GB.
5. How does this affect the Transaction Log?
Since the data is smaller, it results in smaller log records during INSERT and UPDATE operations, which can actually improve log throughput.
6. Can I use COMPRESS on numeric data types?
Yes, but you must first convert the numeric data to a string or binary format. However, the space savings on small integers are usually negligible.
7. Does this function work in Azure SQL Database?
Yes, COMPRESS and DECOMPRESS are fully supported in Azure SQL Database and Managed Instance.
8. Is it better than Columnstore compression?
They serve different purposes. Columnstore is for analytical processing (OLAP), while COMPRESS/DECOMPRESS is ideal for row-level operations (OLTP) on large text fields.
9. Will COMPRESS help with memory usage?
Yes, because the data remains compressed in the Buffer Pool (memory), it allows you to fit more data into the same amount of RAM.
SQL Server 2016 T-SQL Enhancement Series
- STRING_SPLIT and STRING_ESCAPE Functions
- FORMATMESSAGE Statement
- SERVERPROPERTY Function
- TRUNCATE TABLE WITH PARTITIONS
- DROP IF EXISTS
- ALTER TABLE WITH (ONLINE = ON | OFF)
- MAXDOP for DBCC CHECKDB, DBCC CHECKTABLE and DBCC CHECKFILEGROUP
- ALTER DATABASE SET AUTOGROW_SINGLE_FILE
- ALTER DATABASE SET AUTOGROW_ALL_FILES
- COMPRESS and DECOMPRESS Functions
Trending SQL Server Articles and Tips
- Permissions Required to Take Database Backup in SQL Server
- SQL Server: How to Start SQL Server with Minimal Configuration
- Different Ways to Enable Dedicated Administrator Connection in SQL Server
- How to Start SQL Server without TempDB Database
- How to Fix Cannot execute script. Insufficient memory to continue the execution of the program (mscorlib) error in SQL Server

Add comment