Overview of Column Store Indexes (CSI).

Since SQL 2012, Microsoft enhancing the capabilities of column store indexes that can dramatically enhance the performance of SQL queries. You may ask, what is column store indexes & how it is different than traditional row store index. Here is the example to clear doubts.

Here is the small table.

ID Name Salary
1 emp1 1000
2 emp2 1200
3 emp3 1100

 Data will be saved in following format when

A.    Row Store index is used
Page 1  (1, emp1, 1000),
Page 2  (2, emp2, 1200),
page3  (3, emp3, 1100)

B.    Column Store index is used
Page 1 (1, 2, 3),
Page 2 (emp1, emp2, emp3),
Page 3 (1000, 1200, 1100)

Now without discussing a lot, let see the demonstration to see difference in performance.

-- Creating Sample demo database. 
USE [master]
CREATE DATABASE ColumnStoreDemo
GO

-- Creating Table to demo CS indexes
USE [ColumnStoreDemo]
CREATE TABLE CSTable
(
EMPID INT IDENTITY(1,1),
FirstName VARCHAR(100),
LastName VARCHAR (100),
City VARCHAR (100),
ProductGroup VARCHAR (100),
SALARY BIGINT,
JoingDate DATE DEFAULT GETDATE()
)
GO

-- Creating a row store clustered & non-clustered indexes. 
CREATE CLUSTERED INDEX CSTable_CI_EMPID ON CSTable (EMPID)
CREATE NONCLUSTERED INDEX CSTable_CI_FirstName ON CSTable (FirstName)
GO

-- Creating a non clustered column store index. 
CREATE NONCLUSTERED COLUMNSTORE INDEX [IX_CSTable_NC_ColumnStore] ON CSTable (EMPID, FirstName, LastName, City, ProductGroup, JoingDate
)
GO
-- Query to show all indexes of a given table.
SELECT
ind.name, 
obj.name, 
ind.type_desc
FROM sys.indexes ind INNER JOIN sys.objects obj ON ind.object_id = obj.object_id
WHERE obj.name like 'CSTable'
GO

Indexes

-- Inserting 7 unique rows in 10k batches. 
INSERT INTO CSTable (Firstname, LastName, City, ProductGroup, SALARY)
VALUES
('Pradeep', 'Papnai', 'Pune', 'EV', 1000),
('Vikram', 'Jadhav', 'Pune', 'EV', 2500), 
('Rahul', 'Sharma', 'Pune', 'Antivirus', 1500),
('Rahul', 'Gupta', 'Pune', 'Exchange', 1900),
('Ajay', 'Gadge', 'Mumbai', 'EV', 2100), 
('Ajay', 'Jain', 'Mumbai', 'EV', 1600), 
('Rahul', 'Katiyar', 'Delhi', 'EV', 1660)
GO 10000

Run following two commands together while including ‘actual execution plan

USE [ColumnStoreDemo]
SELECT COUNT(*), CITY, PRODUCTGROUP FROM CSTable
GROUP BY CITY, PRODUCTGROUP

SELECT COUNT(*), CITY, PRODUCTGROUP FROM CSTable
GROUP BY CITY, PRODUCTGROUP
OPTION (IGNORE_NONCLUSTERED_COLUMNSTORE_INDEX) ---We are ignoring column store index so it fail to row clustered indexes. 
GO

Example1

Column Store Index (CSI) feature introduce with SQL 2012 and significant improvement made in subsequent version of SQL servers (2014/2016). CSI uses xVelocity technology that was based on Vertipaq engine. Column store indexes store data column wise, each page stores data from a single column. The traditional row store index (heap/cluster/non clustered) save data for each row in data pages.

During completion, query optimized automatically considers & priorities column store indexes unless we ignore it manually (with key word “OPTION (IGNORE_NONCLUSTERED_COLUMNSTORE_INDEX))

Column store increases the performance by reducing the I/O as it…
–    Read less data that is highly compressed.
–    process data in units called “batches” from disk to memory.
–    Segment elimination (contain minimum & maximum value for Integer datatype)

Row Group  is the group of 1 million (1,048,576) logically contiguous rows.
Column Segment contains values from one column for a set of rows.

Architecture
Segments are compressed and Segment is unit of transfer between disk and memory.

Architecture2
SQL 2012 offering for ColumnStore
–    A Table (Heap or BTree) can have only one NonClustered ColumnStore Index.
–    A Table with NonClustered ColumnStore Index becomes readonly and cannot be updated.
–    The NonClustered ColumnStore Index uses Segment Compression for high compression of Columnar data.

SQL 2014 offering for ColumnStore.
–    You can create one Clustered ColumnStore Index on a Table, and no further Indexes can be created.
–    A Table with Clustered ColumnStore Index can be updated with INSERT/UPDATE/DELETE operations.
–    A Table with NonCluster ColumnStore index make entire table as readonly.
–    Both Clustered & NonClustered ColumnStore Index has new Archival Compression option (COLUMNSTORE_ARCHIVE) to further compress the data.

SQL 2016 offering for column store.
–    A Table can NonClustered ColumnStore Index with read/write capablities.
–    A Clustered ColumnStore Index table can have one or more NonClustered RowStore Indexes.
–    Clustered ColumnStore Index can now be unique (Primary & Foreign key is allowed)
–    Ability to create Filtered NonClustered ColumnStore Index by specifying a WHERE clause to it.

More Details about this topic
Clustered Column store Indexes
MSDN

Advertisements

#columnstoreindex, #sql-performance, #sql2016