What is Collation in Database?

Prerna Kaushik
4 min readMay 17, 2021

--

Understanding database or its functionalities or operations or simple yet beneficial things that you can easily achieve through SQL commands has always been an enlightening and learning experience for me. For me personally I think how database is structured, how stored procedures, table structures, queries are written, how you maintain your data plays a vital role in the performance of the project. There is a lot to learn and a lot to discuss. Starting here with the concept of collation, let’s discuss what is collation, what it means, what it signifies and how to use it?

Image courtesy : Astera.com

WHAT IS COLLATION

“Collation is set of rules maintained that instruct database on how to sort and compare data in SQL server.”

Collation is the set of rules that define how data characters are stored, compared and sorted in the database. there are multiple ways to set these rules. You can set collation rules on Server level, database level or even on column level. Going further in this article we will study briefly about all three.

SQL Server Level :

SQL server offers you to set collation at the time of installation of SQL server setup. Collation set at the installation time is SQL server level collation.

Database level :

Second option we have is setting up collation on the database level. By default the databases inherit the collation set up at the SQL server level. So, in case you want to change the collation at database level, you can set it up while creating the database using the specified SQL command.

CREATE DATABASE [ExampleDatabase] COLLATE SQL_Latin1_General_CP1_CS_AS GO

Here [ExampleDatabase] means the name of the database you want to apply collation to.

COLUMN LEVEL :

Similar to database level , By default the databases inherit the collation set up at the one step higher level which is your database level. So, in case you want to change the collation at column level, you can set it up while creating the table using the specified SQL command.

CREATE TABLE Example (ExampleId int, ExampleName varchar(500) COLLATE SQL_Latin1_General_CP437_BIN)

Here Example is the name of the table and ExampleName is the column you want to apply collation to.

UPDATE AND VIEW COLLATION COMMANDS

To update and view the collation after creation of database or table you can use given commands. To view the collation set up on SQL server level you can run command-

SELECT SERVERPROPERTY(‘collation’)

For database level -

ALTER DATABASE [DatabaseName] COLLATE SQL_Latin1_General_CP437_BIN

SELECT DATABASEPROPERTYEX(‘DatabaseName’,’collation’)

Here DatabaseName is the name of the database you want to change or view collation for.

To change the column level collation, we can use

ALTER TABLE dbo.Example ALTER COLUMN ExampleName VARCHAR(500) COLLATE Latin1_General_100_CI_AI_SC_UTF8

sp_help Example

Here Example is the name of the table and ExampleName is the column you want to change or view collation for.

COLLATION OPTIONS

Collation offers you multiple options to be set in your collation in which your data will be sorted or compared. for example case sensitivity or accent sensitivity. Here are some of the very commonly used options listed below.

  1. Case sensitive (_CS) : lower case and upper case are treated as different and lower case occurs before upper case in sorting.
  2. Case insensitive (_CI) : lower case and upper case are treated as identical.
  3. Accent sensitive (_AS) : used to consider accented and non accented letter separately. for example ‘a’ and ’ ấ’are not treated identical.
  4. Accent insensitive (_AI) : used to consider accented and non accented letter identical. for example ‘a’ and ’ ấ’are treated identical.
  5. Width sensitive (_WS) : differentiates in full width and half width characters present. In case _WS is not mentioned then it is width-insensitive and hence full and half width characters will be considered identical.

FAQs ON COLLATION

  1. How to find collations on the system?

SELECT name, description FROM sys.fn_helpcollations();

2. How to set default database collation?

USE master;

GO

DROP DATABASE IF EXISTS TestCollations;

GO

CREATE DATABASE TestCollations

COLLATE Indic_General_100_CS_AS;

GO

3. How can i find Unicode collations in system?

SELECT name, description FROM sys.fn_helpcollations() WHERE COLLATIONPROPERTY(name, ‘CodePage’) = 0;

4. What is _KS in collation?

_KS signifies Kana-sensitivity. it separates the two types of Japanese kana characters which are namely Hiragana and Katakana. If _KS in not mentioned then the collation is Kana-insensitive. which means that SQL Server considers Hiragana and Katakana characters to be equal for sorting purposes.

I hope this was beneficial to you. I tried to sum up all the knowledge I gained by reading documents, videos on internet and implementation in code. In case you have any suggestions, improvements or add on to the article please do comment below. It would come to be a great source of learning.

Thank you!

Note: It’s a tough phase that the whole world is going through. I hope everyone reading this is sitting back home, safe and sound. Social distancing is our way of killing this pandemic. We are fighting together with this situation of COVID-19 and we will get through this. We are in it together. Stay Safe! Stay Inside!

--

--