Which collation should I use for biblical Hebrew?SQL Server collation mismatchSQL Server Collation for Arabic, Hebrew, English and FrenchDesign SQL Server 2008 new install … which collation?Server default collation change or not changeDB collation is used for comparison instead of column collationWhat collation to use for Ukraine?What is Collation Compatibility_60_406_30001 in SQL ServerSearch for Arabic text ignoring diacritics, alef hamza differences, and kashida in SQL Server and OracleSQL Server default collation vs database with different collation - potential problems?
Disrespectful employee going above my head and telling me what to do. I am his manager
How much income am I getting by renting my house?
Can the bass be used instead of drums?
Why is matter-antimatter asymmetry surprising, if asymmetry can be generated by a random walk in which particles go into black holes?
Fill a bowl with alphabet soup
How can my hammerspace safely "decompress"?
Is it realistic that an advanced species isn't good at war?
Eigenvectors of the Hadamard matrix?
UK PM is taking his proposal to EU but has not proposed to his own parliament - can he legally bypass the UK parliament?
What can I do to avoid potential charges for bribery?
Why are seats at the rear of a plane sometimes unavailable even though many other seats are available in the plane?
How do express my condolences, when I couldn't show up at the funeral?
Slow coworker receiving compliments while I receive complaints
Can you decide not to sneak into a room after seeing your roll?
Numbering like equations for regular text
Based on true story rules
SQL server backup message
Is self-defense mutually exclusive of murder?
Why is CMYK & PNG not possible?
Why did the range based for loop specification change in C++17
Would you use a proportion to solve this problem?
Little Endian Number to String Conversion
"Kept that sister of his quiet" meaning
What kind of screwdriver can unscrew this?
Which collation should I use for biblical Hebrew?
SQL Server collation mismatchSQL Server Collation for Arabic, Hebrew, English and FrenchDesign SQL Server 2008 new install … which collation?Server default collation change or not changeDB collation is used for comparison instead of column collationWhat collation to use for Ukraine?What is Collation Compatibility_60_406_30001 in SQL ServerSearch for Arabic text ignoring diacritics, alef hamza differences, and kashida in SQL Server and OracleSQL Server default collation vs database with different collation - potential problems?
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty
margin-bottom:0;
Which SQL Server collation should I use for biblical Hebrew? The database under consideration needs to accommodate diacritics (i.e., vowels, accents, trope, etc.).
sql-server database-design configuration sql-server-2017 collation
New contributor
add a comment
|
Which SQL Server collation should I use for biblical Hebrew? The database under consideration needs to accommodate diacritics (i.e., vowels, accents, trope, etc.).
sql-server database-design configuration sql-server-2017 collation
New contributor
add a comment
|
Which SQL Server collation should I use for biblical Hebrew? The database under consideration needs to accommodate diacritics (i.e., vowels, accents, trope, etc.).
sql-server database-design configuration sql-server-2017 collation
New contributor
Which SQL Server collation should I use for biblical Hebrew? The database under consideration needs to accommodate diacritics (i.e., vowels, accents, trope, etc.).
sql-server database-design configuration sql-server-2017 collation
sql-server database-design configuration sql-server-2017 collation
New contributor
New contributor
edited 8 hours ago
MDCCL
7,2993 gold badges20 silver badges48 bronze badges
7,2993 gold badges20 silver badges48 bronze badges
New contributor
asked 9 hours ago
brian12345brian12345
311 bronze badge
311 bronze badge
New contributor
New contributor
add a comment
|
add a comment
|
2 Answers
2
active
oldest
votes
First, regardless of anything else, you want to use the newest set of collations, which are the _100_
series as they have the newer / more complete sort weights and linguistic rules than the older series with no version number in the name (technically the are version 80
). (NOTE: testing is showing that the version 100
collations are not allowing cantillation marks to be ignored while the non-versioned collations are, which is the opposite of how it should be...will update with what I find out)
Next, it depends on how you will be interacting with the string values. Hebrew has no case variations, nor do the concepts of "width" or "Kana" apply. Accent-sensitive / insensitive might apply to diacritics used for vowels and cantillation marks (i.e. trope). Maybe answer these questions:
Do you need to distinguish between
א
andאֽ
, or betweenש
,שׁ
, andשׂ
? If so, you want an_AS
collation. (actually, testing is showing that vowels and cantillation marks are always sensitive in version 100 collations, but not in unversioned / version 80 collations, which is the opposite of how this should be). The following test shows that a vowel difference still compares as different in a newer, accent-insensitive collation, which I think should compare as the same:SELECT NCHAR(0x05D0), NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) = NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_100_CI_AI_SC;
-- no rows (was expecting 1 row!!)
SELECT NCHAR(0x05D0), NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) = NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_CS_AI;
-- 1 row (expected)
-- א אֽ
SELECT NCHAR(0x05D0), NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) = NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_CS_AS;
-- no rows (expected)Even just a cantillation mark difference is an issue with the newer accent-insensitive collations, and this seems odd given that the mark is Hebrew Accent Geresh (U+059C) (which has "accent" in the name!):
SELECT NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD),
NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD) =
NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_100_CS_AI_SC;
-- no rows (was expecting 1 row!!)
SELECT NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD),
NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD) =
NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_CS_AI;
-- 1 row (expected)
-- אֽ֜ אֽ
SELECT NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD),
NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD) =
NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_CS_AS;
-- no rows (expected)Do you need to distinguish between upper-case and lower-case in other languages, such as English? If so, then you want a
_CS
collation, else you can use a_CI
collation (i.e. case-insensitive)
You do not want a binary collation (_BIN
or _BIN2
) as those cannot distinguish between Hebrew letters with both vowels and cantillation marks that are the same but have the combining characters in different orders, nor can they ignore vowels and other marks to equate things like א
and אֽ
.
For example (vowel and cantillation mark combining characters in opposite order):
SELECT NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD),
NCHAR(0x05D0) + NCHAR(0x05BD) + NCHAR(0x059C)
WHERE NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD) =
NCHAR(0x05D0) + NCHAR(0x05BD) + NCHAR(0x059C) COLLATE Hebrew_100_CS_AS_SC;
-- אֽ֜ אֽ֜
SELECT NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD),
NCHAR(0x05D0) + NCHAR(0x05BD) + NCHAR(0x059C)
WHERE NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD) =
NCHAR(0x05D0) + NCHAR(0x05BD) + NCHAR(0x059C) COLLATE Hebrew_100_BIN2;
-- no rows
Also, the collations ending in _SC
support supplementary characters (i.e. full UTF-16) so usually best to pick one of those, if available.
So, perhaps Hebrew_100_CI_AS_SC
or Hebrew_100_CS_AS_SC
for the columns, and you can override per expression / predicate via the COLLATE
statement if you need to use a variation, such as COLLATE Hebrew_CS_AI
.
Also, you will need to store the data in NVARCHAR
columns / variables.
add a comment
|
It depends on a lot of things. Collation is sorting, comparing, and non-unicode code page.
This repo has a good list of options around Hebrew.
+---------------------------+---------------------------------------------------------------------------------------------------------------------+
| Hebrew_BIN | Hebrew, binary sort |
| Hebrew_BIN2 | Hebrew, binary code point comparison sort |
| Hebrew_CI_AI | Hebrew, case-insensitive, accent-insensitive, kanatype-insensitive, width-insensitive |
| Hebrew_CI_AI_WS | Hebrew, case-insensitive, accent-insensitive, kanatype-insensitive, width-sensitive |
| Hebrew_CI_AI_KS | Hebrew, case-insensitive, accent-insensitive, kanatype-sensitive, width-insensitive |
| Hebrew_CI_AI_KS_WS | Hebrew, case-insensitive, accent-insensitive, kanatype-sensitive, width-sensitive |
| Hebrew_CI_AS | Hebrew, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive |
| Hebrew_CI_AS_WS | Hebrew, case-insensitive, accent-sensitive, kanatype-insensitive, width-sensitive |
| Hebrew_CI_AS_KS | Hebrew, case-insensitive, accent-sensitive, kanatype-sensitive, width-insensitive |
| Hebrew_CI_AS_KS_WS | Hebrew, case-insensitive, accent-sensitive, kanatype-sensitive, width-sensitive |
| Hebrew_CS_AI | Hebrew, case-sensitive, accent-insensitive, kanatype-insensitive, width-insensitive |
| Hebrew_CS_AI_WS | Hebrew, case-sensitive, accent-insensitive, kanatype-insensitive, width-sensitive |
| Hebrew_CS_AI_KS | Hebrew, case-sensitive, accent-insensitive, kanatype-sensitive, width-insensitive |
| Hebrew_CS_AI_KS_WS | Hebrew, case-sensitive, accent-insensitive, kanatype-sensitive, width-sensitive |
| Hebrew_CS_AS | Hebrew, case-sensitive, accent-sensitive, kanatype-insensitive, width-insensitive |
| Hebrew_CS_AS_WS | Hebrew, case-sensitive, accent-sensitive, kanatype-insensitive, width-sensitive |
| Hebrew_CS_AS_KS | Hebrew, case-sensitive, accent-sensitive, kanatype-sensitive, width-insensitive |
| Hebrew_CS_AS_KS_WS | Hebrew, case-sensitive, accent-sensitive, kanatype-sensitive, width-sensitive |
| Hebrew_100_BIN | Hebrew-100, binary sort |
| Hebrew_100_BIN2 | Hebrew-100, binary code point comparison sort |
| Hebrew_100_CI_AI | Hebrew-100, case-insensitive, accent-insensitive, kanatype-insensitive, width-insensitive |
| Hebrew_100_CI_AI_WS | Hebrew-100, case-insensitive, accent-insensitive, kanatype-insensitive, width-sensitive |
| Hebrew_100_CI_AI_KS | Hebrew-100, case-insensitive, accent-insensitive, kanatype-sensitive, width-insensitive |
| Hebrew_100_CI_AI_KS_WS | Hebrew-100, case-insensitive, accent-insensitive, kanatype-sensitive, width-sensitive |
| Hebrew_100_CI_AS | Hebrew-100, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive |
| Hebrew_100_CI_AS_WS | Hebrew-100, case-insensitive, accent-sensitive, kanatype-insensitive, width-sensitive |
| Hebrew_100_CI_AS_KS | Hebrew-100, case-insensitive, accent-sensitive, kanatype-sensitive, width-insensitive |
| Hebrew_100_CI_AS_KS_WS | Hebrew-100, case-insensitive, accent-sensitive, kanatype-sensitive, width-sensitive |
| Hebrew_100_CS_AI | Hebrew-100, case-sensitive, accent-insensitive, kanatype-insensitive, width-insensitive |
| Hebrew_100_CS_AI_WS | Hebrew-100, case-sensitive, accent-insensitive, kanatype-insensitive, width-sensitive |
| Hebrew_100_CS_AI_KS | Hebrew-100, case-sensitive, accent-insensitive, kanatype-sensitive, width-insensitive |
| Hebrew_100_CS_AI_KS_WS | Hebrew-100, case-sensitive, accent-insensitive, kanatype-sensitive, width-sensitive |
| Hebrew_100_CS_AS | Hebrew-100, case-sensitive, accent-sensitive, kanatype-insensitive, width-insensitive |
| Hebrew_100_CS_AS_WS | Hebrew-100, case-sensitive, accent-sensitive, kanatype-insensitive, width-sensitive |
| Hebrew_100_CS_AS_KS | Hebrew-100, case-sensitive, accent-sensitive, kanatype-sensitive, width-insensitive |
| Hebrew_100_CS_AS_KS_WS | Hebrew-100, case-sensitive, accent-sensitive, kanatype-sensitive, width-sensitive |
| Hebrew_100_CI_AI_SC | Hebrew-100, case-insensitive, accent-insensitive, kanatype-insensitive, width-insensitive, supplementary characters |
| Hebrew_100_CI_AI_WS_SC | Hebrew-100, case-insensitive, accent-insensitive, kanatype-insensitive, width-sensitive, supplementary characters |
| Hebrew_100_CI_AI_KS_SC | Hebrew-100, case-insensitive, accent-insensitive, kanatype-sensitive, width-insensitive, supplementary characters |
| Hebrew_100_CI_AI_KS_WS_SC | Hebrew-100, case-insensitive, accent-insensitive, kanatype-sensitive, width-sensitive, supplementary characters |
| Hebrew_100_CI_AS_SC | Hebrew-100, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive, supplementary characters |
| Hebrew_100_CI_AS_WS_SC | Hebrew-100, case-insensitive, accent-sensitive, kanatype-insensitive, width-sensitive, supplementary characters |
| Hebrew_100_CI_AS_KS_SC | Hebrew-100, case-insensitive, accent-sensitive, kanatype-sensitive, width-insensitive, supplementary characters |
| Hebrew_100_CI_AS_KS_WS_SC | Hebrew-100, case-insensitive, accent-sensitive, kanatype-sensitive, width-sensitive, supplementary characters |
| Hebrew_100_CS_AI_SC | Hebrew-100, case-sensitive, accent-insensitive, kanatype-insensitive, width-insensitive, supplementary characters |
| Hebrew_100_CS_AI_WS_SC | Hebrew-100, case-sensitive, accent-insensitive, kanatype-insensitive, width-sensitive, supplementary characters |
| Hebrew_100_CS_AI_KS_SC | Hebrew-100, case-sensitive, accent-insensitive, kanatype-sensitive, width-insensitive, supplementary characters |
| Hebrew_100_CS_AI_KS_WS_SC | Hebrew-100, case-sensitive, accent-insensitive, kanatype-sensitive, width-sensitive, supplementary characters |
| Hebrew_100_CS_AS_SC | Hebrew-100, case-sensitive, accent-sensitive, kanatype-insensitive, width-insensitive, supplementary characters |
| Hebrew_100_CS_AS_WS_SC | Hebrew-100, case-sensitive, accent-sensitive, kanatype-insensitive, width-sensitive, supplementary characters |
| Hebrew_100_CS_AS_KS_SC | Hebrew-100, case-sensitive, accent-sensitive, kanatype-sensitive, width-insensitive, supplementary characters |
| Hebrew_100_CS_AS_KS_WS_SC | Hebrew-100, case-sensitive, accent-sensitive, kanatype-sensitive, width-sensitive, supplementary characters |
+---------------------------+---------------------------------------------------------------------------------------------------------------------+
add a comment
|
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "182"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/4.0/"u003ecc by-sa 4.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
brian12345 is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f250215%2fwhich-collation-should-i-use-for-biblical-hebrew%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
First, regardless of anything else, you want to use the newest set of collations, which are the _100_
series as they have the newer / more complete sort weights and linguistic rules than the older series with no version number in the name (technically the are version 80
). (NOTE: testing is showing that the version 100
collations are not allowing cantillation marks to be ignored while the non-versioned collations are, which is the opposite of how it should be...will update with what I find out)
Next, it depends on how you will be interacting with the string values. Hebrew has no case variations, nor do the concepts of "width" or "Kana" apply. Accent-sensitive / insensitive might apply to diacritics used for vowels and cantillation marks (i.e. trope). Maybe answer these questions:
Do you need to distinguish between
א
andאֽ
, or betweenש
,שׁ
, andשׂ
? If so, you want an_AS
collation. (actually, testing is showing that vowels and cantillation marks are always sensitive in version 100 collations, but not in unversioned / version 80 collations, which is the opposite of how this should be). The following test shows that a vowel difference still compares as different in a newer, accent-insensitive collation, which I think should compare as the same:SELECT NCHAR(0x05D0), NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) = NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_100_CI_AI_SC;
-- no rows (was expecting 1 row!!)
SELECT NCHAR(0x05D0), NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) = NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_CS_AI;
-- 1 row (expected)
-- א אֽ
SELECT NCHAR(0x05D0), NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) = NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_CS_AS;
-- no rows (expected)Even just a cantillation mark difference is an issue with the newer accent-insensitive collations, and this seems odd given that the mark is Hebrew Accent Geresh (U+059C) (which has "accent" in the name!):
SELECT NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD),
NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD) =
NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_100_CS_AI_SC;
-- no rows (was expecting 1 row!!)
SELECT NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD),
NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD) =
NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_CS_AI;
-- 1 row (expected)
-- אֽ֜ אֽ
SELECT NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD),
NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD) =
NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_CS_AS;
-- no rows (expected)Do you need to distinguish between upper-case and lower-case in other languages, such as English? If so, then you want a
_CS
collation, else you can use a_CI
collation (i.e. case-insensitive)
You do not want a binary collation (_BIN
or _BIN2
) as those cannot distinguish between Hebrew letters with both vowels and cantillation marks that are the same but have the combining characters in different orders, nor can they ignore vowels and other marks to equate things like א
and אֽ
.
For example (vowel and cantillation mark combining characters in opposite order):
SELECT NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD),
NCHAR(0x05D0) + NCHAR(0x05BD) + NCHAR(0x059C)
WHERE NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD) =
NCHAR(0x05D0) + NCHAR(0x05BD) + NCHAR(0x059C) COLLATE Hebrew_100_CS_AS_SC;
-- אֽ֜ אֽ֜
SELECT NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD),
NCHAR(0x05D0) + NCHAR(0x05BD) + NCHAR(0x059C)
WHERE NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD) =
NCHAR(0x05D0) + NCHAR(0x05BD) + NCHAR(0x059C) COLLATE Hebrew_100_BIN2;
-- no rows
Also, the collations ending in _SC
support supplementary characters (i.e. full UTF-16) so usually best to pick one of those, if available.
So, perhaps Hebrew_100_CI_AS_SC
or Hebrew_100_CS_AS_SC
for the columns, and you can override per expression / predicate via the COLLATE
statement if you need to use a variation, such as COLLATE Hebrew_CS_AI
.
Also, you will need to store the data in NVARCHAR
columns / variables.
add a comment
|
First, regardless of anything else, you want to use the newest set of collations, which are the _100_
series as they have the newer / more complete sort weights and linguistic rules than the older series with no version number in the name (technically the are version 80
). (NOTE: testing is showing that the version 100
collations are not allowing cantillation marks to be ignored while the non-versioned collations are, which is the opposite of how it should be...will update with what I find out)
Next, it depends on how you will be interacting with the string values. Hebrew has no case variations, nor do the concepts of "width" or "Kana" apply. Accent-sensitive / insensitive might apply to diacritics used for vowels and cantillation marks (i.e. trope). Maybe answer these questions:
Do you need to distinguish between
א
andאֽ
, or betweenש
,שׁ
, andשׂ
? If so, you want an_AS
collation. (actually, testing is showing that vowels and cantillation marks are always sensitive in version 100 collations, but not in unversioned / version 80 collations, which is the opposite of how this should be). The following test shows that a vowel difference still compares as different in a newer, accent-insensitive collation, which I think should compare as the same:SELECT NCHAR(0x05D0), NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) = NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_100_CI_AI_SC;
-- no rows (was expecting 1 row!!)
SELECT NCHAR(0x05D0), NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) = NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_CS_AI;
-- 1 row (expected)
-- א אֽ
SELECT NCHAR(0x05D0), NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) = NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_CS_AS;
-- no rows (expected)Even just a cantillation mark difference is an issue with the newer accent-insensitive collations, and this seems odd given that the mark is Hebrew Accent Geresh (U+059C) (which has "accent" in the name!):
SELECT NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD),
NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD) =
NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_100_CS_AI_SC;
-- no rows (was expecting 1 row!!)
SELECT NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD),
NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD) =
NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_CS_AI;
-- 1 row (expected)
-- אֽ֜ אֽ
SELECT NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD),
NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD) =
NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_CS_AS;
-- no rows (expected)Do you need to distinguish between upper-case and lower-case in other languages, such as English? If so, then you want a
_CS
collation, else you can use a_CI
collation (i.e. case-insensitive)
You do not want a binary collation (_BIN
or _BIN2
) as those cannot distinguish between Hebrew letters with both vowels and cantillation marks that are the same but have the combining characters in different orders, nor can they ignore vowels and other marks to equate things like א
and אֽ
.
For example (vowel and cantillation mark combining characters in opposite order):
SELECT NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD),
NCHAR(0x05D0) + NCHAR(0x05BD) + NCHAR(0x059C)
WHERE NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD) =
NCHAR(0x05D0) + NCHAR(0x05BD) + NCHAR(0x059C) COLLATE Hebrew_100_CS_AS_SC;
-- אֽ֜ אֽ֜
SELECT NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD),
NCHAR(0x05D0) + NCHAR(0x05BD) + NCHAR(0x059C)
WHERE NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD) =
NCHAR(0x05D0) + NCHAR(0x05BD) + NCHAR(0x059C) COLLATE Hebrew_100_BIN2;
-- no rows
Also, the collations ending in _SC
support supplementary characters (i.e. full UTF-16) so usually best to pick one of those, if available.
So, perhaps Hebrew_100_CI_AS_SC
or Hebrew_100_CS_AS_SC
for the columns, and you can override per expression / predicate via the COLLATE
statement if you need to use a variation, such as COLLATE Hebrew_CS_AI
.
Also, you will need to store the data in NVARCHAR
columns / variables.
add a comment
|
First, regardless of anything else, you want to use the newest set of collations, which are the _100_
series as they have the newer / more complete sort weights and linguistic rules than the older series with no version number in the name (technically the are version 80
). (NOTE: testing is showing that the version 100
collations are not allowing cantillation marks to be ignored while the non-versioned collations are, which is the opposite of how it should be...will update with what I find out)
Next, it depends on how you will be interacting with the string values. Hebrew has no case variations, nor do the concepts of "width" or "Kana" apply. Accent-sensitive / insensitive might apply to diacritics used for vowels and cantillation marks (i.e. trope). Maybe answer these questions:
Do you need to distinguish between
א
andאֽ
, or betweenש
,שׁ
, andשׂ
? If so, you want an_AS
collation. (actually, testing is showing that vowels and cantillation marks are always sensitive in version 100 collations, but not in unversioned / version 80 collations, which is the opposite of how this should be). The following test shows that a vowel difference still compares as different in a newer, accent-insensitive collation, which I think should compare as the same:SELECT NCHAR(0x05D0), NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) = NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_100_CI_AI_SC;
-- no rows (was expecting 1 row!!)
SELECT NCHAR(0x05D0), NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) = NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_CS_AI;
-- 1 row (expected)
-- א אֽ
SELECT NCHAR(0x05D0), NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) = NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_CS_AS;
-- no rows (expected)Even just a cantillation mark difference is an issue with the newer accent-insensitive collations, and this seems odd given that the mark is Hebrew Accent Geresh (U+059C) (which has "accent" in the name!):
SELECT NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD),
NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD) =
NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_100_CS_AI_SC;
-- no rows (was expecting 1 row!!)
SELECT NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD),
NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD) =
NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_CS_AI;
-- 1 row (expected)
-- אֽ֜ אֽ
SELECT NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD),
NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD) =
NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_CS_AS;
-- no rows (expected)Do you need to distinguish between upper-case and lower-case in other languages, such as English? If so, then you want a
_CS
collation, else you can use a_CI
collation (i.e. case-insensitive)
You do not want a binary collation (_BIN
or _BIN2
) as those cannot distinguish between Hebrew letters with both vowels and cantillation marks that are the same but have the combining characters in different orders, nor can they ignore vowels and other marks to equate things like א
and אֽ
.
For example (vowel and cantillation mark combining characters in opposite order):
SELECT NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD),
NCHAR(0x05D0) + NCHAR(0x05BD) + NCHAR(0x059C)
WHERE NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD) =
NCHAR(0x05D0) + NCHAR(0x05BD) + NCHAR(0x059C) COLLATE Hebrew_100_CS_AS_SC;
-- אֽ֜ אֽ֜
SELECT NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD),
NCHAR(0x05D0) + NCHAR(0x05BD) + NCHAR(0x059C)
WHERE NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD) =
NCHAR(0x05D0) + NCHAR(0x05BD) + NCHAR(0x059C) COLLATE Hebrew_100_BIN2;
-- no rows
Also, the collations ending in _SC
support supplementary characters (i.e. full UTF-16) so usually best to pick one of those, if available.
So, perhaps Hebrew_100_CI_AS_SC
or Hebrew_100_CS_AS_SC
for the columns, and you can override per expression / predicate via the COLLATE
statement if you need to use a variation, such as COLLATE Hebrew_CS_AI
.
Also, you will need to store the data in NVARCHAR
columns / variables.
First, regardless of anything else, you want to use the newest set of collations, which are the _100_
series as they have the newer / more complete sort weights and linguistic rules than the older series with no version number in the name (technically the are version 80
). (NOTE: testing is showing that the version 100
collations are not allowing cantillation marks to be ignored while the non-versioned collations are, which is the opposite of how it should be...will update with what I find out)
Next, it depends on how you will be interacting with the string values. Hebrew has no case variations, nor do the concepts of "width" or "Kana" apply. Accent-sensitive / insensitive might apply to diacritics used for vowels and cantillation marks (i.e. trope). Maybe answer these questions:
Do you need to distinguish between
א
andאֽ
, or betweenש
,שׁ
, andשׂ
? If so, you want an_AS
collation. (actually, testing is showing that vowels and cantillation marks are always sensitive in version 100 collations, but not in unversioned / version 80 collations, which is the opposite of how this should be). The following test shows that a vowel difference still compares as different in a newer, accent-insensitive collation, which I think should compare as the same:SELECT NCHAR(0x05D0), NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) = NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_100_CI_AI_SC;
-- no rows (was expecting 1 row!!)
SELECT NCHAR(0x05D0), NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) = NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_CS_AI;
-- 1 row (expected)
-- א אֽ
SELECT NCHAR(0x05D0), NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) = NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_CS_AS;
-- no rows (expected)Even just a cantillation mark difference is an issue with the newer accent-insensitive collations, and this seems odd given that the mark is Hebrew Accent Geresh (U+059C) (which has "accent" in the name!):
SELECT NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD),
NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD) =
NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_100_CS_AI_SC;
-- no rows (was expecting 1 row!!)
SELECT NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD),
NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD) =
NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_CS_AI;
-- 1 row (expected)
-- אֽ֜ אֽ
SELECT NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD),
NCHAR(0x05D0) + NCHAR(0x05BD)
WHERE NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD) =
NCHAR(0x05D0) + NCHAR(0x05BD) COLLATE Hebrew_CS_AS;
-- no rows (expected)Do you need to distinguish between upper-case and lower-case in other languages, such as English? If so, then you want a
_CS
collation, else you can use a_CI
collation (i.e. case-insensitive)
You do not want a binary collation (_BIN
or _BIN2
) as those cannot distinguish between Hebrew letters with both vowels and cantillation marks that are the same but have the combining characters in different orders, nor can they ignore vowels and other marks to equate things like א
and אֽ
.
For example (vowel and cantillation mark combining characters in opposite order):
SELECT NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD),
NCHAR(0x05D0) + NCHAR(0x05BD) + NCHAR(0x059C)
WHERE NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD) =
NCHAR(0x05D0) + NCHAR(0x05BD) + NCHAR(0x059C) COLLATE Hebrew_100_CS_AS_SC;
-- אֽ֜ אֽ֜
SELECT NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD),
NCHAR(0x05D0) + NCHAR(0x05BD) + NCHAR(0x059C)
WHERE NCHAR(0x05D0) + NCHAR(0x059C) + NCHAR(0x05BD) =
NCHAR(0x05D0) + NCHAR(0x05BD) + NCHAR(0x059C) COLLATE Hebrew_100_BIN2;
-- no rows
Also, the collations ending in _SC
support supplementary characters (i.e. full UTF-16) so usually best to pick one of those, if available.
So, perhaps Hebrew_100_CI_AS_SC
or Hebrew_100_CS_AS_SC
for the columns, and you can override per expression / predicate via the COLLATE
statement if you need to use a variation, such as COLLATE Hebrew_CS_AI
.
Also, you will need to store the data in NVARCHAR
columns / variables.
edited 7 hours ago
answered 8 hours ago
Solomon RutzkySolomon Rutzky
53.5k5 gold badges97 silver badges211 bronze badges
53.5k5 gold badges97 silver badges211 bronze badges
add a comment
|
add a comment
|
It depends on a lot of things. Collation is sorting, comparing, and non-unicode code page.
This repo has a good list of options around Hebrew.
+---------------------------+---------------------------------------------------------------------------------------------------------------------+
| Hebrew_BIN | Hebrew, binary sort |
| Hebrew_BIN2 | Hebrew, binary code point comparison sort |
| Hebrew_CI_AI | Hebrew, case-insensitive, accent-insensitive, kanatype-insensitive, width-insensitive |
| Hebrew_CI_AI_WS | Hebrew, case-insensitive, accent-insensitive, kanatype-insensitive, width-sensitive |
| Hebrew_CI_AI_KS | Hebrew, case-insensitive, accent-insensitive, kanatype-sensitive, width-insensitive |
| Hebrew_CI_AI_KS_WS | Hebrew, case-insensitive, accent-insensitive, kanatype-sensitive, width-sensitive |
| Hebrew_CI_AS | Hebrew, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive |
| Hebrew_CI_AS_WS | Hebrew, case-insensitive, accent-sensitive, kanatype-insensitive, width-sensitive |
| Hebrew_CI_AS_KS | Hebrew, case-insensitive, accent-sensitive, kanatype-sensitive, width-insensitive |
| Hebrew_CI_AS_KS_WS | Hebrew, case-insensitive, accent-sensitive, kanatype-sensitive, width-sensitive |
| Hebrew_CS_AI | Hebrew, case-sensitive, accent-insensitive, kanatype-insensitive, width-insensitive |
| Hebrew_CS_AI_WS | Hebrew, case-sensitive, accent-insensitive, kanatype-insensitive, width-sensitive |
| Hebrew_CS_AI_KS | Hebrew, case-sensitive, accent-insensitive, kanatype-sensitive, width-insensitive |
| Hebrew_CS_AI_KS_WS | Hebrew, case-sensitive, accent-insensitive, kanatype-sensitive, width-sensitive |
| Hebrew_CS_AS | Hebrew, case-sensitive, accent-sensitive, kanatype-insensitive, width-insensitive |
| Hebrew_CS_AS_WS | Hebrew, case-sensitive, accent-sensitive, kanatype-insensitive, width-sensitive |
| Hebrew_CS_AS_KS | Hebrew, case-sensitive, accent-sensitive, kanatype-sensitive, width-insensitive |
| Hebrew_CS_AS_KS_WS | Hebrew, case-sensitive, accent-sensitive, kanatype-sensitive, width-sensitive |
| Hebrew_100_BIN | Hebrew-100, binary sort |
| Hebrew_100_BIN2 | Hebrew-100, binary code point comparison sort |
| Hebrew_100_CI_AI | Hebrew-100, case-insensitive, accent-insensitive, kanatype-insensitive, width-insensitive |
| Hebrew_100_CI_AI_WS | Hebrew-100, case-insensitive, accent-insensitive, kanatype-insensitive, width-sensitive |
| Hebrew_100_CI_AI_KS | Hebrew-100, case-insensitive, accent-insensitive, kanatype-sensitive, width-insensitive |
| Hebrew_100_CI_AI_KS_WS | Hebrew-100, case-insensitive, accent-insensitive, kanatype-sensitive, width-sensitive |
| Hebrew_100_CI_AS | Hebrew-100, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive |
| Hebrew_100_CI_AS_WS | Hebrew-100, case-insensitive, accent-sensitive, kanatype-insensitive, width-sensitive |
| Hebrew_100_CI_AS_KS | Hebrew-100, case-insensitive, accent-sensitive, kanatype-sensitive, width-insensitive |
| Hebrew_100_CI_AS_KS_WS | Hebrew-100, case-insensitive, accent-sensitive, kanatype-sensitive, width-sensitive |
| Hebrew_100_CS_AI | Hebrew-100, case-sensitive, accent-insensitive, kanatype-insensitive, width-insensitive |
| Hebrew_100_CS_AI_WS | Hebrew-100, case-sensitive, accent-insensitive, kanatype-insensitive, width-sensitive |
| Hebrew_100_CS_AI_KS | Hebrew-100, case-sensitive, accent-insensitive, kanatype-sensitive, width-insensitive |
| Hebrew_100_CS_AI_KS_WS | Hebrew-100, case-sensitive, accent-insensitive, kanatype-sensitive, width-sensitive |
| Hebrew_100_CS_AS | Hebrew-100, case-sensitive, accent-sensitive, kanatype-insensitive, width-insensitive |
| Hebrew_100_CS_AS_WS | Hebrew-100, case-sensitive, accent-sensitive, kanatype-insensitive, width-sensitive |
| Hebrew_100_CS_AS_KS | Hebrew-100, case-sensitive, accent-sensitive, kanatype-sensitive, width-insensitive |
| Hebrew_100_CS_AS_KS_WS | Hebrew-100, case-sensitive, accent-sensitive, kanatype-sensitive, width-sensitive |
| Hebrew_100_CI_AI_SC | Hebrew-100, case-insensitive, accent-insensitive, kanatype-insensitive, width-insensitive, supplementary characters |
| Hebrew_100_CI_AI_WS_SC | Hebrew-100, case-insensitive, accent-insensitive, kanatype-insensitive, width-sensitive, supplementary characters |
| Hebrew_100_CI_AI_KS_SC | Hebrew-100, case-insensitive, accent-insensitive, kanatype-sensitive, width-insensitive, supplementary characters |
| Hebrew_100_CI_AI_KS_WS_SC | Hebrew-100, case-insensitive, accent-insensitive, kanatype-sensitive, width-sensitive, supplementary characters |
| Hebrew_100_CI_AS_SC | Hebrew-100, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive, supplementary characters |
| Hebrew_100_CI_AS_WS_SC | Hebrew-100, case-insensitive, accent-sensitive, kanatype-insensitive, width-sensitive, supplementary characters |
| Hebrew_100_CI_AS_KS_SC | Hebrew-100, case-insensitive, accent-sensitive, kanatype-sensitive, width-insensitive, supplementary characters |
| Hebrew_100_CI_AS_KS_WS_SC | Hebrew-100, case-insensitive, accent-sensitive, kanatype-sensitive, width-sensitive, supplementary characters |
| Hebrew_100_CS_AI_SC | Hebrew-100, case-sensitive, accent-insensitive, kanatype-insensitive, width-insensitive, supplementary characters |
| Hebrew_100_CS_AI_WS_SC | Hebrew-100, case-sensitive, accent-insensitive, kanatype-insensitive, width-sensitive, supplementary characters |
| Hebrew_100_CS_AI_KS_SC | Hebrew-100, case-sensitive, accent-insensitive, kanatype-sensitive, width-insensitive, supplementary characters |
| Hebrew_100_CS_AI_KS_WS_SC | Hebrew-100, case-sensitive, accent-insensitive, kanatype-sensitive, width-sensitive, supplementary characters |
| Hebrew_100_CS_AS_SC | Hebrew-100, case-sensitive, accent-sensitive, kanatype-insensitive, width-insensitive, supplementary characters |
| Hebrew_100_CS_AS_WS_SC | Hebrew-100, case-sensitive, accent-sensitive, kanatype-insensitive, width-sensitive, supplementary characters |
| Hebrew_100_CS_AS_KS_SC | Hebrew-100, case-sensitive, accent-sensitive, kanatype-sensitive, width-insensitive, supplementary characters |
| Hebrew_100_CS_AS_KS_WS_SC | Hebrew-100, case-sensitive, accent-sensitive, kanatype-sensitive, width-sensitive, supplementary characters |
+---------------------------+---------------------------------------------------------------------------------------------------------------------+
add a comment
|
It depends on a lot of things. Collation is sorting, comparing, and non-unicode code page.
This repo has a good list of options around Hebrew.
+---------------------------+---------------------------------------------------------------------------------------------------------------------+
| Hebrew_BIN | Hebrew, binary sort |
| Hebrew_BIN2 | Hebrew, binary code point comparison sort |
| Hebrew_CI_AI | Hebrew, case-insensitive, accent-insensitive, kanatype-insensitive, width-insensitive |
| Hebrew_CI_AI_WS | Hebrew, case-insensitive, accent-insensitive, kanatype-insensitive, width-sensitive |
| Hebrew_CI_AI_KS | Hebrew, case-insensitive, accent-insensitive, kanatype-sensitive, width-insensitive |
| Hebrew_CI_AI_KS_WS | Hebrew, case-insensitive, accent-insensitive, kanatype-sensitive, width-sensitive |
| Hebrew_CI_AS | Hebrew, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive |
| Hebrew_CI_AS_WS | Hebrew, case-insensitive, accent-sensitive, kanatype-insensitive, width-sensitive |
| Hebrew_CI_AS_KS | Hebrew, case-insensitive, accent-sensitive, kanatype-sensitive, width-insensitive |
| Hebrew_CI_AS_KS_WS | Hebrew, case-insensitive, accent-sensitive, kanatype-sensitive, width-sensitive |
| Hebrew_CS_AI | Hebrew, case-sensitive, accent-insensitive, kanatype-insensitive, width-insensitive |
| Hebrew_CS_AI_WS | Hebrew, case-sensitive, accent-insensitive, kanatype-insensitive, width-sensitive |
| Hebrew_CS_AI_KS | Hebrew, case-sensitive, accent-insensitive, kanatype-sensitive, width-insensitive |
| Hebrew_CS_AI_KS_WS | Hebrew, case-sensitive, accent-insensitive, kanatype-sensitive, width-sensitive |
| Hebrew_CS_AS | Hebrew, case-sensitive, accent-sensitive, kanatype-insensitive, width-insensitive |
| Hebrew_CS_AS_WS | Hebrew, case-sensitive, accent-sensitive, kanatype-insensitive, width-sensitive |
| Hebrew_CS_AS_KS | Hebrew, case-sensitive, accent-sensitive, kanatype-sensitive, width-insensitive |
| Hebrew_CS_AS_KS_WS | Hebrew, case-sensitive, accent-sensitive, kanatype-sensitive, width-sensitive |
| Hebrew_100_BIN | Hebrew-100, binary sort |
| Hebrew_100_BIN2 | Hebrew-100, binary code point comparison sort |
| Hebrew_100_CI_AI | Hebrew-100, case-insensitive, accent-insensitive, kanatype-insensitive, width-insensitive |
| Hebrew_100_CI_AI_WS | Hebrew-100, case-insensitive, accent-insensitive, kanatype-insensitive, width-sensitive |
| Hebrew_100_CI_AI_KS | Hebrew-100, case-insensitive, accent-insensitive, kanatype-sensitive, width-insensitive |
| Hebrew_100_CI_AI_KS_WS | Hebrew-100, case-insensitive, accent-insensitive, kanatype-sensitive, width-sensitive |
| Hebrew_100_CI_AS | Hebrew-100, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive |
| Hebrew_100_CI_AS_WS | Hebrew-100, case-insensitive, accent-sensitive, kanatype-insensitive, width-sensitive |
| Hebrew_100_CI_AS_KS | Hebrew-100, case-insensitive, accent-sensitive, kanatype-sensitive, width-insensitive |
| Hebrew_100_CI_AS_KS_WS | Hebrew-100, case-insensitive, accent-sensitive, kanatype-sensitive, width-sensitive |
| Hebrew_100_CS_AI | Hebrew-100, case-sensitive, accent-insensitive, kanatype-insensitive, width-insensitive |
| Hebrew_100_CS_AI_WS | Hebrew-100, case-sensitive, accent-insensitive, kanatype-insensitive, width-sensitive |
| Hebrew_100_CS_AI_KS | Hebrew-100, case-sensitive, accent-insensitive, kanatype-sensitive, width-insensitive |
| Hebrew_100_CS_AI_KS_WS | Hebrew-100, case-sensitive, accent-insensitive, kanatype-sensitive, width-sensitive |
| Hebrew_100_CS_AS | Hebrew-100, case-sensitive, accent-sensitive, kanatype-insensitive, width-insensitive |
| Hebrew_100_CS_AS_WS | Hebrew-100, case-sensitive, accent-sensitive, kanatype-insensitive, width-sensitive |
| Hebrew_100_CS_AS_KS | Hebrew-100, case-sensitive, accent-sensitive, kanatype-sensitive, width-insensitive |
| Hebrew_100_CS_AS_KS_WS | Hebrew-100, case-sensitive, accent-sensitive, kanatype-sensitive, width-sensitive |
| Hebrew_100_CI_AI_SC | Hebrew-100, case-insensitive, accent-insensitive, kanatype-insensitive, width-insensitive, supplementary characters |
| Hebrew_100_CI_AI_WS_SC | Hebrew-100, case-insensitive, accent-insensitive, kanatype-insensitive, width-sensitive, supplementary characters |
| Hebrew_100_CI_AI_KS_SC | Hebrew-100, case-insensitive, accent-insensitive, kanatype-sensitive, width-insensitive, supplementary characters |
| Hebrew_100_CI_AI_KS_WS_SC | Hebrew-100, case-insensitive, accent-insensitive, kanatype-sensitive, width-sensitive, supplementary characters |
| Hebrew_100_CI_AS_SC | Hebrew-100, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive, supplementary characters |
| Hebrew_100_CI_AS_WS_SC | Hebrew-100, case-insensitive, accent-sensitive, kanatype-insensitive, width-sensitive, supplementary characters |
| Hebrew_100_CI_AS_KS_SC | Hebrew-100, case-insensitive, accent-sensitive, kanatype-sensitive, width-insensitive, supplementary characters |
| Hebrew_100_CI_AS_KS_WS_SC | Hebrew-100, case-insensitive, accent-sensitive, kanatype-sensitive, width-sensitive, supplementary characters |
| Hebrew_100_CS_AI_SC | Hebrew-100, case-sensitive, accent-insensitive, kanatype-insensitive, width-insensitive, supplementary characters |
| Hebrew_100_CS_AI_WS_SC | Hebrew-100, case-sensitive, accent-insensitive, kanatype-insensitive, width-sensitive, supplementary characters |
| Hebrew_100_CS_AI_KS_SC | Hebrew-100, case-sensitive, accent-insensitive, kanatype-sensitive, width-insensitive, supplementary characters |
| Hebrew_100_CS_AI_KS_WS_SC | Hebrew-100, case-sensitive, accent-insensitive, kanatype-sensitive, width-sensitive, supplementary characters |
| Hebrew_100_CS_AS_SC | Hebrew-100, case-sensitive, accent-sensitive, kanatype-insensitive, width-insensitive, supplementary characters |
| Hebrew_100_CS_AS_WS_SC | Hebrew-100, case-sensitive, accent-sensitive, kanatype-insensitive, width-sensitive, supplementary characters |
| Hebrew_100_CS_AS_KS_SC | Hebrew-100, case-sensitive, accent-sensitive, kanatype-sensitive, width-insensitive, supplementary characters |
| Hebrew_100_CS_AS_KS_WS_SC | Hebrew-100, case-sensitive, accent-sensitive, kanatype-sensitive, width-sensitive, supplementary characters |
+---------------------------+---------------------------------------------------------------------------------------------------------------------+
add a comment
|
It depends on a lot of things. Collation is sorting, comparing, and non-unicode code page.
This repo has a good list of options around Hebrew.
+---------------------------+---------------------------------------------------------------------------------------------------------------------+
| Hebrew_BIN | Hebrew, binary sort |
| Hebrew_BIN2 | Hebrew, binary code point comparison sort |
| Hebrew_CI_AI | Hebrew, case-insensitive, accent-insensitive, kanatype-insensitive, width-insensitive |
| Hebrew_CI_AI_WS | Hebrew, case-insensitive, accent-insensitive, kanatype-insensitive, width-sensitive |
| Hebrew_CI_AI_KS | Hebrew, case-insensitive, accent-insensitive, kanatype-sensitive, width-insensitive |
| Hebrew_CI_AI_KS_WS | Hebrew, case-insensitive, accent-insensitive, kanatype-sensitive, width-sensitive |
| Hebrew_CI_AS | Hebrew, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive |
| Hebrew_CI_AS_WS | Hebrew, case-insensitive, accent-sensitive, kanatype-insensitive, width-sensitive |
| Hebrew_CI_AS_KS | Hebrew, case-insensitive, accent-sensitive, kanatype-sensitive, width-insensitive |
| Hebrew_CI_AS_KS_WS | Hebrew, case-insensitive, accent-sensitive, kanatype-sensitive, width-sensitive |
| Hebrew_CS_AI | Hebrew, case-sensitive, accent-insensitive, kanatype-insensitive, width-insensitive |
| Hebrew_CS_AI_WS | Hebrew, case-sensitive, accent-insensitive, kanatype-insensitive, width-sensitive |
| Hebrew_CS_AI_KS | Hebrew, case-sensitive, accent-insensitive, kanatype-sensitive, width-insensitive |
| Hebrew_CS_AI_KS_WS | Hebrew, case-sensitive, accent-insensitive, kanatype-sensitive, width-sensitive |
| Hebrew_CS_AS | Hebrew, case-sensitive, accent-sensitive, kanatype-insensitive, width-insensitive |
| Hebrew_CS_AS_WS | Hebrew, case-sensitive, accent-sensitive, kanatype-insensitive, width-sensitive |
| Hebrew_CS_AS_KS | Hebrew, case-sensitive, accent-sensitive, kanatype-sensitive, width-insensitive |
| Hebrew_CS_AS_KS_WS | Hebrew, case-sensitive, accent-sensitive, kanatype-sensitive, width-sensitive |
| Hebrew_100_BIN | Hebrew-100, binary sort |
| Hebrew_100_BIN2 | Hebrew-100, binary code point comparison sort |
| Hebrew_100_CI_AI | Hebrew-100, case-insensitive, accent-insensitive, kanatype-insensitive, width-insensitive |
| Hebrew_100_CI_AI_WS | Hebrew-100, case-insensitive, accent-insensitive, kanatype-insensitive, width-sensitive |
| Hebrew_100_CI_AI_KS | Hebrew-100, case-insensitive, accent-insensitive, kanatype-sensitive, width-insensitive |
| Hebrew_100_CI_AI_KS_WS | Hebrew-100, case-insensitive, accent-insensitive, kanatype-sensitive, width-sensitive |
| Hebrew_100_CI_AS | Hebrew-100, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive |
| Hebrew_100_CI_AS_WS | Hebrew-100, case-insensitive, accent-sensitive, kanatype-insensitive, width-sensitive |
| Hebrew_100_CI_AS_KS | Hebrew-100, case-insensitive, accent-sensitive, kanatype-sensitive, width-insensitive |
| Hebrew_100_CI_AS_KS_WS | Hebrew-100, case-insensitive, accent-sensitive, kanatype-sensitive, width-sensitive |
| Hebrew_100_CS_AI | Hebrew-100, case-sensitive, accent-insensitive, kanatype-insensitive, width-insensitive |
| Hebrew_100_CS_AI_WS | Hebrew-100, case-sensitive, accent-insensitive, kanatype-insensitive, width-sensitive |
| Hebrew_100_CS_AI_KS | Hebrew-100, case-sensitive, accent-insensitive, kanatype-sensitive, width-insensitive |
| Hebrew_100_CS_AI_KS_WS | Hebrew-100, case-sensitive, accent-insensitive, kanatype-sensitive, width-sensitive |
| Hebrew_100_CS_AS | Hebrew-100, case-sensitive, accent-sensitive, kanatype-insensitive, width-insensitive |
| Hebrew_100_CS_AS_WS | Hebrew-100, case-sensitive, accent-sensitive, kanatype-insensitive, width-sensitive |
| Hebrew_100_CS_AS_KS | Hebrew-100, case-sensitive, accent-sensitive, kanatype-sensitive, width-insensitive |
| Hebrew_100_CS_AS_KS_WS | Hebrew-100, case-sensitive, accent-sensitive, kanatype-sensitive, width-sensitive |
| Hebrew_100_CI_AI_SC | Hebrew-100, case-insensitive, accent-insensitive, kanatype-insensitive, width-insensitive, supplementary characters |
| Hebrew_100_CI_AI_WS_SC | Hebrew-100, case-insensitive, accent-insensitive, kanatype-insensitive, width-sensitive, supplementary characters |
| Hebrew_100_CI_AI_KS_SC | Hebrew-100, case-insensitive, accent-insensitive, kanatype-sensitive, width-insensitive, supplementary characters |
| Hebrew_100_CI_AI_KS_WS_SC | Hebrew-100, case-insensitive, accent-insensitive, kanatype-sensitive, width-sensitive, supplementary characters |
| Hebrew_100_CI_AS_SC | Hebrew-100, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive, supplementary characters |
| Hebrew_100_CI_AS_WS_SC | Hebrew-100, case-insensitive, accent-sensitive, kanatype-insensitive, width-sensitive, supplementary characters |
| Hebrew_100_CI_AS_KS_SC | Hebrew-100, case-insensitive, accent-sensitive, kanatype-sensitive, width-insensitive, supplementary characters |
| Hebrew_100_CI_AS_KS_WS_SC | Hebrew-100, case-insensitive, accent-sensitive, kanatype-sensitive, width-sensitive, supplementary characters |
| Hebrew_100_CS_AI_SC | Hebrew-100, case-sensitive, accent-insensitive, kanatype-insensitive, width-insensitive, supplementary characters |
| Hebrew_100_CS_AI_WS_SC | Hebrew-100, case-sensitive, accent-insensitive, kanatype-insensitive, width-sensitive, supplementary characters |
| Hebrew_100_CS_AI_KS_SC | Hebrew-100, case-sensitive, accent-insensitive, kanatype-sensitive, width-insensitive, supplementary characters |
| Hebrew_100_CS_AI_KS_WS_SC | Hebrew-100, case-sensitive, accent-insensitive, kanatype-sensitive, width-sensitive, supplementary characters |
| Hebrew_100_CS_AS_SC | Hebrew-100, case-sensitive, accent-sensitive, kanatype-insensitive, width-insensitive, supplementary characters |
| Hebrew_100_CS_AS_WS_SC | Hebrew-100, case-sensitive, accent-sensitive, kanatype-insensitive, width-sensitive, supplementary characters |
| Hebrew_100_CS_AS_KS_SC | Hebrew-100, case-sensitive, accent-sensitive, kanatype-sensitive, width-insensitive, supplementary characters |
| Hebrew_100_CS_AS_KS_WS_SC | Hebrew-100, case-sensitive, accent-sensitive, kanatype-sensitive, width-sensitive, supplementary characters |
+---------------------------+---------------------------------------------------------------------------------------------------------------------+
It depends on a lot of things. Collation is sorting, comparing, and non-unicode code page.
This repo has a good list of options around Hebrew.
+---------------------------+---------------------------------------------------------------------------------------------------------------------+
| Hebrew_BIN | Hebrew, binary sort |
| Hebrew_BIN2 | Hebrew, binary code point comparison sort |
| Hebrew_CI_AI | Hebrew, case-insensitive, accent-insensitive, kanatype-insensitive, width-insensitive |
| Hebrew_CI_AI_WS | Hebrew, case-insensitive, accent-insensitive, kanatype-insensitive, width-sensitive |
| Hebrew_CI_AI_KS | Hebrew, case-insensitive, accent-insensitive, kanatype-sensitive, width-insensitive |
| Hebrew_CI_AI_KS_WS | Hebrew, case-insensitive, accent-insensitive, kanatype-sensitive, width-sensitive |
| Hebrew_CI_AS | Hebrew, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive |
| Hebrew_CI_AS_WS | Hebrew, case-insensitive, accent-sensitive, kanatype-insensitive, width-sensitive |
| Hebrew_CI_AS_KS | Hebrew, case-insensitive, accent-sensitive, kanatype-sensitive, width-insensitive |
| Hebrew_CI_AS_KS_WS | Hebrew, case-insensitive, accent-sensitive, kanatype-sensitive, width-sensitive |
| Hebrew_CS_AI | Hebrew, case-sensitive, accent-insensitive, kanatype-insensitive, width-insensitive |
| Hebrew_CS_AI_WS | Hebrew, case-sensitive, accent-insensitive, kanatype-insensitive, width-sensitive |
| Hebrew_CS_AI_KS | Hebrew, case-sensitive, accent-insensitive, kanatype-sensitive, width-insensitive |
| Hebrew_CS_AI_KS_WS | Hebrew, case-sensitive, accent-insensitive, kanatype-sensitive, width-sensitive |
| Hebrew_CS_AS | Hebrew, case-sensitive, accent-sensitive, kanatype-insensitive, width-insensitive |
| Hebrew_CS_AS_WS | Hebrew, case-sensitive, accent-sensitive, kanatype-insensitive, width-sensitive |
| Hebrew_CS_AS_KS | Hebrew, case-sensitive, accent-sensitive, kanatype-sensitive, width-insensitive |
| Hebrew_CS_AS_KS_WS | Hebrew, case-sensitive, accent-sensitive, kanatype-sensitive, width-sensitive |
| Hebrew_100_BIN | Hebrew-100, binary sort |
| Hebrew_100_BIN2 | Hebrew-100, binary code point comparison sort |
| Hebrew_100_CI_AI | Hebrew-100, case-insensitive, accent-insensitive, kanatype-insensitive, width-insensitive |
| Hebrew_100_CI_AI_WS | Hebrew-100, case-insensitive, accent-insensitive, kanatype-insensitive, width-sensitive |
| Hebrew_100_CI_AI_KS | Hebrew-100, case-insensitive, accent-insensitive, kanatype-sensitive, width-insensitive |
| Hebrew_100_CI_AI_KS_WS | Hebrew-100, case-insensitive, accent-insensitive, kanatype-sensitive, width-sensitive |
| Hebrew_100_CI_AS | Hebrew-100, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive |
| Hebrew_100_CI_AS_WS | Hebrew-100, case-insensitive, accent-sensitive, kanatype-insensitive, width-sensitive |
| Hebrew_100_CI_AS_KS | Hebrew-100, case-insensitive, accent-sensitive, kanatype-sensitive, width-insensitive |
| Hebrew_100_CI_AS_KS_WS | Hebrew-100, case-insensitive, accent-sensitive, kanatype-sensitive, width-sensitive |
| Hebrew_100_CS_AI | Hebrew-100, case-sensitive, accent-insensitive, kanatype-insensitive, width-insensitive |
| Hebrew_100_CS_AI_WS | Hebrew-100, case-sensitive, accent-insensitive, kanatype-insensitive, width-sensitive |
| Hebrew_100_CS_AI_KS | Hebrew-100, case-sensitive, accent-insensitive, kanatype-sensitive, width-insensitive |
| Hebrew_100_CS_AI_KS_WS | Hebrew-100, case-sensitive, accent-insensitive, kanatype-sensitive, width-sensitive |
| Hebrew_100_CS_AS | Hebrew-100, case-sensitive, accent-sensitive, kanatype-insensitive, width-insensitive |
| Hebrew_100_CS_AS_WS | Hebrew-100, case-sensitive, accent-sensitive, kanatype-insensitive, width-sensitive |
| Hebrew_100_CS_AS_KS | Hebrew-100, case-sensitive, accent-sensitive, kanatype-sensitive, width-insensitive |
| Hebrew_100_CS_AS_KS_WS | Hebrew-100, case-sensitive, accent-sensitive, kanatype-sensitive, width-sensitive |
| Hebrew_100_CI_AI_SC | Hebrew-100, case-insensitive, accent-insensitive, kanatype-insensitive, width-insensitive, supplementary characters |
| Hebrew_100_CI_AI_WS_SC | Hebrew-100, case-insensitive, accent-insensitive, kanatype-insensitive, width-sensitive, supplementary characters |
| Hebrew_100_CI_AI_KS_SC | Hebrew-100, case-insensitive, accent-insensitive, kanatype-sensitive, width-insensitive, supplementary characters |
| Hebrew_100_CI_AI_KS_WS_SC | Hebrew-100, case-insensitive, accent-insensitive, kanatype-sensitive, width-sensitive, supplementary characters |
| Hebrew_100_CI_AS_SC | Hebrew-100, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive, supplementary characters |
| Hebrew_100_CI_AS_WS_SC | Hebrew-100, case-insensitive, accent-sensitive, kanatype-insensitive, width-sensitive, supplementary characters |
| Hebrew_100_CI_AS_KS_SC | Hebrew-100, case-insensitive, accent-sensitive, kanatype-sensitive, width-insensitive, supplementary characters |
| Hebrew_100_CI_AS_KS_WS_SC | Hebrew-100, case-insensitive, accent-sensitive, kanatype-sensitive, width-sensitive, supplementary characters |
| Hebrew_100_CS_AI_SC | Hebrew-100, case-sensitive, accent-insensitive, kanatype-insensitive, width-insensitive, supplementary characters |
| Hebrew_100_CS_AI_WS_SC | Hebrew-100, case-sensitive, accent-insensitive, kanatype-insensitive, width-sensitive, supplementary characters |
| Hebrew_100_CS_AI_KS_SC | Hebrew-100, case-sensitive, accent-insensitive, kanatype-sensitive, width-insensitive, supplementary characters |
| Hebrew_100_CS_AI_KS_WS_SC | Hebrew-100, case-sensitive, accent-insensitive, kanatype-sensitive, width-sensitive, supplementary characters |
| Hebrew_100_CS_AS_SC | Hebrew-100, case-sensitive, accent-sensitive, kanatype-insensitive, width-insensitive, supplementary characters |
| Hebrew_100_CS_AS_WS_SC | Hebrew-100, case-sensitive, accent-sensitive, kanatype-insensitive, width-sensitive, supplementary characters |
| Hebrew_100_CS_AS_KS_SC | Hebrew-100, case-sensitive, accent-sensitive, kanatype-sensitive, width-insensitive, supplementary characters |
| Hebrew_100_CS_AS_KS_WS_SC | Hebrew-100, case-sensitive, accent-sensitive, kanatype-sensitive, width-sensitive, supplementary characters |
+---------------------------+---------------------------------------------------------------------------------------------------------------------+
answered 8 hours ago
scsimonscsimon
2,6307 silver badges19 bronze badges
2,6307 silver badges19 bronze badges
add a comment
|
add a comment
|
brian12345 is a new contributor. Be nice, and check out our Code of Conduct.
brian12345 is a new contributor. Be nice, and check out our Code of Conduct.
brian12345 is a new contributor. Be nice, and check out our Code of Conduct.
brian12345 is a new contributor. Be nice, and check out our Code of Conduct.
Thanks for contributing an answer to Database Administrators Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f250215%2fwhich-collation-should-i-use-for-biblical-hebrew%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown