When should IGNORE_DUP_KEY option be used on an index?Slow Performance Inserting Few Rows Into Huge TableDB2 10.5 - When inserting, the database is trying to reuse index keys on GENERATE ALWAYSAdd UNIQUE index fails with duplicate entry error, but no duplicates foundOracle: How to create secondary index based on every word in a CLOBWhy would SQL Server ignore an index?Are there practical scenarios where inserts + deletes are faster than updates? (SQL Server)Why don't I get an index seek?What to do when you need a cursor and you have a columnstore clustered indexHow to update column value by adding “1” based on a condition?
Dissuading my girlfriend from a scam
Why doesn't an NVMe connection on an SSD make non-sequential access faster?
What drugs were used in England during the High Middle Ages?
Why did Boris Johnson call for new elections?
What's this constructed number's starter?
Why does the seven segment display have decimal point at the right?
How does the UK House of Commons think they can prolong the deadline of Brexit?
Fantasy Military Arms and Armor: the Dwarven Grand Armory
Can my imp familiar still talk when shapshifted (to a raven, if that matters)?
How do I make my fill-in-the-blank exercise more obvious?
How to calculate the power level of a Commander deck?
SQL Always On COPY ONLY backups - what's the point if I cant restore the AG from these backups?
How should Thaumaturgy's "three times as loud as normal" be interpreted?
Euro sign in table with siunitx
Add builder hat to other people with tikzpeople
Balm of the Summer Court fey energy dice usage limits
Professor refuses to write a recommendation letter to students who haven't written a research paper with him
Why are UK MPs allowed to not vote (but it counts as a no)?
How to interpret or parse this confusing 'NOT' and 'AND' legal clause
Is future tense in English really a myth?
Book where main character comes out of stasis bubble
Why there is no wireless switch?
What exactly is Apple Cider
Why did Tony's Arc Reactor do this?
When should IGNORE_DUP_KEY option be used on an index?
Slow Performance Inserting Few Rows Into Huge TableDB2 10.5 - When inserting, the database is trying to reuse index keys on GENERATE ALWAYSAdd UNIQUE index fails with duplicate entry error, but no duplicates foundOracle: How to create secondary index based on every word in a CLOBWhy would SQL Server ignore an index?Are there practical scenarios where inserts + deletes are faster than updates? (SQL Server)Why don't I get an index seek?What to do when you need a cursor and you have a columnstore clustered indexHow to update column value by adding “1” based on a condition?
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;
Someone said it's preferable to craft your queries to avoid duplicate key exceptions, but I'm not convinced that's more performant that just setting IGNORE_DUP_KEY = ON for the index.
My goal is to ensure a row or set of rows exists for one or more users, before attempting to update those rows. I do this, so that when I attempt to update the row with an update statement like the one below, and no rows are affected, it's because the [Count] portion of predicate wasn't satisfied, as opposed to the row not existing at all (i.e. the [ID] portion of the predicate not being satisfied).
UPDATE [Inventory] SET [Count] = [Count] + 1 WHERE [ID] = 3 AND ([Count] + 1) <= @MaxInventory
I could run EXISTS(SELECT 1 From [Inventory] WHERE [ID] = 3
to check for that single row, and only insert the row if it does not exist. That simply avoids unnecessary inserts. The insert, if necessary, would still have to contend with concurrent transactions, so duplicate key exceptions can still occur.
I'm curious whether it's more performant to just turn on IGNORE_DUP_KEY in this scenario, rather than allowing the error to be thrown and caught. Specifically, I'm curious if it's as-fast or possibly even faster than running an exists check, to just attempt to insert the record and let it ignore duplicate keys.
This becomes even more important, when I'm checking for and initializing multiple records at once. For example, if I need to ensure records for thousands of users exist in a single update statement, the logic would be much simpler if I just ran that insert statement up-front, letting it ignore duplicate keys. Avoiding duplicates would be more complex, because I'd have to first query the table for which records don't exist, then attempt to add just those records (again, ignoring duplicate keys). Just inserting may be faster, even if all the records exist. I could meet it halfway and check whether ANY of the records are missing, such as with a left join or a COUNT comparison, but why bother if the insert ignoring dupliciate keys is just faster?
Is is a good idea to use IGNORE_DUP_KEY and just attempt inserts instead of bothering with checking for row existence ahead of time? If not, why?
sql-server index sql-server-2019
add a comment |
Someone said it's preferable to craft your queries to avoid duplicate key exceptions, but I'm not convinced that's more performant that just setting IGNORE_DUP_KEY = ON for the index.
My goal is to ensure a row or set of rows exists for one or more users, before attempting to update those rows. I do this, so that when I attempt to update the row with an update statement like the one below, and no rows are affected, it's because the [Count] portion of predicate wasn't satisfied, as opposed to the row not existing at all (i.e. the [ID] portion of the predicate not being satisfied).
UPDATE [Inventory] SET [Count] = [Count] + 1 WHERE [ID] = 3 AND ([Count] + 1) <= @MaxInventory
I could run EXISTS(SELECT 1 From [Inventory] WHERE [ID] = 3
to check for that single row, and only insert the row if it does not exist. That simply avoids unnecessary inserts. The insert, if necessary, would still have to contend with concurrent transactions, so duplicate key exceptions can still occur.
I'm curious whether it's more performant to just turn on IGNORE_DUP_KEY in this scenario, rather than allowing the error to be thrown and caught. Specifically, I'm curious if it's as-fast or possibly even faster than running an exists check, to just attempt to insert the record and let it ignore duplicate keys.
This becomes even more important, when I'm checking for and initializing multiple records at once. For example, if I need to ensure records for thousands of users exist in a single update statement, the logic would be much simpler if I just ran that insert statement up-front, letting it ignore duplicate keys. Avoiding duplicates would be more complex, because I'd have to first query the table for which records don't exist, then attempt to add just those records (again, ignoring duplicate keys). Just inserting may be faster, even if all the records exist. I could meet it halfway and check whether ANY of the records are missing, such as with a left join or a COUNT comparison, but why bother if the insert ignoring dupliciate keys is just faster?
Is is a good idea to use IGNORE_DUP_KEY and just attempt inserts instead of bothering with checking for row existence ahead of time? If not, why?
sql-server index sql-server-2019
add a comment |
Someone said it's preferable to craft your queries to avoid duplicate key exceptions, but I'm not convinced that's more performant that just setting IGNORE_DUP_KEY = ON for the index.
My goal is to ensure a row or set of rows exists for one or more users, before attempting to update those rows. I do this, so that when I attempt to update the row with an update statement like the one below, and no rows are affected, it's because the [Count] portion of predicate wasn't satisfied, as opposed to the row not existing at all (i.e. the [ID] portion of the predicate not being satisfied).
UPDATE [Inventory] SET [Count] = [Count] + 1 WHERE [ID] = 3 AND ([Count] + 1) <= @MaxInventory
I could run EXISTS(SELECT 1 From [Inventory] WHERE [ID] = 3
to check for that single row, and only insert the row if it does not exist. That simply avoids unnecessary inserts. The insert, if necessary, would still have to contend with concurrent transactions, so duplicate key exceptions can still occur.
I'm curious whether it's more performant to just turn on IGNORE_DUP_KEY in this scenario, rather than allowing the error to be thrown and caught. Specifically, I'm curious if it's as-fast or possibly even faster than running an exists check, to just attempt to insert the record and let it ignore duplicate keys.
This becomes even more important, when I'm checking for and initializing multiple records at once. For example, if I need to ensure records for thousands of users exist in a single update statement, the logic would be much simpler if I just ran that insert statement up-front, letting it ignore duplicate keys. Avoiding duplicates would be more complex, because I'd have to first query the table for which records don't exist, then attempt to add just those records (again, ignoring duplicate keys). Just inserting may be faster, even if all the records exist. I could meet it halfway and check whether ANY of the records are missing, such as with a left join or a COUNT comparison, but why bother if the insert ignoring dupliciate keys is just faster?
Is is a good idea to use IGNORE_DUP_KEY and just attempt inserts instead of bothering with checking for row existence ahead of time? If not, why?
sql-server index sql-server-2019
Someone said it's preferable to craft your queries to avoid duplicate key exceptions, but I'm not convinced that's more performant that just setting IGNORE_DUP_KEY = ON for the index.
My goal is to ensure a row or set of rows exists for one or more users, before attempting to update those rows. I do this, so that when I attempt to update the row with an update statement like the one below, and no rows are affected, it's because the [Count] portion of predicate wasn't satisfied, as opposed to the row not existing at all (i.e. the [ID] portion of the predicate not being satisfied).
UPDATE [Inventory] SET [Count] = [Count] + 1 WHERE [ID] = 3 AND ([Count] + 1) <= @MaxInventory
I could run EXISTS(SELECT 1 From [Inventory] WHERE [ID] = 3
to check for that single row, and only insert the row if it does not exist. That simply avoids unnecessary inserts. The insert, if necessary, would still have to contend with concurrent transactions, so duplicate key exceptions can still occur.
I'm curious whether it's more performant to just turn on IGNORE_DUP_KEY in this scenario, rather than allowing the error to be thrown and caught. Specifically, I'm curious if it's as-fast or possibly even faster than running an exists check, to just attempt to insert the record and let it ignore duplicate keys.
This becomes even more important, when I'm checking for and initializing multiple records at once. For example, if I need to ensure records for thousands of users exist in a single update statement, the logic would be much simpler if I just ran that insert statement up-front, letting it ignore duplicate keys. Avoiding duplicates would be more complex, because I'd have to first query the table for which records don't exist, then attempt to add just those records (again, ignoring duplicate keys). Just inserting may be faster, even if all the records exist. I could meet it halfway and check whether ANY of the records are missing, such as with a left join or a COUNT comparison, but why bother if the insert ignoring dupliciate keys is just faster?
Is is a good idea to use IGNORE_DUP_KEY and just attempt inserts instead of bothering with checking for row existence ahead of time? If not, why?
sql-server index sql-server-2019
sql-server index sql-server-2019
edited 8 hours ago
Erik Darling
27.3k13 gold badges83 silver badges138 bronze badges
27.3k13 gold badges83 silver badges138 bronze badges
asked 8 hours ago
TriynkoTriynko
2752 silver badges11 bronze badges
2752 silver badges11 bronze badges
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
It's definitely an atypical setting to enable for an index. I don't think you're going to find many people who jump up and talk about using it.
Of course, there are a couple helpful posts on the subject by Paul White:
- IGNORE_DUP_KEY slower on clustered indexes
- A creative use of IGNORE_DUP_KEY
If your concern is around upsert patterns (or something similar), this article by Michael Swart is really informative.:
- SQL Server UPSERT Patterns and Antipatterns
In the first article, it sounds like the insert proceeds as usual, still throws the exception, but catches and handles it internally after it tries to insert each row. Seems like a poor implementation, since it should be aware of the option and simply return a non-success code internally, rather than actually generating an exception. That's the whole 'exceptions shouldn't be used for control flow' idea. Good to know thought, and certainly points towards a solution that doesn't rely on this settings. On the other hand, it also says performance could be faster if there are no duplicates.
– Triynko
7 hours ago
I'll likely mark this as the answer, because the 1st link alone provides all of the implementation details necessary to determine the best solution for this problem. The 'insert only and always' approach is certainly a no-go, especially for a clustered index, because of all the overhead of latches and errors involved. Since rows are only created once, incurring these costs with every call thereafter is senseless. On the other hand, including an 'exists' check beforehand will avoid that, such that IGNORE_DUP_KEY can remain on for concurrent insert collisions, which are rare in comparison.
– Triynko
7 hours ago
It's interesting that the 3rd link for UPSERT Patterns mentions use of IGNORE_DUP_KEY as an anti-pattern, but that's only for the 'always and only insert' pattern where we always attempt the insert. What I propose is keeping IGNORE_DUP_KEY on to resolve concurrency conflicts when an insert is needed, but keeping the pre-check to determine whether the insert is needed, which more closely resembles the 'better' patterns mentioned in that article.
– Triynko
6 hours ago
One last thought. Assuming the pre-check is in place, IGNORE_DUP_KEY is probably still more effiicent, even with the errors. For example, suppose I run a pre-check and determine there are 534 row that need inserted. Meanwhile, a concurrent operation inserts one of those rows, such that when I attempt to insert the 534 rows I thought I needed, the operation fails. If it fails late in the insert, the transaction has to roll back all the rows it inserted. It's going to be much more efficient to turn IGNORE_DUP_KEY on and ignore that one failed row. Also more efficient than serializable isolation.
– Triynko
5 hours ago
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "182"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/4.0/"u003ecc by-sa 4.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f247090%2fwhen-should-ignore-dup-key-option-be-used-on-an-index%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
It's definitely an atypical setting to enable for an index. I don't think you're going to find many people who jump up and talk about using it.
Of course, there are a couple helpful posts on the subject by Paul White:
- IGNORE_DUP_KEY slower on clustered indexes
- A creative use of IGNORE_DUP_KEY
If your concern is around upsert patterns (or something similar), this article by Michael Swart is really informative.:
- SQL Server UPSERT Patterns and Antipatterns
In the first article, it sounds like the insert proceeds as usual, still throws the exception, but catches and handles it internally after it tries to insert each row. Seems like a poor implementation, since it should be aware of the option and simply return a non-success code internally, rather than actually generating an exception. That's the whole 'exceptions shouldn't be used for control flow' idea. Good to know thought, and certainly points towards a solution that doesn't rely on this settings. On the other hand, it also says performance could be faster if there are no duplicates.
– Triynko
7 hours ago
I'll likely mark this as the answer, because the 1st link alone provides all of the implementation details necessary to determine the best solution for this problem. The 'insert only and always' approach is certainly a no-go, especially for a clustered index, because of all the overhead of latches and errors involved. Since rows are only created once, incurring these costs with every call thereafter is senseless. On the other hand, including an 'exists' check beforehand will avoid that, such that IGNORE_DUP_KEY can remain on for concurrent insert collisions, which are rare in comparison.
– Triynko
7 hours ago
It's interesting that the 3rd link for UPSERT Patterns mentions use of IGNORE_DUP_KEY as an anti-pattern, but that's only for the 'always and only insert' pattern where we always attempt the insert. What I propose is keeping IGNORE_DUP_KEY on to resolve concurrency conflicts when an insert is needed, but keeping the pre-check to determine whether the insert is needed, which more closely resembles the 'better' patterns mentioned in that article.
– Triynko
6 hours ago
One last thought. Assuming the pre-check is in place, IGNORE_DUP_KEY is probably still more effiicent, even with the errors. For example, suppose I run a pre-check and determine there are 534 row that need inserted. Meanwhile, a concurrent operation inserts one of those rows, such that when I attempt to insert the 534 rows I thought I needed, the operation fails. If it fails late in the insert, the transaction has to roll back all the rows it inserted. It's going to be much more efficient to turn IGNORE_DUP_KEY on and ignore that one failed row. Also more efficient than serializable isolation.
– Triynko
5 hours ago
add a comment |
It's definitely an atypical setting to enable for an index. I don't think you're going to find many people who jump up and talk about using it.
Of course, there are a couple helpful posts on the subject by Paul White:
- IGNORE_DUP_KEY slower on clustered indexes
- A creative use of IGNORE_DUP_KEY
If your concern is around upsert patterns (or something similar), this article by Michael Swart is really informative.:
- SQL Server UPSERT Patterns and Antipatterns
In the first article, it sounds like the insert proceeds as usual, still throws the exception, but catches and handles it internally after it tries to insert each row. Seems like a poor implementation, since it should be aware of the option and simply return a non-success code internally, rather than actually generating an exception. That's the whole 'exceptions shouldn't be used for control flow' idea. Good to know thought, and certainly points towards a solution that doesn't rely on this settings. On the other hand, it also says performance could be faster if there are no duplicates.
– Triynko
7 hours ago
I'll likely mark this as the answer, because the 1st link alone provides all of the implementation details necessary to determine the best solution for this problem. The 'insert only and always' approach is certainly a no-go, especially for a clustered index, because of all the overhead of latches and errors involved. Since rows are only created once, incurring these costs with every call thereafter is senseless. On the other hand, including an 'exists' check beforehand will avoid that, such that IGNORE_DUP_KEY can remain on for concurrent insert collisions, which are rare in comparison.
– Triynko
7 hours ago
It's interesting that the 3rd link for UPSERT Patterns mentions use of IGNORE_DUP_KEY as an anti-pattern, but that's only for the 'always and only insert' pattern where we always attempt the insert. What I propose is keeping IGNORE_DUP_KEY on to resolve concurrency conflicts when an insert is needed, but keeping the pre-check to determine whether the insert is needed, which more closely resembles the 'better' patterns mentioned in that article.
– Triynko
6 hours ago
One last thought. Assuming the pre-check is in place, IGNORE_DUP_KEY is probably still more effiicent, even with the errors. For example, suppose I run a pre-check and determine there are 534 row that need inserted. Meanwhile, a concurrent operation inserts one of those rows, such that when I attempt to insert the 534 rows I thought I needed, the operation fails. If it fails late in the insert, the transaction has to roll back all the rows it inserted. It's going to be much more efficient to turn IGNORE_DUP_KEY on and ignore that one failed row. Also more efficient than serializable isolation.
– Triynko
5 hours ago
add a comment |
It's definitely an atypical setting to enable for an index. I don't think you're going to find many people who jump up and talk about using it.
Of course, there are a couple helpful posts on the subject by Paul White:
- IGNORE_DUP_KEY slower on clustered indexes
- A creative use of IGNORE_DUP_KEY
If your concern is around upsert patterns (or something similar), this article by Michael Swart is really informative.:
- SQL Server UPSERT Patterns and Antipatterns
It's definitely an atypical setting to enable for an index. I don't think you're going to find many people who jump up and talk about using it.
Of course, there are a couple helpful posts on the subject by Paul White:
- IGNORE_DUP_KEY slower on clustered indexes
- A creative use of IGNORE_DUP_KEY
If your concern is around upsert patterns (or something similar), this article by Michael Swart is really informative.:
- SQL Server UPSERT Patterns and Antipatterns
answered 8 hours ago
Erik DarlingErik Darling
27.3k13 gold badges83 silver badges138 bronze badges
27.3k13 gold badges83 silver badges138 bronze badges
In the first article, it sounds like the insert proceeds as usual, still throws the exception, but catches and handles it internally after it tries to insert each row. Seems like a poor implementation, since it should be aware of the option and simply return a non-success code internally, rather than actually generating an exception. That's the whole 'exceptions shouldn't be used for control flow' idea. Good to know thought, and certainly points towards a solution that doesn't rely on this settings. On the other hand, it also says performance could be faster if there are no duplicates.
– Triynko
7 hours ago
I'll likely mark this as the answer, because the 1st link alone provides all of the implementation details necessary to determine the best solution for this problem. The 'insert only and always' approach is certainly a no-go, especially for a clustered index, because of all the overhead of latches and errors involved. Since rows are only created once, incurring these costs with every call thereafter is senseless. On the other hand, including an 'exists' check beforehand will avoid that, such that IGNORE_DUP_KEY can remain on for concurrent insert collisions, which are rare in comparison.
– Triynko
7 hours ago
It's interesting that the 3rd link for UPSERT Patterns mentions use of IGNORE_DUP_KEY as an anti-pattern, but that's only for the 'always and only insert' pattern where we always attempt the insert. What I propose is keeping IGNORE_DUP_KEY on to resolve concurrency conflicts when an insert is needed, but keeping the pre-check to determine whether the insert is needed, which more closely resembles the 'better' patterns mentioned in that article.
– Triynko
6 hours ago
One last thought. Assuming the pre-check is in place, IGNORE_DUP_KEY is probably still more effiicent, even with the errors. For example, suppose I run a pre-check and determine there are 534 row that need inserted. Meanwhile, a concurrent operation inserts one of those rows, such that when I attempt to insert the 534 rows I thought I needed, the operation fails. If it fails late in the insert, the transaction has to roll back all the rows it inserted. It's going to be much more efficient to turn IGNORE_DUP_KEY on and ignore that one failed row. Also more efficient than serializable isolation.
– Triynko
5 hours ago
add a comment |
In the first article, it sounds like the insert proceeds as usual, still throws the exception, but catches and handles it internally after it tries to insert each row. Seems like a poor implementation, since it should be aware of the option and simply return a non-success code internally, rather than actually generating an exception. That's the whole 'exceptions shouldn't be used for control flow' idea. Good to know thought, and certainly points towards a solution that doesn't rely on this settings. On the other hand, it also says performance could be faster if there are no duplicates.
– Triynko
7 hours ago
I'll likely mark this as the answer, because the 1st link alone provides all of the implementation details necessary to determine the best solution for this problem. The 'insert only and always' approach is certainly a no-go, especially for a clustered index, because of all the overhead of latches and errors involved. Since rows are only created once, incurring these costs with every call thereafter is senseless. On the other hand, including an 'exists' check beforehand will avoid that, such that IGNORE_DUP_KEY can remain on for concurrent insert collisions, which are rare in comparison.
– Triynko
7 hours ago
It's interesting that the 3rd link for UPSERT Patterns mentions use of IGNORE_DUP_KEY as an anti-pattern, but that's only for the 'always and only insert' pattern where we always attempt the insert. What I propose is keeping IGNORE_DUP_KEY on to resolve concurrency conflicts when an insert is needed, but keeping the pre-check to determine whether the insert is needed, which more closely resembles the 'better' patterns mentioned in that article.
– Triynko
6 hours ago
One last thought. Assuming the pre-check is in place, IGNORE_DUP_KEY is probably still more effiicent, even with the errors. For example, suppose I run a pre-check and determine there are 534 row that need inserted. Meanwhile, a concurrent operation inserts one of those rows, such that when I attempt to insert the 534 rows I thought I needed, the operation fails. If it fails late in the insert, the transaction has to roll back all the rows it inserted. It's going to be much more efficient to turn IGNORE_DUP_KEY on and ignore that one failed row. Also more efficient than serializable isolation.
– Triynko
5 hours ago
In the first article, it sounds like the insert proceeds as usual, still throws the exception, but catches and handles it internally after it tries to insert each row. Seems like a poor implementation, since it should be aware of the option and simply return a non-success code internally, rather than actually generating an exception. That's the whole 'exceptions shouldn't be used for control flow' idea. Good to know thought, and certainly points towards a solution that doesn't rely on this settings. On the other hand, it also says performance could be faster if there are no duplicates.
– Triynko
7 hours ago
In the first article, it sounds like the insert proceeds as usual, still throws the exception, but catches and handles it internally after it tries to insert each row. Seems like a poor implementation, since it should be aware of the option and simply return a non-success code internally, rather than actually generating an exception. That's the whole 'exceptions shouldn't be used for control flow' idea. Good to know thought, and certainly points towards a solution that doesn't rely on this settings. On the other hand, it also says performance could be faster if there are no duplicates.
– Triynko
7 hours ago
I'll likely mark this as the answer, because the 1st link alone provides all of the implementation details necessary to determine the best solution for this problem. The 'insert only and always' approach is certainly a no-go, especially for a clustered index, because of all the overhead of latches and errors involved. Since rows are only created once, incurring these costs with every call thereafter is senseless. On the other hand, including an 'exists' check beforehand will avoid that, such that IGNORE_DUP_KEY can remain on for concurrent insert collisions, which are rare in comparison.
– Triynko
7 hours ago
I'll likely mark this as the answer, because the 1st link alone provides all of the implementation details necessary to determine the best solution for this problem. The 'insert only and always' approach is certainly a no-go, especially for a clustered index, because of all the overhead of latches and errors involved. Since rows are only created once, incurring these costs with every call thereafter is senseless. On the other hand, including an 'exists' check beforehand will avoid that, such that IGNORE_DUP_KEY can remain on for concurrent insert collisions, which are rare in comparison.
– Triynko
7 hours ago
It's interesting that the 3rd link for UPSERT Patterns mentions use of IGNORE_DUP_KEY as an anti-pattern, but that's only for the 'always and only insert' pattern where we always attempt the insert. What I propose is keeping IGNORE_DUP_KEY on to resolve concurrency conflicts when an insert is needed, but keeping the pre-check to determine whether the insert is needed, which more closely resembles the 'better' patterns mentioned in that article.
– Triynko
6 hours ago
It's interesting that the 3rd link for UPSERT Patterns mentions use of IGNORE_DUP_KEY as an anti-pattern, but that's only for the 'always and only insert' pattern where we always attempt the insert. What I propose is keeping IGNORE_DUP_KEY on to resolve concurrency conflicts when an insert is needed, but keeping the pre-check to determine whether the insert is needed, which more closely resembles the 'better' patterns mentioned in that article.
– Triynko
6 hours ago
One last thought. Assuming the pre-check is in place, IGNORE_DUP_KEY is probably still more effiicent, even with the errors. For example, suppose I run a pre-check and determine there are 534 row that need inserted. Meanwhile, a concurrent operation inserts one of those rows, such that when I attempt to insert the 534 rows I thought I needed, the operation fails. If it fails late in the insert, the transaction has to roll back all the rows it inserted. It's going to be much more efficient to turn IGNORE_DUP_KEY on and ignore that one failed row. Also more efficient than serializable isolation.
– Triynko
5 hours ago
One last thought. Assuming the pre-check is in place, IGNORE_DUP_KEY is probably still more effiicent, even with the errors. For example, suppose I run a pre-check and determine there are 534 row that need inserted. Meanwhile, a concurrent operation inserts one of those rows, such that when I attempt to insert the 534 rows I thought I needed, the operation fails. If it fails late in the insert, the transaction has to roll back all the rows it inserted. It's going to be much more efficient to turn IGNORE_DUP_KEY on and ignore that one failed row. Also more efficient than serializable isolation.
– Triynko
5 hours ago
add a comment |
Thanks for contributing an answer to Database Administrators Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f247090%2fwhen-should-ignore-dup-key-option-be-used-on-an-index%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown