Why doesn't an NVMe connection on an SSD make non-sequential access faster?NVMe ssd: Why is 4k writing faster than reading?How can I use my small SSD as a cache for a larger hard disk?ExpressCache vs. Intel's Rapid Storage and Smart Response TechnologiesDo I need to connect the jumper pins in a SATA hard drive to anything?NVMe ssd: Why is 4k writing faster than reading?Identical SSDs on same port: Why is one SATA/600 and the other SATA/150?NVMe SSD and Windows NTFS compression - effects on performance?Different rpm but same speed?
What is the purpose of the rotating plate in front of the lock?
In-universe, why does Doc Brown program the time machine to go to 1955?
Is Sanskrit really the mother of all languages?
SQL Always On COPY ONLY backups - what's the point if I cant restore the AG from these backups?
Why would one hemisphere of a planet be very mountainous while the other is flat?
Was Rosie the Riveter sourced from a Michelangelo painting?
Why doesn't an NVMe connection on an SSD make non-sequential access faster?
GFI outlets tripped after power outage
Looking for the comic book where Spider-Man was [mistakenly] addressed as Super-Man
What can we do about our 9 month old putting fingers down his throat?
How to interpret or parse this confusing 'NOT' and 'AND' legal clause
Why did Tony's Arc Reactor do this?
Where on Earth is it easiest to survive in the wilderness?
How do German speakers decide what should be on the left side of the verb?
Sinning and G-d's will, what's wrong with this logic?
If I have an accident, should I file a claim with my car insurance company?
Do I need to declare engagement ring bought in UK when flying on holiday to US?
When should IGNORE_DUP_KEY option be used on an index?
convenient Vector3f class
Balm of the Summer Court fey energy dice usage limits
Is it right to use the ideas of non-winning designers in a design contest?
What are the map units that WGS84 uses?
Global variables and information security
Why did Boris Johnson call for new elections?
Why doesn't an NVMe connection on an SSD make non-sequential access faster?
NVMe ssd: Why is 4k writing faster than reading?How can I use my small SSD as a cache for a larger hard disk?ExpressCache vs. Intel's Rapid Storage and Smart Response TechnologiesDo I need to connect the jumper pins in a SATA hard drive to anything?NVMe ssd: Why is 4k writing faster than reading?Identical SSDs on same port: Why is one SATA/600 and the other SATA/150?NVMe SSD and Windows NTFS compression - effects on performance?Different rpm but same speed?
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;
Why would it matter if you’re transferring a 1GB file at quintuple the speed (of SATA SSDs) or transferring 1,000 1MB files at quintuple the speed? It should always amount to being quintuple the speed of the SSD.
But in real life non-sequential access turns out to have only little benefit over a SATA SSD.
EDIT
Since the answers are concentrating on the difference between large and small files, let me clarify my question:
- Yes, small files will have overhead.
- And yes, they will waste time by reading data that will be ignored.
- But this is irrelevant to my question since every read and write (including those pesky little MFT writes etc…) will (or rather should) see the x5 speed gain.
Saying that there is wasted drive access doesn't change that. I'm not asking why is 1GB not as fast as 1000 1MBs. I'm asking why:
(1GB_NVMe / 1GB_SSD) != (1000x1MB_NVMe / 1000x1MB_SSD)
hard-drive ssd sata nvme
|
show 6 more comments
Why would it matter if you’re transferring a 1GB file at quintuple the speed (of SATA SSDs) or transferring 1,000 1MB files at quintuple the speed? It should always amount to being quintuple the speed of the SSD.
But in real life non-sequential access turns out to have only little benefit over a SATA SSD.
EDIT
Since the answers are concentrating on the difference between large and small files, let me clarify my question:
- Yes, small files will have overhead.
- And yes, they will waste time by reading data that will be ignored.
- But this is irrelevant to my question since every read and write (including those pesky little MFT writes etc…) will (or rather should) see the x5 speed gain.
Saying that there is wasted drive access doesn't change that. I'm not asking why is 1GB not as fast as 1000 1MBs. I'm asking why:
(1GB_NVMe / 1GB_SSD) != (1000x1MB_NVMe / 1000x1MB_SSD)
hard-drive ssd sata nvme
1
in real life... You say that like it's a universal truth. Can you substantiate this claim? Please do not only respond in the comments. Instead, edit the post with this information.
– Twisty Impersonator
7 hours ago
@TwistyImpersonator I've searched the web for a contradictory source and only found one lead which turned out to be useless. It seems like there's practically no disagreement about that. That's why I omitted that from the question. Just like I omitted the fact that SSDs are faster than HDDs.
– ispiro
7 hours ago
Seek latency becomes a real problem with random reads and all the speed benefits of SSDs are lost with tiny reads: superuser.com/a/1168029/19943 This question feels like a slightly differently phrased duplicate of that one...
– Mokubai♦
7 hours ago
@Mokubai Every seek should see the speed difference. Your answer there explains the difference between small and large files. Not the difference between different types of drives. Every seek, every write amplification, every part should see the speed gain.
– ispiro
7 hours ago
1
@JakeGould Thanks. Point taken. I edited that line. Did you mean there was another part where the tone was harsh?
– ispiro
6 hours ago
|
show 6 more comments
Why would it matter if you’re transferring a 1GB file at quintuple the speed (of SATA SSDs) or transferring 1,000 1MB files at quintuple the speed? It should always amount to being quintuple the speed of the SSD.
But in real life non-sequential access turns out to have only little benefit over a SATA SSD.
EDIT
Since the answers are concentrating on the difference between large and small files, let me clarify my question:
- Yes, small files will have overhead.
- And yes, they will waste time by reading data that will be ignored.
- But this is irrelevant to my question since every read and write (including those pesky little MFT writes etc…) will (or rather should) see the x5 speed gain.
Saying that there is wasted drive access doesn't change that. I'm not asking why is 1GB not as fast as 1000 1MBs. I'm asking why:
(1GB_NVMe / 1GB_SSD) != (1000x1MB_NVMe / 1000x1MB_SSD)
hard-drive ssd sata nvme
Why would it matter if you’re transferring a 1GB file at quintuple the speed (of SATA SSDs) or transferring 1,000 1MB files at quintuple the speed? It should always amount to being quintuple the speed of the SSD.
But in real life non-sequential access turns out to have only little benefit over a SATA SSD.
EDIT
Since the answers are concentrating on the difference between large and small files, let me clarify my question:
- Yes, small files will have overhead.
- And yes, they will waste time by reading data that will be ignored.
- But this is irrelevant to my question since every read and write (including those pesky little MFT writes etc…) will (or rather should) see the x5 speed gain.
Saying that there is wasted drive access doesn't change that. I'm not asking why is 1GB not as fast as 1000 1MBs. I'm asking why:
(1GB_NVMe / 1GB_SSD) != (1000x1MB_NVMe / 1000x1MB_SSD)
hard-drive ssd sata nvme
hard-drive ssd sata nvme
edited 6 hours ago
ispiro
asked 8 hours ago
ispiroispiro
7123 gold badges13 silver badges34 bronze badges
7123 gold badges13 silver badges34 bronze badges
1
in real life... You say that like it's a universal truth. Can you substantiate this claim? Please do not only respond in the comments. Instead, edit the post with this information.
– Twisty Impersonator
7 hours ago
@TwistyImpersonator I've searched the web for a contradictory source and only found one lead which turned out to be useless. It seems like there's practically no disagreement about that. That's why I omitted that from the question. Just like I omitted the fact that SSDs are faster than HDDs.
– ispiro
7 hours ago
Seek latency becomes a real problem with random reads and all the speed benefits of SSDs are lost with tiny reads: superuser.com/a/1168029/19943 This question feels like a slightly differently phrased duplicate of that one...
– Mokubai♦
7 hours ago
@Mokubai Every seek should see the speed difference. Your answer there explains the difference between small and large files. Not the difference between different types of drives. Every seek, every write amplification, every part should see the speed gain.
– ispiro
7 hours ago
1
@JakeGould Thanks. Point taken. I edited that line. Did you mean there was another part where the tone was harsh?
– ispiro
6 hours ago
|
show 6 more comments
1
in real life... You say that like it's a universal truth. Can you substantiate this claim? Please do not only respond in the comments. Instead, edit the post with this information.
– Twisty Impersonator
7 hours ago
@TwistyImpersonator I've searched the web for a contradictory source and only found one lead which turned out to be useless. It seems like there's practically no disagreement about that. That's why I omitted that from the question. Just like I omitted the fact that SSDs are faster than HDDs.
– ispiro
7 hours ago
Seek latency becomes a real problem with random reads and all the speed benefits of SSDs are lost with tiny reads: superuser.com/a/1168029/19943 This question feels like a slightly differently phrased duplicate of that one...
– Mokubai♦
7 hours ago
@Mokubai Every seek should see the speed difference. Your answer there explains the difference between small and large files. Not the difference between different types of drives. Every seek, every write amplification, every part should see the speed gain.
– ispiro
7 hours ago
1
@JakeGould Thanks. Point taken. I edited that line. Did you mean there was another part where the tone was harsh?
– ispiro
6 hours ago
1
1
in real life... You say that like it's a universal truth. Can you substantiate this claim? Please do not only respond in the comments. Instead, edit the post with this information.
– Twisty Impersonator
7 hours ago
in real life... You say that like it's a universal truth. Can you substantiate this claim? Please do not only respond in the comments. Instead, edit the post with this information.
– Twisty Impersonator
7 hours ago
@TwistyImpersonator I've searched the web for a contradictory source and only found one lead which turned out to be useless. It seems like there's practically no disagreement about that. That's why I omitted that from the question. Just like I omitted the fact that SSDs are faster than HDDs.
– ispiro
7 hours ago
@TwistyImpersonator I've searched the web for a contradictory source and only found one lead which turned out to be useless. It seems like there's practically no disagreement about that. That's why I omitted that from the question. Just like I omitted the fact that SSDs are faster than HDDs.
– ispiro
7 hours ago
Seek latency becomes a real problem with random reads and all the speed benefits of SSDs are lost with tiny reads: superuser.com/a/1168029/19943 This question feels like a slightly differently phrased duplicate of that one...
– Mokubai♦
7 hours ago
Seek latency becomes a real problem with random reads and all the speed benefits of SSDs are lost with tiny reads: superuser.com/a/1168029/19943 This question feels like a slightly differently phrased duplicate of that one...
– Mokubai♦
7 hours ago
@Mokubai Every seek should see the speed difference. Your answer there explains the difference between small and large files. Not the difference between different types of drives. Every seek, every write amplification, every part should see the speed gain.
– ispiro
7 hours ago
@Mokubai Every seek should see the speed difference. Your answer there explains the difference between small and large files. Not the difference between different types of drives. Every seek, every write amplification, every part should see the speed gain.
– ispiro
7 hours ago
1
1
@JakeGould Thanks. Point taken. I edited that line. Did you mean there was another part where the tone was harsh?
– ispiro
6 hours ago
@JakeGould Thanks. Point taken. I edited that line. Did you mean there was another part where the tone was harsh?
– ispiro
6 hours ago
|
show 6 more comments
2 Answers
2
active
oldest
votes
The problem here is that while NVMe and SSDs in general are faster than spinning rust due to using flash memories, the ability of NVMe to transfer multiple gigabytes of data per second is due to the way the flash memory is arranged around the controller.
Fast flash devices effectively use a scheme similar to RAID0 across what are (on their own) simply fast flash memory chips. Each chip on its own can handle a certain speed, but tied together with its siblings can achieve a much higher aggregate speed by having data written to and read from multiple devices simultaneously.
Effectively large transfers can take advantage of transfer parallelism and request multiple blocks from multiple chips and so reduce what would be 8 seeks times down to a single seek (across multiple chips) along with a larger transfer. The controller will have buffering and queueing to be able to sequentially stream out the data in whichever direction is required.
The individual flash chips themselves may also be configured to read ahead a few blocks for future requests and (for writes) cache it in a small internal buffer to further reduce delays for future requests.
The problem with working with lots of small files is that it ends up defeating all of the smarts used to achieve a single massive transfer. The controller has to operate in a queue going between flash devices requesting a block of data, waiting for a response, looking at the next item in the queue, requesting that data and so on.
If the data being read or written is on another chip then it might be able to use multiple channels but if a lot of the requests end up on the same chip for a period, as it could for lots of small writes, then what you end up seeing is the performance of a single flash chip rather than the full performance of an array of chips.
So thousands of small reads or writes could actually show you the performance of only a small part of your NVMe device, rather than what the device is fully capable of under so-called "perfect" conditions.
There is a problem with that: The disk knows nothing of files, only of sectors. So if the OS fed it enough data it wouldn't need to slow down. This means that the bottleneck is with the OS being too slow on many files, not with the way NVMe works.
– harrymc
6 hours ago
Thanks. This makes sense.
– ispiro
6 hours ago
@harrymc Are you saying that the OS should have fed the drive more but doesn't, or that it does, but that the loss is by the OS wasting other time?
– ispiro
6 hours ago
1
@harrymc that could be the case if a flash device were a dumb slab of disk like spinning rust, but there is another source of seek latency, the wear leveller or "flash transition layer" that is wholly within the NVMe controller. For large reads and writes again this becomes a single check and then burst out to the memory devices, for a queue of things it becomes yet another bottleneck of "where's this" and "where's that" within the drive itself. I'm not meaning to say mine is the one true answer, but the drive itself, being a more complicated device, holds a lot of the cards performance wise.
– Mokubai♦
6 hours ago
1
@ispiro performance SATA SSDs would have effectively the same internals as NVMe, you just don't really notice it because it is bottlenecked at the interface. Both SATA and NVMe get the same "worst case" performance, but for best case the NVMe can go way ahead.
– Mokubai♦
6 hours ago
|
show 3 more comments
Copying many files involves the overhead of creating their entries in the
Master File Table (MFT)
which is an integral component of the NTFS file system
(with equivalents for other file-systems than NTFS).
This means that creating a file entails first searching the MFT for the name,
in order to avoid duplicates, then allocating the space, copying the file,
and finally completing the entry in the MFT.
It is this bookkeeping overhead that slows down dramatically the copying of
many files.
The overhead involves matching work in the operating system,
updating RAM tables, computer interrupts, system calls etc etc.
Closing a file also causes the operating system to flush it to the disk,
and that takes time and disturbs the smooth copying of the data,
not letting NVMe achieve performance that is closer to its potential.
Note: This problem of the slow copy of many files is not unique to NVMe.
We see exactly the same problem for any kind of fast disk, either mechanical
or SSD, SATA or NVMe.
This to my way of thinking proves that the problem is with inefficient OS
handling of this case, perhaps because of inefficient cache memory
algorithms and/or disk-driver.
This shouldn't matter because every access to the MFT should be 5 times as fast. The bottom line is we're doing the same amount of work, and every part of it is 5 times as fast. (And I assume that the CPU work here, which would, indeed, be the same, is negligible.)
– ispiro
8 hours ago
I added more info to the answer.
– harrymc
7 hours ago
updating RAM tables
shouldn't be a bottleneck compared with IO (unless you're saying that NVMe is that fast). Same goes forcomputer interrupts
.Closing a file also causes the operating system to flush it to the disk
- this should be 5 times (or whatever) as fast.system calls
- they contain 2 parts: IO - which should be with that speed gain, and non-IO which should be negligible. Everything breaks up into 2: IO - where we should see the full speed gain, and non-IO which should be negligible. Unless: NVMe is really reaching near RAM speed.
– ispiro
7 hours ago
It is a problem when the OS operates in an inefficient manner that cannot drive the NVMe at full speed when there are many files.
– harrymc
6 hours ago
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "3"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/4.0/"u003ecc by-sa 4.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsuperuser.com%2fquestions%2f1479346%2fwhy-doesnt-an-nvme-connection-on-an-ssd-make-non-sequential-access-faster%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
The problem here is that while NVMe and SSDs in general are faster than spinning rust due to using flash memories, the ability of NVMe to transfer multiple gigabytes of data per second is due to the way the flash memory is arranged around the controller.
Fast flash devices effectively use a scheme similar to RAID0 across what are (on their own) simply fast flash memory chips. Each chip on its own can handle a certain speed, but tied together with its siblings can achieve a much higher aggregate speed by having data written to and read from multiple devices simultaneously.
Effectively large transfers can take advantage of transfer parallelism and request multiple blocks from multiple chips and so reduce what would be 8 seeks times down to a single seek (across multiple chips) along with a larger transfer. The controller will have buffering and queueing to be able to sequentially stream out the data in whichever direction is required.
The individual flash chips themselves may also be configured to read ahead a few blocks for future requests and (for writes) cache it in a small internal buffer to further reduce delays for future requests.
The problem with working with lots of small files is that it ends up defeating all of the smarts used to achieve a single massive transfer. The controller has to operate in a queue going between flash devices requesting a block of data, waiting for a response, looking at the next item in the queue, requesting that data and so on.
If the data being read or written is on another chip then it might be able to use multiple channels but if a lot of the requests end up on the same chip for a period, as it could for lots of small writes, then what you end up seeing is the performance of a single flash chip rather than the full performance of an array of chips.
So thousands of small reads or writes could actually show you the performance of only a small part of your NVMe device, rather than what the device is fully capable of under so-called "perfect" conditions.
There is a problem with that: The disk knows nothing of files, only of sectors. So if the OS fed it enough data it wouldn't need to slow down. This means that the bottleneck is with the OS being too slow on many files, not with the way NVMe works.
– harrymc
6 hours ago
Thanks. This makes sense.
– ispiro
6 hours ago
@harrymc Are you saying that the OS should have fed the drive more but doesn't, or that it does, but that the loss is by the OS wasting other time?
– ispiro
6 hours ago
1
@harrymc that could be the case if a flash device were a dumb slab of disk like spinning rust, but there is another source of seek latency, the wear leveller or "flash transition layer" that is wholly within the NVMe controller. For large reads and writes again this becomes a single check and then burst out to the memory devices, for a queue of things it becomes yet another bottleneck of "where's this" and "where's that" within the drive itself. I'm not meaning to say mine is the one true answer, but the drive itself, being a more complicated device, holds a lot of the cards performance wise.
– Mokubai♦
6 hours ago
1
@ispiro performance SATA SSDs would have effectively the same internals as NVMe, you just don't really notice it because it is bottlenecked at the interface. Both SATA and NVMe get the same "worst case" performance, but for best case the NVMe can go way ahead.
– Mokubai♦
6 hours ago
|
show 3 more comments
The problem here is that while NVMe and SSDs in general are faster than spinning rust due to using flash memories, the ability of NVMe to transfer multiple gigabytes of data per second is due to the way the flash memory is arranged around the controller.
Fast flash devices effectively use a scheme similar to RAID0 across what are (on their own) simply fast flash memory chips. Each chip on its own can handle a certain speed, but tied together with its siblings can achieve a much higher aggregate speed by having data written to and read from multiple devices simultaneously.
Effectively large transfers can take advantage of transfer parallelism and request multiple blocks from multiple chips and so reduce what would be 8 seeks times down to a single seek (across multiple chips) along with a larger transfer. The controller will have buffering and queueing to be able to sequentially stream out the data in whichever direction is required.
The individual flash chips themselves may also be configured to read ahead a few blocks for future requests and (for writes) cache it in a small internal buffer to further reduce delays for future requests.
The problem with working with lots of small files is that it ends up defeating all of the smarts used to achieve a single massive transfer. The controller has to operate in a queue going between flash devices requesting a block of data, waiting for a response, looking at the next item in the queue, requesting that data and so on.
If the data being read or written is on another chip then it might be able to use multiple channels but if a lot of the requests end up on the same chip for a period, as it could for lots of small writes, then what you end up seeing is the performance of a single flash chip rather than the full performance of an array of chips.
So thousands of small reads or writes could actually show you the performance of only a small part of your NVMe device, rather than what the device is fully capable of under so-called "perfect" conditions.
There is a problem with that: The disk knows nothing of files, only of sectors. So if the OS fed it enough data it wouldn't need to slow down. This means that the bottleneck is with the OS being too slow on many files, not with the way NVMe works.
– harrymc
6 hours ago
Thanks. This makes sense.
– ispiro
6 hours ago
@harrymc Are you saying that the OS should have fed the drive more but doesn't, or that it does, but that the loss is by the OS wasting other time?
– ispiro
6 hours ago
1
@harrymc that could be the case if a flash device were a dumb slab of disk like spinning rust, but there is another source of seek latency, the wear leveller or "flash transition layer" that is wholly within the NVMe controller. For large reads and writes again this becomes a single check and then burst out to the memory devices, for a queue of things it becomes yet another bottleneck of "where's this" and "where's that" within the drive itself. I'm not meaning to say mine is the one true answer, but the drive itself, being a more complicated device, holds a lot of the cards performance wise.
– Mokubai♦
6 hours ago
1
@ispiro performance SATA SSDs would have effectively the same internals as NVMe, you just don't really notice it because it is bottlenecked at the interface. Both SATA and NVMe get the same "worst case" performance, but for best case the NVMe can go way ahead.
– Mokubai♦
6 hours ago
|
show 3 more comments
The problem here is that while NVMe and SSDs in general are faster than spinning rust due to using flash memories, the ability of NVMe to transfer multiple gigabytes of data per second is due to the way the flash memory is arranged around the controller.
Fast flash devices effectively use a scheme similar to RAID0 across what are (on their own) simply fast flash memory chips. Each chip on its own can handle a certain speed, but tied together with its siblings can achieve a much higher aggregate speed by having data written to and read from multiple devices simultaneously.
Effectively large transfers can take advantage of transfer parallelism and request multiple blocks from multiple chips and so reduce what would be 8 seeks times down to a single seek (across multiple chips) along with a larger transfer. The controller will have buffering and queueing to be able to sequentially stream out the data in whichever direction is required.
The individual flash chips themselves may also be configured to read ahead a few blocks for future requests and (for writes) cache it in a small internal buffer to further reduce delays for future requests.
The problem with working with lots of small files is that it ends up defeating all of the smarts used to achieve a single massive transfer. The controller has to operate in a queue going between flash devices requesting a block of data, waiting for a response, looking at the next item in the queue, requesting that data and so on.
If the data being read or written is on another chip then it might be able to use multiple channels but if a lot of the requests end up on the same chip for a period, as it could for lots of small writes, then what you end up seeing is the performance of a single flash chip rather than the full performance of an array of chips.
So thousands of small reads or writes could actually show you the performance of only a small part of your NVMe device, rather than what the device is fully capable of under so-called "perfect" conditions.
The problem here is that while NVMe and SSDs in general are faster than spinning rust due to using flash memories, the ability of NVMe to transfer multiple gigabytes of data per second is due to the way the flash memory is arranged around the controller.
Fast flash devices effectively use a scheme similar to RAID0 across what are (on their own) simply fast flash memory chips. Each chip on its own can handle a certain speed, but tied together with its siblings can achieve a much higher aggregate speed by having data written to and read from multiple devices simultaneously.
Effectively large transfers can take advantage of transfer parallelism and request multiple blocks from multiple chips and so reduce what would be 8 seeks times down to a single seek (across multiple chips) along with a larger transfer. The controller will have buffering and queueing to be able to sequentially stream out the data in whichever direction is required.
The individual flash chips themselves may also be configured to read ahead a few blocks for future requests and (for writes) cache it in a small internal buffer to further reduce delays for future requests.
The problem with working with lots of small files is that it ends up defeating all of the smarts used to achieve a single massive transfer. The controller has to operate in a queue going between flash devices requesting a block of data, waiting for a response, looking at the next item in the queue, requesting that data and so on.
If the data being read or written is on another chip then it might be able to use multiple channels but if a lot of the requests end up on the same chip for a period, as it could for lots of small writes, then what you end up seeing is the performance of a single flash chip rather than the full performance of an array of chips.
So thousands of small reads or writes could actually show you the performance of only a small part of your NVMe device, rather than what the device is fully capable of under so-called "perfect" conditions.
answered 6 hours ago
Mokubai♦Mokubai
60.5k16 gold badges140 silver badges161 bronze badges
60.5k16 gold badges140 silver badges161 bronze badges
There is a problem with that: The disk knows nothing of files, only of sectors. So if the OS fed it enough data it wouldn't need to slow down. This means that the bottleneck is with the OS being too slow on many files, not with the way NVMe works.
– harrymc
6 hours ago
Thanks. This makes sense.
– ispiro
6 hours ago
@harrymc Are you saying that the OS should have fed the drive more but doesn't, or that it does, but that the loss is by the OS wasting other time?
– ispiro
6 hours ago
1
@harrymc that could be the case if a flash device were a dumb slab of disk like spinning rust, but there is another source of seek latency, the wear leveller or "flash transition layer" that is wholly within the NVMe controller. For large reads and writes again this becomes a single check and then burst out to the memory devices, for a queue of things it becomes yet another bottleneck of "where's this" and "where's that" within the drive itself. I'm not meaning to say mine is the one true answer, but the drive itself, being a more complicated device, holds a lot of the cards performance wise.
– Mokubai♦
6 hours ago
1
@ispiro performance SATA SSDs would have effectively the same internals as NVMe, you just don't really notice it because it is bottlenecked at the interface. Both SATA and NVMe get the same "worst case" performance, but for best case the NVMe can go way ahead.
– Mokubai♦
6 hours ago
|
show 3 more comments
There is a problem with that: The disk knows nothing of files, only of sectors. So if the OS fed it enough data it wouldn't need to slow down. This means that the bottleneck is with the OS being too slow on many files, not with the way NVMe works.
– harrymc
6 hours ago
Thanks. This makes sense.
– ispiro
6 hours ago
@harrymc Are you saying that the OS should have fed the drive more but doesn't, or that it does, but that the loss is by the OS wasting other time?
– ispiro
6 hours ago
1
@harrymc that could be the case if a flash device were a dumb slab of disk like spinning rust, but there is another source of seek latency, the wear leveller or "flash transition layer" that is wholly within the NVMe controller. For large reads and writes again this becomes a single check and then burst out to the memory devices, for a queue of things it becomes yet another bottleneck of "where's this" and "where's that" within the drive itself. I'm not meaning to say mine is the one true answer, but the drive itself, being a more complicated device, holds a lot of the cards performance wise.
– Mokubai♦
6 hours ago
1
@ispiro performance SATA SSDs would have effectively the same internals as NVMe, you just don't really notice it because it is bottlenecked at the interface. Both SATA and NVMe get the same "worst case" performance, but for best case the NVMe can go way ahead.
– Mokubai♦
6 hours ago
There is a problem with that: The disk knows nothing of files, only of sectors. So if the OS fed it enough data it wouldn't need to slow down. This means that the bottleneck is with the OS being too slow on many files, not with the way NVMe works.
– harrymc
6 hours ago
There is a problem with that: The disk knows nothing of files, only of sectors. So if the OS fed it enough data it wouldn't need to slow down. This means that the bottleneck is with the OS being too slow on many files, not with the way NVMe works.
– harrymc
6 hours ago
Thanks. This makes sense.
– ispiro
6 hours ago
Thanks. This makes sense.
– ispiro
6 hours ago
@harrymc Are you saying that the OS should have fed the drive more but doesn't, or that it does, but that the loss is by the OS wasting other time?
– ispiro
6 hours ago
@harrymc Are you saying that the OS should have fed the drive more but doesn't, or that it does, but that the loss is by the OS wasting other time?
– ispiro
6 hours ago
1
1
@harrymc that could be the case if a flash device were a dumb slab of disk like spinning rust, but there is another source of seek latency, the wear leveller or "flash transition layer" that is wholly within the NVMe controller. For large reads and writes again this becomes a single check and then burst out to the memory devices, for a queue of things it becomes yet another bottleneck of "where's this" and "where's that" within the drive itself. I'm not meaning to say mine is the one true answer, but the drive itself, being a more complicated device, holds a lot of the cards performance wise.
– Mokubai♦
6 hours ago
@harrymc that could be the case if a flash device were a dumb slab of disk like spinning rust, but there is another source of seek latency, the wear leveller or "flash transition layer" that is wholly within the NVMe controller. For large reads and writes again this becomes a single check and then burst out to the memory devices, for a queue of things it becomes yet another bottleneck of "where's this" and "where's that" within the drive itself. I'm not meaning to say mine is the one true answer, but the drive itself, being a more complicated device, holds a lot of the cards performance wise.
– Mokubai♦
6 hours ago
1
1
@ispiro performance SATA SSDs would have effectively the same internals as NVMe, you just don't really notice it because it is bottlenecked at the interface. Both SATA and NVMe get the same "worst case" performance, but for best case the NVMe can go way ahead.
– Mokubai♦
6 hours ago
@ispiro performance SATA SSDs would have effectively the same internals as NVMe, you just don't really notice it because it is bottlenecked at the interface. Both SATA and NVMe get the same "worst case" performance, but for best case the NVMe can go way ahead.
– Mokubai♦
6 hours ago
|
show 3 more comments
Copying many files involves the overhead of creating their entries in the
Master File Table (MFT)
which is an integral component of the NTFS file system
(with equivalents for other file-systems than NTFS).
This means that creating a file entails first searching the MFT for the name,
in order to avoid duplicates, then allocating the space, copying the file,
and finally completing the entry in the MFT.
It is this bookkeeping overhead that slows down dramatically the copying of
many files.
The overhead involves matching work in the operating system,
updating RAM tables, computer interrupts, system calls etc etc.
Closing a file also causes the operating system to flush it to the disk,
and that takes time and disturbs the smooth copying of the data,
not letting NVMe achieve performance that is closer to its potential.
Note: This problem of the slow copy of many files is not unique to NVMe.
We see exactly the same problem for any kind of fast disk, either mechanical
or SSD, SATA or NVMe.
This to my way of thinking proves that the problem is with inefficient OS
handling of this case, perhaps because of inefficient cache memory
algorithms and/or disk-driver.
This shouldn't matter because every access to the MFT should be 5 times as fast. The bottom line is we're doing the same amount of work, and every part of it is 5 times as fast. (And I assume that the CPU work here, which would, indeed, be the same, is negligible.)
– ispiro
8 hours ago
I added more info to the answer.
– harrymc
7 hours ago
updating RAM tables
shouldn't be a bottleneck compared with IO (unless you're saying that NVMe is that fast). Same goes forcomputer interrupts
.Closing a file also causes the operating system to flush it to the disk
- this should be 5 times (or whatever) as fast.system calls
- they contain 2 parts: IO - which should be with that speed gain, and non-IO which should be negligible. Everything breaks up into 2: IO - where we should see the full speed gain, and non-IO which should be negligible. Unless: NVMe is really reaching near RAM speed.
– ispiro
7 hours ago
It is a problem when the OS operates in an inefficient manner that cannot drive the NVMe at full speed when there are many files.
– harrymc
6 hours ago
add a comment |
Copying many files involves the overhead of creating their entries in the
Master File Table (MFT)
which is an integral component of the NTFS file system
(with equivalents for other file-systems than NTFS).
This means that creating a file entails first searching the MFT for the name,
in order to avoid duplicates, then allocating the space, copying the file,
and finally completing the entry in the MFT.
It is this bookkeeping overhead that slows down dramatically the copying of
many files.
The overhead involves matching work in the operating system,
updating RAM tables, computer interrupts, system calls etc etc.
Closing a file also causes the operating system to flush it to the disk,
and that takes time and disturbs the smooth copying of the data,
not letting NVMe achieve performance that is closer to its potential.
Note: This problem of the slow copy of many files is not unique to NVMe.
We see exactly the same problem for any kind of fast disk, either mechanical
or SSD, SATA or NVMe.
This to my way of thinking proves that the problem is with inefficient OS
handling of this case, perhaps because of inefficient cache memory
algorithms and/or disk-driver.
This shouldn't matter because every access to the MFT should be 5 times as fast. The bottom line is we're doing the same amount of work, and every part of it is 5 times as fast. (And I assume that the CPU work here, which would, indeed, be the same, is negligible.)
– ispiro
8 hours ago
I added more info to the answer.
– harrymc
7 hours ago
updating RAM tables
shouldn't be a bottleneck compared with IO (unless you're saying that NVMe is that fast). Same goes forcomputer interrupts
.Closing a file also causes the operating system to flush it to the disk
- this should be 5 times (or whatever) as fast.system calls
- they contain 2 parts: IO - which should be with that speed gain, and non-IO which should be negligible. Everything breaks up into 2: IO - where we should see the full speed gain, and non-IO which should be negligible. Unless: NVMe is really reaching near RAM speed.
– ispiro
7 hours ago
It is a problem when the OS operates in an inefficient manner that cannot drive the NVMe at full speed when there are many files.
– harrymc
6 hours ago
add a comment |
Copying many files involves the overhead of creating their entries in the
Master File Table (MFT)
which is an integral component of the NTFS file system
(with equivalents for other file-systems than NTFS).
This means that creating a file entails first searching the MFT for the name,
in order to avoid duplicates, then allocating the space, copying the file,
and finally completing the entry in the MFT.
It is this bookkeeping overhead that slows down dramatically the copying of
many files.
The overhead involves matching work in the operating system,
updating RAM tables, computer interrupts, system calls etc etc.
Closing a file also causes the operating system to flush it to the disk,
and that takes time and disturbs the smooth copying of the data,
not letting NVMe achieve performance that is closer to its potential.
Note: This problem of the slow copy of many files is not unique to NVMe.
We see exactly the same problem for any kind of fast disk, either mechanical
or SSD, SATA or NVMe.
This to my way of thinking proves that the problem is with inefficient OS
handling of this case, perhaps because of inefficient cache memory
algorithms and/or disk-driver.
Copying many files involves the overhead of creating their entries in the
Master File Table (MFT)
which is an integral component of the NTFS file system
(with equivalents for other file-systems than NTFS).
This means that creating a file entails first searching the MFT for the name,
in order to avoid duplicates, then allocating the space, copying the file,
and finally completing the entry in the MFT.
It is this bookkeeping overhead that slows down dramatically the copying of
many files.
The overhead involves matching work in the operating system,
updating RAM tables, computer interrupts, system calls etc etc.
Closing a file also causes the operating system to flush it to the disk,
and that takes time and disturbs the smooth copying of the data,
not letting NVMe achieve performance that is closer to its potential.
Note: This problem of the slow copy of many files is not unique to NVMe.
We see exactly the same problem for any kind of fast disk, either mechanical
or SSD, SATA or NVMe.
This to my way of thinking proves that the problem is with inefficient OS
handling of this case, perhaps because of inefficient cache memory
algorithms and/or disk-driver.
edited 5 hours ago
answered 8 hours ago
harrymcharrymc
283k16 gold badges300 silver badges615 bronze badges
283k16 gold badges300 silver badges615 bronze badges
This shouldn't matter because every access to the MFT should be 5 times as fast. The bottom line is we're doing the same amount of work, and every part of it is 5 times as fast. (And I assume that the CPU work here, which would, indeed, be the same, is negligible.)
– ispiro
8 hours ago
I added more info to the answer.
– harrymc
7 hours ago
updating RAM tables
shouldn't be a bottleneck compared with IO (unless you're saying that NVMe is that fast). Same goes forcomputer interrupts
.Closing a file also causes the operating system to flush it to the disk
- this should be 5 times (or whatever) as fast.system calls
- they contain 2 parts: IO - which should be with that speed gain, and non-IO which should be negligible. Everything breaks up into 2: IO - where we should see the full speed gain, and non-IO which should be negligible. Unless: NVMe is really reaching near RAM speed.
– ispiro
7 hours ago
It is a problem when the OS operates in an inefficient manner that cannot drive the NVMe at full speed when there are many files.
– harrymc
6 hours ago
add a comment |
This shouldn't matter because every access to the MFT should be 5 times as fast. The bottom line is we're doing the same amount of work, and every part of it is 5 times as fast. (And I assume that the CPU work here, which would, indeed, be the same, is negligible.)
– ispiro
8 hours ago
I added more info to the answer.
– harrymc
7 hours ago
updating RAM tables
shouldn't be a bottleneck compared with IO (unless you're saying that NVMe is that fast). Same goes forcomputer interrupts
.Closing a file also causes the operating system to flush it to the disk
- this should be 5 times (or whatever) as fast.system calls
- they contain 2 parts: IO - which should be with that speed gain, and non-IO which should be negligible. Everything breaks up into 2: IO - where we should see the full speed gain, and non-IO which should be negligible. Unless: NVMe is really reaching near RAM speed.
– ispiro
7 hours ago
It is a problem when the OS operates in an inefficient manner that cannot drive the NVMe at full speed when there are many files.
– harrymc
6 hours ago
This shouldn't matter because every access to the MFT should be 5 times as fast. The bottom line is we're doing the same amount of work, and every part of it is 5 times as fast. (And I assume that the CPU work here, which would, indeed, be the same, is negligible.)
– ispiro
8 hours ago
This shouldn't matter because every access to the MFT should be 5 times as fast. The bottom line is we're doing the same amount of work, and every part of it is 5 times as fast. (And I assume that the CPU work here, which would, indeed, be the same, is negligible.)
– ispiro
8 hours ago
I added more info to the answer.
– harrymc
7 hours ago
I added more info to the answer.
– harrymc
7 hours ago
updating RAM tables
shouldn't be a bottleneck compared with IO (unless you're saying that NVMe is that fast). Same goes for computer interrupts
. Closing a file also causes the operating system to flush it to the disk
- this should be 5 times (or whatever) as fast. system calls
- they contain 2 parts: IO - which should be with that speed gain, and non-IO which should be negligible. Everything breaks up into 2: IO - where we should see the full speed gain, and non-IO which should be negligible. Unless: NVMe is really reaching near RAM speed.– ispiro
7 hours ago
updating RAM tables
shouldn't be a bottleneck compared with IO (unless you're saying that NVMe is that fast). Same goes for computer interrupts
. Closing a file also causes the operating system to flush it to the disk
- this should be 5 times (or whatever) as fast. system calls
- they contain 2 parts: IO - which should be with that speed gain, and non-IO which should be negligible. Everything breaks up into 2: IO - where we should see the full speed gain, and non-IO which should be negligible. Unless: NVMe is really reaching near RAM speed.– ispiro
7 hours ago
It is a problem when the OS operates in an inefficient manner that cannot drive the NVMe at full speed when there are many files.
– harrymc
6 hours ago
It is a problem when the OS operates in an inefficient manner that cannot drive the NVMe at full speed when there are many files.
– harrymc
6 hours ago
add a comment |
Thanks for contributing an answer to Super User!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsuperuser.com%2fquestions%2f1479346%2fwhy-doesnt-an-nvme-connection-on-an-ssd-make-non-sequential-access-faster%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
in real life... You say that like it's a universal truth. Can you substantiate this claim? Please do not only respond in the comments. Instead, edit the post with this information.
– Twisty Impersonator
7 hours ago
@TwistyImpersonator I've searched the web for a contradictory source and only found one lead which turned out to be useless. It seems like there's practically no disagreement about that. That's why I omitted that from the question. Just like I omitted the fact that SSDs are faster than HDDs.
– ispiro
7 hours ago
Seek latency becomes a real problem with random reads and all the speed benefits of SSDs are lost with tiny reads: superuser.com/a/1168029/19943 This question feels like a slightly differently phrased duplicate of that one...
– Mokubai♦
7 hours ago
@Mokubai Every seek should see the speed difference. Your answer there explains the difference between small and large files. Not the difference between different types of drives. Every seek, every write amplification, every part should see the speed gain.
– ispiro
7 hours ago
1
@JakeGould Thanks. Point taken. I edited that line. Did you mean there was another part where the tone was harsh?
– ispiro
6 hours ago