Why doesn't an NVMe connection on an SSD make non-sequential access faster?NVMe ssd: Why is 4k writing faster than reading?How can I use my small SSD as a cache for a larger hard disk?ExpressCache vs. Intel's Rapid Storage and Smart Response TechnologiesDo I need to connect the jumper pins in a SATA hard drive to anything?NVMe ssd: Why is 4k writing faster than reading?Identical SSDs on same port: Why is one SATA/600 and the other SATA/150?NVMe SSD and Windows NTFS compression - effects on performance?Different rpm but same speed?

What is the purpose of the rotating plate in front of the lock?

In-universe, why does Doc Brown program the time machine to go to 1955?

Is Sanskrit really the mother of all languages?

SQL Always On COPY ONLY backups - what's the point if I cant restore the AG from these backups?

Why would one hemisphere of a planet be very mountainous while the other is flat?

Was Rosie the Riveter sourced from a Michelangelo painting?

Why doesn't an NVMe connection on an SSD make non-sequential access faster?

GFI outlets tripped after power outage

Looking for the comic book where Spider-Man was [mistakenly] addressed as Super-Man

What can we do about our 9 month old putting fingers down his throat?

How to interpret or parse this confusing 'NOT' and 'AND' legal clause

Why did Tony's Arc Reactor do this?

Where on Earth is it easiest to survive in the wilderness?

How do German speakers decide what should be on the left side of the verb?

Sinning and G-d's will, what's wrong with this logic?

If I have an accident, should I file a claim with my car insurance company?

Do I need to declare engagement ring bought in UK when flying on holiday to US?

When should IGNORE_DUP_KEY option be used on an index?

convenient Vector3f class

Balm of the Summer Court fey energy dice usage limits

Is it right to use the ideas of non-winning designers in a design contest?

What are the map units that WGS84 uses?

Global variables and information security

Why did Boris Johnson call for new elections?



Why doesn't an NVMe connection on an SSD make non-sequential access faster?


NVMe ssd: Why is 4k writing faster than reading?How can I use my small SSD as a cache for a larger hard disk?ExpressCache vs. Intel's Rapid Storage and Smart Response TechnologiesDo I need to connect the jumper pins in a SATA hard drive to anything?NVMe ssd: Why is 4k writing faster than reading?Identical SSDs on same port: Why is one SATA/600 and the other SATA/150?NVMe SSD and Windows NTFS compression - effects on performance?Different rpm but same speed?






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








1















Why would it matter if you’re transferring a 1GB file at quintuple the speed (of SATA SSDs) or transferring 1,000 1MB files at quintuple the speed? It should always amount to being quintuple the speed of the SSD.



But in real life non-sequential access turns out to have only little benefit over a SATA SSD.



EDIT



Since the answers are concentrating on the difference between large and small files, let me clarify my question:



  • Yes, small files will have overhead.

  • And yes, they will waste time by reading data that will be ignored.

  • But this is irrelevant to my question since every read and write (including those pesky little MFT writes etc…) will (or rather should) see the x5 speed gain.

Saying that there is wasted drive access doesn't change that. I'm not asking why is 1GB not as fast as 1000 1MBs. I'm asking why:




(1GB_NVMe / 1GB_SSD) != (1000x1MB_NVMe / 1000x1MB_SSD)











share|improve this question





















  • 1





    in real life... You say that like it's a universal truth. Can you substantiate this claim? Please do not only respond in the comments. Instead, edit the post with this information.

    – Twisty Impersonator
    7 hours ago












  • @TwistyImpersonator I've searched the web for a contradictory source and only found one lead which turned out to be useless. It seems like there's practically no disagreement about that. That's why I omitted that from the question. Just like I omitted the fact that SSDs are faster than HDDs.

    – ispiro
    7 hours ago











  • Seek latency becomes a real problem with random reads and all the speed benefits of SSDs are lost with tiny reads: superuser.com/a/1168029/19943 This question feels like a slightly differently phrased duplicate of that one...

    – Mokubai
    7 hours ago












  • @Mokubai Every seek should see the speed difference. Your answer there explains the difference between small and large files. Not the difference between different types of drives. Every seek, every write amplification, every part should see the speed gain.

    – ispiro
    7 hours ago






  • 1





    @JakeGould Thanks. Point taken. I edited that line. Did you mean there was another part where the tone was harsh?

    – ispiro
    6 hours ago

















1















Why would it matter if you’re transferring a 1GB file at quintuple the speed (of SATA SSDs) or transferring 1,000 1MB files at quintuple the speed? It should always amount to being quintuple the speed of the SSD.



But in real life non-sequential access turns out to have only little benefit over a SATA SSD.



EDIT



Since the answers are concentrating on the difference between large and small files, let me clarify my question:



  • Yes, small files will have overhead.

  • And yes, they will waste time by reading data that will be ignored.

  • But this is irrelevant to my question since every read and write (including those pesky little MFT writes etc…) will (or rather should) see the x5 speed gain.

Saying that there is wasted drive access doesn't change that. I'm not asking why is 1GB not as fast as 1000 1MBs. I'm asking why:




(1GB_NVMe / 1GB_SSD) != (1000x1MB_NVMe / 1000x1MB_SSD)











share|improve this question





















  • 1





    in real life... You say that like it's a universal truth. Can you substantiate this claim? Please do not only respond in the comments. Instead, edit the post with this information.

    – Twisty Impersonator
    7 hours ago












  • @TwistyImpersonator I've searched the web for a contradictory source and only found one lead which turned out to be useless. It seems like there's practically no disagreement about that. That's why I omitted that from the question. Just like I omitted the fact that SSDs are faster than HDDs.

    – ispiro
    7 hours ago











  • Seek latency becomes a real problem with random reads and all the speed benefits of SSDs are lost with tiny reads: superuser.com/a/1168029/19943 This question feels like a slightly differently phrased duplicate of that one...

    – Mokubai
    7 hours ago












  • @Mokubai Every seek should see the speed difference. Your answer there explains the difference between small and large files. Not the difference between different types of drives. Every seek, every write amplification, every part should see the speed gain.

    – ispiro
    7 hours ago






  • 1





    @JakeGould Thanks. Point taken. I edited that line. Did you mean there was another part where the tone was harsh?

    – ispiro
    6 hours ago













1












1








1








Why would it matter if you’re transferring a 1GB file at quintuple the speed (of SATA SSDs) or transferring 1,000 1MB files at quintuple the speed? It should always amount to being quintuple the speed of the SSD.



But in real life non-sequential access turns out to have only little benefit over a SATA SSD.



EDIT



Since the answers are concentrating on the difference between large and small files, let me clarify my question:



  • Yes, small files will have overhead.

  • And yes, they will waste time by reading data that will be ignored.

  • But this is irrelevant to my question since every read and write (including those pesky little MFT writes etc…) will (or rather should) see the x5 speed gain.

Saying that there is wasted drive access doesn't change that. I'm not asking why is 1GB not as fast as 1000 1MBs. I'm asking why:




(1GB_NVMe / 1GB_SSD) != (1000x1MB_NVMe / 1000x1MB_SSD)











share|improve this question
















Why would it matter if you’re transferring a 1GB file at quintuple the speed (of SATA SSDs) or transferring 1,000 1MB files at quintuple the speed? It should always amount to being quintuple the speed of the SSD.



But in real life non-sequential access turns out to have only little benefit over a SATA SSD.



EDIT



Since the answers are concentrating on the difference between large and small files, let me clarify my question:



  • Yes, small files will have overhead.

  • And yes, they will waste time by reading data that will be ignored.

  • But this is irrelevant to my question since every read and write (including those pesky little MFT writes etc…) will (or rather should) see the x5 speed gain.

Saying that there is wasted drive access doesn't change that. I'm not asking why is 1GB not as fast as 1000 1MBs. I'm asking why:




(1GB_NVMe / 1GB_SSD) != (1000x1MB_NVMe / 1000x1MB_SSD)








hard-drive ssd sata nvme






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited 6 hours ago







ispiro

















asked 8 hours ago









ispiroispiro

7123 gold badges13 silver badges34 bronze badges




7123 gold badges13 silver badges34 bronze badges










  • 1





    in real life... You say that like it's a universal truth. Can you substantiate this claim? Please do not only respond in the comments. Instead, edit the post with this information.

    – Twisty Impersonator
    7 hours ago












  • @TwistyImpersonator I've searched the web for a contradictory source and only found one lead which turned out to be useless. It seems like there's practically no disagreement about that. That's why I omitted that from the question. Just like I omitted the fact that SSDs are faster than HDDs.

    – ispiro
    7 hours ago











  • Seek latency becomes a real problem with random reads and all the speed benefits of SSDs are lost with tiny reads: superuser.com/a/1168029/19943 This question feels like a slightly differently phrased duplicate of that one...

    – Mokubai
    7 hours ago












  • @Mokubai Every seek should see the speed difference. Your answer there explains the difference between small and large files. Not the difference between different types of drives. Every seek, every write amplification, every part should see the speed gain.

    – ispiro
    7 hours ago






  • 1





    @JakeGould Thanks. Point taken. I edited that line. Did you mean there was another part where the tone was harsh?

    – ispiro
    6 hours ago












  • 1





    in real life... You say that like it's a universal truth. Can you substantiate this claim? Please do not only respond in the comments. Instead, edit the post with this information.

    – Twisty Impersonator
    7 hours ago












  • @TwistyImpersonator I've searched the web for a contradictory source and only found one lead which turned out to be useless. It seems like there's practically no disagreement about that. That's why I omitted that from the question. Just like I omitted the fact that SSDs are faster than HDDs.

    – ispiro
    7 hours ago











  • Seek latency becomes a real problem with random reads and all the speed benefits of SSDs are lost with tiny reads: superuser.com/a/1168029/19943 This question feels like a slightly differently phrased duplicate of that one...

    – Mokubai
    7 hours ago












  • @Mokubai Every seek should see the speed difference. Your answer there explains the difference between small and large files. Not the difference between different types of drives. Every seek, every write amplification, every part should see the speed gain.

    – ispiro
    7 hours ago






  • 1





    @JakeGould Thanks. Point taken. I edited that line. Did you mean there was another part where the tone was harsh?

    – ispiro
    6 hours ago







1




1





in real life... You say that like it's a universal truth. Can you substantiate this claim? Please do not only respond in the comments. Instead, edit the post with this information.

– Twisty Impersonator
7 hours ago






in real life... You say that like it's a universal truth. Can you substantiate this claim? Please do not only respond in the comments. Instead, edit the post with this information.

– Twisty Impersonator
7 hours ago














@TwistyImpersonator I've searched the web for a contradictory source and only found one lead which turned out to be useless. It seems like there's practically no disagreement about that. That's why I omitted that from the question. Just like I omitted the fact that SSDs are faster than HDDs.

– ispiro
7 hours ago





@TwistyImpersonator I've searched the web for a contradictory source and only found one lead which turned out to be useless. It seems like there's practically no disagreement about that. That's why I omitted that from the question. Just like I omitted the fact that SSDs are faster than HDDs.

– ispiro
7 hours ago













Seek latency becomes a real problem with random reads and all the speed benefits of SSDs are lost with tiny reads: superuser.com/a/1168029/19943 This question feels like a slightly differently phrased duplicate of that one...

– Mokubai
7 hours ago






Seek latency becomes a real problem with random reads and all the speed benefits of SSDs are lost with tiny reads: superuser.com/a/1168029/19943 This question feels like a slightly differently phrased duplicate of that one...

– Mokubai
7 hours ago














@Mokubai Every seek should see the speed difference. Your answer there explains the difference between small and large files. Not the difference between different types of drives. Every seek, every write amplification, every part should see the speed gain.

– ispiro
7 hours ago





@Mokubai Every seek should see the speed difference. Your answer there explains the difference between small and large files. Not the difference between different types of drives. Every seek, every write amplification, every part should see the speed gain.

– ispiro
7 hours ago




1




1





@JakeGould Thanks. Point taken. I edited that line. Did you mean there was another part where the tone was harsh?

– ispiro
6 hours ago





@JakeGould Thanks. Point taken. I edited that line. Did you mean there was another part where the tone was harsh?

– ispiro
6 hours ago










2 Answers
2






active

oldest

votes


















3
















The problem here is that while NVMe and SSDs in general are faster than spinning rust due to using flash memories, the ability of NVMe to transfer multiple gigabytes of data per second is due to the way the flash memory is arranged around the controller.



Fast flash devices effectively use a scheme similar to RAID0 across what are (on their own) simply fast flash memory chips. Each chip on its own can handle a certain speed, but tied together with its siblings can achieve a much higher aggregate speed by having data written to and read from multiple devices simultaneously.



Effectively large transfers can take advantage of transfer parallelism and request multiple blocks from multiple chips and so reduce what would be 8 seeks times down to a single seek (across multiple chips) along with a larger transfer. The controller will have buffering and queueing to be able to sequentially stream out the data in whichever direction is required.



The individual flash chips themselves may also be configured to read ahead a few blocks for future requests and (for writes) cache it in a small internal buffer to further reduce delays for future requests.



The problem with working with lots of small files is that it ends up defeating all of the smarts used to achieve a single massive transfer. The controller has to operate in a queue going between flash devices requesting a block of data, waiting for a response, looking at the next item in the queue, requesting that data and so on.



If the data being read or written is on another chip then it might be able to use multiple channels but if a lot of the requests end up on the same chip for a period, as it could for lots of small writes, then what you end up seeing is the performance of a single flash chip rather than the full performance of an array of chips.



So thousands of small reads or writes could actually show you the performance of only a small part of your NVMe device, rather than what the device is fully capable of under so-called "perfect" conditions.






share|improve this answer

























  • There is a problem with that: The disk knows nothing of files, only of sectors. So if the OS fed it enough data it wouldn't need to slow down. This means that the bottleneck is with the OS being too slow on many files, not with the way NVMe works.

    – harrymc
    6 hours ago












  • Thanks. This makes sense.

    – ispiro
    6 hours ago











  • @harrymc Are you saying that the OS should have fed the drive more but doesn't, or that it does, but that the loss is by the OS wasting other time?

    – ispiro
    6 hours ago







  • 1





    @harrymc that could be the case if a flash device were a dumb slab of disk like spinning rust, but there is another source of seek latency, the wear leveller or "flash transition layer" that is wholly within the NVMe controller. For large reads and writes again this becomes a single check and then burst out to the memory devices, for a queue of things it becomes yet another bottleneck of "where's this" and "where's that" within the drive itself. I'm not meaning to say mine is the one true answer, but the drive itself, being a more complicated device, holds a lot of the cards performance wise.

    – Mokubai
    6 hours ago






  • 1





    @ispiro performance SATA SSDs would have effectively the same internals as NVMe, you just don't really notice it because it is bottlenecked at the interface. Both SATA and NVMe get the same "worst case" performance, but for best case the NVMe can go way ahead.

    – Mokubai
    6 hours ago


















1
















Copying many files involves the overhead of creating their entries in the
Master File Table (MFT)
which is an integral component of the NTFS file system
(with equivalents for other file-systems than NTFS).



This means that creating a file entails first searching the MFT for the name,
in order to avoid duplicates, then allocating the space, copying the file,
and finally completing the entry in the MFT.



It is this bookkeeping overhead that slows down dramatically the copying of
many files.
The overhead involves matching work in the operating system,
updating RAM tables, computer interrupts, system calls etc etc.
Closing a file also causes the operating system to flush it to the disk,
and that takes time and disturbs the smooth copying of the data,
not letting NVMe achieve performance that is closer to its potential.



Note: This problem of the slow copy of many files is not unique to NVMe.
We see exactly the same problem for any kind of fast disk, either mechanical
or SSD, SATA or NVMe.
This to my way of thinking proves that the problem is with inefficient OS
handling of this case, perhaps because of inefficient cache memory
algorithms and/or disk-driver.






share|improve this answer



























  • This shouldn't matter because every access to the MFT should be 5 times as fast. The bottom line is we're doing the same amount of work, and every part of it is 5 times as fast. (And I assume that the CPU work here, which would, indeed, be the same, is negligible.)

    – ispiro
    8 hours ago












  • I added more info to the answer.

    – harrymc
    7 hours ago











  • updating RAM tables shouldn't be a bottleneck compared with IO (unless you're saying that NVMe is that fast). Same goes for computer interrupts. Closing a file also causes the operating system to flush it to the disk - this should be 5 times (or whatever) as fast. system calls - they contain 2 parts: IO - which should be with that speed gain, and non-IO which should be negligible. Everything breaks up into 2: IO - where we should see the full speed gain, and non-IO which should be negligible. Unless: NVMe is really reaching near RAM speed.

    – ispiro
    7 hours ago












  • It is a problem when the OS operates in an inefficient manner that cannot drive the NVMe at full speed when there are many files.

    – harrymc
    6 hours ago













Your Answer








StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "3"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/4.0/"u003ecc by-sa 4.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);














draft saved

draft discarded
















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsuperuser.com%2fquestions%2f1479346%2fwhy-doesnt-an-nvme-connection-on-an-ssd-make-non-sequential-access-faster%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























2 Answers
2






active

oldest

votes








2 Answers
2






active

oldest

votes









active

oldest

votes






active

oldest

votes









3
















The problem here is that while NVMe and SSDs in general are faster than spinning rust due to using flash memories, the ability of NVMe to transfer multiple gigabytes of data per second is due to the way the flash memory is arranged around the controller.



Fast flash devices effectively use a scheme similar to RAID0 across what are (on their own) simply fast flash memory chips. Each chip on its own can handle a certain speed, but tied together with its siblings can achieve a much higher aggregate speed by having data written to and read from multiple devices simultaneously.



Effectively large transfers can take advantage of transfer parallelism and request multiple blocks from multiple chips and so reduce what would be 8 seeks times down to a single seek (across multiple chips) along with a larger transfer. The controller will have buffering and queueing to be able to sequentially stream out the data in whichever direction is required.



The individual flash chips themselves may also be configured to read ahead a few blocks for future requests and (for writes) cache it in a small internal buffer to further reduce delays for future requests.



The problem with working with lots of small files is that it ends up defeating all of the smarts used to achieve a single massive transfer. The controller has to operate in a queue going between flash devices requesting a block of data, waiting for a response, looking at the next item in the queue, requesting that data and so on.



If the data being read or written is on another chip then it might be able to use multiple channels but if a lot of the requests end up on the same chip for a period, as it could for lots of small writes, then what you end up seeing is the performance of a single flash chip rather than the full performance of an array of chips.



So thousands of small reads or writes could actually show you the performance of only a small part of your NVMe device, rather than what the device is fully capable of under so-called "perfect" conditions.






share|improve this answer

























  • There is a problem with that: The disk knows nothing of files, only of sectors. So if the OS fed it enough data it wouldn't need to slow down. This means that the bottleneck is with the OS being too slow on many files, not with the way NVMe works.

    – harrymc
    6 hours ago












  • Thanks. This makes sense.

    – ispiro
    6 hours ago











  • @harrymc Are you saying that the OS should have fed the drive more but doesn't, or that it does, but that the loss is by the OS wasting other time?

    – ispiro
    6 hours ago







  • 1





    @harrymc that could be the case if a flash device were a dumb slab of disk like spinning rust, but there is another source of seek latency, the wear leveller or "flash transition layer" that is wholly within the NVMe controller. For large reads and writes again this becomes a single check and then burst out to the memory devices, for a queue of things it becomes yet another bottleneck of "where's this" and "where's that" within the drive itself. I'm not meaning to say mine is the one true answer, but the drive itself, being a more complicated device, holds a lot of the cards performance wise.

    – Mokubai
    6 hours ago






  • 1





    @ispiro performance SATA SSDs would have effectively the same internals as NVMe, you just don't really notice it because it is bottlenecked at the interface. Both SATA and NVMe get the same "worst case" performance, but for best case the NVMe can go way ahead.

    – Mokubai
    6 hours ago















3
















The problem here is that while NVMe and SSDs in general are faster than spinning rust due to using flash memories, the ability of NVMe to transfer multiple gigabytes of data per second is due to the way the flash memory is arranged around the controller.



Fast flash devices effectively use a scheme similar to RAID0 across what are (on their own) simply fast flash memory chips. Each chip on its own can handle a certain speed, but tied together with its siblings can achieve a much higher aggregate speed by having data written to and read from multiple devices simultaneously.



Effectively large transfers can take advantage of transfer parallelism and request multiple blocks from multiple chips and so reduce what would be 8 seeks times down to a single seek (across multiple chips) along with a larger transfer. The controller will have buffering and queueing to be able to sequentially stream out the data in whichever direction is required.



The individual flash chips themselves may also be configured to read ahead a few blocks for future requests and (for writes) cache it in a small internal buffer to further reduce delays for future requests.



The problem with working with lots of small files is that it ends up defeating all of the smarts used to achieve a single massive transfer. The controller has to operate in a queue going between flash devices requesting a block of data, waiting for a response, looking at the next item in the queue, requesting that data and so on.



If the data being read or written is on another chip then it might be able to use multiple channels but if a lot of the requests end up on the same chip for a period, as it could for lots of small writes, then what you end up seeing is the performance of a single flash chip rather than the full performance of an array of chips.



So thousands of small reads or writes could actually show you the performance of only a small part of your NVMe device, rather than what the device is fully capable of under so-called "perfect" conditions.






share|improve this answer

























  • There is a problem with that: The disk knows nothing of files, only of sectors. So if the OS fed it enough data it wouldn't need to slow down. This means that the bottleneck is with the OS being too slow on many files, not with the way NVMe works.

    – harrymc
    6 hours ago












  • Thanks. This makes sense.

    – ispiro
    6 hours ago











  • @harrymc Are you saying that the OS should have fed the drive more but doesn't, or that it does, but that the loss is by the OS wasting other time?

    – ispiro
    6 hours ago







  • 1





    @harrymc that could be the case if a flash device were a dumb slab of disk like spinning rust, but there is another source of seek latency, the wear leveller or "flash transition layer" that is wholly within the NVMe controller. For large reads and writes again this becomes a single check and then burst out to the memory devices, for a queue of things it becomes yet another bottleneck of "where's this" and "where's that" within the drive itself. I'm not meaning to say mine is the one true answer, but the drive itself, being a more complicated device, holds a lot of the cards performance wise.

    – Mokubai
    6 hours ago






  • 1





    @ispiro performance SATA SSDs would have effectively the same internals as NVMe, you just don't really notice it because it is bottlenecked at the interface. Both SATA and NVMe get the same "worst case" performance, but for best case the NVMe can go way ahead.

    – Mokubai
    6 hours ago













3














3










3









The problem here is that while NVMe and SSDs in general are faster than spinning rust due to using flash memories, the ability of NVMe to transfer multiple gigabytes of data per second is due to the way the flash memory is arranged around the controller.



Fast flash devices effectively use a scheme similar to RAID0 across what are (on their own) simply fast flash memory chips. Each chip on its own can handle a certain speed, but tied together with its siblings can achieve a much higher aggregate speed by having data written to and read from multiple devices simultaneously.



Effectively large transfers can take advantage of transfer parallelism and request multiple blocks from multiple chips and so reduce what would be 8 seeks times down to a single seek (across multiple chips) along with a larger transfer. The controller will have buffering and queueing to be able to sequentially stream out the data in whichever direction is required.



The individual flash chips themselves may also be configured to read ahead a few blocks for future requests and (for writes) cache it in a small internal buffer to further reduce delays for future requests.



The problem with working with lots of small files is that it ends up defeating all of the smarts used to achieve a single massive transfer. The controller has to operate in a queue going between flash devices requesting a block of data, waiting for a response, looking at the next item in the queue, requesting that data and so on.



If the data being read or written is on another chip then it might be able to use multiple channels but if a lot of the requests end up on the same chip for a period, as it could for lots of small writes, then what you end up seeing is the performance of a single flash chip rather than the full performance of an array of chips.



So thousands of small reads or writes could actually show you the performance of only a small part of your NVMe device, rather than what the device is fully capable of under so-called "perfect" conditions.






share|improve this answer













The problem here is that while NVMe and SSDs in general are faster than spinning rust due to using flash memories, the ability of NVMe to transfer multiple gigabytes of data per second is due to the way the flash memory is arranged around the controller.



Fast flash devices effectively use a scheme similar to RAID0 across what are (on their own) simply fast flash memory chips. Each chip on its own can handle a certain speed, but tied together with its siblings can achieve a much higher aggregate speed by having data written to and read from multiple devices simultaneously.



Effectively large transfers can take advantage of transfer parallelism and request multiple blocks from multiple chips and so reduce what would be 8 seeks times down to a single seek (across multiple chips) along with a larger transfer. The controller will have buffering and queueing to be able to sequentially stream out the data in whichever direction is required.



The individual flash chips themselves may also be configured to read ahead a few blocks for future requests and (for writes) cache it in a small internal buffer to further reduce delays for future requests.



The problem with working with lots of small files is that it ends up defeating all of the smarts used to achieve a single massive transfer. The controller has to operate in a queue going between flash devices requesting a block of data, waiting for a response, looking at the next item in the queue, requesting that data and so on.



If the data being read or written is on another chip then it might be able to use multiple channels but if a lot of the requests end up on the same chip for a period, as it could for lots of small writes, then what you end up seeing is the performance of a single flash chip rather than the full performance of an array of chips.



So thousands of small reads or writes could actually show you the performance of only a small part of your NVMe device, rather than what the device is fully capable of under so-called "perfect" conditions.







share|improve this answer












share|improve this answer



share|improve this answer










answered 6 hours ago









MokubaiMokubai

60.5k16 gold badges140 silver badges161 bronze badges




60.5k16 gold badges140 silver badges161 bronze badges















  • There is a problem with that: The disk knows nothing of files, only of sectors. So if the OS fed it enough data it wouldn't need to slow down. This means that the bottleneck is with the OS being too slow on many files, not with the way NVMe works.

    – harrymc
    6 hours ago












  • Thanks. This makes sense.

    – ispiro
    6 hours ago











  • @harrymc Are you saying that the OS should have fed the drive more but doesn't, or that it does, but that the loss is by the OS wasting other time?

    – ispiro
    6 hours ago







  • 1





    @harrymc that could be the case if a flash device were a dumb slab of disk like spinning rust, but there is another source of seek latency, the wear leveller or "flash transition layer" that is wholly within the NVMe controller. For large reads and writes again this becomes a single check and then burst out to the memory devices, for a queue of things it becomes yet another bottleneck of "where's this" and "where's that" within the drive itself. I'm not meaning to say mine is the one true answer, but the drive itself, being a more complicated device, holds a lot of the cards performance wise.

    – Mokubai
    6 hours ago






  • 1





    @ispiro performance SATA SSDs would have effectively the same internals as NVMe, you just don't really notice it because it is bottlenecked at the interface. Both SATA and NVMe get the same "worst case" performance, but for best case the NVMe can go way ahead.

    – Mokubai
    6 hours ago

















  • There is a problem with that: The disk knows nothing of files, only of sectors. So if the OS fed it enough data it wouldn't need to slow down. This means that the bottleneck is with the OS being too slow on many files, not with the way NVMe works.

    – harrymc
    6 hours ago












  • Thanks. This makes sense.

    – ispiro
    6 hours ago











  • @harrymc Are you saying that the OS should have fed the drive more but doesn't, or that it does, but that the loss is by the OS wasting other time?

    – ispiro
    6 hours ago







  • 1





    @harrymc that could be the case if a flash device were a dumb slab of disk like spinning rust, but there is another source of seek latency, the wear leveller or "flash transition layer" that is wholly within the NVMe controller. For large reads and writes again this becomes a single check and then burst out to the memory devices, for a queue of things it becomes yet another bottleneck of "where's this" and "where's that" within the drive itself. I'm not meaning to say mine is the one true answer, but the drive itself, being a more complicated device, holds a lot of the cards performance wise.

    – Mokubai
    6 hours ago






  • 1





    @ispiro performance SATA SSDs would have effectively the same internals as NVMe, you just don't really notice it because it is bottlenecked at the interface. Both SATA and NVMe get the same "worst case" performance, but for best case the NVMe can go way ahead.

    – Mokubai
    6 hours ago
















There is a problem with that: The disk knows nothing of files, only of sectors. So if the OS fed it enough data it wouldn't need to slow down. This means that the bottleneck is with the OS being too slow on many files, not with the way NVMe works.

– harrymc
6 hours ago






There is a problem with that: The disk knows nothing of files, only of sectors. So if the OS fed it enough data it wouldn't need to slow down. This means that the bottleneck is with the OS being too slow on many files, not with the way NVMe works.

– harrymc
6 hours ago














Thanks. This makes sense.

– ispiro
6 hours ago





Thanks. This makes sense.

– ispiro
6 hours ago













@harrymc Are you saying that the OS should have fed the drive more but doesn't, or that it does, but that the loss is by the OS wasting other time?

– ispiro
6 hours ago






@harrymc Are you saying that the OS should have fed the drive more but doesn't, or that it does, but that the loss is by the OS wasting other time?

– ispiro
6 hours ago





1




1





@harrymc that could be the case if a flash device were a dumb slab of disk like spinning rust, but there is another source of seek latency, the wear leveller or "flash transition layer" that is wholly within the NVMe controller. For large reads and writes again this becomes a single check and then burst out to the memory devices, for a queue of things it becomes yet another bottleneck of "where's this" and "where's that" within the drive itself. I'm not meaning to say mine is the one true answer, but the drive itself, being a more complicated device, holds a lot of the cards performance wise.

– Mokubai
6 hours ago





@harrymc that could be the case if a flash device were a dumb slab of disk like spinning rust, but there is another source of seek latency, the wear leveller or "flash transition layer" that is wholly within the NVMe controller. For large reads and writes again this becomes a single check and then burst out to the memory devices, for a queue of things it becomes yet another bottleneck of "where's this" and "where's that" within the drive itself. I'm not meaning to say mine is the one true answer, but the drive itself, being a more complicated device, holds a lot of the cards performance wise.

– Mokubai
6 hours ago




1




1





@ispiro performance SATA SSDs would have effectively the same internals as NVMe, you just don't really notice it because it is bottlenecked at the interface. Both SATA and NVMe get the same "worst case" performance, but for best case the NVMe can go way ahead.

– Mokubai
6 hours ago





@ispiro performance SATA SSDs would have effectively the same internals as NVMe, you just don't really notice it because it is bottlenecked at the interface. Both SATA and NVMe get the same "worst case" performance, but for best case the NVMe can go way ahead.

– Mokubai
6 hours ago













1
















Copying many files involves the overhead of creating their entries in the
Master File Table (MFT)
which is an integral component of the NTFS file system
(with equivalents for other file-systems than NTFS).



This means that creating a file entails first searching the MFT for the name,
in order to avoid duplicates, then allocating the space, copying the file,
and finally completing the entry in the MFT.



It is this bookkeeping overhead that slows down dramatically the copying of
many files.
The overhead involves matching work in the operating system,
updating RAM tables, computer interrupts, system calls etc etc.
Closing a file also causes the operating system to flush it to the disk,
and that takes time and disturbs the smooth copying of the data,
not letting NVMe achieve performance that is closer to its potential.



Note: This problem of the slow copy of many files is not unique to NVMe.
We see exactly the same problem for any kind of fast disk, either mechanical
or SSD, SATA or NVMe.
This to my way of thinking proves that the problem is with inefficient OS
handling of this case, perhaps because of inefficient cache memory
algorithms and/or disk-driver.






share|improve this answer



























  • This shouldn't matter because every access to the MFT should be 5 times as fast. The bottom line is we're doing the same amount of work, and every part of it is 5 times as fast. (And I assume that the CPU work here, which would, indeed, be the same, is negligible.)

    – ispiro
    8 hours ago












  • I added more info to the answer.

    – harrymc
    7 hours ago











  • updating RAM tables shouldn't be a bottleneck compared with IO (unless you're saying that NVMe is that fast). Same goes for computer interrupts. Closing a file also causes the operating system to flush it to the disk - this should be 5 times (or whatever) as fast. system calls - they contain 2 parts: IO - which should be with that speed gain, and non-IO which should be negligible. Everything breaks up into 2: IO - where we should see the full speed gain, and non-IO which should be negligible. Unless: NVMe is really reaching near RAM speed.

    – ispiro
    7 hours ago












  • It is a problem when the OS operates in an inefficient manner that cannot drive the NVMe at full speed when there are many files.

    – harrymc
    6 hours ago















1
















Copying many files involves the overhead of creating their entries in the
Master File Table (MFT)
which is an integral component of the NTFS file system
(with equivalents for other file-systems than NTFS).



This means that creating a file entails first searching the MFT for the name,
in order to avoid duplicates, then allocating the space, copying the file,
and finally completing the entry in the MFT.



It is this bookkeeping overhead that slows down dramatically the copying of
many files.
The overhead involves matching work in the operating system,
updating RAM tables, computer interrupts, system calls etc etc.
Closing a file also causes the operating system to flush it to the disk,
and that takes time and disturbs the smooth copying of the data,
not letting NVMe achieve performance that is closer to its potential.



Note: This problem of the slow copy of many files is not unique to NVMe.
We see exactly the same problem for any kind of fast disk, either mechanical
or SSD, SATA or NVMe.
This to my way of thinking proves that the problem is with inefficient OS
handling of this case, perhaps because of inefficient cache memory
algorithms and/or disk-driver.






share|improve this answer



























  • This shouldn't matter because every access to the MFT should be 5 times as fast. The bottom line is we're doing the same amount of work, and every part of it is 5 times as fast. (And I assume that the CPU work here, which would, indeed, be the same, is negligible.)

    – ispiro
    8 hours ago












  • I added more info to the answer.

    – harrymc
    7 hours ago











  • updating RAM tables shouldn't be a bottleneck compared with IO (unless you're saying that NVMe is that fast). Same goes for computer interrupts. Closing a file also causes the operating system to flush it to the disk - this should be 5 times (or whatever) as fast. system calls - they contain 2 parts: IO - which should be with that speed gain, and non-IO which should be negligible. Everything breaks up into 2: IO - where we should see the full speed gain, and non-IO which should be negligible. Unless: NVMe is really reaching near RAM speed.

    – ispiro
    7 hours ago












  • It is a problem when the OS operates in an inefficient manner that cannot drive the NVMe at full speed when there are many files.

    – harrymc
    6 hours ago













1














1










1









Copying many files involves the overhead of creating their entries in the
Master File Table (MFT)
which is an integral component of the NTFS file system
(with equivalents for other file-systems than NTFS).



This means that creating a file entails first searching the MFT for the name,
in order to avoid duplicates, then allocating the space, copying the file,
and finally completing the entry in the MFT.



It is this bookkeeping overhead that slows down dramatically the copying of
many files.
The overhead involves matching work in the operating system,
updating RAM tables, computer interrupts, system calls etc etc.
Closing a file also causes the operating system to flush it to the disk,
and that takes time and disturbs the smooth copying of the data,
not letting NVMe achieve performance that is closer to its potential.



Note: This problem of the slow copy of many files is not unique to NVMe.
We see exactly the same problem for any kind of fast disk, either mechanical
or SSD, SATA or NVMe.
This to my way of thinking proves that the problem is with inefficient OS
handling of this case, perhaps because of inefficient cache memory
algorithms and/or disk-driver.






share|improve this answer















Copying many files involves the overhead of creating their entries in the
Master File Table (MFT)
which is an integral component of the NTFS file system
(with equivalents for other file-systems than NTFS).



This means that creating a file entails first searching the MFT for the name,
in order to avoid duplicates, then allocating the space, copying the file,
and finally completing the entry in the MFT.



It is this bookkeeping overhead that slows down dramatically the copying of
many files.
The overhead involves matching work in the operating system,
updating RAM tables, computer interrupts, system calls etc etc.
Closing a file also causes the operating system to flush it to the disk,
and that takes time and disturbs the smooth copying of the data,
not letting NVMe achieve performance that is closer to its potential.



Note: This problem of the slow copy of many files is not unique to NVMe.
We see exactly the same problem for any kind of fast disk, either mechanical
or SSD, SATA or NVMe.
This to my way of thinking proves that the problem is with inefficient OS
handling of this case, perhaps because of inefficient cache memory
algorithms and/or disk-driver.







share|improve this answer














share|improve this answer



share|improve this answer








edited 5 hours ago

























answered 8 hours ago









harrymcharrymc

283k16 gold badges300 silver badges615 bronze badges




283k16 gold badges300 silver badges615 bronze badges















  • This shouldn't matter because every access to the MFT should be 5 times as fast. The bottom line is we're doing the same amount of work, and every part of it is 5 times as fast. (And I assume that the CPU work here, which would, indeed, be the same, is negligible.)

    – ispiro
    8 hours ago












  • I added more info to the answer.

    – harrymc
    7 hours ago











  • updating RAM tables shouldn't be a bottleneck compared with IO (unless you're saying that NVMe is that fast). Same goes for computer interrupts. Closing a file also causes the operating system to flush it to the disk - this should be 5 times (or whatever) as fast. system calls - they contain 2 parts: IO - which should be with that speed gain, and non-IO which should be negligible. Everything breaks up into 2: IO - where we should see the full speed gain, and non-IO which should be negligible. Unless: NVMe is really reaching near RAM speed.

    – ispiro
    7 hours ago












  • It is a problem when the OS operates in an inefficient manner that cannot drive the NVMe at full speed when there are many files.

    – harrymc
    6 hours ago

















  • This shouldn't matter because every access to the MFT should be 5 times as fast. The bottom line is we're doing the same amount of work, and every part of it is 5 times as fast. (And I assume that the CPU work here, which would, indeed, be the same, is negligible.)

    – ispiro
    8 hours ago












  • I added more info to the answer.

    – harrymc
    7 hours ago











  • updating RAM tables shouldn't be a bottleneck compared with IO (unless you're saying that NVMe is that fast). Same goes for computer interrupts. Closing a file also causes the operating system to flush it to the disk - this should be 5 times (or whatever) as fast. system calls - they contain 2 parts: IO - which should be with that speed gain, and non-IO which should be negligible. Everything breaks up into 2: IO - where we should see the full speed gain, and non-IO which should be negligible. Unless: NVMe is really reaching near RAM speed.

    – ispiro
    7 hours ago












  • It is a problem when the OS operates in an inefficient manner that cannot drive the NVMe at full speed when there are many files.

    – harrymc
    6 hours ago
















This shouldn't matter because every access to the MFT should be 5 times as fast. The bottom line is we're doing the same amount of work, and every part of it is 5 times as fast. (And I assume that the CPU work here, which would, indeed, be the same, is negligible.)

– ispiro
8 hours ago






This shouldn't matter because every access to the MFT should be 5 times as fast. The bottom line is we're doing the same amount of work, and every part of it is 5 times as fast. (And I assume that the CPU work here, which would, indeed, be the same, is negligible.)

– ispiro
8 hours ago














I added more info to the answer.

– harrymc
7 hours ago





I added more info to the answer.

– harrymc
7 hours ago













updating RAM tables shouldn't be a bottleneck compared with IO (unless you're saying that NVMe is that fast). Same goes for computer interrupts. Closing a file also causes the operating system to flush it to the disk - this should be 5 times (or whatever) as fast. system calls - they contain 2 parts: IO - which should be with that speed gain, and non-IO which should be negligible. Everything breaks up into 2: IO - where we should see the full speed gain, and non-IO which should be negligible. Unless: NVMe is really reaching near RAM speed.

– ispiro
7 hours ago






updating RAM tables shouldn't be a bottleneck compared with IO (unless you're saying that NVMe is that fast). Same goes for computer interrupts. Closing a file also causes the operating system to flush it to the disk - this should be 5 times (or whatever) as fast. system calls - they contain 2 parts: IO - which should be with that speed gain, and non-IO which should be negligible. Everything breaks up into 2: IO - where we should see the full speed gain, and non-IO which should be negligible. Unless: NVMe is really reaching near RAM speed.

– ispiro
7 hours ago














It is a problem when the OS operates in an inefficient manner that cannot drive the NVMe at full speed when there are many files.

– harrymc
6 hours ago





It is a problem when the OS operates in an inefficient manner that cannot drive the NVMe at full speed when there are many files.

– harrymc
6 hours ago


















draft saved

draft discarded















































Thanks for contributing an answer to Super User!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsuperuser.com%2fquestions%2f1479346%2fwhy-doesnt-an-nvme-connection-on-an-ssd-make-non-sequential-access-faster%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Invision Community Contents History See also References External links Navigation menuProprietaryinvisioncommunity.comIPS Community ForumsIPS Community Forumsthis blog entry"License Changes, IP.Board 3.4, and the Future""Interview -- Matt Mecham of Ibforums""CEO Invision Power Board, Matt Mecham Is a Liar, Thief!"IPB License Explanation 1.3, 1.3.1, 2.0, and 2.1ArchivedSecurity Fixes, Updates And Enhancements For IPB 1.3.1Archived"New Demo Accounts - Invision Power Services"the original"New Default Skin"the original"Invision Power Board 3.0.0 and Applications Released"the original"Archived copy"the original"Perpetual licenses being done away with""Release Notes - Invision Power Services""Introducing: IPS Community Suite 4!"Invision Community Release Notes

Canceling a color specificationRandomly assigning color to Graphics3D objects?Default color for Filling in Mathematica 9Coloring specific elements of sets with a prime modified order in an array plotHow to pick a color differing significantly from the colors already in a given color list?Detection of the text colorColor numbers based on their valueCan color schemes for use with ColorData include opacity specification?My dynamic color schemes

Ласкавець круглолистий Зміст Опис | Поширення | Галерея | Примітки | Посилання | Навігаційне меню58171138361-22960890446Bupleurum rotundifoliumEuro+Med PlantbasePlants of the World Online — Kew ScienceGermplasm Resources Information Network (GRIN)Ласкавецькн. VI : Літери Ком — Левиправивши або дописавши її