Intel DC S3700 Review – a full Performance Characterisation

Myce/OakGate 4K Read and Write Latency Tests

Here are the specifications for the tests –

These tests steadily increase the random 4K IO demand in
terms of IOPS, and report the drives response in terms of Average IOPS, Average
Latency and Maximum Latency.  It is designed to show a drive’s maximum IOPS
capability and report the all important Latency numbers for each level of IOPS
demanded.  The Maximum latency numbers give us an insight into the occurrence
of Latency peaks that could cause an unexpected response from time to time.

Here are the results –

Firstly, here is a graph showing the result for the
Pre-Conditioning in Step 2 –

I find the repetitive ‘saw tooth’ shape of the graph line
fascinating; it suggests to me that there is some form of cyclic pattern to the
DC S3700 firmware’s operations.


4K Latency Read Test

You can see that the drive can no longer meet the increase
in IOPS demand at just over 75,000 IOPS (right on Intel’s spec).

 

You can see that the average read latency remains below 175
Microseconds all the way up to 75,000 IOPS.

 

Here you can see some curious Maximum Latency Peaks,
seemingly occurring at regular time intervals. This demands further
investigation, so here is a zoom into the Latency scores that contributed to
the 70,000 IOPS plot –

 

As this is the first time, we are looking at a High
Resolution Latency Histogram, here’s an explanation – The x axis to the left is
the count of the IOs in the observation period (in a Round) that had a Latency
of the value along the Y axis (please note that the x axis is logarithmic to
allow the low order counts of the huge number of IOs that have been measured to
be visible); the y axis is the Latency value measured in Microseconds; The x
axis to the right is the % of the Total IOs observed that have a Latency <=
to a given Latency value; the rate of getting to 100% is highlighted by the red
graph line.

You can now see that the overwhelming majority of IOs are
below 500 Microseconds and that there are relatively few outlying exceptions.

So nothing much to worry about, but I felt that the
seemingly regular occurrences of the latency spikes warranted asking Intel if
they could offer an explanation.

Intel’s response was – “It seems the test that was run
displayed expected results given the atypical workload and will not cause any
significant customer impact. That said, we would prefer not to quote max
latency in this context, it doesn’t say much about the operation of the
drive…. We believe OakGate has the concept of QoS binning testing much like
we use for measuring latencies.  This is a much preferred way to look at it. We
use the Max latencies graph as a way to highlight if we need to dig into QoS if
we have a massive outlier.”

QoS stands for ‘Quality of Service’.

OK, that’s fair comment and I have no argument, but I was
personally interested to look a bit harder.  So, having prefilled the drive, I
ran a 4K Sequential Read Test at a Queue Depth of 1, for 2 hours.  The test
repeated a loop 120 times and in each loop there was a 9 second warm up period
followed by 51 seconds of performance monitoring.  This would allow us to see
if the Maximum Latency spikes occur in sequential operations, and also to
compare the baseline latencies observed to Intel’s specification. Here is the
result –

You can see that the regular spikes are again occurring in
this test – starting in the 7th Round, they then occur in every
subsequent 13th round. So, let’s look at the detailed latency
histogram of all the IOs in a round with a spike (Round 60)–

You can see here that 99.9% of the IOs in Round 60 completed
with a Latency <= 50 Microseconds.  If you look carefully, there is just one
relatively massive outlier with a Latency of 10,005 Microseconds. This
histogram is an implementation of the ‘QoS binning’ approach that Intel
mentioned they prefer. Intel specifies the DCS3700’s QoS as – Read/Write 500
Microseconds for 99.9% of IOs, for 4KB random IOs in a Steady State, at a Queue
depth of 1 (we’ll look at this in a moment)

Now let’s look at the detailed latency histogram for all of
the IOs in a typical round.

Here you can see that 99.9% in Round 55 completed with a
Latency of 30 Microseconds. Intel specifies that the typical Sequential Read
Latency for a DC S3700 is 50 Microseconds. I don’t like the use of ‘typical’ as
I don’t understand what typical means – but in the context of this test, the
specified ‘typical’ Read Latency is, by any reasonable interpretation of
‘typical’, certainly beaten.

Last word on the Maximum Latency peaks – I can’t help
wondering what activity the firmware performs roughly every 13 minutes that
could give rise to the regular Max Latency spikes, but whatever it is, I guess
it must be important.

So, as we have touched upon the subject of Intel’s QoS
specification, let’s take a look at this now.  Remember Intel specifies the
DCS3700’s QoS as – ‘Read/Write 500 Microseconds (99.9%)’ for Random 4K IOs at a
Queue Depth of 1, when in a Steady State. 

I was not sure if this meant 99.9% of IOs in mixed
Read/Write traffic, or for 100% Reads traffic and 100% Writes traffic,
separately.  So I decided to test both.  So, I first ran a test that consisted
of i) Purge ii) Preconditioning the drive by performing 4K random Writes for 2
hours iii) Performing a 50/50 Read/Write mix of 4K random Reads and Writes for
120 rounds, with each round consisting of 9 seconds warm up time and 51 seconds
performance monitoring… and then I ran the test again but with step iii)
becoming – perform 100% Random 4K Writes for 120 rounds and then perform 100%
4K Reads for 120 Rounds.

Here are the results for the 50/50 Read/Write Mix –

You can see the Average Read Latency is about 155
Microseconds and that it blips up in a 13 round cycle.

You can see the typical Ave Write Latency per round is 25
Microseconds, except where it blips up in a 13 round cycle.

You can see here in the High Resolution Read Latency
Histogram for Round 28 that 99.9% of Read IOs have a Latency <= 2240
Microseconds, which doesn’t compare well with Intel’s spec. of 500 Microseconds
(so perhaps Intel’s spec. is for separate Reads and Writes); however, 95% of
IOs have a latency <= 530 Microseconds

You can see here in the High Resolution Write latency
Histogram for Round 25, that 99.9% of Write IOs have a Latency <= 450
Microseconds, which is inside Intel’s spec. of 500.

Here are the results for the separate Writes and Reads Tests

Here you can see that the Average Write Latency per round is
around 35 Microseconds.

In the High Resolution Write Latency Histogram for Round 64
you can see that 99.9% of the Write IOs have a latency <= 430 Microseconds,
which is comfortably within Intel’s spec. of 500.

Here you can see that the Average Read Latency per Round is
90 Microseconds, except where it blips up on a 13 Round cycle.

In the High Resolution Read Latency Histogram for Round 28
(a typical round) you can see that 99.9% of the Read IOs have a latency <=
100 Microseconds, which is very comfortably within Intel’s spec. of 500.

I do wish manufacturers would make it clear what their
numbers mean (or is it just that I lack experience).


4K Latency Write Test

Here you can see that the drive can no longer meet the
increase in IOPS demand at just over 32,500 (a wee bit short of Intel’s spec in
the context of this test).

Here we can see that the average latency remains below 200
micro seconds all the way up to 32,500 IOPS. 

 

The result is a bit spiky, but arguably this is typical and
I feel it is no cause for concern.


Now let’s head to the next page, to look at the results
for the Myce/Oakgate Reads and Writes Tests…..