SimpliVity Effective Storage Capacity
The SimpliVity Data Virtualization Platform (DVP) deduplicates, compresses, and optimizes all data, at inception, once and forever, globally. When sizing a SimpliVity environment we take into account the capacity efficiency provided by the SimpliVity DVP, the effective capacity.
The ratio we use when sizing is a conservative 2.25:1. (1.5:1 Deduplication and 1.5:1 Compression). Our customers typically see much higher efficiency ratios, but we use this conservative ratio for sizing. This is due to the fact that some data may not deduplicate or compress well, for example rich media (video, images, etc). Even when primary data does not deduplicate or compress well the SimpliVity DVP still provides significant savings when it comes to backup and replication.
Here is the general formula I use when sizing for the SimpliVity DVP capacity efficiency:
( Production data + (production data * percent expected growth/100) ) / 2.25 efficiency
For example a customer with 125 TB of data and expected growth of 25%:
( 125 + ( 125 * 25/100 ) ) / 2.25 = ~69.44 TB
In this example I would ensure the SimpliVity environment provided sufficient storage capacity to support 69.44 TB of production data.
Easy enough but what does this look like when deployed in an actual customer environment (or do we see this efficiency for real – on production data).
Here is an example, from an actual customer environment, of SimpliVity DVP efficiency on PRIMARY PRODUCTION DATA:
In this case there is 128.79 TB production data in the environment. The environment consists of approximately 250 virtual machines running workloads including Exchange, SQL, SharePoint, File services, and a number of other application and Web services to support business operations. The physical storage footprint after SimpliVity’s inline deduplication and compression is 30.64 TB, an efficiency ratio of 5.1:1 (3.4:1 deduplication and 1.5:1 compression) before any backups. The capacity savings is 98.15 TB and the entire production environment is hosted on 4 SimpliVity OmniCubes (8 Rack Units!!!) providing N+1 availability for compute and storage.
Since SimpliVity deduplicates and compress data at inception this is also 98.15 TB of IO which was avoided. The real advantage of the SimpliVity DVP is the IO avoidance – not writing duplicate IO. There is approximately 98.15 TB of production data which did not have to be written to disk, a tremendous IO savings. This post is specific to capacity, to learn more about the advantages of IO avoidance SimpliVity provides you must unlearn what you have learned.
Here is another example, from an actual customer environment, of SimpliVity efficiency which includes primary production data and BACKUPS:
Here the customer has 35.99 TB of production data, 953.01 TB of local backups, and 80.71 TB of backups replicated from other datacenters within the SimpliVity Federation. That is 1.04 PB (yep PB) of data stored in a 15.02 TB physical footprint, over a 1.03 PB savings or 71.2:1 efficiency. The environment consists of around 45 virtual machines running a mix of workloads including Exchange, SQL, File services, and several application servers. Even if we only look at the production data (35.99 TB) that is an efficiency of 2.4:1. The entire primary production environment runs on 2 SimpliVity OmniCubes providing N+1 availability for compute and storage.
Again we are focused on capacity here, but think about… 1.03 PB of IO avoidance (mind = blown). Not only do we see great efficiency on the production data, but the backups had no capacity (or performance due to IO avoidance) impact on the environment.
These two customer examples are typical of what our customers see once they deploy SimpliVity. The SimpliVity DVP provides significant capacity savings for both Production data and the associated backups. The capacity savings is very compelling in itself, but since it is achieved inline, at the inception of data, the IO avoidance results in increased performance of the environment as a whole.
I used your sizing guideline here in one system, however when we put the customers system in production there and loaded it with data its a massive amount of space difference between what the system says it will use and what it actually will use.
System has 32TB data stored. Efficiency is 2.5:1. That should result in a 60% savings and a total of 13TB data. Yet when you check the actual used numbers its using 21TB data.
Someone said its expected behaviour (although undocumented) as there is overhead when storing data.
I dont see you listing how to calculate this overhead, but given that its using about 65% in addition to the customers data on this “overhead” im guessing its quite important to mention.
Unless its something completely wrong with the system as i find it using 65% additional space on overhead very very strange as i cannot find that it should do this in any Simplivity documentation.
See an example report from the GUI here;
https://www.dropbox.com/s/czj7psiosl19ha9/2016-06-30_17-45-11.jpg?dl=0
Interesting as I am seeing the same exact problem right now and working with support. Did this ever got resolved? No matter what I move in or out, the GUI show 21 TB.
Hi Guys, any update about how to calculate the Simplivity Storage Capacity.
Can share with us.
Thank’s a lot.