How to cut costs for your cloud analytics
Learn how you can save up to 70% of your cost by using ZebClient instead of Amazon FSx for Lustre.
Moving your advanced analytics workloads to the cloud tends to cause your cloud bill to skyrocket. ZebClient avoids that problem. ZebClient stores data in use close to compute nodes and serves it at ultra-high speed while using low-cost cloud storage for your data lake. The ZebClient architecture provides up to 70% cost reduction compared to an architecture based on AWS FSx for Lustre. This article describes how this is possible.
Enable cloud storage for high-performance use cases
Cloud storage is designed for optimal sharing of resources to provide a cost-effective and scalable solution. The drawback is that it is not designed to deliver the extreme performance, super-high throughput and guaranteed low latency needed to cover modern use cases such as advanced analytics, including AI components. There is still a gap between the performance requirements of the software, the super-powers of ultra-modern compute hardware, and the standard capabilities of cloud storage.
ZebClient bridges that gap by serving data to applications through an acceleration layer to secure the speed and response time needed. With ZebClient, cloud storage becomes the preferred storage of choice even for extremely demanding applications. As a consequence, radical cost savings materialise compared to the high-performance data storage solutions available today. And the cost savings do emerge whether the original storage is an on-prem or a cloud-based one.
Minimise compute times
A side benefit of using the ZebClient acceleration layer, is that expensive compute times can be reduced to a minimum. When running demanding applications in the cloud, bringing down computational times is an area of particular interest. ZebClient serves data to applications at accelerated speeds which means that the compute power available can be fully utilised. The result is that insights can be produced quicker and more cost-efficiently than otherwise.
Design for scalability and flexibility
Replacing expensive file storage with low-cost cloud storage for your analytics, brings immediate cost savings. Cloud-based storage also provides a totally scalable solution capable of handling even a massive data growth without the threshold cost associated with traditional storage solutions. The ZebClient acceleration layer is equally scalable and can be scaled out and down as your performance needs and data lake size change.
On the other hand, there is an obvious risk accompanying the cloud-based IT strategy: vendor lock-in. Placing your important IT infrastructure in the hands of one single cloud provider, increases your risk exposure and results in a higher cost. ZebClient provides a solution to this by disaggregating data storage from compute. This architecture enables you to move your compute and/or your storage to a new or to multiple providers when needed – or to design for a multi-cloud solution from the start. ZebClient provides the tools you need to avoid costly vendor lock-in effects.
ZebClient – Lustre cost comparison
Lustre is a well-known system providing high-performance storage, scalability, a global name space, and the ability to distribute very large files across many nodes. For advanced analytics , it does provide multiple benefits compared to traditional on-prem solutions or standard cloud storage. In our cost comparison, ZebClient and Lustre both support the use of AWS compute instances type i3en.6xlarge for the analytics application. To provide higher performance, the number of i3en.6xlarge nodes is scaled out as performance requirements increase.
The ZebClient design further uses AWS C6in instances for the acceleration layer, AWS EBS volumes for short term storage in the acceleration layer, and standard AWS S3 cloud storage for the data lake.
The corresponding Lustre solution is based on AWS FSx for Lustre persistent SSD file system, designed to meet the read performance delivered to the application by ZebClient. Based on the total amount of data in the solution, different levels of the per unit Lustre throughput is used to meet this performance level.
When benchmarking the two solutions, ZebClient proved to deliver a ultra-high performance per core to the application nodes used. This performance level forms the basic requirements on what performance the Lustre solution needs to deliver. Translated into cost per total performance level and total amount of data in the system, ZebClient demonstrates its ability to save up 70% of the cost of a comparable Lustre solution.