Data Lake as a Service

Key features

The Bigstep bare metal cloud data lake integrates with your existing applications and systems. It delivers unparalleled throughput, it ensures enterprise-grade security, all this at a cost of pennies per gigabyte. It has never been easier to work with big data.

Infinitely scalable active data store for just £30/TB/month;
Fully encrypted data in transit;
Multi-terabit throughput architecture;
Fully HDFS compatible and native NoSQL integration.

The Best Thing Since Hadoop

The bare metal data lake efficiently holds structured and unstructured data of any type. This data ranges from clickstream, social media feeds, audio/video, machine data, and logs, sensor data, CRM or ERP exports, RDBMS or NoSQL exports to just about anything else. Make your datasets available for any app or workload at any time and start discovering unprecedented business insights.

Native HDFS Integration

Hadoop compatible applications (Spark, Kafka, Drill, Flink, NoSQL DBs) can access our data lake service through the binary HDFS protocol.

File-Level Replication

Replication is configured on a per-file basis. This allows to set the extent to which your most sensitive data is safeguarded against loss.

Enterprise-Grade Security

The bare metal data lake uses data in transit encryption and Kerberos-based authentication to enforce data security.

Supports Files of Any Size

There are no restrictions on how much data you may store in the data lake and no restrictions on individual file sizes.

High Throughput

On a single link there is 40 Gbps throughput - data moves freely and it quickly reaches mission-critical applications that need to process large volumes of data in an instant.

Multiple Availability Regions

You access data from any bare metal region (UK, US or RO) and you may also replicate it across zones to ensure maximum availability.

Request a Demo

Where There's Data There Is a Way

There is no need for overly expensive on-premises storage solutions since they are difficult to scale and to manage. No matter how large the dataset or variety of data types you want to collect, process and analyze, the Bigstep bare metal data lake is the go-to service for all your big data use cases.

Why Data Lake as a Service

The bare metal data lake changes the way big data works.

It integrates with and expands your current enterprise data warehouse (EDW).
It serves as a high-capacity repository for any structured or unstructured data.
It scales infinitely, but you only pay per use.
It frees you from the hassle of buying new hardware appliances and purchasing expensive licenses.
It is accessible from anywhere in the world.

How does data lake pricing work?

Using a "Data Lake as a Service" means that you only pay for the capacity you use. 1GB of data costs £0.03 per month, with no additional upload fees. For example, if you upload a batch of 100GB of data for a quick project you will only pay £3 for an entire month of storage. Replication is also a breeze – we measure the raw capacity used across all your files.

Security

Data Ownership Control

Files have individual access permissions and ownership control that replicates the user groups and hierarchy in the bare metal cloud's user management and authentication system. The data lake service uses Kerberos as the default authentication protocol.

Data Encryption

To protect you against unauthorized access, the data lake service data while it is transmitted across networks.

The First Data Lake as a Service in the World

The Bare Metal Data Lake is the first of its kind and it was designed to host big files in the order of terabytes or above. It supports both structured and unstructured data, regardless of size or source. Every file consists of multiple blocks, each of which is downloadable in parallel from different source machines. Add up to 40 Gbps throughput per node, and it's easy to see how this turns into a multi-terabit traffic architecture.

The bare metal data lake service matches the distributed replication schema in Hadoop. File blocks are distributed evenly across data-nodes while also making sure replicas are not on the same machines or disks. Thanks to this replication system individual disk failures do not affect stored data. Compared to traditional RAID solutions, this has the benefit of increased throughput and performance - a client can simultaneously download different parts of a big file from different data nodes.

Simple Infrastructure Integration

The Bare Metal Cloud offers complete infrastructure integration with the data lake service through multiple protocols

HDFS
A binary protocol for Hadoop compatible applications such as Spark, Kafka, Drill, Flink or NoSQL DBs.

WebHDFS
An HTTP-based protocol that can be used by many web-enabled applications.

FUSE
A locally mounted file system which can be used by any application.

Data Lake Client Libraries
A data lake client that helps users access their data lake from the command line interface.

Quick Data Migration

High throughput in the Bare Metal Cloud gets all of your data where you need it lightning fast. Large volumes of data that seem forever stuck in cloud storage solutions such as Amazon S3 can be easily migrated through HTTP to the bare metal data lake. They will therefore benef from unlimited storage and added bare metal performance.

Ready to give it a try?