Building Data Lakes in the Cloud
Paper Abstract
A walk through the steps required to build a data lake in the cloud and connect it to on-premises environments, covering best practices in architecting cloud data lakes and key aspects such as performance, security, benefits and software solutions, presenting technologies ranging from basic HDFS storage to real-time processing with Spark Streaming.
What's Inside
- Data lakes in the cloud
Find out how cloud-based data lakes can be connected to on-premises environments. - Security solutions
Learn about the authentication protocols and data encryption down to per-file basis which safeguards data lakes in the cloud environment. - Software and performance solutions
Discover how to increase performance by going directly onto a bare metal cloud and have your data lake flexible architecture ready within minutes.