Home>Data Lake Administration

Data Lake Administration

Client

For a leading Auto Insurance provider in Ohio

Business Problem

  • Our client wanted to optimize costs and automate cluster management​
  • Also wanted to avoid data quality issues for downstream analytics tools like Qlik, Informatica & SAS

Abzooba’s Solution:

  • Data lake in AWS with Elastic Map Reduce (10 clusters) for processing data and auto-scaling clusters
  • Redshift (5 clusters) is being used as a data warehouse
  • AWS Glue is being used as an ETL tool for data ingestion
  • Qlik (6 clusters), Informatica (5 clusters) and SAS (2 clusters) are being used to create interactive dashboards to provide analytical insights
  • Monitoring of the clusters is being done through AWS Cloudwatch

Business Benefits:

  • Net savings of $40K-$50K per month
  • Providing a centralized administrator for handling big data clusters and analytics workspace
  • Auto-scaling of clusters for optimizing costs
  • 24/7/365 support

Tech Stack

Speak to AI expert