Hadoop

Hadoop

13 May · 1 min read

Challenge: To automate the deployment of the stack and its components and upgrade process

Solution: We created and maintained a manageable Hadoop stack for one of the biggest telecom operators in India

User Group: Telecom Operators

Within the huge community around Hadoop, a number of companies already trying to create their own Hadoop distributions. With our huge expertise in Hadoop stacks and services, we created and maintained a manageable Hadoop stack for one of the biggest telecom operators in India.
Initial requirements were the following:
- automatic deployment of a stack and its components
- automatic upgrade process
- ability to deliver patches for components without full redeployment of stack version
- UI for cluster administration

The stack was created on top of a big top open-source project with the Ambari component as a cluster manager tool. The set of scripts was implemented on top of Ambari as well as custom patches for Hive and Spark components to support high-loaded queries (processing of daily reporting for customer’s operations).
Maintaining/sustaining the cluster (100+ nodes) includes optimization of hive/spark queries and delivering patches that increase cluster stability and performance. Within daily increase in terms of data amount for processing, our team was responsible for maintaining pipelines and cluster optimizations. Moreover, a custom version of Hive components was delivered (a set of fixes from the later open-source version were backported to the existing version), so we were able to fit a 24-hour timeframe for reporting.    

Comment as

Login or comment as

0 comments