Becoming a Big Data Architect: Expert techniques for architecting end to end Big Data solutions to get valuable insights
A comprehensive guide to design, build and execute effective Big Data strategies using Hadoop
The complex structure of data these days requires sophisticated solutions for data transformation and its semantic representation, to make the information more accessible to the users. Apache Hadoop, along with a host of other Big Data tools, empowers you to build such solutions with relative ease. This book lists down unique ideas and techniques to conquer different data processing and analytics challenges on your path to becoming an expert Big Data architect.
The book begins with quickly laying down the principles of enterprise data architecture and how they are related to the Apache Hadoop ecosystem. You will learn get a complete understanding of the data lifecycle management with Hadoop, followed by modelling of structured and unstructured data in Hadoop. The book will also show you how to design real-time streaming pipelines by leveraging tools such as Apache Spark, as well as building efficient enterprise search solutions using tools such as Elasticsearch. You will build enterprise-grade analytics solutions on Hadoop, and learn how to visualize your data using tools such as Tableau and Python.
This book also covers techniques for deploying your Big Data solutions on-premise and on the cloud, as well as expert techniques for managing and administering your Hadoop cluster. By the end of this book, you will have all the knowledge to need to build expert Big Data systems that cater to any data or insight requirements, leveraging the full suite of modern Big Data frameworks and tools. You will have the necessary skills and know-how to become a true Big Data expert.
This book is for Big Data professionals who want to fast-track their career in the Hadoop industry and become an expert Big Data architect. Project managers and mainframe professionals looking forward to build a career in Big Data Hadoop will also find this book to be useful. Some understanding of Hadoop is required to get the best out of this book.
Chapter 1. Enterprise Data Architecture Principles
Chapter 2. Hadoop Life Cycle Management
Chapter 3. Hadoop Design Consideration
Chapter 4. Data Movement Techniques
Chapter 5. Data Modeling in Hadoop
Chapter 6. Designing Real-Time Streaming Data Pipelines
Chapter 7. Large-Scale Data Processing Frameworks
Chapter 8. Building Enterprise Search Platform
Chapter 9. Designing Data Visualization Solutions
Chapter 10. Developing Applications Using the Cloud
Chapter 11. Production Hadoop Cluster Deployment