Introduction to BigData and Cloud Technologies

BigData and Cloud explained with real-world examples in this intensive 1-day workshop

How can we store or process large volumes of data? how to deal with a massive stream of events coming at high velocity? what really is BigData? How does a BigData-ready system look like? and how can Clouds help?

The topics of BigData and Cloud technologoies are being mentioned a lot, but it's hard to know where to start or what technologies to use.

Join our internationally renowned instructors for a full day packed of knowledge sharing. Let us give you an overview with short deep-dives into the vast landscapes of BigData, everything you need to get started with the technologies that changed the world.

Objectives

This course is aimed at giving a good overview for developers and decision-makers new to the field, as well as giving insights and valuable pointers to developers with practical experience.

  • Understanding the challenges with BigData, and approaches for solving them.
  • Storage, Compute and Stream Processing - and an overview of commonly used technologies.
  • Showcasing the characteristics and architecture of a BigData-ready system.
  • Using Cloud technologies, and overview of the notable ones.
Prerequisites

None.

Syllabus
  • What is BigData? and more importantly, what is not BigData?
  • Dealing with the challenges of Volume, Velocity and Variety of data.
  • The Hadoop ecosystem and it's current state
  • Why do we need new technologies?
  • Distributed file systems (HDFS, consistent hashing, and more)
  • File formats, and why they matter
  • NoSQL and the CAP Theorem
  • Properties of modern storage solution (Schemaless, redundancy, relaxed consistency)
  • Polyglot persistence
  • Overview of NoSQL technologies
  • Batch processing and why we can't do ad-hoc querying.
  • The Map/Reduce model and locality of data
  • Hadoop's MapReduce and YARN
  • Computation frameworks like Apache Spark and Flink
  • Monitoring of batch jobs and Spark computations
  • Machine learning
  • Distributed systems, microservices, containers and 'serverless'
  • Discovery, synchronization and configuration management
  • Queue systems and commit logs
  • Data workflows and pipelines
  • Design guidelines (explicit consistency expectations, idempotence, and more)
  • Monitoring and alerting (ELK, Graphite, Grafana, Graylog, Redash, Reimann and more)
  • Data warehousing
  • Micro batching
  • Stream processing (Apache Storm, Heron and similar technologies)
  • Lambda architecture
  • Why cloud?
  • Overview of Amazon Web Services
  • Overview of Google Cloud Platform
  • Overview of Microsoft Azure
  • Comparison and highlight of notable strengths of each
  • BigData on the cloud

Q&A panel with our experts - ask us anything.

Ready to get started?

Enroll Now
Related courses