Bring your own (installation instructions will be sent prior to course start)
Included
Presto, an open source distributed SQL engine, is widely recognized for its low-latency queries, high concurrency, and native ability to query multiple data sources.
Proven at scale in a variety of use cases at Airbnb, Comcast, GrubHub, Facebook, FINRA, LinkedIn, Lyft, Netflix, Twitter, Uber and many more, in the last few years Presto experienced an unprecedented growth in popularity in both on-premises and cloud deployments over Object Stores, HDFS, NoSQL and RDBMS data stores.
Join industry giants who use Presto to provide human analysts and automatic processes alike access for querying data at huge scales, across many data-sources (S3, SQL databases, NoSQL databases, and more).
Objectives
Participants will learn and experience Presto using hands-on exercises demonstrating Presto's key capabilities:
Install, deploy and configure Presto.
Query data on S3 and HDFS using standard SQL.
Execute SQL queries across multiple datasources using Presto's query federation.
Understand what it takes to deploy and use Presto in real-word scenarios.
Prerequisites
No programming experience is required. Basic Linux skills and SQL knowledge are required.
Syllabus
Big Data, Data Warehousing, Data Lakes and Clouds.
What is Presto and why is it needed.
Use cases.
Presto Architecture.
Catalogs, Schemas and Tables.
Installation and configuration.
The Presto CLI and Web UI.
Data Sources and Connectors.
Lab: Using the Apache Hive connector to query data on HDFS and S3.
JDBC and ODBC connectivity.
Using Presto from BI Tools and IDEs.
Lab: Query Presto using Superset, Redash or Zeppelin.
Partitioning and Bucketing.
File formats: Avro, ORC, Parquet.
Lab: Analyzing real data at scale on S3.
Query planning and execution.
Cost-based optimizations.
Query performance monitoring and tuning.
Understanding joins and spill to disk.
More built-in connectors: MySQL, PostgreSQL.
Query relational data using Presto.
Lab: Executing cross data-sources queries with Presto.
Deployment options and administrative tools.
Cluster best practices and high-availability.
Resource groups.
Security overview: Authentication, Authorization and Encryption.
Everything you need to know about Presto SQL to get started querying and analyzing data on S3, HDFS and pretty much anywhere.. Leave your details below and our
representatives will be in touch soon.
We will be storing the details you submit so our representatives can
reach out to you to complete the registration process, and will also
use it to notify you of similar courses in the future via our
newsletter.
Subscribe to Monthly Newsletter
Keep up-to-date with important changes in big data technologies;
discover new features and tools for your business.
We use cookies to provide an optimized user experience and understand our traffic. To learn more, read our use of cookies; otherwise, please choose 'Accept Cookies' to continue using our website.