BigData Boutique in cooperation with
Hands-on Presto: Fast SQL on Anything
Everything you need to know about PrestoSQL to get started querying and analyzing data on S3, HDFS and pretty much anywhere.
Monday, August 17, 2020
10:00 AM to 3:00 PM
We are BigData Boutique
BigData Boutique consists of a team of Data Engineers, DevOps and Big Data experts offering services to help you succeed in your BigData projects. Whether you are just starting or already deployed to production and moving towards the next level of stability and performance.
We are uniquely situated to offer the best advice around. Code we’ve written has been powering complex systems efficiently without faults for years now. We want to help your code to do the same.
Reach out today to hear how we can help you with your Presto deployment, Elastic cluster, Spark jobs, Kafka brokers, data warehouse and more.
Presto, an open source distributed SQL engine, is widely recognized for its low-latency queries, high concurrency, and native ability to query multiple data sources.
Proven at scale in a variety of use cases at Airbnb, Facebook, LinkedIn, Netflix, Twitter, Uber and many more, in the last few years Presto experienced an unprecedented growth in popularity in both on-premises and cloud deployments over Object Stores, HDFS, NoSQL and RDBMS data stores.
Join industry giants who use Presto to provide human analysts and automatic processes alike access for querying data at huge scales, across many data-sources (S3, SQL databases, NoSQL databases, and more).
In this hands-on training you will learn how to build cost-effective Data Warehouse and Data Lake solutions on your cloud using Presto from the basics to the more advanced features.
Participants will learn and experience Presto using hands-on exercises demonstrating Presto's key capabilities:
- Install, deploy and configure Presto.
- Query data on S3 and HDFS using standard SQL.
- Execute SQL queries across multiple datasources using Presto's query federation.
- Understand what it takes to deploy and use Presto in real-word scenarios.
No programming experience is required. Basic Linux skills and SQL knowledge are required.
- Big Data, Data Warehousing, Data Lakes and Clouds.
- What is Presto and why is it needed.
- Use cases.
- Presto Architecture.
- Catalogs, Schemas and Tables.
- Installation and configuration.
- The Presto CLI and Web UI.
- Data Sources and Connectors.
- Lab: Using the Apache Hive connector to query data on HDFS and S3.
- JDBC and ODBC connectivity.
- Using Presto from BI Tools and IDEs.
- Lab: Query Presto using Superset, Redash or Zeppelin.
- Partitioning and Bucketing.
- File formats: Avro, ORC, Parquet.
- Lab: Analyzing real data at scale on S3.
- Query planning and execution.
- Cost-based optimizations.
- Query performance monitoring and tuning.
- Understanding joins and spill to disk.
- More built-in connectors: MySQL, PostgreSQL.
- Query relational data using Presto.
- Lab: Executing cross data-sources queries with Presto.
- Deployment options and administrative tools.
- Cluster best practices and high-availability.
- Resource groups.
- Security overview: Authentication, Authorization and Encryption.