Using lakeFS with Amazon Athena¶
Amazon Athena is an interactive query service that makes it easy to analyze data in Amazon S3 using standard SQL.
Integration Overview¶
To query lakeFS data from Athena, you'll need to first export data to AWS Glue, and then use Athena to query it directly. To do that, use the automated Data Catalog Exports feature, which allows you to:
- Query data directly from lakeFS branches and commits
- Access tables using branch names as schemas
- Leverage lakeFS versioning capabilities in your SQL queries
Getting Started¶
For a complete step-by-step guide on setting up Athena with lakeFS, see the Glue Data Catalog integration guide, which includes:
- Table Configuration: Define your table schema using
_lakefs_tables/<table>.yaml
- Automated Export: Set up Lua hooks to export table metadata to Glue Catalog
- Query Setup: Use Athena to query your lakeFS data with branch-specific schemas