Design and maintain enterprise-scale data pipelines using AWS cloud services, handling schema evolution in data feeds and delivering analytics-ready datasets to BI platforms. This role requires hands-on expertise with the full AWS data stack and proven ability to build enterprise-grade data solutions that scale.
Essential Functions
- Build and orchestrate ETL/ELT workflows using Apache Airflow for complex data pipeline management
- Develop serverless data processing with AWS Lambda and EventBridge for real-time transformations
- Create scalable ETL jobs using AWS Glue with automated schema discovery and catalog management
- Execute database migrations and continuous replication using AWS DMS
- Design and optimize Amazon Redshift data warehouses and Amazon Athena federated queries
- Implement streaming data pipelines with Apache Kafka for real-time ingestion
- Manage schema changes in data feeds with automated detection and pipeline adaptation
- Create data feeds for Tableau and BusinessObjects reporting platforms
Supervisory Responsibilities
No supervisory responsibilities
Required Skills/Abilities
- Airflow: DAG development, custom operators, workflow orchestration, production deployment
- Lambda: Serverless functions, event triggers, performance optimization
- EventBridge: Event-driven architecture, rule configuration, cross-service integration
- Glue: ETL job development, crawlers, Data Catalog, schema management
- DMS: Database migrations, continuous replication, heterogeneous database integration
- Redshift: Cluster management, query optimization, workload management
- Athena: Serverless analytics, partitioning strategies, federated queries
Tableau (Expert Level)
- Develop and maintain data analogs, data cubes, queries, data visualization and reports
- Assist in testing code, governance, data quality, and documentation effort
- Reported data and visualizations and reports
- Collaborate with data stewards to test, clean, and standardize
- Evaluate patterns and meaningful insights from data through qualitative and quantitative analysis
Data Technologies
- Apache Kafka: Stream processing, topic design, producer/consumer development
- SQL: Advanced querying across multiple database platforms (PostgreSQL, MySQL, Oracle)
- <