Design and maintain enterprise-scale data pipelines using AWS cloud services, handling schema evolution in data feeds and delivering analytics-ready datasets to BI platforms. This role requires hands-on expertise with the full AWS data stack and proven ability to build enterprise-grade data solutions that scale.
Essential Functions
- Build and orchestrate ETL/ELT workflows using Apache Airflow for complex data pipeline management
- Develop serverless data processing with AWS Lambda and EventBridge for real-time transformations
- Create scalable ETL jobs using AWS Glue with automated schema discovery and catalog management
- Execute database migrations and continuous replication using AWS DMS
- Design and optimize Amazon Redshift data warehouses and Amazon Athena federated queries
- Implement streaming data pipelines with Apache Kafka for real-time ingestion
- Manage schema changes in data feeds with automated detection and pipeline adaptation
- Create data feeds for Tableau and BusinessObjects reporting platforms
Supervisory Responsibilities
No supervisory responsibilities
Required Skills/Abilities
- Airflow: DAG development, custom operators, workflow orchestration, production deployment
- Lambda: Serverless functions, event triggers, performance optimization
- EventBridge: Event-driven architecture, rule configuration, cross-service integration
- Glue: ETL job development, crawlers, Data Catalog, schema management
- DMS: Database migrations, continuous replication, heterogeneous database integration
- Redshift: Cluster management, query optimization, workload management
- Athena: Serverless analytics, partitioning strategies, federated queries
Tableau (Expert Level)
- Develop and maintain data analogs, data cubes, queries, data visualization and reports
- Assist in testing code, governance, data quality, and documentation effort
- Reported data and visualizations and reports
- Collaborate with data stewards to test, clean, and standardize
- Evaluate patterns and meaningful insights from data through qualitative and quantitative analysis
Data Technologies
- Apache Kafka: Stream processing, topic design, producer/consumer development
- SQL: Advanced querying across multiple database platforms (PostgreSQL, MySQL, Oracle)
- Python/Scala: Data processing, automation, custom pipeline development
- Schema Management: Automated detection, evolution strategies, backward compatibility
- Tableau: data modeling
Analytics Platforms
- Tableau: Data source optimization, extract management, connection configuration
- BusinessObjects: Universe design, report development, data feed creation
- Tableau: Design and Code rewrite
Education and Experience
- 5+ years AWS data platform development
- 3+ years production Airflow experience with complex workflow orchestration
- Proven experience managing high-volume data feeds (TB+ daily) with schema evolution
- Database migration expertise using DMS for enterprise-scale projects
- BI integration experience with Tableau and BusinessObjects platforms
- Tableau: 2 plus year
Key Competencies
- Design fault-tolerant data pipelines with automated error handling and recovery
- Handle schema changes in real-time and batch data feeds without pipeline disruption
- Optimize performance across streaming and batch processing architectures
- Implement data quality validation and monitoring frameworks
- Coordinate cross-platform data synchronization and lineage tracking
Preferred Qualifications
- AWS Data Analytics Specialty or Solutions Architect Professional certification
- Experience with Infrastructure as Code (Terraform, CloudFormation)
- Knowledge of DataOps practices and CI/CD for data pipelines
- Containerization experience (Docker, ECS, EKS) for data workloads
Working Conditions
Work is generally performed within an indoor office environment utilizing standard office equipment.
General office environment requires frequent sitting; dexterity of hands and fingers to operate a computer keyboard and mouse; walking and standing for long periods of time; and lifting of less than 20 pounds.