Lead Data Engineer
Fully Remote Chicago, IL
Description

  

Legacy.com is the place where life stories live on. We are the global leader in online memorial tributes; a top-50 news website in the United States, and a destination for over 40 million unique visitors each month. Founded in 1998, Legacy.com is honored to help consumers express condolences, share direct support for families, and celebrate the people who have touched their lives. Uniquely open to profound messaging at this focused moment, users respond by sharing memories, purchasing gifts, finding event information, and making plans. Legacy delivers its community-targeted content daily via 1,500 local news media partners. We continue to broaden our offerings in our mission to help people seeking information and guidance on all aspects of end-of-life. Our vision is to become a full-service end-of-life and memorialization resource.


The Lead Data Engineer is responsible for designing, building, and maintaining high-performance data pipelines and infrastructure that power data-driven applications and insights. In this role, you will collaborate with cross-functional teams, including Product, Analytics, and Infrastructure, to ensure data integrity, reliability, and scalability. You will also mentor junior data engineers, champion best practices, and help shape the future of our data ecosystem. 


Only accepting applications from qualified candidates within the United States in the following states: AZ, CA, CO, CT, FL, IL, IN, MA, NC, NJ, TX, WI.


Candidates must be willing to travel to one or more of the Company's locations for meetings and workshops on a quarterly basis, or as needed by the business.


Legacy.com is unable to provide visa sponsorships for this position. Applicants must be eligible to work in the United States without sponsorship.


Responsibilities 

  • Data Pipeline & Platform Development 
  • Develop and optimize ETL/ELT workflows using Python, Scala, or Java with tools like Airflow or dbt. 
  • Build real-time and batch data pipelines leveraging technologies such as Kafka, Spark, or AWS Kinesis. 
  • Ensure data integrity and quality through validation checks, unit tests, and monitoring dashboards. 
  • Collaboration & Leadership 
  • Work closely with Product, Analytics, and Infrastructure teams to translate data requirements into technical solutions. 
  • Lead blameless post-mortems to determine root causes of data incidents and implement long-term resolutions. 
  • Participate in Agile ceremonies, offering clear updates and technical insights. 
  • Performance & Scalability 
  • Monitor and optimize pipeline performance, identifying bottlenecks and tuning systems for high throughput. 
  • Scale infrastructure to handle ever-growing data volumes, using tools like AWS CloudWatch, Grafana, or Datadog. 
  • DevOps & Infrastructure 
  • Manage infrastructure as code (Terraform, CloudFormation) to provision and maintain cloud environments (AWS, Azure, or GCP). 
  • Utilize containerization (Docker, Kubernetes) to deploy, run, and manage data services at scale. 
  • Implement security best practices (encryption, access controls) in collaboration with Security teams. 
  • Mentoring & Growth 
  • Lead and mentor junior data engineers, fostering a culture of knowledge-sharing and continuous improvement. 
  • Participate in hiring and onboarding processes, setting clear expectations and best practices. 
  • Best Practices & Innovation 
  • Champion data engineering best practices in coding, testing, documentation, and architecture. 
  • Stay current on emerging trends (e.g., MLOps, serverless) and proactively suggest improvements. 
  • Advocate for sunsetting legacy systems or approaches that no longer meet business needs. 
Requirements

  

  • 10+ years of professional experience in data engineering or related field. 
  • Expertise in building and maintaining ETL/ELT pipelines using Python, Scala, or Java. 
  • Strong SQL skills and experience with relational databases (e.g., PostgreSQL, MySQL) and data warehousing solutions (Snowflake, BigQuery, or Azure Synapse). 
  • Hands-on experience with streaming platforms (Kafka, Kinesis) and big data processing (Spark). 
  • Proficiency with cloud platforms (AWS, Azure, or GCP) and Infrastructure as Code (Terraform, CloudFormation). 
  • Proven track record of optimizing performance, troubleshooting data pipelines, and scaling infrastructures. 
  • Excellent communication and collaboration skills, including leading teams and mentoring. 
  • Deep understanding of CI/CD and testing frameworks for data engineering. 

Preferred/Bonus Skills 

  • Knowledge of container orchestration (Kubernetes) for data workloads. 
  • Familiarity with MLOps to integrate machine learning pipelines with data pipelines. 
  • Event-driven and serverless architectures (AWS Lambda, Azure Functions) experience. 
  • Background in data security and compliance (GDPR, HIPAA, SOC 2). 
  • Use of AI-assisted coding tools (GitHub Copilot, Code Whisperer) to boost productivity. 

Benefits 

Legacy.com offers a very generous and comprehensive benefits package, including:

  • Medical, Dental and Vision Insurance
  • Health Savings Accounts and Flexible Spending account options with generous employer contribution (based on plan selection)
  • Basic Life and Supplemental Life insurance
  • Disability Insurance
  • Voluntary Supplemental Benefits (Hospital Indemnity, Accident, Critical Illness)
  • Flexible Paid Time Off
  • 401k plan with discretionary employer match
  • Paid Medical and Parental Leave
  • Beautiful, modern office with fully stocked kitchen and weekly catered lunch

Final compensation is determined based on a variety of factors such as experience, education, certifications and geographic location.


California Applicants - by applying to this job you are acknowledging that you have read Legacy.com's CCPA Applicant Privacy Notice. View HERE.

Salary Description
$175,000 - 200,000 + Bonus