Fetcherr, experts in deep learning, algo, e-commerce, and digitization, is disrupting traditional systems with its cutting-edge AI technology. At its core is the Large Market Model (LMM), an adaptable AI engine that forecasts demand and market trends with precision, empowering real-time decision-making. Specializing initially in the airline industry, Fetcherr aims to revolutionize industries with dynamic AI-driven solutions.
Fetcher is seeking a Data Engineer to build large-scale optimized data pipelines using cutting-edge technology and tools. We're looking for someone with advanced Python skills and a deep understanding of memory and CPU optimization in distributed environments. This is a high-impact role with responsibilities that directly influence the company's strategic decisions and data-driven initiatives.
Key Responsibilities:
- Design and build scalable, cross-client data pipelines and transformation workflows using modern ELT tools, ensuring high performance, reusability, and cost-efficiency across diverse data products. Leverage orchestration frameworks like Dagster to manage dependencies, retries, and monitoring.
- Develop and operate distributed data processing systems that handle large-scale workloads efficiently, adapting to dynamic data volumes and infrastructure constraints. Apply frameworks such as Dask or Spark to unlock parallelism and optimize compute resource utilization.
- Deliver robust, maintainable Python solutions by applying sound software engineering principles, including modular architecture, reusable components, and shared libraries. Ensure code quality and operational resilience through CI/CD best practices and containerized deployments.
- Collaborate with data scientists, engineers, and product teams to deliver validated, analytics-ready data that aligns with business requirements. Support team-wide adoption of data modeling standards and efficient data access patterns.
- Proactively safeguard data quality and reliability by implementing anomaly detection, validation frameworks, and statistical or ML-based techniques to forecast trends and catch regressions early. Enforce backward compatibility and data contract integrity across pipeline changes.
- Document workflows, interfaces, and architectural decisions in a clear and structured manner to support long-term maintainability. Maintain up-to-date data contracts, system runbooks, and onboarding guides for effective cross-team collaboration.