Our team’s mission is to build reliable, scalable and flexible "micro-data" platform for developing, managing and serving signals for all of Pinterest. This platform aggregates and serves all signals about pinners, pins, and boards and indexes all pins through batched/incremental/real-time pipelines to power our discovery engine for generating and ranking pin candidates.
- Own, improve, and scale existing static signal platform which coordinates tens of jobs and processes hundreds of TB of data each day
- Own and maintain the system that rebuilds the full index of Pinterest’s catalog of ideas periodically, which powers almost all of Pinterest’s products
- Drive the roadmap for the next-generation real-time Pinterest signal platform and build the system to instantly and incrementally update the indices
- Deep expertise on batch or real-time data processing (Hadoop or Spark) at consumer Internet scale
- Strong ability to work cross functionally and drive projects end-to-end
- Expert in C/C++ or Java
- Fluent in Python