ML Infrastructure
2023 · 3 months

ML Feature Store Platform

Series B FinTech Startup

Built a centralized feature store that reduced model development time from weeks to days, while ensuring consistent feature computation between training and production serving.

FeastApache SparkBigQueryRedisApache AirflowPythonGCPDocker
The Problem

The data science team was spending 80% of their time on feature engineering and only 20% on actual model development. Each data scientist was recomputing the same features independently, creating subtle inconsistencies between training and serving — a common driver of model degradation in production. The team needed a shared infrastructure layer.

Architecture & Strategy

Designed a hybrid feature store using Feast as the orchestration layer, backed by offline and online stores optimized for their respective access patterns.

  • Deployed Feast as the feature registry and serving layer, providing a unified API for both training data retrieval and online inference

  • Built Apache Spark feature pipelines computing 200+ features from raw transaction data on daily and hourly Airflow schedules

  • Used BigQuery as the offline store for historical training datasets, leveraging its columnar format for point-in-time correct queries

  • Integrated Redis as the online store for sub-millisecond feature serving during real-time fraud scoring

  • Created a feature governance workflow with automated data quality checks and statistical drift detection alerts

Results
  • Model development cycle time reduced from 3–4 weeks to 4–5 days per model

  • Training-serving skew eliminated across all 12 production models

  • Feature reuse reached 67% — new models leveraged existing features rather than recomputing from scratch

  • Online feature serving latency at p99: 2.3 ms, enabling real-time fraud scoring with no UX impact