ML Feature Store Platform

Series B FinTech Startup

Built a centralized feature store that reduced model development time from weeks to days, while ensuring consistent feature computation between training and production serving.

FeastApache SparkBigQueryRedisApache AirflowPythonGCPDocker

The Problem

The data science team was spending 80% of their time on feature engineering and only 20% on actual model development. Each data scientist was recomputing the same features independently, creating subtle inconsistencies between training and serving — a common driver of model degradation in production. The team needed a shared infrastructure layer.

Architecture & Strategy

Designed a hybrid feature store using Feast as the orchestration layer, backed by offline and online stores optimized for their respective access patterns.

Deployed Feast as the feature registry and serving layer, providing a unified API for both training data retrieval and online inference
Built Apache Spark feature pipelines computing 200+ features from raw transaction data on daily and hourly Airflow schedules
Used BigQuery as the offline store for historical training datasets, leveraging its columnar format for point-in-time correct queries
Integrated Redis as the online store for sub-millisecond feature serving during real-time fraud scoring
Created a feature governance workflow with automated data quality checks and statistical drift detection alerts

Results

Model development cycle time reduced from 3–4 weeks to 4–5 days per model
Training-serving skew eliminated across all 12 production models
Feature reuse reached 67% — new models leveraged existing features rather than recomputing from scratch
Online feature serving latency at p99: 2.3 ms, enabling real-time fraud scoring with no UX impact

Next Case Study

ML Feature Store Platform

Real-Time Analytics Pipeline