Infrastructure & Tooling for MLOps

Storage & Compute Layer
- all the compute ressources a company has access to and the mechanisms to determine how theses ressources can be used
- can be used into smaller compute units to be used concurrently
threads : to execute a job
instance : “permanent” unit
Dev Environment
IDE, versionning, CI/CD, SSH, containers
Ressource Managment
- CRON : run script at a predetermined time & tells if failed or succeded
- SCHEDULER : cron that handle dependencies (needs to know the available ressources)
- ORCHESTRATORS
DS Workflow managment
specify workflows as DAGs where each step is an edge Airflow, Argo, Perfect, Kubeflow, Metaflow
ML Plateform
shared set of tools for ML Deployment
- Model Store :
- model definition
- model parameters
- featurize & predict functions
- dependencies
- data (version or endpoint)
- model generation point
- experiment artifacts
- tags ML Flow is the most popular
- Feature Store
- feature managment
- feature consistency
- feature computation