Production Data Engineering for Machine Learning
Designing, Building, and Operating the Data Foundations of Real-World ML Systems Framework and Blueprints
No se pudo agregar al carrito
Solo puedes tener X títulos en el carrito para realizar el pago.
Add to Cart failed.
Por favor prueba de nuevo más tarde
Error al Agregar a Lista de Deseos.
Por favor prueba de nuevo más tarde
Error al eliminar de la lista de deseos.
Por favor prueba de nuevo más tarde
Error al añadir a tu biblioteca
Por favor intenta de nuevo
Error al seguir el podcast
Intenta nuevamente
Error al dejar de seguir el podcast
Intenta nuevamente
Obtén 30 días de Standard gratis
$8.99 al mes después de que termine la prueba. Cancela en cualquier momento
Compra ahora por $7.99
-
Narrado por:
-
Virtual Voice
-
De:
-
Jordan O'Neal
Este título utiliza narración de voz virtual
Voz Virtual es una narración generada por computadora para audiolibros..
Inside this book, readers will learn how to:
- Design durable schemas: Build feature schemas with explicit nullability, evolvable enumerations, and semantic versioning so retraining cycles and upstream changes do not require quarter-long migrations.
- Choose the right ingestion strategy: Apply a structured decision framework for batch, micro-batch, and streaming ingestion based on freshness requirements, cost tolerance, and source-system characteristics, with every connector designed to be replay-safe by default.
- Define and enforce data contracts: Establish enforceable producer-consumer contracts that prevent silent breakage between teams and wire those contracts into CI as build gates rather than relying on coordination by Slack message.
- Validate features before they reach a model: Build a layered validation stack covering per-record checks at ingestion, statistical distribution tests at the batch level, and contract tests across pipeline stages.
- Detect drift before it reaches customers: Instrument pipelines with per-feature distribution monitors and freshness alerts calibrated against historical baselines, turning data quality into an observable, actionable signal.
- Build lineage and observability: Trace every record from source through training, feature serving, and inference, so when a model misbehaves the root cause is answered in minutes, not days.
- Architect and operate a feature store: Eliminate training-serving skew through shared feature definitions and point-in-time correctness, and manage the full feature lifecycle from proposal through deprecation.
- Protect data through privacy and compliance: Apply sensitivity classification, access controls, and audit logging as first-class engineering concerns embedded in pipeline design.
- Engineer data for LLMs and RAG systems: Manage the retrieval corpora, embedding pipelines, chunking strategies, and quality gates that production large language model and retrieval-augmented generation systems require.
adbl_web_anon_alc_button_suppression_c
Todavía no hay opiniones