End-to-End Machine Learning Workflow in AWS SageMaker

Introduction

This is a project that I did as part of the Udacity scholarship and it entails using SageMaker for a complete project. AWS SageMaker provides a powerful environment for building, training, and deploying machine learning (ML) models at scale. This project demonstrates an end-to-end ML pipeline using AWS SageMaker, AWS Lambda, and AWS Step Functions. The workflow automates data collection, preprocessing, model training, deployment, and inference, enabling a seamless machine learning pipeline in the cloud.

Workflow Overview

This ML project follows a structured approach involving multiple AWS services:

Data Collection & Preprocessing – Using a SageMaker Jupyter notebook, data is collected and cleaned to ensure consistency and quality before training.
Model Training & Deployment – A built-in SageMaker model is trained for image classification and deployed as an endpoint.
Lambda Functions for Processing & Inference – Three AWS Lambda functions are created:
- Image Serialization – Converts input images into a format suitable for model inference.
- Inference Execution – Calls the SageMaker endpoint and retrieves predictions.
- Thresholding – Filters predictions based on a confidence level of at least 70%.
AWS Step Functions for Orchestration – Step Functions automate the execution of the Lambda functions and handle task coordination.
Execution Logging & Permissions – The project includes an execution-detail.json file that logs execution steps, while IAM roles are configured to provide the necessary permissions for each service.

Implementation Details

1. Data Collection & Preprocessing

The project begins with a SageMaker Jupyter notebook, where the dataset is loaded, cleaned, and transformed for training. Essential steps include:

Handling missing values
Normalizing image data
Splitting the dataset into training and validation sets

2. Model Training & Deployment

A built-in SageMaker image classification model is selected and trained using the preprocessed data. After training, the model is deployed as an endpoint, enabling real-time predictions.

3. AWS Lambda Functions

Three Lambda functions are implemented to handle different aspects of the inference pipeline:

Serialization Function – Prepares the input image for processing.
Inference Function – Calls the SageMaker endpoint to obtain predictions.
Thresholding Function – Filters predictions to retain only those with confidence levels above 70%.

4. AWS Step Functions for Workflow Automation

AWS Step Functions manage the execution of the Lambda functions in a structured sequence. The execution-detail.json file provides a JSON representation of the workflow execution, ensuring traceability and debugging.

Key Considerations

IAM Roles & Permissions: The Lambda functions require permissions for S3 full access, Lambda execution, and SageMaker full access policies to function correctly.
Scalability & Cost Optimization: SageMaker enables scalable training and deployment, while Lambda ensures efficient inference processing without dedicated infrastructure.
Enhancements & Future Work: Possible improvements include model retraining automation, performance monitoring, and multi-model support.

Conclusion

This AWS SageMaker ML workflow provides a robust and scalable solution for deploying machine learning models in the cloud. By integrating SageMaker with Lambda functions and Step Functions, the project automates the ML lifecycle from data processing to inference, offering a seamless and efficient pipeline.

Link to the project