Project: Guardrails Model Training

DDWAW-DAED — EY GDS Consulting (Mar 2025 – Present)

An end-to-end language model fine-tuning and evaluation platform built on AWS, enabling scalable model customization with guardrails for enterprise-grade safety and alignment.

Business Problem Statement

The client required a robust pipeline to fine-tune and evaluate large language models for guardrails use cases, ensuring responses remain safe, aligned, and governed. Existing tooling was limited to an on-premise Dell cluster with no cloud-native integration, no synthetic data strategy, and no repeatable evaluation framework — making scalable model adaptation and deployment very difficult.

Role & Contributions

Cloud Architecture & PoC Development

  • Architected the high-level cloud solution using AWS services, defining the end-to-end approach for model customization, deployment, and inference aligned with enterprise PoC architecture standards.
  • Developed a fine-tuning application PoC — including dataset & model onboarding, training workflow, and evaluation pipelines — enabling seamless end-to-end language model customization and establishing the foundational workflow for scalable model adaptation.
  • Enhanced the fine-tuning framework on AWS by extending the baseline fine-tuning script provided by the onsite team (originally developed for the Dell cluster), ensuring compatibility and cloud-native integration.
  • Deployed base and fine-tuned models on AWS via CMI and executed comprehensive latency-focused inference testing using benchmark scenarios, evaluating performance across base models and measuring improvements achieved through fine-tuning.
  • Led the end-to-end deployment and validation of the application within the EY AWS Sandbox environment, ensuring seamless integration and successful functional testing.

Governance, SDG Pipeline & Model Training

  • Established foundational setup and governance: selected base models, transferred model weights to cluster, set up Vanguard governance cadence, open code processes, budget tracking, and defined the model transfer protocol.
  • Designed and operationalized the Synthetic Data Generation (SDG) pipeline: developed the SDG strategy by analyzing existing data, creating data volumes and target formats, selecting and labelling QA pairs, generating and validating synthetic datasets, creating SDG scripts, and standing up a red-teaming SDG pipeline to improve data quality and safety.
  • Built the complete evaluation framework — including evaluation scripts, baseline model evaluation, and configuration validation — and stood up all base models on the cluster for inference, followed by detailed inference testing and benchmarking.
  • Defined training/output data formats, formatted training datasets, deployed training scripts across model families, and executed the training run followed by multiple training iterations involving hypothesis documentation, data/algorithm modifications, evaluation runs, and results analysis.

Tools & Frameworks Used