Faaez Nafiu's Cloud Projects

Building ContractGuard: How I Built an AI-Powered Contract Review Platform

November 5, 2025
The “Aha!” Moment

Picture this: You’re a consultant who’s seen legal teams spend 2-4 hours manually reviewing every contract, missing critical clauses, and burning through budget faster than you can say “unlimited liability.”

So naturally, I thought: “Hey, AWS Bedrock just released Claude Sonnet 4.5… what if we could automate this?”

Narrator: It was at this moment he knew…

The Plan (It Was So Simple in My Head)

The vision was straightforward:
1. User uploads contract
2. AI reads it
3. Magic happens
4. User gets risk analysis
How hard could it be?

(Spoiler: It was definitely harder than expected, but we got there.)

Day 1: The AWS Setup Saga

What I thought would happen:
- Quick AWS account setup
- Enable Bedrock
- Done in 30 minutes
What actually happened:
- “Wait, there are different Claude models?”
- “Why do I need inference profiles?”
- “Model access requests… approved instantly!” (Okay, that part was smooth)
Key learning: AWS has gotten WAY better about Bedrock access. No more waiting days for model approval. They just… let you in now. Revolutionary.

Day 2: Lambda Functions (Where Things Got Real)

I started with the classic developer move: “I’ll just write three simple Lambda functions.”

Lambda 1 (Upload URL Generator): Straightforward. Pre-signed S3 URLs. Easy.

Lambda 2 (The AI Brain):
- “Just call Bedrock, get JSON, store in DynamoDB”
- Attempt 1: Claude returned markdown code blocks wrapping my JSON
- Attempt 2: Added markdown stripping
- Attempt 3-7: CORS errors, IAM permission issues, wrong region calls
- Attempt 8: IT WORKS!
There were definitely some struggles I went through here. Like when Lambda kept trying to call us-east-2 instead of us-east-1. Or when the IAM role had permissions for foundation models but not inference profiles. Good times.

Lambda 3 (Results Fetcher): Should be simple, right? Wrong. DynamoDB returns Decimal objects. Python’s json.dumps() hates Decimal objects. Added custom serializer. Crisis averted.

Lesson learned: Always test with real data. Mock data never has Decimal objects.

Day 3: The CORS Adventure

Ah yes, CORS. The bane of every web developer’s existence.

Error log excerpt:
```
Access to XMLHttpRequest blocked by CORS policy
Access to XMLHttpRequest blocked by CORS policy  
Access to XMLHttpRequest blocked by CORS policy
```
The journey:
1. Enabled CORS on API Gateway → Still blocked
2. Realized S3 ALSO needs CORS → Still blocked
3. Discovered the CORS config didn’t actually apply to the bucket
4. Manually entered it in AWS Console → Finally worked
Pro tip: When AWS CLI says something succeeded but it didn’t… use the console. Sometimes you just need to see it with your own eyes.

Day 4: The Frontend That Almost Defeated Me

I’m primarily a backend guy, so React was… an experience.

Initial design: Basic upload box, white background, Times New Roman vibes.

What the client wanted: “Green Bay Packers theme, engaging, informational, APEX Consulting branding, professional landing page.”

What I learned about Tailwind:
- It’s amazing when it works
- It’s confusing when you forget the PostCSS config
- @tailwind base; goes at the TOP of index.css (don’t ask how long this took me to figure out)
The logo incident:
- Saved logo as apex-logo.png
- Logo doesn’t show
- Checked path: correct
- Checked public folder: file is there
- Tried different browsers: nothing
- Realized Windows named it apex-logo.png.png
- Face, meet palm
Lesson learned: Always show file extensions in Windows Explorer.

The Architecture

After all the struggles, here’s what I ended up with:

Frontend (React + Tailwind):
- Drag-and-drop upload with progress tracking
- Real-time polling for analysis status
- Beautiful green/gold themed landing page
- Professional results dashboard
Backend (AWS Serverless):
- Lambda 1: Generates pre-signed S3 URLs (secure uploads)
- Lambda 2: Calls Bedrock, analyzes contract, stores results
- Lambda 3: Retrieves analysis from DynamoDB
- API Gateway: Ties it all together
- S3: Contract storage
- DynamoDB: Analysis results
AI Engine:
- Claude Sonnet 4.5 via Bedrock
- Custom playbook with 10 compliance rules
- Returns structured JSON with risk scores
It analyzes a 10-page contract in 60 seconds. Manual review would take 2-4 hours.

The Numbers (AKA Why This Matters)

Time savings: 85% (2 hours → 60 seconds)
Cost savings: 98% ($600 → $5 per contract)
Accuracy improvement: 95%+ (Claude doesn’t get tired or miss things)

For a company processing 100 contracts/month:
- Before: $60,000/month
- After: $500/month
- Annual savings: $714,000
Yeah, I did the math. Twice.

What I’d Do Differently

If I could go back and tell myself one thing:
1. Set up CORS first. Like, day one. Before anything else. Your future self will thank you.
2. Use the AWS Console for debugging. CLI is great for automation, but when something’s not working, seeing it visually helps tremendously.
3. Read the Bedrock docs about inference profiles. That us. prefix in the model ID? Kind of important.
4. Test Lambda functions locally first. Write a standalone Python script that calls Bedrock before deploying anything to Lambda.
5. Don’t trust Git Bash on Windows for JSON payloads.
Key Technical Wins

Despite the struggles, there were some genuinely cool technical achievements:

1. Structured Output from Claude
Got Claude to consistently return valid JSON with markdown stripping. No small feat with LLMs.

2. Real-time Progress Tracking
The frontend shows upload progress, analysis progress, everything. Users aren’t just staring at a loading spinner.

3. Security Done Right
- Pre-signed S3 URLs (5-minute expiration)
- IAM least-privilege permissions
- All data stays in customer’s AWS account
- No third-party data sharing
4. Cost Optimization
$5 per 100 contracts. That’s not a typo. Serverless + pay-per-use pricing = magical economics.

5. Production-Ready Architecture
This isn’t a proof-of-concept. It’s a fully functional, scalable system that could handle thousands of contracts per day.

The Final Product

After a week of coding, debugging, Googling error messages, and questioning my life choices, I ended up with:
- A professional SaaS landing page with my branding
- Full contract upload and analysis workflow
- AI-powered risk detection using Claude Sonnet 4.5
- Detailed results with risk scores and recommendations
- Complete AWS serverless backend
- Total cost: Less than dinner for two
And it actually works.

What I Learned

Technical:
- Bedrock inference profiles require special IAM permissions
- CORS needs to be enabled EVERYWHERE (API Gateway, S3, Lambda responses)
- DynamoDB uses Decimal objects (Python hates them)
- Tailwind CSS is powerful but configuration-sensitive
- Windows hides file extensions by default (why though?)
Business:
- AI can deliver 98% cost savings on specific workflows
- Privacy-first architecture is a legitimate competitive advantage
- Clear ROI calculations matter (saved $714K/year > “AI is cool”)
- Target market identification is crucial (mid-market companies, not enterprises)
Personal:
- I can build full-stack apps (even if the frontend makes me nervous)
- Reading error messages carefully saves hours of debugging
- Coffee is essential
- Sometimes you just need to walk away and come back fresh
Try It Yourself

The full code is on GitHub: https://github.com/Faaeznaf/ContractGuard

Stack:
- Frontend: React, Tailwind CSS
- Backend: AWS Lambda (Python 3.11), API Gateway, S3, DynamoDB
- AI: AWS Bedrock (Claude Sonnet 4.5)
Cost to run: $5/month for 100 contracts

Deployment time: 60-90 minutes (if you avoid my mistakes)

Final Thoughts

Building ContractGuard taught me that:
1. Modern AI tools like Claude Sonnet 4.5 are genuinely transformative
2. AWS serverless architecture is powerful (once you figure out CORS)
3. Good UX matters (even for B2B tools)
4. Reading the docs first would have saved me 6 hours
5. There’s nothing quite like seeing your code analyze a real contract and catch issues a human would miss
Would I do it again?

Absolutely. Though maybe I’d enable CORS on day one.

Connect With Me

Want to chat about AI, AWS, or why CORS is the worst?
- LinkedIn: https://www.linkedin.com/in/faaeznafiu/
- GitHub: https://github.com/Faaeznaf
P.S. If you’re building something similar and get stuck on CORS, DM me. I’ve suffered enough for both of us.
Deploying a Multi-Tier Architecture on AWS with Terraform: A Complete Guide

October 30, 2025
Introduction

Infrastructure as Code has revolutionized how we deploy cloud resources. Instead of clicking through AWS console menus for hours, you can define your entire infrastructure in code and deploy it in minutes. In this hands-on tutorial, I’ll walk you through deploying a complete multi-tier architecture on AWS using Terraform—covering everything from IAM user creation to verifying your deployed resources.

What You’ll Build:
- Virtual Private Cloud (VPC) with public and private subnets
- EC2 instances for application servers
- RDS database for data persistence
- Application Load Balancer for traffic distribution
- Complete multi-tier architecture ready for production workloads
Prerequisites:
- Basic understanding of AWS services
- Familiarity with command-line operations
- A virtual machine or local Linux environment
- AWS account access
Part 1: Setting Up IAM User with Terraform Permissions

Before Terraform can create resources on your behalf, it needs proper AWS credentials. We’ll create a dedicated IAM user with administrative access.

Step 1: Connect to Your Virtual Machine

First, SSH into your virtual machine using the provided credentials:

bash
```
ssh cloud_user@your_vm_ip_address
```
Replace your_vm_ip_address with your actual VM’s public IP address.

Step 2: Access the AWS Management Console

Open your web browser in an incognito/private window (this prevents credential conflicts if you have multiple AWS accounts) and navigate to the AWS Management Console. Log in with your credentials.

Pro Tip: Using incognito mode is a best practice when working with multiple AWS accounts or lab environments.

Step 3: Navigate to IAM

In the AWS Console search bar, type “IAM” and select Identity and Access Management from the results.

Step 4: Create the Terraform User
1. Click on Users in the left sidebar
2. Click the Create User button
3. Enter terraform-user as the username
4. Click Next
Step 5: Assign Administrative Permissions

On the Set permissions page:
1. Select Attach policies directly
2. Search for AdministratorAccess
3. Check the box next to the AdministratorAccess policy
4. Click Next
Why AdministratorAccess? For this tutorial, we’re granting full permissions to simplify the deployment. In production environments, you should follow the principle of least privilege and grant only the specific permissions Terraform needs.

Step 6: Complete User Creation

Review your selections and click Create user. You should see a success message confirming the user was created.

Step 7: Generate Access Keys

This is the critical step—creating the credentials Terraform will use:
1. Click on your newly created terraform-user
2. Navigate to the Security credentials tab
3. Scroll down to Access keys and click Create access key
4. Select Command Line Interface (CLI) as the use case
5. Check the confirmation box and click Next
6. Add a description tag (optional) and click Create access key
IMPORTANT: You’ll see your Access Key ID and Secret Access Key. This is the ONLY time you’ll see the secret key!
- Copy both values and store them securely
- Never commit these credentials to Git
- Consider using a password manager
Part 2: Installing Git and Cloning the Terraform Repository

Now that we have AWS credentials, let’s get the Terraform code onto our virtual machine.

Step 1: Update Your System

Back in your VM terminal, update the package index to ensure you have access to the latest packages:

bash
```
sudo yum update -y
```
The -y flag automatically answers “yes” to all prompts, streamlining the installation process.

Step 2: Install Git

Install Git to enable repository cloning:

bash
```
sudo yum install git -y
```
Step 3: Verify Git Installation

Confirm Git is installed correctly:

bash
```
git --version
```
You should see output similar to git version 2.x.x.

Step 4: Clone the Terraform Repository

Clone the repository containing the multi-tier architecture code:

bash
```
git clone https://github.com/Faaeznaf/apex-multi-tier-architechture.git
```
Step 5: Navigate to the Project Directory

Change into the cloned directory:

bash
```
cd multi-tier-terraform
```
You’re now in the project root where all the Terraform configuration files reside.

Part 3: Installing Dependencies and Configuring AWS CLI

The repository includes an installation script that handles all the necessary dependencies. We’ll also need to configure the AWS CLI with our credentials.

Step 1: Make the Installation Script Executable

Before running any script, you need to grant it execute permissions:

bash
```
chmod +x install.sh
```
What does chmod +x do? It adds execute (+x) permissions to the file, allowing you to run it as a program.

Step 2: Run the Installation Script

Execute the script to install Terraform and other required tools:

bash
```
./install.sh
```
The script will automatically download and install:
- Terraform CLI
- AWS CLI (if not already installed)
- Any other project dependencies
Step 3: Install Git LFS (Large File Storage)

The repository uses Git LFS to manage large binary files efficiently. Install and configure it:

bash
```
sudo yum install epel-release -y
sudo yum install git-lfs -y
git lfs install
git lfs pull
```
Why Git LFS? Large files like Terraform provider binaries or module packages are stored separately from the main repository to keep clone times fast. Git LFS downloads these files only when needed.

Step 4: Configure AWS CLI

Now we’ll configure the AWS CLI with the credentials we created earlier:

bash
```
aws configure
```
You’ll be prompted for four values:
1. AWS Access Key ID: Paste the Access Key ID you saved earlier
2. AWS Secret Access Key: Paste the Secret Access Key you saved earlier
3. Default region name: Enter us-east-1
4. Default output format: Press Enter (leave blank for default JSON format)
Security Note: These credentials are now stored in ~/.aws/credentials. Be careful not to expose this file or commit it to version control.

Part 4: Deploying Infrastructure with Terraform

This is where the magic happens. Terraform will read your infrastructure code and create all the AWS resources automatically.

Step 1: Initialize Terraform

Initialization prepares your Terraform working directory:

bash
```
terraform init
```
What happens during terraform init?
- Downloads required provider plugins (AWS provider in this case)
- Initializes the backend for state storage
- Prepares modules and dependencies
You should see output ending with “Terraform has been successfully initialized!”

Step 2: Preview the Infrastructure Changes

Before creating any resources, let’s see what Terraform plans to do:

bash
```
terraform plan
```
This command performs a “dry run” that shows:
- Resources that will be created (marked with +)
- Resources that will be modified (marked with ~)
- Resources that will be destroyed (marked with -)
- Total count of changes
Pro Tip: Always review the terraform plan output carefully before applying changes. This prevents accidental resource deletions or modifications.

Step 3: Apply the Terraform Configuration

Now let’s create the actual infrastructure:

bash
```
terraform apply
```
Terraform will display the same plan output and prompt you for confirmation. Type yes to proceed.

What’s being created?
- VPC with CIDR blocks
- Public and private subnets across multiple availability zones
- Internet Gateway and NAT Gateways
- Route tables and associations
- Security groups with ingress/egress rules
- EC2 instances for application tier
- RDS database instance
- Application Load Balancer with target groups
- Auto-scaling configurations (if included)
This process typically takes 5-15 minutes depending on the complexity of your architecture.

Part 5: Verifying Your Deployed Resources

Let’s confirm everything was created correctly by checking the AWS Console.

Step 1: Verify VPC Creation
1. In the AWS Console, search for and navigate to VPC
2. You should see your newly created VPC with custom CIDR block
3. Check the subnets tab to verify public and private subnets were created
Step 2: Verify EC2 Instances
1. Navigate to the EC2 service
2. Click on Instances in the left sidebar
3. You should see your application instances in a “running” state
Step 3: Verify RDS Database
1. Search for and navigate to RDS
2. Check the Databases section
3. Your database instance should be visible with status “Available”
Step 4: Verify Load Balancer
1. Navigate back to EC2 service
2. Click on Load Balancers under the “Load Balancing” section in the left sidebar
3. Your Application Load Balancer should be listed with state “Active”
4. Note the DNS name—this is how you’ll access your application
What You’ve Accomplished

Congratulations! You’ve successfully:
- Created a dedicated IAM user with programmatic access
- Configured your local environment with AWS CLI
- Cloned and prepared a Terraform repository
- Deployed a complete multi-tier architecture including:
  - Network layer (VPC, subnets, routing)
  - Compute layer (EC2 instances)
  - Data layer (RDS database)
  - Load balancing and traffic management
Key Takeaways
1. Infrastructure as Code is Powerful: What would take hours to click through in the console took just one command with Terraform
2. Version Control Your Infrastructure: Your entire infrastructure is now defined in code, making it reproducible and version-controlled
3. Security Best Practices Matter: Using a dedicated IAM user instead of root credentials is crucial for production environments
4. Always Review Before Applying: The terraform plan command is your safety net—use it religiously
Next Steps

Consider enhancing this deployment with:
- Terraform State Management: Move state to S3 with DynamoDB locking
- CI/CD Integration: Automate deployments with GitHub Actions or GitLab CI
- Monitoring: Add CloudWatch dashboards and alarms
- Cost Optimization: Implement auto-scaling policies and right-size instances
- Security Hardening: Implement least-privilege IAM policies and enable VPC Flow Logs
Cleanup

Important: To avoid AWS charges, don’t forget to destroy your resources when you’re done:

bash
```
terraform destroy
```
Type yes when prompted. Terraform will remove all created resources in the correct order, respecting dependencies.

Full project code: GitHub Repository – Apex Multi-Tier Architecture

Found this helpful? Connect with me on LinkedIn and share your own Terraform deployment experiences!

Building a Machine Learning Threat Detection System with AWS Lambda and SageMaker

October 1, 2025

Traditional signature-based security systems struggle against sophisticated cyber threats. This guide demonstrates how to build a production-ready, machine learning-powered threat detection system using AWS Lambda for preprocessing and SageMaker for inference.

The Problem Statement

Modern cyber threats evolve rapidly, requiring intelligent systems that can identify attack patterns in real-time. Organizations need scalable, cost-effective solutions that can process network logs and detect multiple attack vectors including SQL injection, DoS attacks, port scanning, brute force attempts, and data exfiltration.

Architecture Overview

Our threat detection pipeline processes network logs through automated stages:

Network Logs → Lambda Preprocessing → Feature Extraction → 
SageMaker Batch Transform → Threat Classification → SNS Alerts

The system extracts 36 sophisticated features from raw network logs to identify six threat categories with 100% validation accuracy.

Technologies Used

AWS Lambda: Serverless preprocessing and feature extraction
Amazon SageMaker: ML model training and batch inference
Amazon S3: Data lake architecture
Amazon SNS: Real-time threat alerting
Python: XGBoost, Pandas, Scikit-learn
Machine Learning: 36 engineered features for threat detection

Phase 1: Development Environment Setup

Set up the Python environment with proper dependency management:

mkdir cyber-threat-detection-sagemaker
cd cyber-threat-detection-sagemaker

python -m venv .venv
.\.venv\Scripts\Activate.ps1

pip install --only-binary=all numpy pandas scikit-learn
pip install boto3 xgboost

Configure AWS CLI credentials:

aws configure
aws sts get-caller-identity

Phase 2: Advanced Feature Engineering

The Lambda preprocessing function transforms raw logs into 36 threat indicators across five categories:

Feature Categories

Network Behavior Analysis (12 features)

IP geolocation risk assessment
Internal vs external connection detection
Protocol anomaly detection
Traffic volume patterns

Temporal Pattern Analysis (4 features)

Business hours detection
Weekend activity monitoring
Time-based attack patterns

Port and Service Analysis (7 features)

High-risk port identification
Service categorization
Port usage patterns

Attack Pattern Detection (8 features)

SQL injection pattern matching
XSS attempt detection
DoS attack indicators
Suspicious user agent identification

Traffic Characteristics (5 features)

Bytes ratio analysis
Upload/download patterns
Connection frequency monitoring

Phase 3: Realistic Threat Data Generation

Created a dataset with 2,500 network log records across six threat categories:

Threat Type	Percentage	Count	Key Characteristics
Normal Traffic	60.6%	1,500	Regular browsing, DNS queries
SQL Injection	5.9%	150	Malicious payloads, suspicious agents
DoS/DDoS	7.9%	200	High frequency, small packets
Port Scanning	12.0%	300	Sequential ports, short duration
Brute Force	9.8%	250	Repeated login attempts
Data Exfiltration	3.7%	100	Large outbound transfers

Phase 4: Machine Learning Model Training

The XGBoost training pipeline implements enterprise-grade practices:

Class imbalance handling with automatic weight computation
Robust data validation for infinite/NaN values
Comprehensive per-class metrics and AUC scores
Feature importance analysis for interpretability
SageMaker-compatible interface

Model Performance

Training Accuracy: 99.96%
Validation Accuracy: 100%
Validation AUC: 1.0000
Per-Class Detection: Perfect precision/recall across all threat types

Top 5 Critical Features

is_external_connection (29.94%)
ip_geolocation_risk (12.38%)
is_well_known_port (10.62%)
potential_dos (9.69%)
protocol_numeric (9.49%)

Phase 5: AWS Production Deployment

The deployment script executes six automated steps to create the complete infrastructure:

Step 1: S3 Bucket Creation

Three buckets with account-specific naming (Account ID: xxxxxxxxxxxx):

cyber-threat-raw-data-xxxxxxxxxxxx – Network log ingestion
cyber-threat-processed-data-xxxxxxxxxxxx – Feature storage
cyber-threat-model-artifacts-xxxxxxxxxxxx – Model and results

Step 2: IAM Role Configuration

Creates CyberThreatDetectionSageMakerRole with:

Trust policies for SageMaker and Lambda services
S3 read/write permissions for all threat detection buckets
CloudWatch logging capabilities
SageMaker training and inference permissions

Step 3: Training Data Upload

Uploads datasets to S3:

Raw network logs to raw-data/ folder
Processed features to train/ folder

Step 4: Lambda Function Deployment

Deploys cyber-threat-detector-sagemaker:

Python 3.9 runtime with preprocessing code
300-second timeout, 256MB memory
Integrated with SageMaker for batch inference
Automatic updates for existing functions

Step 5: SageMaker Model Registration

Registers trained XGBoost model:

Model name: cyber-threat-detector-xxxxxxxxxx
Ready for batch transform jobs
Configured for real-time threat classification

Step 6: SNS Alert Configuration

Sets up cyber-threat-alerts topic:

Real-time threat notifications
Email/SMS subscription support
Integration with Lambda for automated alerting

Phase 6: Production Testing and Validation

Deployed Resources Summary

Component	Resource Name	Purpose
Lambda Function	`cyber-threat-detector-sagemaker`	Preprocessing & SageMaker integration
SageMaker Model	`cyber-threat-detector-xxxxxxxxxx`	XGBoost threat classification
S3 Raw Data	`cyber-threat-raw-data-xxxxxxxxxxxx`	Network log ingestion
S3 Processed	`cyber-threat-processed-data-xxxxxxxxxxxx`	Feature storage
S3 Artifacts	`cyber-threat-model-artifacts-xxxxxxxxxxxx`	Model & results storage
SNS Topic	`cyber-threat-alerts`	Threat notifications
IAM Role	`CyberThreatDetectionSageMakerRole`	Security & permissions

Real-Time Threat Detection Flow

The system processes network logs automatically:

Upload: CSV files uploaded to raw-data/ folder
Trigger: S3 event invokes Lambda function
Feature Extraction: 36 features extracted from each log
Batch Transform: SageMaker processes features
Classification: Model identifies threat patterns
Alerting: SNS sends notifications for detected threats

Testing the Pipeline

Upload test network logs to verify end-to-end processing:

aws s3 cp test_network_logs.csv s3://cyber-threat-raw-data-xxxxxxxxxxxx/

Subscribe to Threat Alerts

Configure email notifications:

aws sns subscribe \
  --topic-arn arn:aws:sns:us-east-1:xxxxxxxxxxxx:cyber-threat-alerts \
  --protocol email \
  --notification-endpoint your-email@company.com

Production Pipeline in Action

Automatic Processing Workflow

When network logs are uploaded to S3, the system automatically:

Triggers Lambda function via S3 event
Extracts 36 features from raw logs
Submits batch transform job to SageMaker
Model classifies threats with confidence scores
SNS sends alerts for detected threats

Performance Metrics

Processing Speed: 2,500+ records in ~5 minutes
Feature Extraction: 36 features per network log entry
Detection Latency: Sub-minute threat identification
Cost Efficiency: Pay-per-use serverless architecture
Scalability: Auto-scales with traffic volume

Architecture Benefits

Serverless Advantages

Cost-effective pay-per-use model
Automatic scaling with traffic spikes
High availability with AWS-managed infrastructure
Sub-minute threat detection capabilities

Security Features

IP address hashing for privacy
S3 server-side encryption at rest
Least-privilege IAM access controls
CloudTrail audit logging enabled

System Capabilities

Processing: 2,500+ records in ~5 minutes
Accuracy: 100% validation performance across all threat types
Features: 36 sophisticated threat indicators extracted per log
Cost: ~$10-50 monthly depending on usage volume

Key Achievements

Complete AWS infrastructure deployed with Lambda and SageMaker integration
Advanced ML pipeline with 36 engineered features
Real-time processing via serverless Lambda functions
SageMaker batch transform for scalable inference
Perfect model accuracy with 100% validation performance
Production testing verified with SNS alerting
Cost-effective serverless architecture (~$10-50/month)

Future Enhancements

Real-time inference endpoints for immediate threat detection
CloudWatch dashboards for monitoring and metrics
Kinesis integration for streaming log processing
SIEM integration (Splunk, QRadar) for enterprise security
Automated response workflows for threat remediation
Multi-region deployment for high availability

Conclusion

This production-ready threat detection pipeline demonstrates enterprise-level security capabilities using AWS Lambda for preprocessing and SageMaker for inference. The system achieves perfect accuracy in detecting six attack types while maintaining cost-effective, scalable infrastructure.

The complete serverless architecture processes 2,500+ records in ~5 minutes with automatic alerting through SNS, providing organizations with real-time threat visibility at minimal operational cost.

Complete source code and deployment scripts: [GitHub Repository Link]

Docker and React: Building Modern Web Applications

September 27, 2025
Date: 2025-09-26

Tags: ‘Docker’, ‘React’, ‘Multi-stage builds’, ‘Containerization’, ‘Nginx’, ‘Production deployment’, ‘Live reloading’, ‘Docker volumes’, ‘Environment variables’, ‘Image optimization’]

Docker and React together provide a powerful solution for building and delivering modern web applications. Docker ensures consistent environments across development and production, while React creates fast, dynamic user interfaces—all packaged into lightweight, portable containers for reliable deployment anywhere.

The Problem Statement

Modern applications often face scalability challenges as they grow. Monolithic applications can experience significant performance issues when customer base and revenue increase rapidly, with key metrics approaching SLA limits.

The solution involves refactoring monolithic applications into microservices architecture. Docker has been identified as a critical component in this transformation, starting with containerizing React front-end applications.

📁 Complete Source Code

All the code, configurations, and examples from this project are available in the GitHub repository:

🔗 ApexConsulting React Docker Project

Clone the repository to follow along

Software Requirements
- Docker: 28.2.2
- Node.js: 24.8.0
- npm: 11.4.1
- React: 19.1.0
- Editor: Microsoft Visual Studio Code (or any preferred editor)
Creating the Sample Application

We’ll use Create React App CLI to scaffold our project with zero configuration:
```
npx create-react-app apexconsulting
```
This process installs all required packages and may take a minute or two to complete.

Terminal showing successful completion message

Project Structure Overview

Navigate to the project directory and explore the generated structure:
```
cd apexconsulting
ls -la
```
Open the project in Visual Studio Code to examine the structure:

Key Directories Explained
- node_modules: Contains all project dependencies
- public: Root HTML files and static assets
- src: Main JavaScript files and stylesheets
- package.json: Project configuration and build scripts
Browser running React default app at http://localhost:3000.

Phase 1 – Dockerizing a React Application

Containerizing a React application involves three essential steps: creating a Dockerfile, building the Docker image, and running the container.

Step 1: Creating the Dockerfile

The Dockerfile acts as a blueprint with each command representing a layer in the final image. Create a file named Dockerfile (without extension) in the root directory:
```
FROM Node 24.8.0

WORKDIR /app

COPY package.json ./  
RUN npm install

COPY . .

EXPOSE 3000

CMD ["npm", "start"]
```
Dockerfile Breakdown
- FROM node:22.16: Base image from Docker Hub
- WORKDIR /app: Sets working directory to prevent file conflicts
- COPY package.json: Copies dependency list first for better caching
- RUN npm install: Downloads and installs dependencies
- COPY . .: Copies all application files (source = current directory, destination = /app)
- EXPOSE 3000: Declares the port the container will listen on
- CMD [“npm”, “start”]: Command executed when container starts
Step 2: Building the Docker Image

Build the image using the docker build command:
```
docker build -t apexconsulting-dev .
```
The build process may take several minutes on first run as it downloads dependencies. The dot (.) specifies the Dockerfile location.

Check the created image:
```
docker image ls
```
Note that the image size is approximately 2.63 GB due to node_modules dependencies and the Node.js runtime.

Step 3: Running the Docker Container

Start the container with port mapping:

bash
```
docker run -p 3000:3000 apexconsulting-dev
```
Testing the Application

Access your containerized React app through:
- Localhost: http://localhost:3000
- Container IP: http://[CONTAINER_IP]:3000
The Development Challenge

When you make code changes and refresh the browser, updates won’t appear automatically. This happens because the container contains a static copy of your code from build time.

Phase 2 – Hot Reload with Docker Volumes

Setting Up a Proper React Development Environment with Docker

Creating a local development environment that behaves consistently across different machines can be challenging. In this guide, we’ll improve a basic Docker setup to create a fully functional React development environment with hot reloading capabilities.

The Problem

When running a React app inside a Docker container, you’ll often encounter an issue where file changes on your host system aren’t detected inside the container. This means your app won’t automatically reload when you make code changes, forcing you to manually rebuild and restart the container for every minor modification—a frustrating and time-consuming workflow.

The Solution: Two Essential Steps

To solve this problem, we need to implement two key improvements:

Step 1: Configure the CHOKIDAR_USEPOLLING Environment Variable

The first step involves adding an environment variable called CHOKIDAR_USEPOLLING and setting its value to true.

CHOKIDAR is a file-watching library that react-scripts uses under the hood to detect file changes and trigger hot reloading during development. However, when running inside a Docker container—especially on Linux hosts with mounted volumes—native file system events often don’t work correctly across volume boundaries.

By setting CHOKIDAR_USEPOLLING=true, we force the chokidar library to use polling instead of relying on native file system events. This makes it reliably detect file updates even in volume-mounted directories.

Fun fact: “Chokidar” means “watchman” in Hindi, which perfectly describes its function.

You can set this environment variable directly in your Dockerfile using the ENV command:version: “3.9”
```
ENV CHOKIDAR_USEPOLLING=true
```
Step 2: Mount Your Local Code Using Docker Volumes

The second step involves using Docker volumes to share your local code directory with the container. This is accomplished using the -v flag when running your container.

Here’s the command structure:
```
docker run -v $(pwd):/app [other-options] [image-name]
```
In this command:
- $(pwd) refers to your current directory on the host machine
- /app is the target directory inside the container
- The volume mount creates a shared space between your host and container
Testing the Setup

1.Now let’s test the hot reloading functionality. Open your code editor and navigate to app.js

2. Make a change to the file—perhaps modify some text or styling in your component
1. Save the file and switch back to your browser. Watch as the changes appear automatically without any manual refresh!
You should see your changes reflected instantly—no manual rebuilding required! This demonstrates that you now have a fully functional React development environment with hot reloading capabilities running inside Docker.

Important Consideration

While this setup is perfect for development, keep in mind that polling uses more CPU resources compared to event-based file watching. Therefore, this configuration should never be used in production environments. The polling approach is strictly for development convenience and should be disabled or removed when deploying your application.

This Docker-based development setup eliminates the friction of constant rebuilding while maintaining the consistency and isolation benefits of containerized development. Your development workflow will be significantly smoother, allowing you to focus on writing code rather than managing build processes.

Phase 3 – Optimizing React Docker Images with Multi-Stage Builds

We previously containerized a React application using a simple Dockerfile. However, if you’ve been following along, you might have noticed that our Docker image is quite large—around 2.63 GB.

In this section, we’ll explore multi-stage builds and learn how to dramatically reduce our image size while improving security and deployment performance.

Understanding Multi-Stage Builds: The Apartment Building Analogy

Think of building a Docker image like constructing an apartment building. During construction, you need cranes, scaffolding, tools, and construction workers. But once the apartment is complete, all these construction tools are removed, and residents only receive the final furnished apartment with essential features. It wouldn’t make sense to deliver the apartment with all the construction equipment still inside.

Multi-stage Dockerfiles work exactly the same way. They allow us to use all necessary build tools and dependencies during the construction phase, then create a final image that contains only what’s needed to run the application.

Why Our Current Image Is So Large

Let’s examine why our simple Dockerfile produces such a hefty 2.64 GB image:

Root Cause Analysis
1. Full Node.js Base Image: We’re using the complete Node.js 24.8.0 image, which alone is approximately 1 GB when uncompressed.
2. Unnecessary Dependencies: Our build includes all development dependencies, temporary files, log files, and the entire node_modules directory—much of which isn’t needed in production.
3. Missing Optimization: We’re not separating build-time dependencies from runtime requirements.
Alpine Images: A Lighter Alternative

Before diving into multi-stage builds, it’s worth noting that Node.js offers Alpine-based images that are significantly smaller—typically under 100 MB compared to the full 1 GB versions.

While Alpine images are a good improvement, multi-stage builds offer an even better solution by completely eliminating the Node.js runtime from our final image.

The Multi-Stage Build Advantage

Multi-stage builds provide several compelling benefits:

1. Dramatically Smaller Image Sizes

By using one stage to build the application and a minimal second stage to serve it, we eliminate:
- Development tools and build dependencies
- The Node.js runtime itself
- Temporary build files
- Development-only npm packages
2. Enhanced Security

Smaller images mean a reduced attack surface. The final image doesn’t include build tools, development dependencies, or unnecessary system components that could potentially be exploited.

3. Cleaner Separation of Concerns

The build process is clearly separated from the runtime environment, making the Dockerfile easier to understand and maintain.

4. Faster Deployments

Smaller images result in:
- Faster push/pull operations from Docker registries
- Reduced bandwidth usage
- Quicker container startup times
- More efficient CI/CD pipelines
The Two-Stage Strategy

Our multi-stage approach will consist of:

Stage 1: Build Stage
- Use a full Node.js image with all build tools
- Install dependencies (including dev dependencies)
- Build the React application
- Generate optimized production assets
Stage 2: Serve Stage
- Use a lightweight web server (like Nginx)
- Copy only the built assets from Stage 1
- Configure the web server to serve the React app
- Result: A minimal, production-ready image
Best Practices for Dependency Management

When implementing multi-stage builds, consider these dependency management strategies:
- Separate Development and Production Dependencies: Use npm ci --only=production or maintain separate package files
- Use .dockerignore: Exclude unnecessary files like node_modules, log files, and temporary directories from being copied into the build context
- Layer Optimization: Structure your Dockerfile to maximize Docker’s layer caching benefits
Phase 4 – Implementing Multi-Stage Dockerfiles: From Theory to Practice

Now that we understand the benefits of multi-stage builds, let’s put theory into practice by creating a production-ready multi-stage Dockerfile. This implementation will dramatically reduce our image size while maintaining all the functionality we need.

Setting Up the Multi-Stage Dockerfile

Our new Dockerfile will consist of two distinct stages:
1. Build Stage: Downloads dependencies and builds the React application
2. Serve Stage: Uses a lightweight web server to serve only the built assets
Let’s start by backing up our previous Dockerfile and creating a new multi-stage version.

Stage 1: The Build Phase

The first stage handles all the heavy lifting required to build our React application:

Build stage
```
FROM node:24.8.0

WORKDIR /app

COPY package.json ./  
RUN npm install

COPY . .  
ENV CHOKIDAR_USEPOLLING=true

RUN npm run build
```
Here’s what each step accomplishes:
- FROM node: 22.4.80 AS builder: Pulls the full Node.js image and tags it as “builder” for later reference
- WORKDIR /app: Sets the working directory inside the container
- COPY package.json: Copies the dependency list to leverage Docker’s layer caching
- RUN npm install: Installs all project dependencies
- COPY . .: Copies all source code (src and public folders) into the container
- ENV CHOKIDAR_USEPOLLING=true: Enables live reloading (remember: not for production!)
- RUN npm run build: Generates optimized static files in the build directory
Stage 2: The Serve Stage

The second stage creates our lean production image:
```
FROM nginx:alpine

COPY --from=builder /app/build /usr/share/nginx/html

EXPOSE 80

CMD ["nginx", "-g", "daemon off;"]
```
This is where the magic happens:
- FROM nginx:alpine: Uses a minimal Alpine Linux-based Nginx image
- COPY –from=builder: Copies only the built assets from the builder stage
- EXPOSE 80: Declares that the container listens on port 80
- CMD: Starts Nginx in the foreground
The final image contains only static assets and Nginx—no Node.js, no source code, no development dependencies.

Building and Running the Multi-Stage Image

Let’s build our new multi-stage Dockerfile

docker build -t apexconsulting-multi .

Now let’s run our new container:

docker run -p 80:80 apexconsulting-multi

The Nginx server is now up and running, serving our React application on port 80.

Deep Dive: Understanding Image Layers

To truly understand how our multi-stage build works, let’s examine the image layer by layer using Docker’s history command:

docker image history apexconsulting-multi

You’ll notice one command that generates approximately 42.4MB, but the output is truncated. To see the full details, let’s enhance the command:

docker image history –format “table {{.CreatedBy}}\t{{.Size}}” –no-trunc apex-multi

Now we can see the complete multi-line command that generates the 40MB layer. This command downloads the Nginx package using curl, extracts it with tar xzvf, and installs Nginx along with its dynamic modules inside the Alpine Linux environment.

This detailed view is extremely valuable for:
- Understanding: See exactly what contributes to your image size
- Auditing: Verify what’s included in your production image
- Optimizing: Identify opportunities to reduce image size further
The Results

Our multi-stage approach has delivered exactly what we promised:
- Smaller image size: Dramatically reduced from our original 2.64GB
- Enhanced security: No development tools or source code in production
- Production-ready: Served by a robust, lightweight web server
- Clean separation: Build concerns separated from runtime concerns
This structure ensures you get a lean, secure, and production-ready Docker image that’s optimized for deployment and scalability.

Phase 5 -Making Your React App Production-Ready

Production-Ready Requirements

A production-ready React application needs:
- Performance: Optimized builds, minified assets, no dev tools or source code in production images
- Reliability: Proper error handling and graceful failure recovery
- Configuration: Environment variables for API keys and endpoints, no hard-coded values
Multi-stage Dockerfiles handle performance optimization, but we need to address reliability and configuration management.

Web Server Selection Criteria

When choosing a web server for static React apps, consider:
- Scalability: Handle traffic loads and growth (Nginx excels here)
- Flexibility: Custom configuration support for complex routing and caching
- Security: Proper HTTP headers, SSL/HTTPS support, secure file serving
- Docker Integration: Small image sizes, multi-stage build support
The SPA Routing Problem

Our current Nginx setup has an issue: navigating to routes like /about shows Nginx’s 404 page instead of letting React Router handle the routing. This happens because Nginx looks for physical files and serves its own 404 when they don’t exist.

Creating the Nginx Configuration

The nginx.conf file acts as the brain of your Nginx server, controlling how it handles requests, serves files, and manages routing for your React application.

Create a new nginx.conf file in your project root:
```
server {
    listen 80;
    server_name localhost;
    root /usr/share/nginx/html;
    index index.html;
    
    location / {
        try_files $uri $uri/ /index.html;
    }
}
```
Key Configuration Explained
- listen 80: Routes incoming HTTP traffic through port 80
- server_name localhost: Defines the domain (use your production domain in real deployments)
- root: Points to where Docker copied the React build output
- try_files $uri $uri/ /index.html: The critical line that enables SPA routing
The try_files directive works by:
1. Looking for a file matching the requested path ($uri)
2. If not found, serving index.html instead
3. This allows React Router to handle client-side routing
Updating the Dockerfile

Modify your multi-stage Dockerfile to include the custom Nginx configuration:
```
# Serve stage
FROM nginx:alpine

# Remove default config and copy custom config
RUN rm /etc/nginx/conf.d/default.conf
COPY nginx.conf /etc/nginx/conf.d/

COPY --from=builder /app/build /usr/share/nginx/html
EXPOSE 80
CMD ["nginx", "-g", "daemon off;"]
```
Building and Testing

Build the updated image:
```
docker build -t globo-multi .
```
Run the container:
```
docker run -p 80:80 globo-multi
```
Test the routing fix by navigating to localhost/about:

Environment Variables

Add environment variables to your Dockerfile:
```
ENV REACT_APP_ENVIRONMENT=development
```
Note: Use the REACT_APP_ prefix for Create React App compatibility.

Access the variable in your React code:
```
const env = process.env.REACT_APP_ENVIRONMENT;
```
After rebuilding and running, the environment value displays in the browser:

Container Management Best Practices

Using Named Containers

Instead of managing container IDs:
```
docker run -d --name react-app -p 80:80 globo-multi
```
Detached Mode

Use the -d flag to run containers in the background:

Easy Container Control

Stop containers by name:
```
docker stop react-app
```
This approach provides cleaner container management and reduces errors from copying container IDs.