• The “Aha!” Moment

    Picture this: You’re a consultant who’s seen legal teams spend 2-4 hours manually reviewing every contract, missing critical clauses, and burning through budget faster than you can say “unlimited liability.”

    So naturally, I thought: “Hey, AWS Bedrock just released Claude Sonnet 4.5… what if we could automate this?”

    Narrator: It was at this moment he knew…


    The Plan (It Was So Simple in My Head)

    The vision was straightforward:

    1. User uploads contract
    2. AI reads it
    3. Magic happens
    4. User gets risk analysis

    How hard could it be?

    (Spoiler: It was definitely harder than expected, but we got there.)


    Day 1: The AWS Setup Saga

    What I thought would happen:

    • Quick AWS account setup
    • Enable Bedrock
    • Done in 30 minutes

    What actually happened:

    • “Wait, there are different Claude models?”
    • “Why do I need inference profiles?”
    • “Model access requests… approved instantly!” (Okay, that part was smooth)

    Key learning: AWS has gotten WAY better about Bedrock access. No more waiting days for model approval. They just… let you in now. Revolutionary.


    Day 2: Lambda Functions (Where Things Got Real)

    I started with the classic developer move: “I’ll just write three simple Lambda functions.”

    Lambda 1 (Upload URL Generator): Straightforward. Pre-signed S3 URLs. Easy.

    Lambda 2 (The AI Brain):

    • “Just call Bedrock, get JSON, store in DynamoDB”
    • Attempt 1: Claude returned markdown code blocks wrapping my JSON
    • Attempt 2: Added markdown stripping
    • Attempt 3-7: CORS errors, IAM permission issues, wrong region calls
    • Attempt 8: IT WORKS!

    There were definitely some struggles I went through here. Like when Lambda kept trying to call us-east-2 instead of us-east-1. Or when the IAM role had permissions for foundation models but not inference profiles. Good times.

    Lambda 3 (Results Fetcher): Should be simple, right? Wrong. DynamoDB returns Decimal objects. Python’s json.dumps() hates Decimal objects. Added custom serializer. Crisis averted.

    Lesson learned: Always test with real data. Mock data never has Decimal objects.


    Day 3: The CORS Adventure

    Ah yes, CORS. The bane of every web developer’s existence.

    Error log excerpt:

    Access to XMLHttpRequest blocked by CORS policy
    Access to XMLHttpRequest blocked by CORS policy  
    Access to XMLHttpRequest blocked by CORS policy
    

    The journey:

    1. Enabled CORS on API Gateway → Still blocked
    2. Realized S3 ALSO needs CORS → Still blocked
    3. Discovered the CORS config didn’t actually apply to the bucket
    4. Manually entered it in AWS Console → Finally worked

    Pro tip: When AWS CLI says something succeeded but it didn’t… use the console. Sometimes you just need to see it with your own eyes.


    Day 4: The Frontend That Almost Defeated Me

    I’m primarily a backend guy, so React was… an experience.

    Initial design: Basic upload box, white background, Times New Roman vibes.

    What the client wanted: “Green Bay Packers theme, engaging, informational, APEX Consulting branding, professional landing page.”

    What I learned about Tailwind:

    • It’s amazing when it works
    • It’s confusing when you forget the PostCSS config
    • @tailwind base; goes at the TOP of index.css (don’t ask how long this took me to figure out)

    The logo incident:

    • Saved logo as apex-logo.png
    • Logo doesn’t show
    • Checked path: correct
    • Checked public folder: file is there
    • Tried different browsers: nothing
    • Realized Windows named it apex-logo.png.png
    • Face, meet palm

    Lesson learned: Always show file extensions in Windows Explorer.


    The Architecture

    After all the struggles, here’s what I ended up with:

    Frontend (React + Tailwind):

    • Drag-and-drop upload with progress tracking
    • Real-time polling for analysis status
    • Beautiful green/gold themed landing page
    • Professional results dashboard

    Backend (AWS Serverless):

    • Lambda 1: Generates pre-signed S3 URLs (secure uploads)
    • Lambda 2: Calls Bedrock, analyzes contract, stores results
    • Lambda 3: Retrieves analysis from DynamoDB
    • API Gateway: Ties it all together
    • S3: Contract storage
    • DynamoDB: Analysis results

    AI Engine:

    • Claude Sonnet 4.5 via Bedrock
    • Custom playbook with 10 compliance rules
    • Returns structured JSON with risk scores

    It analyzes a 10-page contract in 60 seconds. Manual review would take 2-4 hours.


    The Numbers (AKA Why This Matters)

    Time savings: 85% (2 hours → 60 seconds)
    Cost savings: 98% ($600 → $5 per contract)
    Accuracy improvement: 95%+ (Claude doesn’t get tired or miss things)

    For a company processing 100 contracts/month:

    • Before: $60,000/month
    • After: $500/month
    • Annual savings: $714,000

    Yeah, I did the math. Twice.


    What I’d Do Differently

    If I could go back and tell myself one thing:

    1. Set up CORS first. Like, day one. Before anything else. Your future self will thank you.
    2. Use the AWS Console for debugging. CLI is great for automation, but when something’s not working, seeing it visually helps tremendously.
    3. Read the Bedrock docs about inference profiles. That us. prefix in the model ID? Kind of important.
    4. Test Lambda functions locally first. Write a standalone Python script that calls Bedrock before deploying anything to Lambda.
    5. Don’t trust Git Bash on Windows for JSON payloads.

    Key Technical Wins

    Despite the struggles, there were some genuinely cool technical achievements:

    1. Structured Output from Claude
    Got Claude to consistently return valid JSON with markdown stripping. No small feat with LLMs.

    2. Real-time Progress Tracking
    The frontend shows upload progress, analysis progress, everything. Users aren’t just staring at a loading spinner.

    3. Security Done Right

    • Pre-signed S3 URLs (5-minute expiration)
    • IAM least-privilege permissions
    • All data stays in customer’s AWS account
    • No third-party data sharing

    4. Cost Optimization
    $5 per 100 contracts. That’s not a typo. Serverless + pay-per-use pricing = magical economics.

    5. Production-Ready Architecture
    This isn’t a proof-of-concept. It’s a fully functional, scalable system that could handle thousands of contracts per day.


    The Final Product

    After a week of coding, debugging, Googling error messages, and questioning my life choices, I ended up with:

    • A professional SaaS landing page with my branding
    • Full contract upload and analysis workflow
    • AI-powered risk detection using Claude Sonnet 4.5
    • Detailed results with risk scores and recommendations
    • Complete AWS serverless backend
    • Total cost: Less than dinner for two

    And it actually works.


    What I Learned

    Technical:

    • Bedrock inference profiles require special IAM permissions
    • CORS needs to be enabled EVERYWHERE (API Gateway, S3, Lambda responses)
    • DynamoDB uses Decimal objects (Python hates them)
    • Tailwind CSS is powerful but configuration-sensitive
    • Windows hides file extensions by default (why though?)

    Business:

    • AI can deliver 98% cost savings on specific workflows
    • Privacy-first architecture is a legitimate competitive advantage
    • Clear ROI calculations matter (saved $714K/year > “AI is cool”)
    • Target market identification is crucial (mid-market companies, not enterprises)

    Personal:

    • I can build full-stack apps (even if the frontend makes me nervous)
    • Reading error messages carefully saves hours of debugging
    • Coffee is essential
    • Sometimes you just need to walk away and come back fresh

    Try It Yourself

    The full code is on GitHub: https://github.com/Faaeznaf/ContractGuard

    Stack:

    • Frontend: React, Tailwind CSS
    • Backend: AWS Lambda (Python 3.11), API Gateway, S3, DynamoDB
    • AI: AWS Bedrock (Claude Sonnet 4.5)

    Cost to run: $5/month for 100 contracts

    Deployment time: 60-90 minutes (if you avoid my mistakes)


    Final Thoughts

    Building ContractGuard taught me that:

    1. Modern AI tools like Claude Sonnet 4.5 are genuinely transformative
    2. AWS serverless architecture is powerful (once you figure out CORS)
    3. Good UX matters (even for B2B tools)
    4. Reading the docs first would have saved me 6 hours
    5. There’s nothing quite like seeing your code analyze a real contract and catch issues a human would miss

    Would I do it again?

    Absolutely. Though maybe I’d enable CORS on day one.


    Connect With Me

    Want to chat about AI, AWS, or why CORS is the worst?


    P.S. If you’re building something similar and get stuck on CORS, DM me. I’ve suffered enough for both of us.

  • Introduction

    Infrastructure as Code has revolutionized how we deploy cloud resources. Instead of clicking through AWS console menus for hours, you can define your entire infrastructure in code and deploy it in minutes. In this hands-on tutorial, I’ll walk you through deploying a complete multi-tier architecture on AWS using Terraform—covering everything from IAM user creation to verifying your deployed resources.

    What You’ll Build:

    • Virtual Private Cloud (VPC) with public and private subnets
    • EC2 instances for application servers
    • RDS database for data persistence
    • Application Load Balancer for traffic distribution
    • Complete multi-tier architecture ready for production workloads

    Prerequisites:

    • Basic understanding of AWS services
    • Familiarity with command-line operations
    • A virtual machine or local Linux environment
    • AWS account access

    Part 1: Setting Up IAM User with Terraform Permissions

    Before Terraform can create resources on your behalf, it needs proper AWS credentials. We’ll create a dedicated IAM user with administrative access.

    Step 1: Connect to Your Virtual Machine

    First, SSH into your virtual machine using the provided credentials:

    bash

    ssh cloud_user@your_vm_ip_address
    

    Replace your_vm_ip_address with your actual VM’s public IP address.

    Step 2: Access the AWS Management Console

    Open your web browser in an incognito/private window (this prevents credential conflicts if you have multiple AWS accounts) and navigate to the AWS Management Console. Log in with your credentials.

    Pro Tip: Using incognito mode is a best practice when working with multiple AWS accounts or lab environments.

    Step 3: Navigate to IAM

    In the AWS Console search bar, type “IAM” and select Identity and Access Management from the results.

    Step 4: Create the Terraform User

    1. Click on Users in the left sidebar
    2. Click the Create User button
    3. Enter terraform-user as the username
    4. Click Next

    Step 5: Assign Administrative Permissions

    On the Set permissions page:

    1. Select Attach policies directly
    2. Search for AdministratorAccess
    3. Check the box next to the AdministratorAccess policy
    4. Click Next

    Why AdministratorAccess? For this tutorial, we’re granting full permissions to simplify the deployment. In production environments, you should follow the principle of least privilege and grant only the specific permissions Terraform needs.

    Step 6: Complete User Creation

    Review your selections and click Create user. You should see a success message confirming the user was created.

    Step 7: Generate Access Keys

    This is the critical step—creating the credentials Terraform will use:

    1. Click on your newly created terraform-user
    2. Navigate to the Security credentials tab
    3. Scroll down to Access keys and click Create access key
    4. Select Command Line Interface (CLI) as the use case
    5. Check the confirmation box and click Next
    6. Add a description tag (optional) and click Create access key

    IMPORTANT: You’ll see your Access Key ID and Secret Access Key. This is the ONLY time you’ll see the secret key!

    • Copy both values and store them securely
    • Never commit these credentials to Git
    • Consider using a password manager

    Part 2: Installing Git and Cloning the Terraform Repository

    Now that we have AWS credentials, let’s get the Terraform code onto our virtual machine.

    Step 1: Update Your System

    Back in your VM terminal, update the package index to ensure you have access to the latest packages:

    bash

    sudo yum update -y
    

    The -y flag automatically answers “yes” to all prompts, streamlining the installation process.

    Step 2: Install Git

    Install Git to enable repository cloning:

    bash

    sudo yum install git -y
    

    Step 3: Verify Git Installation

    Confirm Git is installed correctly:

    bash

    git --version
    

    You should see output similar to git version 2.x.x.

    Step 4: Clone the Terraform Repository

    Clone the repository containing the multi-tier architecture code:

    bash

    git clone https://github.com/Faaeznaf/apex-multi-tier-architechture.git
    

    Step 5: Navigate to the Project Directory

    Change into the cloned directory:

    bash

    cd multi-tier-terraform
    

    You’re now in the project root where all the Terraform configuration files reside.


    Part 3: Installing Dependencies and Configuring AWS CLI

    The repository includes an installation script that handles all the necessary dependencies. We’ll also need to configure the AWS CLI with our credentials.

    Step 1: Make the Installation Script Executable

    Before running any script, you need to grant it execute permissions:

    bash

    chmod +x install.sh
    

    What does chmod +x do? It adds execute (+x) permissions to the file, allowing you to run it as a program.

    Step 2: Run the Installation Script

    Execute the script to install Terraform and other required tools:

    bash

    ./install.sh
    

    The script will automatically download and install:

    • Terraform CLI
    • AWS CLI (if not already installed)
    • Any other project dependencies

    Step 3: Install Git LFS (Large File Storage)

    The repository uses Git LFS to manage large binary files efficiently. Install and configure it:

    bash

    sudo yum install epel-release -y
    sudo yum install git-lfs -y
    git lfs install
    git lfs pull
    

    Why Git LFS? Large files like Terraform provider binaries or module packages are stored separately from the main repository to keep clone times fast. Git LFS downloads these files only when needed.

    Step 4: Configure AWS CLI

    Now we’ll configure the AWS CLI with the credentials we created earlier:

    bash

    aws configure
    

    You’ll be prompted for four values:

    1. AWS Access Key ID: Paste the Access Key ID you saved earlier
    2. AWS Secret Access Key: Paste the Secret Access Key you saved earlier
    3. Default region name: Enter us-east-1
    4. Default output format: Press Enter (leave blank for default JSON format)

    Security Note: These credentials are now stored in ~/.aws/credentials. Be careful not to expose this file or commit it to version control.


    Part 4: Deploying Infrastructure with Terraform

    This is where the magic happens. Terraform will read your infrastructure code and create all the AWS resources automatically.

    Step 1: Initialize Terraform

    Initialization prepares your Terraform working directory:

    bash

    terraform init
    

    What happens during terraform init?

    • Downloads required provider plugins (AWS provider in this case)
    • Initializes the backend for state storage
    • Prepares modules and dependencies

    You should see output ending with “Terraform has been successfully initialized!”

    Step 2: Preview the Infrastructure Changes

    Before creating any resources, let’s see what Terraform plans to do:

    bash

    terraform plan
    

    This command performs a “dry run” that shows:

    • Resources that will be created (marked with +)
    • Resources that will be modified (marked with ~)
    • Resources that will be destroyed (marked with -)
    • Total count of changes

    Pro Tip: Always review the terraform plan output carefully before applying changes. This prevents accidental resource deletions or modifications.

    Step 3: Apply the Terraform Configuration

    Now let’s create the actual infrastructure:

    bash

    terraform apply
    

    Terraform will display the same plan output and prompt you for confirmation. Type yes to proceed.

    What’s being created?

    • VPC with CIDR blocks
    • Public and private subnets across multiple availability zones
    • Internet Gateway and NAT Gateways
    • Route tables and associations
    • Security groups with ingress/egress rules
    • EC2 instances for application tier
    • RDS database instance
    • Application Load Balancer with target groups
    • Auto-scaling configurations (if included)

    This process typically takes 5-15 minutes depending on the complexity of your architecture.


    Part 5: Verifying Your Deployed Resources

    Let’s confirm everything was created correctly by checking the AWS Console.

    Step 1: Verify VPC Creation

    1. In the AWS Console, search for and navigate to VPC
    2. You should see your newly created VPC with custom CIDR block
    3. Check the subnets tab to verify public and private subnets were created

    Step 2: Verify EC2 Instances

    1. Navigate to the EC2 service
    2. Click on Instances in the left sidebar
    3. You should see your application instances in a “running” state

    Step 3: Verify RDS Database

    1. Search for and navigate to RDS
    2. Check the Databases section
    3. Your database instance should be visible with status “Available”

    Step 4: Verify Load Balancer

    1. Navigate back to EC2 service
    2. Click on Load Balancers under the “Load Balancing” section in the left sidebar
    3. Your Application Load Balancer should be listed with state “Active”
    4. Note the DNS name—this is how you’ll access your application


    What You’ve Accomplished

    Congratulations! You’ve successfully:

    • Created a dedicated IAM user with programmatic access
    • Configured your local environment with AWS CLI
    • Cloned and prepared a Terraform repository
    • Deployed a complete multi-tier architecture including:
      • Network layer (VPC, subnets, routing)
      • Compute layer (EC2 instances)
      • Data layer (RDS database)
      • Load balancing and traffic management
    Key Takeaways
    1. Infrastructure as Code is Powerful: What would take hours to click through in the console took just one command with Terraform
    2. Version Control Your Infrastructure: Your entire infrastructure is now defined in code, making it reproducible and version-controlled
    3. Security Best Practices Matter: Using a dedicated IAM user instead of root credentials is crucial for production environments
    4. Always Review Before Applying: The terraform plan command is your safety net—use it religiously
    Next Steps

    Consider enhancing this deployment with:

    • Terraform State Management: Move state to S3 with DynamoDB locking
    • CI/CD Integration: Automate deployments with GitHub Actions or GitLab CI
    • Monitoring: Add CloudWatch dashboards and alarms
    • Cost Optimization: Implement auto-scaling policies and right-size instances
    • Security Hardening: Implement least-privilege IAM policies and enable VPC Flow Logs
    Cleanup

    Important: To avoid AWS charges, don’t forget to destroy your resources when you’re done:

    bash

    terraform destroy
    

    Type yes when prompted. Terraform will remove all created resources in the correct order, respecting dependencies.

    Full project code: GitHub Repository – Apex Multi-Tier Architecture

    Found this helpful? Connect with me on LinkedIn and share your own Terraform deployment experiences!

  • Traditional signature-based security systems struggle against sophisticated cyber threats. This guide demonstrates how to build a production-ready, machine learning-powered threat detection system using AWS Lambda for preprocessing and SageMaker for inference.

    The Problem Statement

    Modern cyber threats evolve rapidly, requiring intelligent systems that can identify attack patterns in real-time. Organizations need scalable, cost-effective solutions that can process network logs and detect multiple attack vectors including SQL injection, DoS attacks, port scanning, brute force attempts, and data exfiltration.

    Architecture Overview

    Our threat detection pipeline processes network logs through automated stages:

    Network Logs → Lambda Preprocessing → Feature Extraction → 
    SageMaker Batch Transform → Threat Classification → SNS Alerts
    

    The system extracts 36 sophisticated features from raw network logs to identify six threat categories with 100% validation accuracy.

    Technologies Used

    • AWS Lambda: Serverless preprocessing and feature extraction
    • Amazon SageMaker: ML model training and batch inference
    • Amazon S3: Data lake architecture
    • Amazon SNS: Real-time threat alerting
    • Python: XGBoost, Pandas, Scikit-learn
    • Machine Learning: 36 engineered features for threat detection
    Phase 1: Development Environment Setup

    Set up the Python environment with proper dependency management:

    mkdir cyber-threat-detection-sagemaker
    cd cyber-threat-detection-sagemaker
    
    python -m venv .venv
    .\.venv\Scripts\Activate.ps1
    
    pip install --only-binary=all numpy pandas scikit-learn
    pip install boto3 xgboost
    

    Configure AWS CLI credentials:

    aws configure
    aws sts get-caller-identity
    
    Phase 2: Advanced Feature Engineering

    The Lambda preprocessing function transforms raw logs into 36 threat indicators across five categories:

    Feature Categories

    1. Network Behavior Analysis (12 features)
    • IP geolocation risk assessment
    • Internal vs external connection detection
    • Protocol anomaly detection
    • Traffic volume patterns
    1. Temporal Pattern Analysis (4 features)
    • Business hours detection
    • Weekend activity monitoring
    • Time-based attack patterns
    1. Port and Service Analysis (7 features)
    • High-risk port identification
    • Service categorization
    • Port usage patterns
    1. Attack Pattern Detection (8 features)
    • SQL injection pattern matching
    • XSS attempt detection
    • DoS attack indicators
    • Suspicious user agent identification
    1. Traffic Characteristics (5 features)
    • Bytes ratio analysis
    • Upload/download patterns
    • Connection frequency monitoring
    Phase 3: Realistic Threat Data Generation

    Created a dataset with 2,500 network log records across six threat categories:

    Threat TypePercentageCountKey Characteristics
    Normal Traffic60.6%1,500Regular browsing, DNS queries
    SQL Injection5.9%150Malicious payloads, suspicious agents
    DoS/DDoS7.9%200High frequency, small packets
    Port Scanning12.0%300Sequential ports, short duration
    Brute Force9.8%250Repeated login attempts
    Data Exfiltration3.7%100Large outbound transfers
    Phase 4: Machine Learning Model Training

    The XGBoost training pipeline implements enterprise-grade practices:

    • Class imbalance handling with automatic weight computation
    • Robust data validation for infinite/NaN values
    • Comprehensive per-class metrics and AUC scores
    • Feature importance analysis for interpretability
    • SageMaker-compatible interface

    Model Performance

    • Training Accuracy: 99.96%
    • Validation Accuracy: 100%
    • Validation AUC: 1.0000
    • Per-Class Detection: Perfect precision/recall across all threat types

    Top 5 Critical Features

    1. is_external_connection (29.94%)
    2. ip_geolocation_risk (12.38%)
    3. is_well_known_port (10.62%)
    4. potential_dos (9.69%)
    5. protocol_numeric (9.49%)
    Phase 5: AWS Production Deployment

    The deployment script executes six automated steps to create the complete infrastructure:

    Step 1: S3 Bucket Creation

    Three buckets with account-specific naming (Account ID: xxxxxxxxxxxx):

    • cyber-threat-raw-data-xxxxxxxxxxxx – Network log ingestion
    • cyber-threat-processed-data-xxxxxxxxxxxx – Feature storage
    • cyber-threat-model-artifacts-xxxxxxxxxxxx – Model and results

    Step 2: IAM Role Configuration

    Creates CyberThreatDetectionSageMakerRole with:

    • Trust policies for SageMaker and Lambda services
    • S3 read/write permissions for all threat detection buckets
    • CloudWatch logging capabilities
    • SageMaker training and inference permissions

    Step 3: Training Data Upload

    Uploads datasets to S3:

    • Raw network logs to raw-data/ folder
    • Processed features to train/ folder

    Step 4: Lambda Function Deployment

    Deploys cyber-threat-detector-sagemaker:

    • Python 3.9 runtime with preprocessing code
    • 300-second timeout, 256MB memory
    • Integrated with SageMaker for batch inference
    • Automatic updates for existing functions

    Step 5: SageMaker Model Registration

    Registers trained XGBoost model:

    • Model name: cyber-threat-detector-xxxxxxxxxx
    • Ready for batch transform jobs
    • Configured for real-time threat classification

    Step 6: SNS Alert Configuration

    Sets up cyber-threat-alerts topic:

    • Real-time threat notifications
    • Email/SMS subscription support
    • Integration with Lambda for automated alerting
    Phase 6: Production Testing and Validation

    Deployed Resources Summary

    ComponentResource NamePurpose
    Lambda Functioncyber-threat-detector-sagemakerPreprocessing & SageMaker integration
    SageMaker Modelcyber-threat-detector-xxxxxxxxxxXGBoost threat classification
    S3 Raw Datacyber-threat-raw-data-xxxxxxxxxxxxNetwork log ingestion
    S3 Processedcyber-threat-processed-data-xxxxxxxxxxxxFeature storage
    S3 Artifactscyber-threat-model-artifacts-xxxxxxxxxxxxModel & results storage
    SNS Topiccyber-threat-alertsThreat notifications
    IAM RoleCyberThreatDetectionSageMakerRoleSecurity & permissions

    Real-Time Threat Detection Flow

    The system processes network logs automatically:

    1. Upload: CSV files uploaded to raw-data/ folder
    2. Trigger: S3 event invokes Lambda function
    3. Feature Extraction: 36 features extracted from each log
    4. Batch Transform: SageMaker processes features
    5. Classification: Model identifies threat patterns
    6. Alerting: SNS sends notifications for detected threats

    Testing the Pipeline

    Upload test network logs to verify end-to-end processing:

    aws s3 cp test_network_logs.csv s3://cyber-threat-raw-data-xxxxxxxxxxxx/
    
    Subscribe to Threat Alerts

    Configure email notifications:

    aws sns subscribe \
      --topic-arn arn:aws:sns:us-east-1:xxxxxxxxxxxx:cyber-threat-alerts \
      --protocol email \
      --notification-endpoint your-email@company.com
    
    Production Pipeline in Action

    Automatic Processing Workflow

    When network logs are uploaded to S3, the system automatically:

    1. Triggers Lambda function via S3 event
    2. Extracts 36 features from raw logs
    3. Submits batch transform job to SageMaker
    4. Model classifies threats with confidence scores
    5. SNS sends alerts for detected threats

    Performance Metrics

    • Processing Speed: 2,500+ records in ~5 minutes
    • Feature Extraction: 36 features per network log entry
    • Detection Latency: Sub-minute threat identification
    • Cost Efficiency: Pay-per-use serverless architecture
    • Scalability: Auto-scales with traffic volume
    Architecture Benefits

    Serverless Advantages

    • Cost-effective pay-per-use model
    • Automatic scaling with traffic spikes
    • High availability with AWS-managed infrastructure
    • Sub-minute threat detection capabilities

    Security Features

    • IP address hashing for privacy
    • S3 server-side encryption at rest
    • Least-privilege IAM access controls
    • CloudTrail audit logging enabled

    System Capabilities

    • Processing: 2,500+ records in ~5 minutes
    • Accuracy: 100% validation performance across all threat types
    • Features: 36 sophisticated threat indicators extracted per log
    • Cost: ~$10-50 monthly depending on usage volume
    Key Achievements
    • Complete AWS infrastructure deployed with Lambda and SageMaker integration
    • Advanced ML pipeline with 36 engineered features
    • Real-time processing via serverless Lambda functions
    • SageMaker batch transform for scalable inference
    • Perfect model accuracy with 100% validation performance
    • Production testing verified with SNS alerting
    • Cost-effective serverless architecture (~$10-50/month)
    Future Enhancements
    • Real-time inference endpoints for immediate threat detection
    • CloudWatch dashboards for monitoring and metrics
    • Kinesis integration for streaming log processing
    • SIEM integration (Splunk, QRadar) for enterprise security
    • Automated response workflows for threat remediation
    • Multi-region deployment for high availability
    Conclusion

    This production-ready threat detection pipeline demonstrates enterprise-level security capabilities using AWS Lambda for preprocessing and SageMaker for inference. The system achieves perfect accuracy in detecting six attack types while maintaining cost-effective, scalable infrastructure.

    The complete serverless architecture processes 2,500+ records in ~5 minutes with automatic alerting through SNS, providing organizations with real-time threat visibility at minimal operational cost.


    Complete source code and deployment scripts: [GitHub Repository Link]

  • Date: 2025-09-26

    Tags: ‘Docker’, ‘React’, ‘Multi-stage builds’, ‘Containerization’, ‘Nginx’, ‘Production deployment’, ‘Live reloading’, ‘Docker volumes’, ‘Environment variables’, ‘Image optimization’]

    Docker and React together provide a powerful solution for building and delivering modern web applications. Docker ensures consistent environments across development and production, while React creates fast, dynamic user interfaces—all packaged into lightweight, portable containers for reliable deployment anywhere.

    The Problem Statement

    Modern applications often face scalability challenges as they grow. Monolithic applications can experience significant performance issues when customer base and revenue increase rapidly, with key metrics approaching SLA limits.

    The solution involves refactoring monolithic applications into microservices architecture. Docker has been identified as a critical component in this transformation, starting with containerizing React front-end applications.

    📁 Complete Source Code

    All the code, configurations, and examples from this project are available in the GitHub repository:

    🔗 ApexConsulting React Docker Project

    Clone the repository to follow along

    Software Requirements

    • Docker: 28.2.2
    • Node.js: 24.8.0
    • npm: 11.4.1
    • React: 19.1.0
    • Editor: Microsoft Visual Studio Code (or any preferred editor)

    Creating the Sample Application

    We’ll use Create React App CLI to scaffold our project with zero configuration:

    npx create-react-app apexconsulting
    
    

    This process installs all required packages and may take a minute or two to complete.

    Terminal showing successful completion message

    Project Structure Overview

    Navigate to the project directory and explore the generated structure:

    cd apexconsulting
    ls -la
    
    

    Open the project in Visual Studio Code to examine the structure:

    Key Directories Explained

    • node_modules: Contains all project dependencies
    • public: Root HTML files and static assets
    • src: Main JavaScript files and stylesheets
    • package.json: Project configuration and build scripts

    Browser running React default app at http://localhost:3000.

    Phase 1 – Dockerizing a React Application

    Containerizing a React application involves three essential steps: creating a Dockerfile, building the Docker image, and running the container.

    Step 1: Creating the Dockerfile

    The Dockerfile acts as a blueprint with each command representing a layer in the final image. Create a file named Dockerfile (without extension) in the root directory:

    FROM Node 24.8.0
    
    WORKDIR /app
    
    COPY package.json ./  
    RUN npm install
    
    COPY . .
    
    EXPOSE 3000
    
    CMD ["npm", "start"]
    
    
    Dockerfile Breakdown
    • FROM node:22.16: Base image from Docker Hub
    • WORKDIR /app: Sets working directory to prevent file conflicts
    • COPY package.json: Copies dependency list first for better caching
    • RUN npm install: Downloads and installs dependencies
    • COPY . .: Copies all application files (source = current directory, destination = /app)
    • EXPOSE 3000: Declares the port the container will listen on
    • CMD [“npm”, “start”]: Command executed when container starts
    Step 2: Building the Docker Image

    Build the image using the docker build command:

    docker build -t apexconsulting-dev .

    The build process may take several minutes on first run as it downloads dependencies. The dot (.) specifies the Dockerfile location.

    Check the created image:

    docker image ls

    Note that the image size is approximately 2.63 GB due to node_modules dependencies and the Node.js runtime.

    Step 3: Running the Docker Container

    Start the container with port mapping:

    bash

    docker run -p 3000:3000 apexconsulting-dev
    
    Testing the Application

    Access your containerized React app through:

    • Localhost: http://localhost:3000
    • Container IP: http://[CONTAINER_IP]:3000

    The Development Challenge

    When you make code changes and refresh the browser, updates won’t appear automatically. This happens because the container contains a static copy of your code from build time.

    Phase 2 – Hot Reload with Docker Volumes

    Setting Up a Proper React Development Environment with Docker

    Creating a local development environment that behaves consistently across different machines can be challenging. In this guide, we’ll improve a basic Docker setup to create a fully functional React development environment with hot reloading capabilities.

    The Problem

    When running a React app inside a Docker container, you’ll often encounter an issue where file changes on your host system aren’t detected inside the container. This means your app won’t automatically reload when you make code changes, forcing you to manually rebuild and restart the container for every minor modification—a frustrating and time-consuming workflow.

    The Solution: Two Essential Steps

    To solve this problem, we need to implement two key improvements:

    Step 1: Configure the CHOKIDAR_USEPOLLING Environment Variable

    The first step involves adding an environment variable called CHOKIDAR_USEPOLLING and setting its value to true.

    CHOKIDAR is a file-watching library that react-scripts uses under the hood to detect file changes and trigger hot reloading during development. However, when running inside a Docker container—especially on Linux hosts with mounted volumes—native file system events often don’t work correctly across volume boundaries.

    By setting CHOKIDAR_USEPOLLING=true, we force the chokidar library to use polling instead of relying on native file system events. This makes it reliably detect file updates even in volume-mounted directories.

    Fun fact: “Chokidar” means “watchman” in Hindi, which perfectly describes its function.

    You can set this environment variable directly in your Dockerfile using the ENV command:version: “3.9”

    ENV CHOKIDAR_USEPOLLING=true
    Step 2: Mount Your Local Code Using Docker Volumes

    The second step involves using Docker volumes to share your local code directory with the container. This is accomplished using the -v flag when running your container.

    Here’s the command structure:

    docker run -v $(pwd):/app [other-options] [image-name]

    In this command:

    • $(pwd) refers to your current directory on the host machine
    • /app is the target directory inside the container
    • The volume mount creates a shared space between your host and container
    Testing the Setup

    1.Now let’s test the hot reloading functionality. Open your code editor and navigate to app.js

    2. Make a change to the file—perhaps modify some text or styling in your component

    1. Save the file and switch back to your browser. Watch as the changes appear automatically without any manual refresh!

    You should see your changes reflected instantly—no manual rebuilding required! This demonstrates that you now have a fully functional React development environment with hot reloading capabilities running inside Docker.

    Important Consideration

    While this setup is perfect for development, keep in mind that polling uses more CPU resources compared to event-based file watching. Therefore, this configuration should never be used in production environments. The polling approach is strictly for development convenience and should be disabled or removed when deploying your application.


    This Docker-based development setup eliminates the friction of constant rebuilding while maintaining the consistency and isolation benefits of containerized development. Your development workflow will be significantly smoother, allowing you to focus on writing code rather than managing build processes.

    Phase 3 – Optimizing React Docker Images with Multi-Stage Builds

    We previously containerized a React application using a simple Dockerfile. However, if you’ve been following along, you might have noticed that our Docker image is quite large—around 2.63 GB.

    In this section, we’ll explore multi-stage builds and learn how to dramatically reduce our image size while improving security and deployment performance.

    Understanding Multi-Stage Builds: The Apartment Building Analogy

    Think of building a Docker image like constructing an apartment building. During construction, you need cranes, scaffolding, tools, and construction workers. But once the apartment is complete, all these construction tools are removed, and residents only receive the final furnished apartment with essential features. It wouldn’t make sense to deliver the apartment with all the construction equipment still inside.

    Multi-stage Dockerfiles work exactly the same way. They allow us to use all necessary build tools and dependencies during the construction phase, then create a final image that contains only what’s needed to run the application.

    Why Our Current Image Is So Large

    Let’s examine why our simple Dockerfile produces such a hefty 2.64 GB image:

    Root Cause Analysis
    1. Full Node.js Base Image: We’re using the complete Node.js 24.8.0 image, which alone is approximately 1 GB when uncompressed.
    2. Unnecessary Dependencies: Our build includes all development dependencies, temporary files, log files, and the entire node_modules directory—much of which isn’t needed in production.
    3. Missing Optimization: We’re not separating build-time dependencies from runtime requirements.
    Alpine Images: A Lighter Alternative

    Before diving into multi-stage builds, it’s worth noting that Node.js offers Alpine-based images that are significantly smaller—typically under 100 MB compared to the full 1 GB versions.

    While Alpine images are a good improvement, multi-stage builds offer an even better solution by completely eliminating the Node.js runtime from our final image.

    The Multi-Stage Build Advantage

    Multi-stage builds provide several compelling benefits:

    1. Dramatically Smaller Image Sizes

    By using one stage to build the application and a minimal second stage to serve it, we eliminate:

    • Development tools and build dependencies
    • The Node.js runtime itself
    • Temporary build files
    • Development-only npm packages
    2. Enhanced Security

    Smaller images mean a reduced attack surface. The final image doesn’t include build tools, development dependencies, or unnecessary system components that could potentially be exploited.

    3. Cleaner Separation of Concerns

    The build process is clearly separated from the runtime environment, making the Dockerfile easier to understand and maintain.

    4. Faster Deployments

    Smaller images result in:

    • Faster push/pull operations from Docker registries
    • Reduced bandwidth usage
    • Quicker container startup times
    • More efficient CI/CD pipelines
    The Two-Stage Strategy

    Our multi-stage approach will consist of:

    Stage 1: Build Stage

    • Use a full Node.js image with all build tools
    • Install dependencies (including dev dependencies)
    • Build the React application
    • Generate optimized production assets

    Stage 2: Serve Stage

    • Use a lightweight web server (like Nginx)
    • Copy only the built assets from Stage 1
    • Configure the web server to serve the React app
    • Result: A minimal, production-ready image
    Best Practices for Dependency Management

    When implementing multi-stage builds, consider these dependency management strategies:

    • Separate Development and Production Dependencies: Use npm ci --only=production or maintain separate package files
    • Use .dockerignore: Exclude unnecessary files like node_modules, log files, and temporary directories from being copied into the build context
    • Layer Optimization: Structure your Dockerfile to maximize Docker’s layer caching benefits
    Phase 4 – Implementing Multi-Stage Dockerfiles: From Theory to Practice

    Now that we understand the benefits of multi-stage builds, let’s put theory into practice by creating a production-ready multi-stage Dockerfile. This implementation will dramatically reduce our image size while maintaining all the functionality we need.

    Setting Up the Multi-Stage Dockerfile

    Our new Dockerfile will consist of two distinct stages:

    1. Build Stage: Downloads dependencies and builds the React application
    2. Serve Stage: Uses a lightweight web server to serve only the built assets

    Let’s start by backing up our previous Dockerfile and creating a new multi-stage version.

    Stage 1: The Build Phase

    The first stage handles all the heavy lifting required to build our React application:

    Build stage
    FROM node:24.8.0
    
    WORKDIR /app
    
    COPY package.json ./  
    RUN npm install
    
    COPY . .  
    ENV CHOKIDAR_USEPOLLING=true
    
    RUN npm run build
    
    

    Here’s what each step accomplishes:

    • FROM node: 22.4.80 AS builder: Pulls the full Node.js image and tags it as “builder” for later reference
    • WORKDIR /app: Sets the working directory inside the container
    • COPY package.json: Copies the dependency list to leverage Docker’s layer caching
    • RUN npm install: Installs all project dependencies
    • COPY . .: Copies all source code (src and public folders) into the container
    • ENV CHOKIDAR_USEPOLLING=true: Enables live reloading (remember: not for production!)
    • RUN npm run build: Generates optimized static files in the build directory
    Stage 2: The Serve Stage

    The second stage creates our lean production image:

    FROM nginx:alpine
    
    COPY --from=builder /app/build /usr/share/nginx/html
    
    EXPOSE 80
    
    CMD ["nginx", "-g", "daemon off;"]
    
    

    This is where the magic happens:

    • FROM nginx:alpine: Uses a minimal Alpine Linux-based Nginx image
    • COPY –from=builder: Copies only the built assets from the builder stage
    • EXPOSE 80: Declares that the container listens on port 80
    • CMD: Starts Nginx in the foreground

    The final image contains only static assets and Nginx—no Node.js, no source code, no development dependencies.

    Building and Running the Multi-Stage Image

    Let’s build our new multi-stage Dockerfile

    docker build -t apexconsulting-multi .

    Now let’s run our new container:

    docker run -p 80:80 apexconsulting-multi

    The Nginx server is now up and running, serving our React application on port 80.

    Deep Dive: Understanding Image Layers

    To truly understand how our multi-stage build works, let’s examine the image layer by layer using Docker’s history command:

    docker image history apexconsulting-multi

    You’ll notice one command that generates approximately 42.4MB, but the output is truncated. To see the full details, let’s enhance the command:

    docker image history –format “table {{.CreatedBy}}\t{{.Size}}” –no-trunc apex-multi

    Now we can see the complete multi-line command that generates the 40MB layer. This command downloads the Nginx package using curl, extracts it with tar xzvf, and installs Nginx along with its dynamic modules inside the Alpine Linux environment.

    This detailed view is extremely valuable for:

    • Understanding: See exactly what contributes to your image size
    • Auditing: Verify what’s included in your production image
    • Optimizing: Identify opportunities to reduce image size further

    The Results

    Our multi-stage approach has delivered exactly what we promised:

    • Smaller image size: Dramatically reduced from our original 2.64GB
    • Enhanced security: No development tools or source code in production
    • Production-ready: Served by a robust, lightweight web server
    • Clean separation: Build concerns separated from runtime concerns

    This structure ensures you get a lean, secure, and production-ready Docker image that’s optimized for deployment and scalability.

    Phase 5 -Making Your React App Production-Ready
    Production-Ready Requirements

    A production-ready React application needs:

    • Performance: Optimized builds, minified assets, no dev tools or source code in production images
    • Reliability: Proper error handling and graceful failure recovery
    • Configuration: Environment variables for API keys and endpoints, no hard-coded values

    Multi-stage Dockerfiles handle performance optimization, but we need to address reliability and configuration management.

    Web Server Selection Criteria

    When choosing a web server for static React apps, consider:

    • Scalability: Handle traffic loads and growth (Nginx excels here)
    • Flexibility: Custom configuration support for complex routing and caching
    • Security: Proper HTTP headers, SSL/HTTPS support, secure file serving
    • Docker Integration: Small image sizes, multi-stage build support
    The SPA Routing Problem

    Our current Nginx setup has an issue: navigating to routes like /about shows Nginx’s 404 page instead of letting React Router handle the routing. This happens because Nginx looks for physical files and serves its own 404 when they don’t exist.

    Creating the Nginx Configuration

    The nginx.conf file acts as the brain of your Nginx server, controlling how it handles requests, serves files, and manages routing for your React application.

    Create a new nginx.conf file in your project root:

    server {
        listen 80;
        server_name localhost;
        root /usr/share/nginx/html;
        index index.html;
        
        location / {
            try_files $uri $uri/ /index.html;
        }
    }
    
    
    Key Configuration Explained
    • listen 80: Routes incoming HTTP traffic through port 80
    • server_name localhost: Defines the domain (use your production domain in real deployments)
    • root: Points to where Docker copied the React build output
    • try_files $uri $uri/ /index.html: The critical line that enables SPA routing

    The try_files directive works by:

    1. Looking for a file matching the requested path ($uri)
    2. If not found, serving index.html instead
    3. This allows React Router to handle client-side routing
    Updating the Dockerfile

    Modify your multi-stage Dockerfile to include the custom Nginx configuration:

    # Serve stage
    FROM nginx:alpine
    
    # Remove default config and copy custom config
    RUN rm /etc/nginx/conf.d/default.conf
    COPY nginx.conf /etc/nginx/conf.d/
    
    COPY --from=builder /app/build /usr/share/nginx/html
    EXPOSE 80
    CMD ["nginx", "-g", "daemon off;"]
    
    
    Building and Testing

    Build the updated image:

    docker build -t globo-multi .
    
    

    Run the container:

    docker run -p 80:80 globo-multi
    
    

    Test the routing fix by navigating to localhost/about:

    Environment Variables

    Add environment variables to your Dockerfile:

    ENV REACT_APP_ENVIRONMENT=development
    
    

    Note: Use the REACT_APP_ prefix for Create React App compatibility.

    Access the variable in your React code:

    const env = process.env.REACT_APP_ENVIRONMENT;
    

    After rebuilding and running, the environment value displays in the browser:

    Container Management Best Practices

    Using Named Containers

    Instead of managing container IDs:

    docker run -d --name react-app -p 80:80 globo-multi
    
    
    Detached Mode

    Use the -d flag to run containers in the background:

    Easy Container Control

    Stop containers by name:

    docker stop react-app
    
    

    This approach provides cleaner container management and reduces errors from copying container IDs.