System Architecture

FreightFlow is a microservices-based platform designed for real-time freight tracking and management, utilizing high-performance background automation.

Visual Overview

Simplified Architecture

For a high-level view, the system consists of 4 main layers:

┌─────────────────────────────────────────────────────────────────┐
│  🖥️  LAYER 1: FRONTEND                                          │
│     React Dashboard → REST API → JWT Auth                       │
├─────────────────────────────────────────────────────────────────┤
│  ⚙️  LAYER 2: BACKEND CORE                                      │
│     ┌─────────────┐  ┌─────────────┐  ┌─────────────────────┐ │
│     │   FastAPI   │  │   Celery    │  │  Data Pipelines     │ │
│     │   Server    │  │   Workers   │  │  • TAI Webhooks     │ │
│     └──────┬──────┘  └──────┬──────┘  │  • Cargoes Flow     │ │
│            │                │         │  • Gravitas CSV     │ │
│            └────────────────┬───────────┴─────────────────────┘ │
├───────────────────────────┼─────────────────────────────────────┤
│  💾  LAYER 3: DATA        │  🔍  LAYER 4: SCRAPER SERVICE        │
│     ┌──────────────┐      │     ┌──────────────────────────┐   │
│     │ PostgreSQL 17│◄─────┼────►│    Scraper Microservice  │   │
│     │   + Redis    │      │     │  (14 Terminal Scrapers)  │   │
│     └──────────────┘      │     └──────────────────────────┘   │
└───────────────────────────┴─────────────────────────────────────┘
         │                           │
         ▼                           ▼
┌─────────────────┐         ┌─────────────────┐
│ 📡 TAI Platform │         │ 🌐 Terminal     │
│ 🌊 Cargoes Flow │         │    Portals      │
└─────────────────┘         └─────────────────┘

Data Ingestion Pipelines

FreightFlow receives data through four distinct pipelines, each feeding into the central PostgreSQL database.

Pipeline 1: TAI Platform Webhook

The TAI (Transportation Management System) sends real-time shipment JSON payloads to our backend whenever a shipment is created or updated.

TAI Platform ──(JSON Webhook)──► /api/v1/webhooks/tms
                                      │
                                      ▼
                              Celery Background Task
                                      │
                                      ▼
                           TmsWebhookProcessor
                                      │
                          ┌───────────┴───────────┐
                          ▼                       ▼
                   ShipmentTai              Cargoes Flow API
                   (Upsert)              (Register & Sync)

How it works:

TAI sends a JSON payload containing shipment details (MBL, container number, shipper, consignee, carrier, stops, attachments).
The webhook endpoint immediately dispatches the payload to a Celery background task and returns a task_id.
The TmsWebhookProcessor extracts and maps the TAI payload fields into our internal schema.
It checks if a ShipmentTai record already exists by reference_number:
- Exists: Updates the record with the latest data.
- New: Creates a new ShipmentTai record.
If an MBL is present, it registers the shipment with the Cargoes Flow API for ocean tracking.
If the MBL is missing, a MissingMblShipment record is created for future resolution.

IMPORTANT

Only Drayage shipment types are processed. Non-drayage shipments are silently skipped.

Pipeline 2: Cargoes Flow Polling (Every 1h 20min)

A periodic Celery Beat task polls the Cargoes Flow API to refresh tracking data for all active containers.

Celery Beat (every 4800s / 1h 20min)
        │
        ▼
poll_all_latest_updates_from_cargoesflow
        │
        ▼
  CargoesFlowPoller
        │
        ├─► Get all active container numbers from DB
        │
        ├─► For each container:
        │       │
        │       ├─► GET Cargoes Flow API (container_number)
        │       ├─► Extract & map shipment data
        │       └─► Upsert into CargoesFlowShipment + Container + Events
        │
        └─► Commit all changes

Key details:

Uses a Redis distributed lock (lock:cargoes_flow_polling) to prevent concurrent executions.
Iterates through every unique, non-archived container number in the database.
For each container, fetches the latest status from the Cargoes Flow external API.
Maps the response into our internal models and performs a create-or-update on the shipment, container fields, and event timeline.

Pipeline 3: Terminal Scraper (Every 6 Hours)

Automated scraping of terminal-specific data (availability, LFD, holds) for recently discharged containers.

Celery Beat (every 21600s / 6 hours)
        │
        ▼
periodic_trigger_scrapers_for_discharged_containers
        │
        ▼
  Find containers with discharge event in last 72 hours
        │
        ▼
  For each container:
        │
        ├─► Dispatch to Scraper Service (X-API-TOKEN)
        │       │
        │       └─► Scraper hits terminal portal
        │               │
        │               └─► Returns: LFD, Availability, Holds, Terminal Name
        │
        └─► Scraper Webhook (/api/v1/webhooks/scraper)
                │
                ▼
        process_scraper_webhook_task
                │
                ├─► Update container.last_free_day (LFD)
                ├─► Update container.terminal_status (available/not-available)
                ├─► Update container.pod_terminal (terminal name, only if empty)
                ├─► Backfill event location_terminal_name on discharge events
                └─► Upsert custom holds (Customs, USDA, Terminal, Other Agency)

Pipeline 4: Gravitas CSV Import (New)

Gravitas is a separate entity tracking system that uses CSV import instead of TAI webhooks. It manages its own container lifecycle independently.

┌─────────────────────────────────────────────────────────────────────┐
│                        Gravitas Pipeline                            │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│  Admin User ──► CSV Upload ──► POST /api/v1/gravitas/import          │
│                                     │                               │
│                                     ▼                               │
│                         GravitasService.import_csv()                 │
│                                     │                               │
│                         ┌───────────┴───────────┐                  │
│                         ▼                       ▼                   │
│              GravitasShipment            Cargoes Flow API            │
│              (Create/Update)            (Query by MBL)               │
│                         │                       │                   │
│                         │              ┌────────┴────────┐           │
│                         │              ▼                 ▼           │
│                         │    GravitasContainer      Container Events  │
│                         │    (Multi-container       (Timeline)       │
│                         │     per MBL support)                     │
│                         │                       │                   │
│                         └───────────┬───────────┘                   │
│                                     │                               │
│                                     ▼                               │
│                    trigger_gravitas_scrapers_task                    │
│                    (Terminal scraping for LFD/LRD)                   │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

Key differences from TAI pipeline:

No TAI dependency: Gravitas operates completely independently from the TAI system
CSV-based import: Shipments are created via CSV upload with columns: File No., MB/L No., Office, Consignee, Oversea Agent, Container No., Shipper
Multi-container MBL support: When CSV has MBL numbers, the system queries Cargoes Flow to discover ALL containers under that MBL (not just the one listed)
Separate data models: Uses GravitasShipment, GravitasContainer, and GravitasContainerEvent tables
Gravitas-specific scrapers: Dedicated scraper tasks that update Gravitas containers only
Dedicated dashboard: Separate analytics and KPI tracking for Gravitas operations

CSV Import Process:

Admin uploads CSV via /api/v1/gravitas/import
System creates/updates GravitasShipment records by MBL number
Each shipment is registered with Cargoes Flow for ocean tracking
sync_tracking() queries Cargoes Flow by MBL to get all containers
Creates GravitasContainer records for each container found
Triggers trigger_gravitas_scrapers_task for terminal data (LFD, LRD)

Celery Beat Schedule (Production)

All periodic tasks run on a dedicated periodic queue with a separate Celery worker.

Task	Interval	Queue	Description
`poll_all_latest_updates_from_cargoesflow`	Every 1h 20min	`periodic`	Sync all active containers with Cargoes Flow API.
`poll_all_latest_updates_for_gravitas`	Every 1h 20min	`periodic`	Sync all active Gravitas containers with Cargoes Flow API.
`fetch_or_update_route_map`	Every 1h 20min	`periodic`	Fetch map route data for shipments.
`periodic_trigger_scrapers_for_discharged_containers`	Every 6 hours	`periodic`	Trigger terminal scrapers for recently discharged containers.
`auto_archive_completed_containers`	Every 24 hours	`periodic`	Archive containers with completed/outdated status.

TIP

On-demand tasks like process_tms_webhook_task, process_shipment_import_task, trigger_gravitas_scrapers_task, and trigger_scrapers_for_container_task run on the default queue.

Core Data Models

Legacy TAI-Based System

The central database schema connects TAI shipments, Cargoes Flow tracking, and terminal scraper data.

ShipmentTai (tai_shipments)
    │
    ├── reference_number, MBL, carrier, shipper, consignee
    │
    └──► Container (containers)
              │
              ├── container_number, mbl_number, status
              ├── pod_terminal, terminal_status, last_free_day
              ├── is_archived, is_completed (computed)
              │
              ├──► ContainerShipmentEvent (container_shipment_events)
              │        └── code, actual_time, location, terminal_name
              │
              └──► ContainerCustomHold (container_custom_holds)
                       └── hold_type (Customs, USDA, Terminal, Other Agency)

CargoesFlowShipment (cargoes_flow_shipments)
    │
    └── mbl_number ──► Container (via foreign key relationship)

Gravitas System (New)

Separate tracking tables for Gravitas entity operations:

GravitasShipment (gravitas_shipments)
    │
    ├── file_no, mbl_number, office, consignee
    ├── oversea_agent, container_number, shipper
    │
    └──► GravitasContainer (gravitas_containers)
              │
              ├── container_number, mbl_number, status
              ├── pod_terminal, terminal_status, last_free_day
              ├── last_return_day, rail_last_free_day
              ├── rail_last_return_day, demurrage_fee, detention_fee
              ├── is_archived, is_completed (computed)
              ├── is_pod_awaiting, is_pod_full_out (computed)
              ├── is_lfd_needed, is_lrd_needed (computed)
              │
              ├──► GravitasContainerEvent (gravitas_container_events)
              │        └── code, actual_time, location, transport_mode
              │
              └──► shipment_tags[] (LFD, LRD, POD awaiting, etc.)

Container Lifecycle Properties

Both container models include computed properties that drive the dashboard KPIs:

Property	Logic
`is_pod_awaiting`	Discharged at destination but no full gate-out yet.
`is_pod_full_out`	Container picked up full from the port.
`is_completed`	Empty gate-in recorded or status is "completed"/"outdated".
`is_lfd_needed`	Awaiting at port and missing Last Free Day.
`is_lrd_needed`	Gated out full but missing Last Return Day.
`needs_manual_alert_input`	Missing fees at port after discharge/arrival.
`is_rail_shipment`	Has rail dates or rail-specific tracking events.
`is_in_transit`	Active shipment that hasn't arrived at destination port.

Infrastructure

Component	Technology	Purpose
API Server	FastAPI + Gunicorn + Uvicorn	Primary REST API.
Task Broker	Redis 7	Celery message broker & application cache.
Task Workers	Celery (2 workers: `default` + `periodic`)	Background processing.
Scheduler	Celery Beat	Periodic task scheduling.
Database	PostgreSQL 17	Persistent data storage.
Reverse Proxy	Traefik	SSL termination, routing, rate limiting.
Monitoring	Flower	Real-time Celery task dashboard.

Microservices Communication

┌─────────────────────────────────────────────────────────────────┐
│                    Service Communication Flow                   │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│   ┌──────────────┐         X-API-TOKEN          ┌──────────────┐│
│   │   Backend    │ ───────────────────────────► │   Scraper    ││
│   │   :8000      │    Trigger Scraping Tasks   │   :9000      ││
│   └──────────────┘                             └──────────────┘│
│          ▲                                              │       │
│          │                                              │       │
│          │    Webhook Results (X-Webhook-Key)           │       │
│          └──────────────────────────────────────────────┘       │
│                                                                 │
│   ┌──────────────┐         REST API            ┌──────────────┐│
│   │   Backend    │ ───────────────────────────► │   Cargoes    ││
│   │              │     API Key Auth            │    Flow      ││
│   └──────────────┘                             └──────────────┘│
│          ▲                                              │       │
│          │                                              │       │
│          │         Webhook (Event Updates)              │       │
│          └──────────────────────────────────────────────┘       │
│                                                                 │
│   ┌──────────────┐         JWT Bearer          ┌──────────────┐│
│   │   Frontend   │ ───────────────────────────► │   Backend    ││
│   │   Dashboard  │        REST API             │   API        ││
│   └──────────────┘                             └──────────────┘│
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

System Architecture ​

Visual Overview ​

Simplified Architecture ​

Data Ingestion Pipelines ​

Pipeline 1: TAI Platform Webhook ​

Pipeline 2: Cargoes Flow Polling (Every 1h 20min) ​

Pipeline 3: Terminal Scraper (Every 6 Hours) ​

Pipeline 4: Gravitas CSV Import (New) ​

Celery Beat Schedule (Production) ​

Core Data Models ​

Legacy TAI-Based System ​

Gravitas System (New) ​

Container Lifecycle Properties ​

Infrastructure ​

Microservices Communication ​