Skip to content

System Architecture

FreightFlow is a microservices-based platform designed for real-time freight tracking and management, utilizing high-performance background automation.

Visual Overview


Simplified Architecture

For a high-level view, the system consists of 4 main layers:

┌─────────────────────────────────────────────────────────────────┐
│  🖥️  LAYER 1: FRONTEND                                          │
│     React Dashboard → REST API → JWT Auth                       │
├─────────────────────────────────────────────────────────────────┤
│  ⚙️  LAYER 2: BACKEND CORE                                      │
│     ┌─────────────┐  ┌─────────────┐  ┌─────────────────────┐ │
│     │   FastAPI   │  │   Celery    │  │  Data Pipelines     │ │
│     │   Server    │  │   Workers   │  │  • TAI Webhooks     │ │
│     └──────┬──────┘  └──────┬──────┘  │  • Cargoes Flow     │ │
│            │                │         │  • Gravitas CSV     │ │
│            └────────────────┬───────────┴─────────────────────┘ │
├───────────────────────────┼─────────────────────────────────────┤
│  💾  LAYER 3: DATA        │  🔍  LAYER 4: SCRAPER SERVICE        │
│     ┌──────────────┐      │     ┌──────────────────────────┐   │
│     │ PostgreSQL 17│◄─────┼────►│    Scraper Microservice  │   │
│     │   + Redis    │      │     │  (14 Terminal Scrapers)  │   │
│     └──────────────┘      │     └──────────────────────────┘   │
└───────────────────────────┴─────────────────────────────────────┘
         │                           │
         ▼                           ▼
┌─────────────────┐         ┌─────────────────┐
│ 📡 TAI Platform │         │ 🌐 Terminal     │
│ 🌊 Cargoes Flow │         │    Portals      │
└─────────────────┘         └─────────────────┘

Data Ingestion Pipelines

FreightFlow receives data through four distinct pipelines, each feeding into the central PostgreSQL database.

Pipeline 1: TAI Platform Webhook

The TAI (Transportation Management System) sends real-time shipment JSON payloads to our backend whenever a shipment is created or updated.

TAI Platform ──(JSON Webhook)──► /api/v1/webhooks/tms


                              Celery Background Task


                           TmsWebhookProcessor

                          ┌───────────┴───────────┐
                          ▼                       ▼
                   ShipmentTai              Cargoes Flow API
                   (Upsert)              (Register & Sync)

How it works:

  1. TAI sends a JSON payload containing shipment details (MBL, container number, shipper, consignee, carrier, stops, attachments).
  2. The webhook endpoint immediately dispatches the payload to a Celery background task and returns a task_id.
  3. The TmsWebhookProcessor extracts and maps the TAI payload fields into our internal schema.
  4. It checks if a ShipmentTai record already exists by reference_number:
    • Exists: Updates the record with the latest data.
    • New: Creates a new ShipmentTai record.
  5. If an MBL is present, it registers the shipment with the Cargoes Flow API for ocean tracking.
  6. If the MBL is missing, a MissingMblShipment record is created for future resolution.

IMPORTANT

Only Drayage shipment types are processed. Non-drayage shipments are silently skipped.


Pipeline 2: Cargoes Flow Polling (Every 1h 20min)

A periodic Celery Beat task polls the Cargoes Flow API to refresh tracking data for all active containers.

Celery Beat (every 4800s / 1h 20min)


poll_all_latest_updates_from_cargoesflow


  CargoesFlowPoller

        ├─► Get all active container numbers from DB

        ├─► For each container:
        │       │
        │       ├─► GET Cargoes Flow API (container_number)
        │       ├─► Extract & map shipment data
        │       └─► Upsert into CargoesFlowShipment + Container + Events

        └─► Commit all changes

Key details:

  • Uses a Redis distributed lock (lock:cargoes_flow_polling) to prevent concurrent executions.
  • Iterates through every unique, non-archived container number in the database.
  • For each container, fetches the latest status from the Cargoes Flow external API.
  • Maps the response into our internal models and performs a create-or-update on the shipment, container fields, and event timeline.

Pipeline 3: Terminal Scraper (Every 6 Hours)

Automated scraping of terminal-specific data (availability, LFD, holds) for recently discharged containers.

Celery Beat (every 21600s / 6 hours)


periodic_trigger_scrapers_for_discharged_containers


  Find containers with discharge event in last 72 hours


  For each container:

        ├─► Dispatch to Scraper Service (X-API-TOKEN)
        │       │
        │       └─► Scraper hits terminal portal
        │               │
        │               └─► Returns: LFD, Availability, Holds, Terminal Name

        └─► Scraper Webhook (/api/v1/webhooks/scraper)


        process_scraper_webhook_task

                ├─► Update container.last_free_day (LFD)
                ├─► Update container.terminal_status (available/not-available)
                ├─► Update container.pod_terminal (terminal name, only if empty)
                ├─► Backfill event location_terminal_name on discharge events
                └─► Upsert custom holds (Customs, USDA, Terminal, Other Agency)

Pipeline 4: Gravitas CSV Import (New)

Gravitas is a separate entity tracking system that uses CSV import instead of TAI webhooks. It manages its own container lifecycle independently.

┌─────────────────────────────────────────────────────────────────────┐
│                        Gravitas Pipeline                            │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│  Admin User ──► CSV Upload ──► POST /api/v1/gravitas/import          │
│                                     │                               │
│                                     ▼                               │
│                         GravitasService.import_csv()                 │
│                                     │                               │
│                         ┌───────────┴───────────┐                  │
│                         ▼                       ▼                   │
│              GravitasShipment            Cargoes Flow API            │
│              (Create/Update)            (Query by MBL)               │
│                         │                       │                   │
│                         │              ┌────────┴────────┐           │
│                         │              ▼                 ▼           │
│                         │    GravitasContainer      Container Events  │
│                         │    (Multi-container       (Timeline)       │
│                         │     per MBL support)                     │
│                         │                       │                   │
│                         └───────────┬───────────┘                   │
│                                     │                               │
│                                     ▼                               │
│                    trigger_gravitas_scrapers_task                    │
│                    (Terminal scraping for LFD/LRD)                   │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

Key differences from TAI pipeline:

  • No TAI dependency: Gravitas operates completely independently from the TAI system
  • CSV-based import: Shipments are created via CSV upload with columns: File No., MB/L No., Office, Consignee, Oversea Agent, Container No., Shipper
  • Multi-container MBL support: When CSV has MBL numbers, the system queries Cargoes Flow to discover ALL containers under that MBL (not just the one listed)
  • Separate data models: Uses GravitasShipment, GravitasContainer, and GravitasContainerEvent tables
  • Gravitas-specific scrapers: Dedicated scraper tasks that update Gravitas containers only
  • Dedicated dashboard: Separate analytics and KPI tracking for Gravitas operations

CSV Import Process:

  1. Admin uploads CSV via /api/v1/gravitas/import
  2. System creates/updates GravitasShipment records by MBL number
  3. Each shipment is registered with Cargoes Flow for ocean tracking
  4. sync_tracking() queries Cargoes Flow by MBL to get all containers
  5. Creates GravitasContainer records for each container found
  6. Triggers trigger_gravitas_scrapers_task for terminal data (LFD, LRD)

Celery Beat Schedule (Production)

All periodic tasks run on a dedicated periodic queue with a separate Celery worker.

TaskIntervalQueueDescription
poll_all_latest_updates_from_cargoesflowEvery 1h 20minperiodicSync all active containers with Cargoes Flow API.
poll_all_latest_updates_for_gravitasEvery 1h 20minperiodicSync all active Gravitas containers with Cargoes Flow API.
fetch_or_update_route_mapEvery 1h 20minperiodicFetch map route data for shipments.
periodic_trigger_scrapers_for_discharged_containersEvery 6 hoursperiodicTrigger terminal scrapers for recently discharged containers.
auto_archive_completed_containersEvery 24 hoursperiodicArchive containers with completed/outdated status.

TIP

On-demand tasks like process_tms_webhook_task, process_shipment_import_task, trigger_gravitas_scrapers_task, and trigger_scrapers_for_container_task run on the default queue.


Core Data Models

Legacy TAI-Based System

The central database schema connects TAI shipments, Cargoes Flow tracking, and terminal scraper data.

ShipmentTai (tai_shipments)

    ├── reference_number, MBL, carrier, shipper, consignee

    └──► Container (containers)

              ├── container_number, mbl_number, status
              ├── pod_terminal, terminal_status, last_free_day
              ├── is_archived, is_completed (computed)

              ├──► ContainerShipmentEvent (container_shipment_events)
              │        └── code, actual_time, location, terminal_name

              └──► ContainerCustomHold (container_custom_holds)
                       └── hold_type (Customs, USDA, Terminal, Other Agency)

CargoesFlowShipment (cargoes_flow_shipments)

    └── mbl_number ──► Container (via foreign key relationship)

Gravitas System (New)

Separate tracking tables for Gravitas entity operations:

GravitasShipment (gravitas_shipments)

    ├── file_no, mbl_number, office, consignee
    ├── oversea_agent, container_number, shipper

    └──► GravitasContainer (gravitas_containers)

              ├── container_number, mbl_number, status
              ├── pod_terminal, terminal_status, last_free_day
              ├── last_return_day, rail_last_free_day
              ├── rail_last_return_day, demurrage_fee, detention_fee
              ├── is_archived, is_completed (computed)
              ├── is_pod_awaiting, is_pod_full_out (computed)
              ├── is_lfd_needed, is_lrd_needed (computed)

              ├──► GravitasContainerEvent (gravitas_container_events)
              │        └── code, actual_time, location, transport_mode

              └──► shipment_tags[] (LFD, LRD, POD awaiting, etc.)

Container Lifecycle Properties

Both container models include computed properties that drive the dashboard KPIs:

PropertyLogic
is_pod_awaitingDischarged at destination but no full gate-out yet.
is_pod_full_outContainer picked up full from the port.
is_completedEmpty gate-in recorded or status is "completed"/"outdated".
is_lfd_neededAwaiting at port and missing Last Free Day.
is_lrd_neededGated out full but missing Last Return Day.
needs_manual_alert_inputMissing fees at port after discharge/arrival.
is_rail_shipmentHas rail dates or rail-specific tracking events.
is_in_transitActive shipment that hasn't arrived at destination port.

Infrastructure

ComponentTechnologyPurpose
API ServerFastAPI + Gunicorn + UvicornPrimary REST API.
Task BrokerRedis 7Celery message broker & application cache.
Task WorkersCelery (2 workers: default + periodic)Background processing.
SchedulerCelery BeatPeriodic task scheduling.
DatabasePostgreSQL 17Persistent data storage.
Reverse ProxyTraefikSSL termination, routing, rate limiting.
MonitoringFlowerReal-time Celery task dashboard.

Microservices Communication

┌─────────────────────────────────────────────────────────────────┐
│                    Service Communication Flow                   │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│   ┌──────────────┐         X-API-TOKEN          ┌──────────────┐│
│   │   Backend    │ ───────────────────────────► │   Scraper    ││
│   │   :8000      │    Trigger Scraping Tasks   │   :9000      ││
│   └──────────────┘                             └──────────────┘│
│          ▲                                              │       │
│          │                                              │       │
│          │    Webhook Results (X-Webhook-Key)           │       │
│          └──────────────────────────────────────────────┘       │
│                                                                 │
│   ┌──────────────┐         REST API            ┌──────────────┐│
│   │   Backend    │ ───────────────────────────► │   Cargoes    ││
│   │              │     API Key Auth            │    Flow      ││
│   └──────────────┘                             └──────────────┘│
│          ▲                                              │       │
│          │                                              │       │
│          │         Webhook (Event Updates)              │       │
│          └──────────────────────────────────────────────┘       │
│                                                                 │
│   ┌──────────────┐         JWT Bearer          ┌──────────────┐│
│   │   Frontend   │ ───────────────────────────► │   Backend    ││
│   │   Dashboard  │        REST API             │   API        ││
│   └──────────────┘                             └──────────────┘│
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

FreightFlow Platform Documentation