System Architecture
FreightFlow is a microservices-based platform designed for real-time freight tracking and management, utilizing high-performance background automation.
Visual Overview
Simplified Architecture
For a high-level view, the system consists of 4 main layers:
┌─────────────────────────────────────────────────────────────────┐
│ 🖥️ LAYER 1: FRONTEND │
│ React Dashboard → REST API → JWT Auth │
├─────────────────────────────────────────────────────────────────┤
│ ⚙️ LAYER 2: BACKEND CORE │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────┐ │
│ │ FastAPI │ │ Celery │ │ Data Pipelines │ │
│ │ Server │ │ Workers │ │ • TAI Webhooks │ │
│ └──────┬──────┘ └──────┬──────┘ │ • Cargoes Flow │ │
│ │ │ │ • Gravitas CSV │ │
│ └────────────────┬───────────┴─────────────────────┘ │
├───────────────────────────┼─────────────────────────────────────┤
│ 💾 LAYER 3: DATA │ 🔍 LAYER 4: SCRAPER SERVICE │
│ ┌──────────────┐ │ ┌──────────────────────────┐ │
│ │ PostgreSQL 17│◄─────┼────►│ Scraper Microservice │ │
│ │ + Redis │ │ │ (14 Terminal Scrapers) │ │
│ └──────────────┘ │ └──────────────────────────┘ │
└───────────────────────────┴─────────────────────────────────────┘
│ │
▼ ▼
┌─────────────────┐ ┌─────────────────┐
│ 📡 TAI Platform │ │ 🌐 Terminal │
│ 🌊 Cargoes Flow │ │ Portals │
└─────────────────┘ └─────────────────┘Data Ingestion Pipelines
FreightFlow receives data through four distinct pipelines, each feeding into the central PostgreSQL database.
Pipeline 1: TAI Platform Webhook
The TAI (Transportation Management System) sends real-time shipment JSON payloads to our backend whenever a shipment is created or updated.
TAI Platform ──(JSON Webhook)──► /api/v1/webhooks/tms
│
▼
Celery Background Task
│
▼
TmsWebhookProcessor
│
┌───────────┴───────────┐
▼ ▼
ShipmentTai Cargoes Flow API
(Upsert) (Register & Sync)How it works:
- TAI sends a JSON payload containing shipment details (MBL, container number, shipper, consignee, carrier, stops, attachments).
- The webhook endpoint immediately dispatches the payload to a Celery background task and returns a
task_id. - The
TmsWebhookProcessorextracts and maps the TAI payload fields into our internal schema. - It checks if a
ShipmentTairecord already exists byreference_number:- Exists: Updates the record with the latest data.
- New: Creates a new
ShipmentTairecord.
- If an MBL is present, it registers the shipment with the Cargoes Flow API for ocean tracking.
- If the MBL is missing, a
MissingMblShipmentrecord is created for future resolution.
IMPORTANT
Only Drayage shipment types are processed. Non-drayage shipments are silently skipped.
Pipeline 2: Cargoes Flow Polling (Every 1h 20min)
A periodic Celery Beat task polls the Cargoes Flow API to refresh tracking data for all active containers.
Celery Beat (every 4800s / 1h 20min)
│
▼
poll_all_latest_updates_from_cargoesflow
│
▼
CargoesFlowPoller
│
├─► Get all active container numbers from DB
│
├─► For each container:
│ │
│ ├─► GET Cargoes Flow API (container_number)
│ ├─► Extract & map shipment data
│ └─► Upsert into CargoesFlowShipment + Container + Events
│
└─► Commit all changesKey details:
- Uses a Redis distributed lock (
lock:cargoes_flow_polling) to prevent concurrent executions. - Iterates through every unique, non-archived container number in the database.
- For each container, fetches the latest status from the Cargoes Flow external API.
- Maps the response into our internal models and performs a create-or-update on the shipment, container fields, and event timeline.
Pipeline 3: Terminal Scraper (Every 6 Hours)
Automated scraping of terminal-specific data (availability, LFD, holds) for recently discharged containers.
Celery Beat (every 21600s / 6 hours)
│
▼
periodic_trigger_scrapers_for_discharged_containers
│
▼
Find containers with discharge event in last 72 hours
│
▼
For each container:
│
├─► Dispatch to Scraper Service (X-API-TOKEN)
│ │
│ └─► Scraper hits terminal portal
│ │
│ └─► Returns: LFD, Availability, Holds, Terminal Name
│
└─► Scraper Webhook (/api/v1/webhooks/scraper)
│
▼
process_scraper_webhook_task
│
├─► Update container.last_free_day (LFD)
├─► Update container.terminal_status (available/not-available)
├─► Update container.pod_terminal (terminal name, only if empty)
├─► Backfill event location_terminal_name on discharge events
└─► Upsert custom holds (Customs, USDA, Terminal, Other Agency)Pipeline 4: Gravitas CSV Import (New)
Gravitas is a separate entity tracking system that uses CSV import instead of TAI webhooks. It manages its own container lifecycle independently.
┌─────────────────────────────────────────────────────────────────────┐
│ Gravitas Pipeline │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ Admin User ──► CSV Upload ──► POST /api/v1/gravitas/import │
│ │ │
│ ▼ │
│ GravitasService.import_csv() │
│ │ │
│ ┌───────────┴───────────┐ │
│ ▼ ▼ │
│ GravitasShipment Cargoes Flow API │
│ (Create/Update) (Query by MBL) │
│ │ │ │
│ │ ┌────────┴────────┐ │
│ │ ▼ ▼ │
│ │ GravitasContainer Container Events │
│ │ (Multi-container (Timeline) │
│ │ per MBL support) │
│ │ │ │
│ └───────────┬───────────┘ │
│ │ │
│ ▼ │
│ trigger_gravitas_scrapers_task │
│ (Terminal scraping for LFD/LRD) │
│ │
└─────────────────────────────────────────────────────────────────────┘Key differences from TAI pipeline:
- No TAI dependency: Gravitas operates completely independently from the TAI system
- CSV-based import: Shipments are created via CSV upload with columns:
File No., MB/L No., Office, Consignee, Oversea Agent, Container No., Shipper - Multi-container MBL support: When CSV has MBL numbers, the system queries Cargoes Flow to discover ALL containers under that MBL (not just the one listed)
- Separate data models: Uses
GravitasShipment,GravitasContainer, andGravitasContainerEventtables - Gravitas-specific scrapers: Dedicated scraper tasks that update Gravitas containers only
- Dedicated dashboard: Separate analytics and KPI tracking for Gravitas operations
CSV Import Process:
- Admin uploads CSV via
/api/v1/gravitas/import - System creates/updates
GravitasShipmentrecords by MBL number - Each shipment is registered with Cargoes Flow for ocean tracking
sync_tracking()queries Cargoes Flow by MBL to get all containers- Creates
GravitasContainerrecords for each container found - Triggers
trigger_gravitas_scrapers_taskfor terminal data (LFD, LRD)
Celery Beat Schedule (Production)
All periodic tasks run on a dedicated periodic queue with a separate Celery worker.
| Task | Interval | Queue | Description |
|---|---|---|---|
poll_all_latest_updates_from_cargoesflow | Every 1h 20min | periodic | Sync all active containers with Cargoes Flow API. |
poll_all_latest_updates_for_gravitas | Every 1h 20min | periodic | Sync all active Gravitas containers with Cargoes Flow API. |
fetch_or_update_route_map | Every 1h 20min | periodic | Fetch map route data for shipments. |
periodic_trigger_scrapers_for_discharged_containers | Every 6 hours | periodic | Trigger terminal scrapers for recently discharged containers. |
auto_archive_completed_containers | Every 24 hours | periodic | Archive containers with completed/outdated status. |
TIP
On-demand tasks like process_tms_webhook_task, process_shipment_import_task, trigger_gravitas_scrapers_task, and trigger_scrapers_for_container_task run on the default queue.
Core Data Models
Legacy TAI-Based System
The central database schema connects TAI shipments, Cargoes Flow tracking, and terminal scraper data.
ShipmentTai (tai_shipments)
│
├── reference_number, MBL, carrier, shipper, consignee
│
└──► Container (containers)
│
├── container_number, mbl_number, status
├── pod_terminal, terminal_status, last_free_day
├── is_archived, is_completed (computed)
│
├──► ContainerShipmentEvent (container_shipment_events)
│ └── code, actual_time, location, terminal_name
│
└──► ContainerCustomHold (container_custom_holds)
└── hold_type (Customs, USDA, Terminal, Other Agency)
CargoesFlowShipment (cargoes_flow_shipments)
│
└── mbl_number ──► Container (via foreign key relationship)Gravitas System (New)
Separate tracking tables for Gravitas entity operations:
GravitasShipment (gravitas_shipments)
│
├── file_no, mbl_number, office, consignee
├── oversea_agent, container_number, shipper
│
└──► GravitasContainer (gravitas_containers)
│
├── container_number, mbl_number, status
├── pod_terminal, terminal_status, last_free_day
├── last_return_day, rail_last_free_day
├── rail_last_return_day, demurrage_fee, detention_fee
├── is_archived, is_completed (computed)
├── is_pod_awaiting, is_pod_full_out (computed)
├── is_lfd_needed, is_lrd_needed (computed)
│
├──► GravitasContainerEvent (gravitas_container_events)
│ └── code, actual_time, location, transport_mode
│
└──► shipment_tags[] (LFD, LRD, POD awaiting, etc.)Container Lifecycle Properties
Both container models include computed properties that drive the dashboard KPIs:
| Property | Logic |
|---|---|
is_pod_awaiting | Discharged at destination but no full gate-out yet. |
is_pod_full_out | Container picked up full from the port. |
is_completed | Empty gate-in recorded or status is "completed"/"outdated". |
is_lfd_needed | Awaiting at port and missing Last Free Day. |
is_lrd_needed | Gated out full but missing Last Return Day. |
needs_manual_alert_input | Missing fees at port after discharge/arrival. |
is_rail_shipment | Has rail dates or rail-specific tracking events. |
is_in_transit | Active shipment that hasn't arrived at destination port. |
Infrastructure
| Component | Technology | Purpose |
|---|---|---|
| API Server | FastAPI + Gunicorn + Uvicorn | Primary REST API. |
| Task Broker | Redis 7 | Celery message broker & application cache. |
| Task Workers | Celery (2 workers: default + periodic) | Background processing. |
| Scheduler | Celery Beat | Periodic task scheduling. |
| Database | PostgreSQL 17 | Persistent data storage. |
| Reverse Proxy | Traefik | SSL termination, routing, rate limiting. |
| Monitoring | Flower | Real-time Celery task dashboard. |
Microservices Communication
┌─────────────────────────────────────────────────────────────────┐
│ Service Communication Flow │
├─────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────┐ X-API-TOKEN ┌──────────────┐│
│ │ Backend │ ───────────────────────────► │ Scraper ││
│ │ :8000 │ Trigger Scraping Tasks │ :9000 ││
│ └──────────────┘ └──────────────┘│
│ ▲ │ │
│ │ │ │
│ │ Webhook Results (X-Webhook-Key) │ │
│ └──────────────────────────────────────────────┘ │
│ │
│ ┌──────────────┐ REST API ┌──────────────┐│
│ │ Backend │ ───────────────────────────► │ Cargoes ││
│ │ │ API Key Auth │ Flow ││
│ └──────────────┘ └──────────────┘│
│ ▲ │ │
│ │ │ │
│ │ Webhook (Event Updates) │ │
│ └──────────────────────────────────────────────┘ │
│ │
│ ┌──────────────┐ JWT Bearer ┌──────────────┐│
│ │ Frontend │ ───────────────────────────► │ Backend ││
│ │ Dashboard │ REST API │ API ││
│ └──────────────┘ └──────────────┘│
│ │
└─────────────────────────────────────────────────────────────────┘