uploader-bot/ARCHITECTURE.md

7.4 KiB
Raw Blame History

System Architecture Overview

This document is the single source of truth for the platforms architecture, protocols, data flows, and operational details. It supersedes previous scattered docs.

Contents

  • Components & Topology
  • Decentralized Layer (Membership, Replication, Metrics)
  • Upload/Conversion Pipeline
  • Content View & Purchase Flow
  • API Surface (selected endpoints)
  • Data Keys & Schemas
  • Configuration & Defaults
  • Observability & Metrics
  • Sequence Diagrams (Mermaid)

Components & Topology

  • Backend API: Sanic-based service (Telegram bots embedded) with PostgreSQL (SQLAlchemy + Alembic).
  • Storage: Local FS for uploaded/derived data; IPFS used for discovery/pinning; tusd for resumable uploads.
  • Converter workers: Dockerized ffmpeg pipeline (convert_v3, convert_process) driven by background tasks.
  • Frontend: Vite + TypeScript client served via nginx container.
  • Decentralized overlay (in-process DHT): Membership, replication lease management, windowed content metrics.
flowchart LR
  Client -- TWA/HTTP --> Frontend
  Frontend -- REST --> API[Backend API]
  API -- tus hooks --> tusd
  API -- SQL --> Postgres
  API -- IPC --> Workers[Converter Workers]
  API -- IPFS --> IPFS
  API -- DHT --> DHT[(In-Process DHT)]
  DHT -- CRDT Merge --> DHT

Decentralized Layer

Identity & Versions

  • NodeID = blake3(Ed25519 public key), ContentID = blake3(encrypted_blob)
  • schema_version = v1 embedded into DHT keys/records.

Membership

  • Signed /api/v1/network.handshake with Ed25519; includes:
    • Node info, capabilities, metrics, IPFS metadata.
    • reachability_receipts: (issuer, target, ASN, timestamp, signature).
  • State: LWW-Set for members + receipts, HyperLogLog for population estimate.
  • Island filtering: nodes with reachability_ratio < q are excluded (k=5, q=0.6, TTL=600s).
  • N_estimate: max(valid N_local reports) across sufficiently reachable peers.

Replication & Leases

  • Compute prefix p = max(0, round(log2(N_estimate / R_target))) with R_target ≥ 3.
  • Responsible nodes: first p bits of NodeID equal first p bits of ContentID.
  • Leader = min NodeID among responsible.
  • Leader maintains replica_leases with TTL=600s and diversity: ≥3 IP first octets and ≥3 ASN if available.
  • Rendezvous ranking: blake3(ContentID || NodeID) for candidate selection.
  • Heartbeat interval 60s, miss threshold 3 → failover within ≤180s.

Metrics (Windowed CRDT)

  • On view: PN-Counter for views; HyperLogLog for uniques (ViewID = blake3(ContentID || device_salt)); G-Counter for watch_time, bytes_out, completions.
  • Keys are windowed by hour; commutative merges ensure deterministic convergence.
stateDiagram-v2
  [*] --> Discover
  Discover: Handshake + receipts
  Discover --> Active: k ASN receipts & TTL ok
  Active --> Leader: Content prefix p elects min NodeID
  Leader --> Leased: Assign replica_leases (diversity)
  Leased --> Monitoring: Heartbeats every 60s
  Monitoring --> Reassign: Missed 3 intervals
  Reassign --> Leased

Upload & Conversion Pipeline

  1. Client uploads via tusd (resumable). Backend receives hooks (/api/v1/upload.tus-hook).
  2. Encrypted content is registered; converter workers derive preview/low/high (for media) or original (for binaries).
  3. Derivative metadata stored in DB and surfaced via /api/v1/content.view.
sequenceDiagram
  participant C as Client
  participant T as tusd
  participant B as Backend
  participant W as Workers
  participant DB as PostgreSQL

  C->>T: upload chunks
  T->>B: hooks (pre/post-finish)
  B->>DB: create content record
  B->>W: enqueue conversion
  W->>DB: store derivatives
  C->>B: GET /content.view
  B->>DB: resolve latest derivatives
  B-->>C: display_options + status

Content View & Purchase Flow

  • /api/v1/content.view/<content_address> resolves content and derivatives:
    • For binary content without previews: present original only when licensed.
    • For audio/video: use preview/low for unauth; decrypted_low/high for licensed users.
    • Frontend shows processing state when derivatives are pending.
  • Purchase options (TON/Stars) remain in a single row (UI constraint).
  • Cover art layout: fixed square slot; image fits without stretching; background follows page color, not black.
flowchart LR
  View[content.view] --> Resolve[Resolve encrypted/decrypted rows]
  Resolve --> Derivations{Derivatives ready?}
  Derivations -- No --> Status[processing/pending]
  Derivations -- Yes --> Options
  Options -- Binary + No License --> Original hidden
  Options -- Media + No License --> Preview/Low
  Options -- Licensed --> Decrypted Low/High or Original

Selected APIs

  • GET /api/system.version liveness/protocol version.
  • POST /api/v1/network.handshake signed membership exchange.
  • GET /api/v1/content.view/<content_address> resolves display options, status, and downloadability.
  • GET /api/v1.5/storage/<file_hash> static file access.
  • POST /api/v1/storage legacy upload endpoint.

Data Keys & Schemas

  • MetaKey(content_id): tracks replica_leases, leader, conflict_log, revision.
  • MembershipKey(node_id): LWW-Set of members & receipts, HyperLogLog population, N_reports.
  • MetricKey(content_id, window_id): PN-/G-/HLL serialized state.

All DHT records are signed and merged via deterministic CRDT strategies + LWW dominance (logical_counter, timestamp, node_id).


Configuration & Defaults

  • Network: NODE_PRIVACY, PUBLIC_HOST, HANDSHAKE_INTERVAL_SEC, TLS verify, IPFS peering.
  • DHT: DHT_MIN_RECEIPTS=5, DHT_MIN_REACHABILITY=0.6, DHT_MEMBERSHIP_TTL=600, DHT_REPLICATION_TARGET=3, DHT_LEASE_TTL=600, DHT_HEARTBEAT_INTERVAL=60, DHT_HEARTBEAT_MISS_THRESHOLD=3, DHT_MIN_ASN=3, DHT_MIN_IP_OCTETS=3, DHT_METRIC_WINDOW_SEC=3600.
  • Conversion resources: CONVERT_* limits (CPU/mem), MAX_CONTENT_SIZE_MB.

Observability & Metrics

Prometheus (exported in-process):

  • dht_replication_under / dht_replication_over / dht_leader_changes_total
  • dht_merge_conflicts_total
  • dht_view_count_total / dht_unique_view_estimate / dht_watch_time_seconds

Logs track replication conflict_log entries and HTTP structured errors (with session_id/error_id).


Sequence Diagrams (Consolidated)

Membership & N_estimate

sequenceDiagram
  participant A as Node A
  participant B as Node B
  A->>B: POST /network.handshake {nonce, ts, signature}
  B->>B: verify ts, nonce, signature
  B->>B: upsert member; store receipts
  B-->>A: {node, known_public_nodes, n_estimate, signature}
  A->>A: merge; recompute N_estimate = max(N_local, peers)

Replication Leader Election

sequenceDiagram
  participant L as Leader
  participant Peers as Responsible Nodes
  L->>L: compute p from N_estimate
  L->>Peers: rendezvous scores for ContentID
  L->>L: assign leases (diversity)
  Peers-->>L: heartbeat every 60s
  L->>L: reassign on 3 misses (≤180s)

Metrics Publication

sequenceDiagram
  participant C as Client
  participant API as Backend
  participant M as MetricsAggregator
  participant D as DHT

  C->>API: GET content.view?watch_time&bytes_out
  API->>M: record_view(delta)
  M->>D: merge MetricKey(ContentID, Window)
  M->>API: update gauges

Run & Test

# Spin services
docker compose -f /home/configs/docker-compose.yml --env-file /home/configs/.env up -d --build

# Backend unit tests (DHT integration)
cd uploader-bot
python3 -m unittest discover -s tests/dht