BinaryBeast uses FastCDC chunking, Blake3 hashing, and pack-based deduplication to efficiently store and transfer large binary files at scale.
A clean separation of concerns across CLI, daemon, and server — each optimized for its workload with dedicated data stores and communication protocols.
Cobra-powered command interface for asset upload, download, library management, and workspace sync. Communicates with daemon via Unix socket IPC.
Task orchestration engine with 4 specialized worker pools, pack accumulators, batch state tracking, and async SQLite persistence in WAL mode.
Gin-powered HTTP/2 REST API coordinating metadata in MongoDB and binary storage in S3/MinIO. Presigned URLs enable direct client-to-storage transfers.
Optimized multi-stage pipelines with single-pass hashing, cross-asset pack accumulation, and parallel presigned-URL transfers.
FastCDC splits file into 256KB–8MB variable chunks. Blake3 hashes both chunks and full asset in a single streaming pass.
Batch POST chunk hashes to server. Existing chunks skip upload — instant cross-asset deduplication at the chunk level.
Single-goroutine accumulator consolidates new chunks across multiple assets into ~100MB packs. 10x fewer S3 PUT requests.
CPU-bound workers compute Blake3 digest of assembled packs. Platform-optimized: ARM64 NEON on macOS, SIMD on Linux.
Direct presigned PUT to S3/MinIO — server never proxies data. Retry with exponential backoff on transient failures.
Server creates asset record, generates MsgPack download manifest, and stores to S3 for instant future downloads.
Pre-computed MsgPack manifest retrieved via presigned URL. Contains pack hashes and byte-range write targets.
fallocate() on Linux reserves contiguous disk space instantly. Enables concurrent random-access writes without fragmentation.
Single batch request gets presigned GET URLs for all unique packs. Deduplicates pack references across manifest entries.
Parallel pack downloads from S3 with contiguous write merging. pwrite64 enables lock-free concurrent file writes.
Full-file Blake3 hash verification ensures bit-perfect reconstruction. Automatic cleanup on hash mismatch.
Every component is purpose-built for handling large binary assets efficiently, from hashing algorithms to network protocols.
Content-defined chunking with rolling hash boundary detection. Variable-size chunks (256KB–8MB) ensure identical content produces identical chunks regardless of file context.
avg 1MB · window 64B · seed 0x280AE5C0Platform-optimized cryptographic hashing. ARM64 NEON on Apple Silicon, SIMD assembly on Linux x86_64. Single-pass hashing computes chunk and asset digests simultaneously.
256-bit · platform-optimized · streamingChunks consolidated into ~100MB packs via single-goroutine accumulator. Cross-asset packing reduces S3 PUT requests by 10x while maintaining per-chunk addressability.
~100MB packs · cross-asset · 10x fewer PUTsDirect client-to-S3 data transfers via presigned URLs. Server coordinates metadata only — never proxies binary data. Eliminates server bandwidth bottleneck.
direct S3 · 1hr expiry · zero proxyDedicated pools for I/O-bound asset ops, network chunk checks, CPU-bound pack building, and network-bound transfers. Each tuned for its workload characteristics.
asset · check · build · transferWrite-Ahead Logging enables concurrent reads with serialized writes for the daemon task queue. 30s busy timeout handles heavy concurrent workloads gracefully.
WAL · NORMAL sync · 10 connectionsManaged 100MB reusable buffers with idle-timeout deallocation. Channel-based semaphore limits in-flight packs for bounded memory growth under load.
100MB buffers · semaphore · lazy allocfallocate() on Linux reserves contiguous disk space before download. Enables efficient pwrite64 random-access writes without filesystem fragmentation.
fallocate · pwrite64 · zero fragmentationNative support for macOS (ARM64), Linux (amd64), and Windows. Platform-specific IPC (Unix sockets vs named pipes), hashing, and file I/O optimizations.
darwin · linux · windowsJWT-based authentication with automatic token refresh via background goroutine. JWKS caching with TTL avoids constant key fetches.
JWT · auto-refresh · JWKS cachePre-computed binary download manifests stored in S3 at upload time. Downloads skip chunk resolution entirely — just fetch manifest and go.
binary · pre-computed · instant pulls30-second rolling window with per-task-type I/O breakdown. Tracks upload/download speed, disk I/O, and cumulative throughput in real-time.
30s window · per-task · real-timeBinaryBeast deduplicates at every level: chunk-level across assets, within-pack across concurrent uploads, and across versions via predecessor chains.
Modified 200 MB of a 2.4 GB file? Only the changed chunks upload. Same file uploaded by another user? Zero bytes transferred.
Four dedicated worker pools, each tuned to its workload profile. Tasks flow through a dispatcher to the right pool — no lock contention, no shared mutable state.
Single-goroutine pattern — no locks, no contention. Consolidates chunks from multiple concurrent uploads into optimal packs with 5-second timeout flush and drain-signal eager flush.
Batches download requests by pack hash for optimal network utilization. Flushes at 5,000 chunks or 2-second timeout. Tracks presigned URLs and maps write targets across files.
Production-grade components chosen for performance, reliability, and operational simplicity.
Platform-specific optimizations via Go build tags. Each platform gets the best available primitives for hashing, I/O, and IPC.
Blake3 with ARM NEON acceleration. Unix domain socket IPC. mmap for memory-mapped file I/O.
SIMD-optimized Blake3. fallocate() for contiguous preallocation. Unix sockets. Full mmap support.
Named pipes for IPC. Pure Go Blake3. Truncate-based file preallocation with standard I/O fallbacks.
BinaryBeast handles the complexity of large binary file management so you can focus on what matters — building great software.