mirror of
https://git.numenor-labs.us/dsfx.git
synced 2025-04-29 08:10:34 +00:00
2.9 KiB
2.9 KiB
Blobs
This document is under construction. We are actively refining its content, so please check back for updates.
Overview
In DSFX, file data is managed through a system of "blobs." A blob is a content-addressable, encrypted chunk of data that forms the basic unit of file transfer and storage within the DSFX network. By decoupling file management from the underlying data, DSFX can efficiently distribute files, eliminate duplicates, and maintain privacy, as servers never access the raw underlying file content.
How Blobs Work
-
Partitioning and Padding:
- When a user uploads a file, the client application partitions the file into 1 MB chunks.
- If a file is smaller than 1 MB, it is padded to reach the standard chunk size.
- This uniform chunk size supports simpler handling on both the client and server sides.
-
Encryption:
- Each 1 MB chunk is encrypted using appropriate cryptographic methods.
- Encryption ensures that even if a server handles the blob, it remains oblivious to its underlying content.
-
Content-Addressable Storage:
- Once encrypted, every blob is assigned a unique identifier, which is its SHA-256 hash.
- Because the blob’s identifier is computed from its content, any change in the underlying data results in a new identifier.
- This design supports both integrity verification and de-duplication: identical data results in the same blob hash, allowing multiple files with overlapping content to share the same underlying blobs.
-
Server Role:
- DSFX servers operate only in terms of these blobs. They are responsible solely for receiving, storing, and transferring blobs as units of data.
- The servers do not hold or understand metadata about the file or its meaning. Their responsibilities are limited to blob distribution and management.
-
Client Responsibilities:
- The client application is in charge of managing the file manifest. This manifest maps files to the specific ordered list of blob identifiers.
- Clients track what blobs comprise a file, ensuring that files can be reconstructed accurately when needed.
- This model also enables advanced storage operations, such as de-duplication across different files or even accounts.
Future Directions
-
Reference Counting:
- In the future, we plan to explore mechanisms such as reference counting. This would allow multiple user accounts to reference the same underlying blob.
- A reference counting system can enable efficient storage utilization and simplify garbage collection of unused data.
-
Enhanced Distribution:
- As the DSFX network evolves, we envision more sophisticated techniques for blob distribution and caching, reducing network overhead and speeding up file retrieval.
-
Extended Metadata:
- Further refinements to the file manifest management may include additional metadata, enhancing the client’s ability to manage, search, and organize content more effectively.