Designing MeshPack: How Standards Emerge from Real 3D Model Workflows

Most standards fail in one of two ways. Some are too abstract: they describe a clean world that no real workflow inhabits. Others are too local: they encode one implementation so precisely that nobody else can adopt them without inheriting its accidents.

3D asset workflows sit exactly between those failures. A single “model” might be an STL file, an OBJ with a material file, a 3MF archive, a folder of texture maps, a render preview, a license file, a README, and several variants that only make sense together. The same asset may live in a local folder, on a NAS, in Google Drive, in S3-compatible storage, on Thingiverse, or inside a marketplace listing.

Mesh-Sync needs to operate in that world without pretending it is cleaner than it is. That is why I started defining MeshPack: a portable .meshpack format for representing snapshots of 3D asset libraries, metadata, and cross-platform indices. At the time of writing, the standard repository is private while it is being prepared for public release at Mesh-Sync/standard-meshpack.

The core idea is stable: MeshPack is not just a file format. It is the boundary where messy storage becomes a portable, verifiable object that workers, importers, viewers, and marketplace connectors can all reason about.

flowchart LR
    subgraph Sources[Messy storage world]
        A[Local folders]
        B[NAS and cloud drives]
        C[S3 buckets]
        D[Marketplace exports]
    end

    Sources --> E[Storage connector]
    E --> F[MeshPack snapshot]
    F --> G[Integrity checks]
    G --> H[Provider-agnostic workers]
    H --> I[Importers, viewers, marketplaces]

MeshPack is the handoff object: storage connectors create it, downstream systems verify and process it.

This article walks through the design process behind MeshPack: the container structure, sharded index model, integrity sidecar, conformance levels, extension strategy, delta packs, and governance model that make the standard practical instead of decorative. For the product-level reasoning behind this boundary, see the Mesh-Sync strategic architecture post.

The Workflow Problem

The naive model of 3D asset management is simple: scan files, classify them, and store results. That breaks as soon as you ask ordinary user questions:

Is dragon.stl the main model, a variant, or one part of an assembly?
Is license.txt attached to one file, one folder, or the whole library?
Do Logo.png and logo.png refer to different files, or did a case-insensitive filesystem collapse them?
Can a worker analyze the structure of a 500 GB library without downloading every heavy mesh?
How do you prove that a package was not altered between scanning and processing?
How does a marketplace connector preserve existing platform identities while adding Mesh-Sync identities?

These are not implementation details. They are standard-definition questions. If the format does not make them explicit, every importer, worker, marketplace connector, and viewer invents its own answer.

MeshPack exists to make those answers portable.

The Terms That Matter

Before getting into the structure, it helps to pin down four terms that otherwise sound heavier than they are.

Term	Plain-language meaning	Why it exists
Sharded index	The file list split into smaller JSON files	Large libraries stay readable without loading one giant document
Integrity sidecar	A separate `.meshpack.integrity` file	The authoritative archive hash cannot safely live inside the archive it hashes
Conformance level	A named implementation tier	A simple viewer should not need every marketplace feature
Extension envelope	Versioned, namespaced extra metadata	The core stays small while geometry, printability, and marketplace fields can evolve

Those four choices carry most of the standard’s practical weight.

What MeshPack Is

Physically, a .meshpack file is a ZIP archive with a custom extension. The alternative extension .mpack is also recognized. ZIP is not glamorous, but it is useful: browsers can read it, desktop tools can inspect it, language ecosystems have stable libraries for it, and users can extract it when something goes wrong.

The standard repository describes MeshPack as a container for snapshots of file systems or 3D asset collections. It separates Scanning from Processing.

Scanning is provider-specific. It touches credentials, source filesystem behavior, path normalization, case sensitivity, folder traversal, and file hashing.

Processing should be provider-agnostic. A thumbnail worker, geometry worker, metamodel detector, importer, or marketplace connector should receive a stable package, not a bundle of storage-provider assumptions.

flowchart TB
    A[.meshpack ZIP archive] --> B[Required core]
    A --> C[Optional enrichment]

    B --> D[manifest.json]
    B --> E[_README.md]
    B --> F[index/ shards]

    F --> F1[part-00001.json]
    F --> F2[part-00002.json]

    C --> G[resources/ previews and small files]
    C --> H[mapping.db local cache]
    C --> I[extension metadata]

The archive has a small required core and optional enrichment areas. Readers can start safely without understanding every extension.

The core archive layout is intentionally small:

root/
  manifest.json
  _README.md
  index/
    _README.md
    part-00001.json
    part-00002.json
  resources/
    _README.md
    <hex-hash>.jpg
  mapping.db

Only some of that is required. manifest.json, the root _README.md, the index/ directory, and its shards are the core. resources/ is optional and is used for small embedded content such as thumbnails, licenses, READMEs, or lightweight preview files. mapping.db is an optional local acceleration cache and is excluded from integrity verification because SQLite files are not deterministic across generators.

The required _README.md files are deliberate. An extracted MeshPack should explain itself to a person, not only to an SDK.

The Manifest Is The Entry Point

The manifest.json file is the first entry in the ZIP archive. That ordering is a standard requirement, not an aesthetic preference. Readers should be able to parse package metadata before walking through arbitrarily large archives.

A manifest captures the global context:

{
  "format_version": "1.0.0",
  "created_at": "2026-01-06T12:00:00Z",
  "workspace_id": "550e8400-e29b-41d4-a716-446655440000",
  "platform_info": {
    "os": "linux",
    "path_separator": "/",
    "is_case_sensitive": true
  },
  "index_summary": {
    "total_files": 15420,
    "total_shards": 2
  },
  "hash_algo": "sha256",
  "shard_list": [
    {
      "id": "part-00001",
      "entries_count": 10000,
      "entries_hash": "sha256:..."
    }
  ]
}

The point is not to make the manifest enormous. The point is to make it sufficient for safe loading. A reader can check the format version, understand the source platform, know how many shards to expect, decide which hash algorithm to use, and detect whether the package belongs to a Mesh-Sync workspace.

The platform_info field looks mundane until it prevents cross-platform corruption. A case-sensitive Linux source can legally contain both Wheel.stl and wheel.stl; a Windows target cannot treat that distinction casually. MeshPack does not solve every import conflict, but it preserves the information required for a responsible importer to make a decision.

Shards Make Large Libraries Practical

A 3D library can contain hundreds of thousands of files. A single giant JSON file would be simple to define and miserable to operate. MeshPack splits the file index into deterministic shards:

Shards live under index/.
Names use part-00001.json, part-00002.json, and so on.
Shard IDs are one-based and five-digit zero-padded.
Entries are sorted by path using UTF-8 byte order before sharding.
A shard must not exceed 10,000 entries.
A shard should stay below 5 MB where possible.

Each shard is an object, not a raw array. That lets it carry its own shard_id, format_version, entries_count, and entries_hash beside the entries.

{
  "shard_id": "part-00001",
  "format_version": "1.0.0",
  "entries_count": 10000,
  "entries_hash": "sha256:...",
  "entries": [
    {
      "path": "vehicles/cars/sports_car_v2.stl",
      "original_name": "Sports Car V2.stl",
      "size_bytes": 500021,
      "hash": "sha256:e3b0c44298...",
      "mime_type": "model/stl",
      "units": "mm",
      "up_axis": "+Z"
    }
  ]
}

This design is intentionally boring in the best sense. Deterministic ordering enables reproducible output and stable hashes. Shard limits make memory usage predictable. Separate shard hashes make corruption local instead of forcing the reader to treat the entire package as unknowable.

File Entries Preserve More Than Paths

At L1 conformance, a file entry needs only the basic facts: path, original name, size, content hash, and modification time. But MeshPack is designed for richer 3D workflows, so entries can also carry units, up axis, bounding boxes, identifiers, metamodel memberships, assembly transforms, preview references, and extension data.

The path rules are strict:

Paths use forward slashes.
Paths are relative to the scanned root.
Paths must not contain ...
Paths must not be absolute.
Paths must not contain drive letters.
Paths must not exceed 1024 characters.

That is not pedantry. A package format that may be extracted, indexed, or imported by different tools has to reject path traversal before it becomes a filesystem bug.

For 3D data, optional fields make the standard more than a backup inventory. units, up_axis, and bounding_box give downstream systems enough context to reason about geometry without opening the original model. metamodels lets one file participate in one or more logical assets. assembly captures parent-child relationships and transforms using a canonical right-handed, +Y-up coordinate system. preview_ref links to a resource preview when the pack contains one.

The standard keeps the core small while leaving room for progressively richer interpretation.

Integrity Cannot Live Inside The Archive

One of the most important MeshPack design choices came from a simple impossibility: you cannot compute a hash of an archive and store the authoritative value inside that same archive without changing the archive.

MeshPack solves this with a detached sidecar file:

library.meshpack
library.meshpack.integrity

The .meshpack.integrity file is authoritative for the package hash. The manifest.pack_hash field is only informational. Verifiers must use the sidecar.

flowchart TD
    A[library.meshpack.integrity] --> B[Read expected pack hash and size]
    C[library.meshpack] --> D[Compare actual archive size]
    D --> E[Hash archive content]
    E --> F{Pack hash matches?}
    F -->|No| X[Reject package]
    F -->|Yes| G[Load manifest shard list]
    G --> H[Hash canonical shard entries]
    H --> I{Shard hashes match?}
    I -->|No| X
    I -->|Yes| J[Verify signatures when present]
    J --> K[Accept verified package]

Integrity is a verification workflow, not a single field. The sidecar lets the archive be checked without mutating itself.

Verification has several layers:

Read the sidecar.
Compare the stated archive size with the actual size.
Compute the archive hash, excluding entries such as mapping.db.
Compare the computed hash to sidecar.pack_hash.
Verify signatures when present.
Compute each shard hash from canonical JSON and compare it with manifest.shard_list.

Shard hashes use RFC 8785 JSON Canonicalization Scheme. Ordinary JSON serialization is not deterministic enough: key ordering, whitespace, numeric representation, and encoding differences can make identical logical data produce different bytes. MeshPack defines the hash over canonical JSON of entries sorted by path.

This gives the standard a property that architecture documents often ask for but formats rarely implement cleanly: the package is portable, but still verifiable.

Three Conformance Levels

One mistake in standard design is to require every tool to implement the whole universe. MeshPack avoids that by defining three conformance levels.

flowchart BT
    L1[L1 Minimal reader<br/>Read manifest, load shards, reject unsafe paths]
    L2[L2 Standard generator/importer<br/>Generate packs, verify integrity, embed resources]
    L3[L3 Full ecosystem implementation<br/>Resolve identities, merge deltas, validate extensions]

    L1 --> L2 --> L3

Conformance levels keep the standard adoptable. A viewer, sync agent, and marketplace pipeline should not have the same obligations.

Level	Name	Target implementers	Capabilities
L1	Minimal	File browsers, viewers, archive tools	Read manifest, load shards, iterate entries, reject unsafe paths, tolerate unknown fields
L2	Standard	Sync agents, import/export tools, 3D viewers	Generate packs, compute shard hashes, verify sidecar integrity, embed resources, handle platform conversion
L3	Full ecosystem	Mesh-Sync platform, worker pipelines, marketplaces	Resolve identities, handle assemblies, merge delta packs, verify signatures, validate official extensions

This split is one of the most important governance decisions in the standard. A lightweight viewer should not need to understand Mesh-Sync marketplace identity. A Mesh-Sync importer, however, must understand namespaces, workspaces, metamodels, assemblies, delta operations, signatures, and extension schemas.

The standard lets both exist without pretending they are the same class of implementation.

Identity Is Namespaced

3D assets move between systems. A Thingiverse thing ID, a Mesh-Sync model UUID, a marketplace listing ID, and a local storage item ID can all point to related views of the same object. A single id field would collapse that context.

MeshPack uses namespaced identifiers:

{
  "ids": [
    { "ns": "meshsync:model", "id": "550e8400-e29b-41d4-a716-446655440001" },
    { "ns": "thingiverse:thing", "id": "12345" }
  ]
}

The Mesh-Sync ecosystem reserves namespaces such as meshsync:workspace, meshsync:library, meshsync:model, meshsync:metamodel, meshsync:storage-item, and meshsync:media. Other platforms can preserve their own identifiers without fighting for ownership of the core schema.

That decision turns identity reconciliation into an explicit operation. Importers can define precedence rules: prefer canonical Mesh-Sync model identity, then metamodel identity, then platform-specific identifiers, then content hash fallback.

Extensions Keep The Core Honest

The extension model is where MeshPack tries to stay practical without becoming a dumping ground.

The core schemas define the common denominator. Vendor-specific or experimental metadata lives under an extensions object with namespaced keys. Official Mesh-Sync extensions use versioned envelopes:

{
  "extensions": {
    "meshsync_geometry": {
      "v1": {
        "vertex_count": 15420,
        "face_count": 30000,
        "is_manifold": true,
        "volume_mm3": 45000.0
      }
    }
  }
}

The current official extension families include:

Extension	Purpose
`meshsync_geometry`	Vertex, face, edge, topology, surface area, and volume metadata
`meshsync_dependencies`	Material, texture, and reference file dependencies
`meshsync_printability`	FDM, SLA, SLS, and MJF-oriented print analysis
`meshsync_content`	Titles, descriptions, language, and tags for content or listings
`meshsync_thumbnails`	Static thumbnails, animated previews, and GLB preview references

The version envelope matters. A writer can emit v2 while preserving v1 for older readers. A reader can select the highest version it understands and ignore the rest. Breaking extension changes do not have to break the entire MeshPack core.

This also makes canonical field naming visible. For example, the geometry extension defines vertex_count, face_count, edge_count, and volume_mm3 as the authoritative names. Legacy names such as vertices, faces, or volume_cubic_mm are documented as non-conforming. That detail looks small until it prevents three services from silently disagreeing about the same measurement.

Delta Packs Treat Sync As A First-Class Problem

For synchronization, MeshPack supports delta packs. A delta pack sets generation.partial = true and points to a base_pack_hash. Entries carry explicit operations: add, modify, and delete.

flowchart LR
    A[Base MeshPack] --> C[Merge algorithm]
    B[Delta pack] --> C

    B --> B1[add entries]
    B --> B2[modify entries]
    B --> B3[delete entries]

    C --> D[Merged state]
    D --> E[Flatten before distribution]

Delta packs model sync directly, but first-version discipline keeps chains flat before distribution.

Unchanged files are absent from the delta and inherited from the base. The merge algorithm is intentionally simple: load the base entries into a map keyed by path, apply deletes and upserts from the delta, and treat the resulting map as the merged state.

The standard also rejects unbounded complexity at this stage. Delta-of-delta chains are not supported in the first version; they should be flattened before distribution. That is standards discipline: every powerful feature needs an operational cost ceiling.

Governance Is Part Of The Standard

The standard repository is not just a schema folder. It includes a governance model, versioning strategy, contribution process, generators, validation tooling, samples, and conformance tests.

Significant changes require an RFC process. The format uses SemVer:

Major versions are incompatible schema or structure changes.
Minor versions are backward-compatible additions.
Patch versions are clarifications and bug fixes.

The SDKs for Python, Rust, and TypeScript are generated from JSON Schemas. That is a crucial process choice. Generated SDKs make the schemas executable across the ecosystem. A change to the standard is not complete because a document was edited; it is complete when schemas validate, SDKs regenerate, generated packages compile, samples still work, and conformance tests pass.

The repository already contains a validator, extension schemas, and conformance tests across L1, L2, and L3. A standard without a test suite is only an agreement to argue later.

Why This Belongs In Mesh-Sync

MeshPack gives Mesh-Sync a stable middle layer between storage discovery and higher-level processing. A storage connector can create a package. A heuristic worker can inspect folder structure. A geometry worker can attach canonical measurements. A marketplace connector can preserve external identities. A backend importer can verify integrity before trusting anything.

flowchart TD
    A[Storage connectors] --> B[MeshPack scan snapshot]
    B --> C[Integrity and conformance validation]
    C --> D[Processing workers]
    D --> E[Metamodel and marketplace enrichment]
    E --> F[Mesh-Sync platform]

    B --> G[External viewers and tools]
    B --> H[Future public standard users]

The strategic benefit is not only technical portability. It is decision portability. Once the package format can carry the right facts, downstream systems can disagree less about what a 3D asset is.

That is the larger lesson from MeshPack: standards emerge from pressure. You notice what keeps breaking, name the boundary, make the invariants testable, and leave enough room for future workflows to grow without rewriting the foundation.

MeshPack is being prepared for public release at github.com/Mesh-Sync/standard-meshpack. The product it supports is Mesh-Sync, a platform for automated classification and marketplace-oriented processing of 3D model libraries. The worker side of the same architecture is covered in the contract-first worker platform post.