Why Apache Iceberg

Why Apache Iceberg:

Existing data map tools roughly aligned (already using Parquet)
- Allows for schema elovution which can be used to have an intentially limited v1 and then easily evolve to v2, v3, …xs
Geo data types are new in Iceberg 3 (which is finalizing in summer of 2025)
- Based on GeoArrow
- GeoArrow parser for DeckGL: geoarrow/deck.gl-layers
- Useful data structures for representing objects on 2D maps
Web-app context:
- Very light weight data web-app “stack”
  - icebird and friends
  - icebird + hyparquet = 85kb + 10kb
  - Way lighter than duckDB or lanceDB and their ilk
Lakehouse context:
- Iceberg is a widely adopted format for data lakehouse
- Static files (over HTTP or local file system)
  - DataMapPlot generates static file data map web apps
  - “Lakehouse” is simply a fancy term form static storage (“object store”), plus a little DB machinery
- As of early 2025, Iceberg Catalog is now available as a mainstream managed service
  - “The R2 Data Catalog in open beta, a managed Apache Iceberg catalog built directly into your Cloudflare R2 bucket.”
  - We can expect the same from all the other cloud providers RSN

Network transport weight of various data stacks:

Of course, just reading tables is not the same complexity of machinery as a SQL engine but perhaps such is overkill…