Appendix A: GIS Data Formats Reference#

This appendix provides a comprehensive reference guide for the most common geospatial data formats used in Web GIS applications. Each format is described with its characteristics, use cases, advantages, limitations, and code examples for working with the format in web applications.

Vector Data Formats#

GeoJSON#

Description: GeoJSON is a format for encoding geographic data structures using JavaScript Object Notation (JSON). It has become the de facto standard for web-based geospatial applications due to its simplicity and native JavaScript support.

Structure:

{
  "type": "FeatureCollection",
  "features": [
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [-122.4194, 37.7749]
      },
      "properties": {
        "name": "San Francisco",
        "population": 883305
      }
    }
  ]
}

Supported Geometries:

  • Point

  • LineString

  • Polygon

  • MultiPoint

  • MultiLineString

  • MultiPolygon

  • GeometryCollection

Use Cases:

  • Web mapping applications

  • REST API responses

  • Client-side data processing

  • Real-time data streaming

Advantages:

  • Native JavaScript support

  • Human-readable format

  • Widespread adoption

  • Good compression with gzip

Limitations:

  • No styling information

  • Limited precision for coordinates

  • Can be verbose for large datasets

  • No topology support

JavaScript Example:

// Creating GeoJSON
const geoJSON = {
  type: "FeatureCollection",
  features: [
    {
      type: "Feature",
      geometry: {
        type: "Polygon",
        coordinates: [
          [
            [-122.4, 37.8],
            [-122.3, 37.8],
            [-122.3, 37.7],
            [-122.4, 37.7],
            [-122.4, 37.8],
          ],
        ],
      },
      properties: {
        name: "Sample Area",
        area: 100,
      },
    },
  ],
};

// Validating GeoJSON
function isValidGeoJSON(obj) {
  return (
    obj.type &&
    ["Feature", "FeatureCollection", "GeometryCollection"].includes(obj.type)
  );
}

Shapefile#

Description: Developed by Esri, Shapefiles are a widely used vector data format consisting of multiple files that together store geometry, attributes, and spatial indexing information.

File Components:

  • .shp - Main file containing geometry

  • .shx - Shape index file

  • .dbf - Attribute database file

  • .prj - Projection information

  • .cpg - Code page specification

  • .sbn/.sbx - Spatial index files

Use Cases:

  • Desktop GIS data exchange

  • Government data distribution

  • Legacy system integration

  • Bulk data import/export

Advantages:

  • Universal GIS support

  • Compact file size

  • Fast spatial indexing

  • Mature format with extensive tooling

Limitations:

  • Multiple file requirement

  • 2GB file size limit

  • Limited attribute types

  • Column name length restrictions (10 characters)

Processing Example:

// Using shapefile.js to read Shapefiles in browser
import shapefile from "shapefile";

async function readShapefile(shpBuffer, dbfBuffer) {
  const source = await shapefile.open(shpBuffer, dbfBuffer);
  const features = [];

  let result = await source.read();
  while (!result.done) {
    features.push(result.value);
    result = await source.read();
  }

  return {
    type: "FeatureCollection",
    features: features,
  };
}

TopoJSON#

Description: TopoJSON is an extension of GeoJSON that encodes topology, allowing for more compact representation of geographic data by eliminating redundant boundary information.

Structure:

{
  "type": "Topology",
  "arcs": [
    [
      [0, 0],
      [1, 0],
      [0, 1],
      [-1, 0],
      [0, -1]
    ],
    [
      [1, 0],
      [1, 1],
      [0, 1]
    ]
  ],
  "objects": {
    "counties": {
      "type": "GeometryCollection",
      "geometries": [
        {
          "type": "Polygon",
          "arcs": [[0, 1]]
        }
      ]
    }
  }
}

Use Cases:

  • Large geographic datasets

  • Administrative boundaries

  • Choropleth maps

  • Data with shared boundaries

Advantages:

  • Significant file size reduction (often 80% smaller)

  • Preserves topology

  • Eliminates gaps and overlaps

  • Good for administrative boundaries

Limitations:

  • More complex structure

  • Requires processing before use

  • Limited browser support

  • Not suitable for all geometry types

JavaScript Example:

import * as topojson from "topojson-client";

// Convert TopoJSON to GeoJSON
function topoToGeo(topology, objectName) {
  return topojson.feature(topology, topology.objects[objectName]);
}

// Get mesh (boundaries)
function getMesh(topology, objectName) {
  return topojson.mesh(topology, topology.objects[objectName]);
}

GPX (GPS Exchange Format)#

Description: GPX is an XML schema for GPS data exchange, commonly used for tracks, routes, and waypoints from GPS devices.

Structure:

<?xml version="1.0"?>
<gpx version="1.1" creator="GPS device">
  <trk>
    <name>Sample Track</name>
    <trkseg>
      <trkpt lat="37.7749" lon="-122.4194">
        <ele>100</ele>
        <time>2023-01-01T12:00:00Z</time>
      </trkpt>
    </trkseg>
  </trk>
</gpx>

Elements:

  • <wpt> - Waypoints

  • <rte> - Routes

  • <trk> - Tracks

Use Cases:

  • GPS data import/export

  • Hiking and cycling applications

  • Fleet tracking

  • Fitness applications

JavaScript Example:

// Parse GPX to GeoJSON
function gpxToGeoJSON(gpxString) {
  const parser = new DOMParser();
  const gpx = parser.parseFromString(gpxString, "text/xml");

  const features = [];

  // Extract tracks
  const tracks = gpx.querySelectorAll("trk");
  tracks.forEach((track) => {
    const segments = track.querySelectorAll("trkseg");
    segments.forEach((segment) => {
      const points = segment.querySelectorAll("trkpt");
      const coordinates = Array.from(points).map((point) => [
        parseFloat(point.getAttribute("lon")),
        parseFloat(point.getAttribute("lat")),
      ]);

      features.push({
        type: "Feature",
        geometry: {
          type: "LineString",
          coordinates: coordinates,
        },
        properties: {
          name: track.querySelector("name")?.textContent || "Track",
        },
      });
    });
  });

  return { type: "FeatureCollection", features };
}

KML (Keyhole Markup Language)#

Description: KML is an XML-based format for displaying geographic data, originally developed for Google Earth. It supports both vector and raster data with rich styling options.

Structure:

<?xml version="1.0" encoding="UTF-8"?>
<kml xmlns="http://www.opengis.net/kml/2.2">
  <Document>
    <Placemark>
      <name>San Francisco</name>
      <Point>
        <coordinates>-122.4194,37.7749,0</coordinates>
      </Point>
    </Placemark>
  </Document>
</kml>

Features:

  • Rich styling support

  • Time-based data

  • Network links

  • Ground overlays

  • 3D models

Use Cases:

  • Google Earth integration

  • Data visualization with styling

  • Time-based animations

  • 3D geographic data

Advantages:

  • Rich styling capabilities

  • 3D support

  • Time-based data

  • Google ecosystem integration

Limitations:

  • XML verbosity

  • Complex specification

  • Limited web support

  • Google-centric features

Raster Data Formats#

GeoTIFF#

Description: GeoTIFF is a TIFF image format that includes geographic metadata, making it suitable for storing georeferenced raster data.

Characteristics:

  • Embedded coordinate system information

  • Multiple bands support

  • Compression options (LZW, JPEG, etc.)

  • Tiling support for large images

Use Cases:

  • Satellite imagery storage

  • Digital elevation models

  • Orthophotos

  • Scientific raster data

JavaScript Example:

// Using geotiff.js to read GeoTIFF in browser
import { fromUrl, fromArrayBuffer } from "geotiff";

async function readGeoTIFF(url) {
  const tiff = await fromUrl(url);
  const image = await tiff.getImage();

  const bbox = image.getBoundingBox();
  const rasters = await image.readRasters();

  return {
    bbox: bbox,
    width: image.getWidth(),
    height: image.getHeight(),
    data: rasters[0], // First band
  };
}

PNG/JPEG with World Files#

Description: Standard image formats (PNG, JPEG) paired with world files that contain georeferencing information.

World File Format:

0.000100000000  // Pixel size in x direction
0.000000000000  // Rotation about y-axis
0.000000000000  // Rotation about x-axis
-0.000100000000 // Pixel size in y direction
-122.419400000  // X coordinate of upper left pixel center
37.774900000    // Y coordinate of upper left pixel center

File Extensions:

  • .pgw - PNG world file

  • .jgw - JPEG world file

  • .tfw - TIFF world file

Use Cases:

  • Simple georeferenced images

  • Web map tiles

  • Aerial photography

  • Historical maps

Tile Formats#

Slippy Map Tiles (XYZ)#

Description: The standard tiling scheme used by most web mapping services, organized in a pyramid structure with zoom/x/y addressing.

URL Pattern:

https://tile.server.com/{z}/{x}/{y}.png

Characteristics:

  • 256x256 pixel tiles (typically)

  • Web Mercator projection (EPSG:3857)

  • Zoom levels 0-18+

  • PNG or JPEG format

Tile Calculation:

function deg2tile(lat, lon, zoom) {
  const lat_rad = (lat * Math.PI) / 180;
  const n = Math.pow(2, zoom);
  const x = Math.floor(n * ((lon + 180) / 360));
  const y = Math.floor(
    (n * (1 - Math.log(Math.tan(lat_rad) + 1 / Math.cos(lat_rad)) / Math.PI)) /
      2
  );
  return [x, y];
}

function tile2deg(x, y, zoom) {
  const n = Math.pow(2, zoom);
  const lon = (x / n) * 360 - 180;
  const lat_rad = Math.atan(Math.sinh(Math.PI * (1 - (2 * y) / n)));
  const lat = (lat_rad * 180) / Math.PI;
  return [lat, lon];
}

Vector Tiles (MVT)#

Description: Mapbox Vector Tiles (MVT) format stores vector data in a tile-based structure, enabling dynamic styling and interactive features.

Characteristics:

  • Protocol Buffer encoding

  • Multiple layers per tile

  • Feature attributes preserved

  • Dynamic styling capability

Structure:

// MVT layer structure
{
  version: 2,
  name: "roads",
  extent: 4096,
  keys: ["name", "highway"],
  values: ["Main St", "primary"],
  features: [
    {
      id: 1,
      tags: [0, 0, 1, 1], // Key-value pairs
      type: 2, // LineString
      geometry: [9, 50, 34] // Encoded coordinates
    }
  ]
}

Use Cases:

  • Interactive web maps

  • Custom styling

  • High-performance rendering

  • Offline mapping

JavaScript Example:

// Decoding MVT with @mapbox/vector-tile
import VectorTile from "@mapbox/vector-tile";
import Protobuf from "pbf";

function decodeMVT(buffer) {
  const tile = new VectorTile(new Protobuf(buffer));
  const layers = {};

  for (const layerName in tile.layers) {
    const layer = tile.layers[layerName];
    const features = [];

    for (let i = 0; i < layer.length; i++) {
      const feature = layer.feature(i);
      features.push({
        geometry: feature.loadGeometry(),
        properties: feature.properties,
        type: feature.type,
      });
    }

    layers[layerName] = features;
  }

  return layers;
}

MBTiles#

Description: MBTiles is a specification for storing tiled map data in SQLite databases, commonly used for offline mapping applications.

Database Schema:

CREATE TABLE tiles (
  zoom_level INTEGER,
  tile_column INTEGER,
  tile_row INTEGER,
  tile_data BLOB,
  PRIMARY KEY (zoom_level, tile_column, tile_row)
);

CREATE TABLE metadata (
  name TEXT,
  value TEXT
);

Metadata Fields:

  • name: Tileset name

  • format: png, jpg, pbf, etc.

  • bounds: Bounding box

  • center: Center point and zoom

  • minzoom/maxzoom: Zoom range

Use Cases:

  • Offline mapping

  • Mobile applications

  • Tile storage and distribution

  • Custom tile servers

Data Exchange Formats#

CSV with Coordinates#

Description: Comma-separated values format with coordinate columns, simple but widely supported for point data.

Example:

name,latitude,longitude,population
San Francisco,37.7749,-122.4194,883305
Los Angeles,34.0522,-118.2437,3971883

Use Cases:

  • Simple point data import

  • Spreadsheet integration

  • Lightweight data exchange

  • Non-GIS user accessibility

JavaScript Example:

function csvToGeoJSON(csvData) {
  const lines = csvData.split("\n");
  const headers = lines[0].split(",");
  const features = [];

  for (let i = 1; i < lines.length; i++) {
    const values = lines[i].split(",");
    if (values.length < headers.length) continue;

    const properties = {};
    let lat, lng;

    headers.forEach((header, index) => {
      const value = values[index];
      if (["lat", "latitude", "y"].includes(header.toLowerCase())) {
        lat = parseFloat(value);
      } else if (
        ["lng", "lon", "longitude", "x"].includes(header.toLowerCase())
      ) {
        lng = parseFloat(value);
      } else {
        properties[header] = isNaN(value) ? value : parseFloat(value);
      }
    });

    if (lat && lng) {
      features.push({
        type: "Feature",
        geometry: {
          type: "Point",
          coordinates: [lng, lat],
        },
        properties: properties,
      });
    }
  }

  return { type: "FeatureCollection", features };
}

Well-Known Text (WKT)#

Description: WKT is a text markup language for representing vector geometry objects, commonly used in spatial databases and GIS applications.

Geometry Examples:

POINT (-122.4194 37.7749)
LINESTRING (-122.4 37.8, -122.3 37.7, -122.2 37.6)
POLYGON ((-122.4 37.8, -122.3 37.8, -122.3 37.7, -122.4 37.7, -122.4 37.8))
MULTIPOINT ((-122.4 37.8), (-122.3 37.7))

Use Cases:

  • Database geometry columns

  • Spatial SQL queries

  • Data validation

  • Geometry debugging

JavaScript Example:

// Simple WKT parser for basic geometries
function parseWKT(wkt) {
  wkt = wkt.trim();

  if (wkt.startsWith("POINT")) {
    const coords = wkt.match(/\(([^)]+)\)/)[1].split(" ");
    return {
      type: "Point",
      coordinates: [parseFloat(coords[0]), parseFloat(coords[1])],
    };
  }

  if (wkt.startsWith("LINESTRING")) {
    const coordString = wkt.match(/\(([^)]+)\)/)[1];
    const coordinates = coordString.split(",").map((coord) => {
      const [x, y] = coord.trim().split(" ");
      return [parseFloat(x), parseFloat(y)];
    });
    return {
      type: "LineString",
      coordinates: coordinates,
    };
  }

  // Add more geometry types as needed
  throw new Error("Unsupported WKT geometry type");
}

Format Conversion Examples#

Universal Format Converter#

class GeoDataConverter {
  static async convertToGeoJSON(data, format) {
    switch (format.toLowerCase()) {
      case "csv":
        return this.csvToGeoJSON(data);
      case "wkt":
        return this.wktToGeoJSON(data);
      case "gpx":
        return this.gpxToGeoJSON(data);
      case "kml":
        return this.kmlToGeoJSON(data);
      case "shapefile":
        return this.shapefileToGeoJSON(data);
      default:
        throw new Error(`Unsupported format: ${format}`);
    }
  }

  static geoJSONToWKT(feature) {
    const geom = feature.geometry;

    switch (geom.type) {
      case "Point":
        return `POINT (${geom.coordinates[0]} ${geom.coordinates[1]})`;
      case "LineString":
        const lineCoords = geom.coordinates
          .map((c) => `${c[0]} ${c[1]}`)
          .join(", ");
        return `LINESTRING (${lineCoords})`;
      case "Polygon":
        const ringCoords = geom.coordinates[0]
          .map((c) => `${c[0]} ${c[1]}`)
          .join(", ");
        return `POLYGON ((${ringCoords}))`;
      default:
        throw new Error(`Unsupported geometry type: ${geom.type}`);
    }
  }

  static validateGeoJSON(geojson) {
    const errors = [];

    if (!geojson.type) {
      errors.push("Missing type property");
    }

    if (geojson.type === "FeatureCollection") {
      if (!Array.isArray(geojson.features)) {
        errors.push("FeatureCollection must have features array");
      } else {
        geojson.features.forEach((feature, index) => {
          if (!feature.type || feature.type !== "Feature") {
            errors.push(`Feature ${index} missing type`);
          }
          if (!feature.geometry) {
            errors.push(`Feature ${index} missing geometry`);
          }
        });
      }
    }

    return errors;
  }
}

Format Selection Guidelines#

Vector Data Selection#

Use Case

Recommended Format

Reason

Web APIs

GeoJSON

Native JavaScript support, HTTP-friendly

Large datasets

TopoJSON or MVT

Smaller file size, topology preservation

Desktop GIS

Shapefile

Universal support, mature tooling

GPS Data

GPX

Standard for GPS devices

Styled visualization

KML

Rich styling, 3D support

Raster Data Selection#

Use Case

Recommended Format

Reason

Web tiles

PNG/JPEG

Browser native support, efficient

Analysis

GeoTIFF

Preserves precision, metadata

Large imagery

Cloud Optimized GeoTIFF

Efficient streaming, partial reads

Simple overlays

PNG with world file

Simple georeferencing

Performance Considerations#

File Size Optimization:

  • Use appropriate precision for coordinates

  • Remove unnecessary properties

  • Apply compression (gzip for JSON formats)

  • Consider data generalization for display

Loading Performance:

  • Chunk large datasets

  • Use progressive loading

  • Implement caching strategies

  • Consider format-specific optimizations

Browser Compatibility:

  • GeoJSON: Universal support

  • Shapefile: Requires JavaScript library

  • Vector tiles: Requires specialized libraries

  • WKT: Simple parsing, good support

This reference guide provides the foundation for selecting and working with appropriate data formats in Web GIS applications. The choice of format depends on specific use cases, performance requirements, and integration needs.