Relational databases for geotechnical data

Geotechnical data has a natural hierarchical structure that mirrors how ground investigations are actually performed. Understanding this structure and how relational databases represent it, shows why this approach is more flexible and powerful than traditional file-based formats like GEF or AGS.

The natural hierarchy of ground investigations

Every ground investigation follows the same logical structure, from project level down to individual test results.

These hierarchical relationships can be visualized in a tree diagram like this:

This tree-like structure emerges naturally because:

Projects contain multiple investigation locations. Locations yield multiple observations, measurements, and samples. Samples can be used in multiple laboratory tests.

Each level depends on the level above it, creating clear parent-child relationships throughout the dataset.

Relational databases represent hierarchical data

Relational databases are foundational to modern data management, and probably the most common type of database. They store data in linked tables, making them suitable for representing hierarchical structures. Each table represents one level of the hierarchy.

Primary keys: Unique identifiers

Every table has a primary key: a column that uniquely identifies each row:

project_uid in the Projects table
location_uid in the Locations table
sample_uid in the Samples table

Foreign keys: Linking relationships

Tables are linked through foreign keys: columns that reference primary keys in parent tables:

Locations table contains project_uid (linking to Projects)
In-situ test tables contain location_uid (linking to Locations)
Laboratory test tables contain sample_uid (linking to Samples)

Example: One project, multiple relationships

Consider a project with 2 boreholes, where “Borehole 1 (BH001)” has 3 samples:

Projects table:

project_uid	horizontal_crs	vertical_crs
P001	EPSG:2326	EPSG:5738

Locations table:

location_uid	project_uid	easting	northing	depth_to_base	ground_level_elevation
BH001	P001	523441	181652	5.4	20.5
BH002	P001	523467	181678	6.8	19.8

Samples table:

sample_uid	location_uid	project_uid
S001	BH001	P001
S002	BH001	P001
S003	BH001	P001

This creates one-to-many relationships: one project has many locations, one location has many samples, etc.

Relational Database Advantages

Query power: SQL enables complex queries across multiple tables:

Concurrent access: Multiple users can query and update data simultaneously

Extensibility: Easy to add custom fields without breaking existing structure

Integration: Direct connectivity with GIS software, analysis tools, and web applications