Relational databases for geotechnical data
Geotechnical data has a natural hierarchical structure that mirrors how ground investigations are actually performed. Understanding this structure—and how relational databases represent it, reveals why this approach is more flexible and powerful than traditional file-based formats like AGS.
The natural hierarchy of ground investigations
Section titled “The natural hierarchy of ground investigations”Every ground investigation follows the same logical structure, from project level down to individual test results.
These hierarchical relationships can be visualized in a tree diagram like this:
This tree-like structure emerges naturally because:
- Projects contain multiple investigation locations
- Locations yield multiple observations/measurements and samples
- Samples undergo multiple laboratory tests
Each level depends on the level above it, creating clear parent-child relationships throughout the dataset.
Relational databases represent hierarchical data
Section titled “Relational databases represent hierarchical data”Relational databases are foundational to modern data management, and probably the most common type of database. They store data in linked tables, making them suitable for representing hierarchical structures. Each table represents one level of the hierarchy:
Primary keys: Unique identifiers
Section titled “Primary keys: Unique identifiers”Every table has a primary key: a column that uniquely identifies each row:
project_uid
in the Projects tablelocation_uid
in the Locations tablesample_uid
in the Samples tabletest_uid
in laboratory test tables
Foreign keys: Linking relationships
Section titled “Foreign keys: Linking relationships”Tables are linked through foreign keys: columns that reference primary keys in parent tables:
- Locations table contains
project_uid
(linking to Projects) - In-situ test tables contain
location_uid
(linking to Locations) - Laboratory test tables contain
sample_uid
(linking to Samples)
Example: One project, multiple relationships
Section titled “Example: One project, multiple relationships”Consider a project with 2 boreholes, where “Borehole 1 (BH001)” has 3 samples:
Projects table:
project_uid | project_name |
---|---|
P001 | Appartment |
Locations table:
location_uid | project_uid | location_name | easting | northing |
---|---|---|---|---|
BH001 | P001 | Borehole 1 | 523441 | 181652 |
BH002 | P001 | Borehole 2 | 523467 | 181678 |
Samples table:
sample_uid | location_uid | depth_top | depth_base |
---|---|---|---|
S001 | BH001 | 1.0 | 1.5 |
S002 | BH001 | 3.2 | 3.7 |
S003 | BH001 | 5.8 | 6.3 |
This creates one-to-many relationships: one project has many locations, one location has many samples, etc.
Relational database advantages
Section titled “Relational database advantages”Query power: SQL enables complex queries across multiple tables:
-- Find all samples from sandy soils deeper than 5mSELECT s.*, l.location_nameFROM samples sJOIN locations l ON s.location_uid = l.location_uidJOIN geological_descriptions g ON s.location_uid = g.location_uidWHERE g.geology LIKE '%sand%' AND s.depth_top > 5.0
Data integrity: Enforces relationships through foreign key constraints, you can’t add a sample without a valid location
Concurrent access: Multiple users can query and update data simultaneously
Extensibility: Easy to add new test types or custom fields without breaking existing structure
Integration: Direct connectivity with GIS software, analysis tools, and web applications