Skip to content

Plot Validation

The /validate_plots and /validate_plot API endpoints perform comprehensive integrity checks on agricultural plot geometries to ensure they meet quality standards for sustainability analysis. These endpoints validate geometric properties, detect common issues, and provide detailed feedback for data quality improvement.

Validation Checks

The system performs comprehensive integrity checks to ensure plot geometries are suitable for:

  • Sustainability Analysis: Accurate area calculations and boundary definitions
  • Satellite Analysis: Proper geometric structure for Earth Engine processing
  • Compliance Reporting: Valid geometries for regulatory submissions
  • Data Quality: Clean, consistent data for downstream processing

Endpoints

POST /validate_plots

Validates multiple plot geometries from an uploaded GeoJSON file.

Parameters:

Parameter Type Required Default Description
file File Yes - GeoJSON file containing plot geometries
country string No - Country for commodity validation (optional)
commodity_type string No - Commodity type for validation (optional)

Response:

Returns a streaming GeoJSON FeatureCollection with:

  • Collection Properties: Aggregate integrity statistics
  • Feature Properties: Individual validation results for each plot

Example Request:

curl -X POST "https://api.epoch.eco/validate_plots" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -F "file=@plots.geojson" \
  -F "country=Thailand" \
  -F "commodity_type=rubber" \
  -F "min_area=0.5" \
  -F "distance=1000" \
  -F "angle=30"

Usage Examples

Python

import requests
import json

# Set up the request
url = "https://api.epoch.eco/validate_plots"
headers = {
    "Authorization": "Bearer <your_firebase_token>"
}

# Prepare the file and parameters
files = {
    'file': ('my_plots.geojson', open('path/to/your/file.geojson', 'rb'), 'application/geo+json')
}

data = {
    'country': 'Thailand',  # Optional: for commodity validation
    'commodity_type': 'rubber',  # Optional: for commodity validation
    'min_area': 0.5,  # Minimum area threshold
    'distance': 1000,  # Maximum distance between vertices
    'angle': 30,  # Minimum angle threshold
    'min_precision': 6,  # Minimum coordinate precision
    'min_length': 10,  # Minimum boundary segment length
    'overlap_threshold': 0.5,  # Overlap threshold
    'max_distance': 250000  # Maximum distance to nearest neighbor
}

# Make the request
response = requests.post(url, headers=headers, files=files, data=data)

# Check if request was successful
if response.status_code == 200:
    result = response.json()

    # Access the summary statistics
    properties = result['properties']
    print(f"Total plots: {properties.get('total_plots', 0)}")
    print(f"Valid plots: {properties.get('valid_plots', 0)}")
    print(f"Invalid plots: {properties.get('invalid_plots', 0)}")
    print(f"Area issues: {properties.get('area_issues', 0)}")
    print(f"Self-intersection issues: {properties.get('self_intersection_issues', 0)}")

    # Access individual features with validation results
    for feature in result['features']:
        props = feature['properties']
        print(f"Plot {props.get('plot_id', 'unknown')}: Valid={props.get('overall_valid', False)}")
else:
    print(f"Error: {response.status_code} - {response.text}")

JavaScript

const axios = require('axios');
const FormData = require('form-data');
const fs = require('fs');

async function validatePlots(filePath, token, options = {}) {
    try {
        const formData = new FormData();

        // Add the file
        formData.append('file', fs.createReadStream(filePath));

        // Add optional parameters
        if (options.country) formData.append('country', options.country);
        if (options.commodity_type) formData.append('commodity_type', options.commodity_type);
        if (options.min_area) formData.append('min_area', options.min_area);
        if (options.distance) formData.append('distance', options.distance);
        if (options.angle) formData.append('angle', options.angle);

        const response = await axios({
            method: 'post',
            url: 'https://api.epoch.eco/validate_plots',
            headers: {
                'Authorization': `Bearer ${token}`,
                ...formData.getHeaders()
            },
            data: formData
        });

        // Process the response
        const result = response.data;

        // Access the summary statistics
        const properties = result.properties;
        console.log(`Total plots: ${properties.total_plots || 0}`);
        console.log(`Valid plots: ${properties.valid_plots || 0}`);
        console.log(`Invalid plots: ${properties.invalid_plots || 0}`);
        console.log(`Area issues: ${properties.area_issues || 0}`);
        console.log(`Self-intersection issues: ${properties.self_intersection_issues || 0}`);

        // Access individual features with validation results
        result.features.forEach(feature => {
            const props = feature.properties;
            console.log(`Plot ${props.plot_id || 'unknown'}: Valid=${props.overall_valid || false}`);
        });

    } catch (error) {
        console.error('Error:', error.response?.data || error.message);
    }
}

// Usage example
validatePlots('./my_plots.geojson', 'your_token', {
    country: 'Thailand',
    commodity_type: 'rubber',
    min_area: 0.5,
    distance: 1000,
    angle: 30
});

cURL

curl -X POST "https://api.epoch.eco/validate_plots" \
  -H "Authorization: Bearer <your_firebase_token>" \
  -H "Accept: application/json" \
  -F "file=@path/to/your/file.geojson" \
  -F "country=Thailand" \
  -F "commodity_type=rubber" \
  -F "min_area=0.5" \
  -F "distance=1000" \
  -F "angle=30" \
  -F "min_precision=6" \
  -F "min_length=10" \
  -F "overlap_threshold=0.5" \
  -F "max_distance=250000" \
  --output validation_results.json

GET /validate_plot

Validates a single plot geometry using a WKT geometry string.

Parameters:

Parameter Type Required Default Description
geometry string Yes - WKT geometry (Polygon, Multipolygon, etc.)
country string No - Country for commodity validation (optional)
commodity_type string No - Commodity type for validation (optional)

Response:

Returns a GeoJSON FeatureCollection with a single feature containing validation results.

Example Request:

curl -X GET "https://api.epoch.eco/validate_plot" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -G \
  -d "geometry=POLYGON((100.5018 13.7563, 100.5028 13.7563, 100.5028 13.7573, 100.5018 13.7573, 100.5018 13.7563))" \
  -d "country=Thailand"

Validation Checks

Geometric Integrity Checks

1. Area Validation

  • Purpose: Ensures plots meet minimum size requirements
  • Check: area >= min_area (default: 0.1 hectares)
  • Issue: Plots too small for meaningful analysis
  • Fix: Combine small adjacent plots or increase minimum threshold

2. Self-Intersection Detection

  • Purpose: Identifies invalid polygon geometries
  • Check: No self-intersecting boundaries
  • Issue: Invalid geometry that can cause analysis errors
  • Fix: Use GIS tools to fix self-intersections

3. Duplicate Vertices

  • Purpose: Removes redundant coordinate points
  • Check: No consecutive identical coordinates
  • Issue: Inefficient geometry storage and potential analysis issues
  • Fix: Remove duplicate vertices using GIS tools

4. Angle Validation

  • Purpose: Ensures reasonable boundary angles
  • Check: All angles >= min_angle (default: 1.0 degrees)
  • Issue: Extremely sharp angles may indicate digitization errors
  • Fix: Smooth boundaries or adjust digitization precision

5. Distance Validation

  • Purpose: Ensures reasonable distances between vertices
  • Check: All vertex distances <= max_distance (default: 1000m)
  • Issue: Extremely long segments may indicate missing vertices
  • Fix: Add intermediate vertices along long boundaries

6. Precision Validation

  • Purpose: Ensures adequate coordinate precision
  • Check: Coordinates have >= min_precision decimal places (default: 6)
  • Issue: Low precision may cause analysis inaccuracies
  • Fix: Increase coordinate precision in source data

7. Boundary Segment Length

  • Purpose: Ensures reasonable boundary segment lengths
  • Check: All segments >= min_length (default: 10m)
  • Issue: Very short segments may indicate digitization noise
  • Fix: Simplify geometry or adjust digitization settings

Spatial Relationship Checks

8. Overlap Detection

  • Purpose: Identifies overlapping plots within the dataset
  • Check: Overlap area <= overlap_threshold (default: 1% of plot area)
  • Issue: Overlapping plots can cause double-counting in analysis
  • Fix: Resolve overlaps by adjusting boundaries or removing duplicates

9. Nearest Neighbor Distance

  • Purpose: Identifies isolated plots that may be errors
  • Check: Distance to nearest neighbor <= max_neighbor_distance (default: 1000m)
  • Issue: Isolated plots may be digitization errors or require separate handling
  • Fix: Verify plot location or adjust analysis parameters

Geometry Type Handling

10. Geometry Collection Processing

  • Purpose: Handles complex geometry collections
  • Check: Separates and validates individual geometry components
  • Issue: Mixed geometry types in single features
  • Fix: Split into separate features by geometry type

11. MultiPolygon Flattening

  • Purpose: Processes MultiPolygon geometries
  • Check: Separates MultiPolygons into individual Polygons
  • Issue: Complex MultiPolygon structures
  • Fix: Flatten to individual Polygon features

Confidence Scoring

The validation system calculates confidence scores for both individual plots and the overall dataset based on the severity and frequency of detected issues.

Confidence Score Calculation

The system uses a weighted penalty system that starts with a perfect score (1.0) and deducts points based on detected issues:

Issue Weights (Penalty System)

Issue Type Weight Description
intersects_another_polygon_count 0.30 Overlaps are serious - can cause double-counting
self_intersection_count 0.25 Self-intersections are serious - invalid geometry
distance_nearest_neighbor_count 0.20 Isolated plots are concerning - may be errors
commodity_absence_count 0.15 Wrong commodity area is concerning
area_too_small_count 0.10 Small areas are less serious
max_distance_gt_min_distance_count 0.10 Long edges are less serious
max_angle_lt_min_angle_count 0.10 Sharp angles are less serious
bad_precision_count 0.05 Precision issues are minor
short_boundary_segments_count 0.05 Short segments are minor
triangular_geometry_count 0.05 Triangular shapes are minor
duplicate_vertices_count 0.05 Duplicate vertices are minor
nested_geometry_collection_count 0.05 Nested collections are minor
simplifiable_collections_count 0.05 Simplifiable collections are minor

Confidence Level Thresholds

  • High (0.8-1.0): Excellent data quality with minimal issues
  • Medium (0.5-0.8): Good data quality with some issues requiring attention
  • Low (0.0-0.5): Poor data quality with significant issues requiring correction

Calculation Process

  1. Start with perfect score: 1.0
  2. Calculate penalty ratio: min(issue_count / total_plots, 1.0)
  3. Apply weighted penalty: penalty_ratio × issue_weight
  4. Final score: max(0.0, 1.0 - total_penalty)

Individual Plot Confidence

Each plot receives its own confidence score based on the specific issues detected for that plot:

  • Plot-specific penalties: Applied based on individual plot issues
  • Same thresholds: High (≥0.8), Medium (≥0.5), Low (<0.5)
  • Individual scoring: Each plot's confidence is independent of others

Response Format

Collection Properties (Aggregate Statistics)

{
  "type": "FeatureCollection",
  "properties": {
    "total_plots": 150,
    "valid_plots": 142,
    "invalid_plots": 8,
    "area_issues": 3,
    "self_intersection_issues": 2,
    "duplicate_vertex_issues": 5,
    "angle_issues": 1,
    "distance_issues": 0,
    "precision_issues": 2,
    "segment_length_issues": 1,
    "overlap_issues": 4,
    "neighbor_distance_issues": 2,
    "total_area_hectares": 1250.5,
    "average_plot_area": 8.34,
    "confidence_score": 0.756,
    "confidence_level": "medium"
  },
  "features": [...]
}

Feature Properties (Individual Results)

{
  "type": "Feature",
  "geometry": {
    "type": "Polygon",
    "coordinates": [[[100.5018, 13.7563], [100.5028, 13.7563], [100.5028, 13.7573], [100.5018, 13.7573], [100.5018, 13.7563]]]
  },
  "properties": {
    "plot_id": "plot_001",
    "area_hectares": 1.25,
    "area_valid": true,
    "self_intersection": false,
    "duplicate_vertices": false,
    "angle_valid": true,
    "distance_valid": true,
    "precision_valid": true,
    "segment_length_valid": true,
    "overlap_detected": false,
    "neighbor_distance_valid": true,
    "overall_valid": true,
    "validation_issues": [],
    "validation_warnings": [],
    "confidence_score": 0.850,
    "confidence_level": "high"
  }
}

Validation Results

High Confidence Plot (0.8-1.0)

  • Excellent data quality with minimal issues
  • All critical geometric checks pass
  • Suitable for sustainability analysis
  • No significant issues requiring attention

Medium Confidence Plot (0.5-0.8)

  • Good data quality with some issues
  • Most geometric checks pass
  • Suitable for analysis with minor caveats
  • Some issues may require attention for optimal results

Low Confidence Plot (0.0-0.5)

  • Poor data quality with significant issues
  • One or more critical geometric checks fail
  • Requires correction before analysis
  • Specific issues identified in response

Confidence Score Interpretation

The confidence score provides a quantitative measure of data quality:

  • 0.9-1.0: Exceptional quality, ready for analysis
  • 0.8-0.9: High quality, minor issues only
  • 0.7-0.8: Good quality, some attention needed
  • 0.5-0.7: Moderate quality, several issues present
  • 0.3-0.5: Poor quality, significant issues
  • 0.0-0.3: Very poor quality, major corrections needed

Common Issues and Solutions

1. Small Plot Areas

Issue: Plots smaller than minimum threshold Solution:

  • Increase min_area parameter if appropriate
  • Combine adjacent small plots
  • Use different analysis approach for small plots

2. Self-Intersections

Issue: Invalid polygon boundaries Solution:

  • Use GIS software to fix geometries
  • Re-digitize problematic boundaries
  • Use buffer operations to clean boundaries

3. Overlapping Plots

Issue: Multiple plots occupy same area Solution:

  • Adjust plot boundaries to eliminate overlaps
  • Remove duplicate plots
  • Use spatial analysis to resolve conflicts

4. Low Coordinate Precision

Issue: Insufficient decimal places in coordinates Solution:

  • Increase precision in source data
  • Adjust min_precision parameter if appropriate
  • Re-digitize with higher precision

5. Isolated Plots

Issue: Plots far from nearest neighbors Solution:

  • Verify plot location accuracy
  • Adjust max_neighbor_distance parameter
  • Handle isolated plots separately in analysis

Performance Considerations

  • Large Files: Use streaming for files with >1000 plots
  • Complex Geometries: MultiPolygons and GeometryCollections require more processing time
  • Precision: Higher precision requirements increase processing time
  • Spatial Checks: Overlap and neighbor distance checks are computationally intensive

Best Practices

1. Parameter Tuning

  • Start with default values and adjust based on your data characteristics
  • Lower angle thresholds (0.5-1.0°) for more sensitive spike detection
  • Lower distance thresholds (500-800m) for more sensitive edge detection
  • Adjust area thresholds based on your expected plot sizes
  • Increase precision requirements for high-accuracy applications

2. Data Preparation

  • Clean geometries before uploading using GIS software like QGIS
  • Fix self-intersections and topological issues
  • Ensure adequate coordinate precision (6+ decimal places)
  • Remove duplicate vertices and simplify complex geometries
  • Validate coordinate systems and ensure proper projection