Skip to content

Facility Validation

The /validate_facilities and /validate_facility API endpoints assess whether locations are suitable for commodity aggregation facilities. These endpoints analyze geographic, infrastructure, and business factors to determine the likelihood that a location serves as a first-mile aggregation point for commodities.

Validation Metrics

The system combines multiple data sources and AI analysis to evaluate:

  • Commodity Presence: Whether the target commodity is present in the area using commodity maps
  • Infrastructure Access: Road, water, and port accessibility using Overture Maps and Google Places API
  • Building/Industrial Presence: Industrial facility and building analysis within 100m using Overture Places dataset with graduated scoring (large = +0.23, medium = +0.12, none = +0.0). The AI receives binary values (0/1) but scoring uses graduated values.
  • Business Context: Nearby POIs using multiple data sources
  • Geographic Suitability: Building density and latitude constraints for commodity production
  • Producer Verification: Cross-referencing facility information with external business databases

Supported Commodities

The system has dedicated detection models for these commodities:

Supported: rubber, palm, cocoa, coffee, soy, timber, cattle

Unsupported Commodities: If you provide a commodity not in the list above, the system will: - Use a plantation forest mask as an umbrella commodity detection model (appropriate for most agro-forestry commodities) - Include a warning in the response indicating the fallback behavior - Still perform all infrastructure and location checks normally

This allows validation to proceed for unlisted commodities while making it transparent that a generic plantation detection model is being used.

Endpoints

POST /validate_facilities

Validates multiple facility locations from an uploaded GeoJSON file.

Parameters:

Parameter Type Required Default Description
file File Yes - GeoJSON file containing facility geometries
commodity_type string No - Commodity type for validation (rubber, palm, cocoa, coffee, soy, timber, cattle)
commodity_radius float No 5000 Radius in meters to check for commodity presence
check_water_access boolean No False Whether to check water access proximity
check_port_proximity boolean No False Whether to check port proximity
check_road_access boolean No True Whether to check road access (fast Overture Places query)

Response:

Returns a streaming GeoJSON FeatureCollection with:

  • Collection Properties: Aggregate statistics across all facilities
  • Feature Properties: Individual validation results for each facility

Example Request:

curl -X POST "https://api.epoch.eco/validate_facilities" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -F "file=@facilities.geojson" \
  -F "commodity_type=rubber" \
  -F "commodity_radius=3000" \
  -F "check_water_access=true"

GET /validate_facility

Validates a single facility location using a WKT geometry string.

Parameters:

Parameter Type Required Default Description
geometry string Yes - WKT geometry (Point, Polygon, etc.)
commodity_type string No - Commodity type for validation
commodity_radius float No 5000 Radius in meters to check for commodity presence
check_water_access boolean No False Whether to check water access proximity
check_port_proximity boolean No False Whether to check port proximity
check_road_access boolean No True Whether to check road access

Response:

Returns a GeoJSON FeatureCollection with a single feature containing validation results.

Example Request:

curl -X GET "https://api.epoch.eco/validate_facility" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -G \
  -d "geometry=POINT(100.5018 13.7563)" \
  -d "commodity_type=rubber" \
  -d "check_road_access=true"
  • POST /validate_locations - Universal validation; auto-detects facility vs plot and routes accordingly. Use for mixed or unknown input types.
  • POST /normalize_locations - Normalize/geocode data without full validation (e.g., prepare CSV/Excel before validation).

Validation Criteria

Confidence Score Calculation (Updated)

The system uses a penalty-based scoring model that starts at perfect confidence (1.0) and deducts points for missing critical attributes. This ensures that facilities with missing infrastructure or commodity presence are appropriately penalized.

Scoring Algorithm:

confidence_score = 1.0  # Start at maximum

# Penalties for missing attributes:
- 0.50 if commodity_presence = 0 (major - outside production region)
- 0.25 if commodity_presence < 1.0 (moderate - low commodity presence)
- 0.20 if poi_presence = 0 (no business activity nearby)
- 0.15 if building_presence = 0 (no industrial presence)
- 0.15 if road_access = 0 (no infrastructure)
- 0.10 if high_density = 1 (urban area - less suitable)

# NOT SCORED: water_access, port_proximity (optional checks, ignored)

# Result: confidence_score = max(0.0, min(1.0, score))

Confidence Level Thresholds (implementation uses 0.7 / 0.45): - High (≥ 0.7): Strong commodity and infrastructure indicators, minimal gaps - Medium (0.45 to < 0.7): Good core indicators but some important gaps - Low (< 0.45): Poor core indicators or critical gaps

Commodity Presence Scoring

Commodity presence is a critical factor. The system checks:

  • Present in area: Facility is in a commodity-producing region
  • Country-level check: Facility is in a country that produces the target commodity
  • Latitude-level check: Facility is within viable latitude range for the commodity (if applicable)

Infrastructure & Location Factors

The system evaluates these factors to determine facility suitability. Absence of critical infrastructure reduces confidence:

  • Commodity Presence: Critical factor in scoring
  • Building/Industrial Presence: Evidence of operational facility
  • POI Nearby: Relevant businesses (warehouses, processors, agricultural companies)
  • Road Access: Essential for logistics
  • High Building Density: Urban areas are less suitable for commodity facilities

Scoring Impact: - Water access and port proximity: Not scored (optional, ignored) - All others: Penalize if absent (see Confidence Score Calculation above)

Latitude Constraints by Commodity

Commodity Optimal Range Viable Range Unlikely Beyond
Rubber 10°N-10°S 20°N/S 30°N/S
Palm Oil 10°N-10°S 15°N/S 20°N/S
Cocoa 20°N-20°S 25°N/S 30°N/S
Coffee 25°N-25°S 35°N/S 40°N/S
Soy 30°N-30°S 45°N/S 50°N/S
Timber 30°N-60°N, 30°S-60°S 20°N-70°N, 20°S-70°S Extreme polar regions
Cattle Global Global None

Commodity-Producing Countries

  • Rubber: Thailand, Vietnam, Indonesia, Malaysia, Ivory Coast, China, Ghana, Philippines, Laos, Cambodia, Myanmar, India, Bangladesh, Nepal, Bhutan, Liberia
  • Palm Oil: Thailand, Indonesia, Malaysia, Ivory Coast, Ghana, Nigeria, Ecuador, Honduras, Brazil, Colombia, Venezuela, Peru, Bolivia, Panama, Nicaragua, Guatemala, Belize, Liberia, Togo, Benin, Cameroon
  • Cocoa: Ghana, Ivory Coast, Ecuador, Colombia, Peru, Togo, Benin, Nigeria, Cameroon, Venezuela, Panama, Brazil, Bolivia, Dominican Republic, Liberia
  • Coffee: Peru, Brazil, Ecuador, Colombia, Nicaragua, Honduras, El Salvador, Uganda, Ethiopia, Indonesia, Vietnam, Argentina, Bolivia, Venezuela, Panama, Costa Rica, Guatemala, Belize, India, Sri Lanka, Malaysia, Myanmar, Cambodia, Thailand, Laos, Papua New Guinea
  • Soy: Argentina, Brazil, Paraguay, Uruguay, Bolivia
  • Timber: Global (all countries with forest cover)
  • Cattle: Global (all countries)

Response Format

Feature properties use flat top-level keys (no nested validation object); location_type identifies output as "facility". All metrics (commodity_presence, road_access, etc.) and scores are at the top level.

Collection Properties (Aggregate Statistics)

{
  "type": "FeatureCollection",
  "properties": {
    "total_facilities": 382,
    "commodity_presence": 315,
    "poi_presence": 298,
    "road_access": 245,
    "water_access": 89,
    "port_proximity": 23,
    "high_density": 67,
    "confidence_score": 0.456,
    "confidence_level": {
      "high": 89,
      "medium": 149,
      "low": 144
    },
    "producer_confirmed": {
      "true": 156,
      "false": 226
    }
  },
  "features": [...]
}

Feature Properties (Individual Results)

Feature properties use flat top-level keys (no nested validation object):

{
  "type": "Feature",
  "geometry": {
    "type": "Point",
    "coordinates": [100.5018, 13.7563]
  },
  "properties": {
    "location_type": "facility",
    "facility_name": "Rubber Processing Mill",
    "country": "Thailand",
    "commodity_type": "rubber",
    "commodity_presence": 1.0,
    "poi_presence": 1.0,
    "high_density": 0.0,
    "building_presence": 1.0,
    "road_access": 1.0,
    "water_access": 0.0,
    "port_proximity": 0.0,
    "producer_confirmation_score": 0.85,
    "producer_confirmed": true,
    "producer_confirmation_comment": "Rule-based analysis: 0.85 score",
    "aggregation_comment": "Rule-based analysis: 0.85 score",
    "confidence_score": 0.85,
    "confidence_level": "high",
    "primary_warning": null
  }
}
  • location_type: "facility" — identifies this as facility validation output
  • primary_warning: Optional string when issues exist (e.g. outside commodity zone, no infrastructure)

Confidence Levels

The confidence score reflects the facility's suitability as a commodity aggregation point based on critical infrastructure and commodity presence:

  • High (≥ 0.7): Strong evidence of suitable commodity facility with good infrastructure
  • Medium (0.45 to < 0.7): Moderate evidence with some infrastructure gaps
  • Low (< 0.45): Weak evidence or significant infrastructure/commodity gaps
  • Invalid: Geometry missing or invalid; no scoring performed

Interpretation Guide: - 0.9-1.0: Exceptional facility characteristics, very strong commodity aggregation point - 0.7-0.9: High quality facility characteristics, strong commodity aggregation point - 0.45-0.7: Moderate facility characteristics, requires verification - 0.3-0.45: Poor facility characteristics, significant issues - 0.0-0.3: Very poor facility characteristics, unlikely commodity aggregation point - Invalid geometry: Feature skipped; confidence_level is "invalid" and confidence_comment explains the error


Possible confidence_level values (per facility)

These are the only values returned in each feature’s properties.confidence_level by validate_facilities / validate_facility / validate_locations (facility path):

Value Meaning
high Confidence score ≥ 0.7. Strong commodity and infrastructure indicators.
medium Confidence score ≥ 0.45 and < 0.7. Some important gaps.
low Confidence score < 0.45, or facility failed country/latitude/commodity checks.
invalid Geometry is null or empty. No scoring; confidence_comment describes the error.

Aggregate collection properties.confidence_level is a counts object (e.g. {"high": 89, "medium": 149, "low": 144}). When this result is used in batch supply shed, the collection-level value written to Firestore is a single string: "low" if any facility is low, otherwise the most common level (so "high", "medium", "low", or "invalid" can appear).


How validation is used by batch supply shed

Facility validation produces the assessment (per-facility and aggregated confidence_level). That assessment is what batch supply shed uses; it is not generated by the batch supply shed endpoint itself.

  1. validate_facilities / validate_facility / validate_locations (facility path) compute per-facility confidence_level and optional confidence_comment.
  2. When batch supply shed stages a facility collection with validate_locations=true, it runs the same validation logic, aggregates to one collection-level confidence_level, and writes it to Firestore under facilityValidation.
  3. fetch_deforestation_check and stat_supply_shed do not recompute confidence. They read the stored assessment (or a skip reason) and expose it as locations_confidence. Allowed values are: high, medium, low, invalid, or (when processing was skipped) not processed: validation confidence too low, not processed: geocoding failed, not processed: geocoding too imprecise. No custom messages are returned. See fetch_deforestation_check for the full list and meanings.

Data Sources

  • Earth Engine: Commodity presence detection and building density analysis using satellite imagery
  • Overture Maps: Road access, water features, building footprints, and POI data
  • Google Places API: Business listings and POI verification
  • Gemini AI: Combined analysis and scoring based on multiple factors

Performance Notes

  • Road Access: Uses fast Overture Places queries (default enabled)
  • Water Access: Uses Overture Places for water features (optional)
  • POI Detection: Combines Overture Maps and Google Places API
  • Commodity Detection: Earth Engine satellite analysis
  • Parallel Processing: All validation checks run concurrently for optimal performance

Error Handling

The API handles various error conditions:

  • Invalid geometries: Set to 0 scores with appropriate error messages
  • API timeouts: Graceful fallback with partial results
  • Missing data: Default to 0 values with explanatory comments
  • Rate limiting: Automatic retry with exponential backoff

Use Cases

  • Supply Chain Verification: Validate claimed aggregation facilities
  • Due Diligence: Assess facility suitability for commodity sourcing
  • Risk Assessment: Identify potentially fraudulent facility claims
  • Compliance: Support EUDR and other deforestation regulations
  • Site Selection: Evaluate potential locations for new facilities