🛡️ Exception Handling & Resilience

The system implements a centralised exception handling strategy to ensure service stability, especially during inter-service communication between the Java API Gateway and the Python AI Services.

🏗️ Architecture

The project uses Spring Boot's @ControllerAdvice to intercept exceptions globally. This prevents raw stack traces from being exposed to the end-user, which is a key security best practice.

Key Components

RlInferenceException: A custom runtime exception used specifically for failures during Reinforcement Learning inference requests.
GlobalExceptionHandler: Intercepts all exceptions and maps them to standardised ErrorResponse DTOs.

🚦 Handling Inter-Service Failures

When the Gateway communicates with the rl-inference-service or lstm-predictor-service, several things can go wrong (timeouts, connection refused, or invalid model state).

Exception	HTTP Status	Mitigation Strategy
`RlInferenceException`	502 Bad Gateway	Logs the specific AI service error and returns a clean JSON response.
`MethodArgumentNotValidException`	400 Bad Request	Validates input telemetry vectors before they reach the AI models.
`Exception` (General)	500 Internal Error	Generic fallback to prevent internal logic leakage.

📝 Standardised Error Response

All exceptions return a consistent JSON contract, allowing the frontend or simulation scripts to handle errors gracefully:

{
  "message": "Error communicating with RL service",
  "details": "Connection timed out",
  "timestamp": "2026-04-27T10:00:00Z"
}