Package com.linkedin.davinci.storage
Class DiskHealthCheckService
- java.lang.Object
-
- com.linkedin.venice.service.AbstractVeniceService
-
- com.linkedin.davinci.storage.DiskHealthCheckService
-
- All Implemented Interfaces:
java.io.Closeable,java.lang.AutoCloseable
public class DiskHealthCheckService extends AbstractVeniceService
DiskHealthCheckService will wake up every 10 seconds by default and run a health check in the disk by writing 64KB random data, read them back and verify the content; if there is any error within the process, an in-memory state variable "diskHealthy" will be updated to false; otherwise, "diskHealthy" will be kept as true. If there is a SSD failure, the disk operation could hang forever; in order to report such kind of disk failure, there is a timeout mechanism inside the health status polling API; a total timeout will be decided at the beginning: totalTimeout = Math.max(30 seconds, health check interval + disk operation timeout) we will keep track of the last update time for the in-memory health status variable, if the in-memory status haven't been updated for more than the totalTimeout, we believe the disk operation hang due to disk failure and start reporting unhealthy for this server.
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class com.linkedin.venice.service.AbstractVeniceService
AbstractVeniceService.ServiceState
-
-
Field Summary
-
Fields inherited from class com.linkedin.venice.service.AbstractVeniceService
logger, serviceState
-
-
Constructor Summary
Constructors Constructor Description DiskHealthCheckService(boolean serviceEnabled, long healthCheckIntervalMs, long diskOperationTimeout, java.lang.String databasePath, long diskFailServerShutdownTimeMs)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description java.lang.StringgetErrorMessage()booleanisDiskHealthy()booleanstartInner()voidstopInner()
-
-
-
Method Detail
-
startInner
public boolean startInner()
- Specified by:
startInnerin classAbstractVeniceService- Returns:
- true if the service is completely started,
false if it is still starting asynchronously (in this case, it is the implementer's
responsibility to set
AbstractVeniceService.serviceStatetoAbstractVeniceService.ServiceState.STARTEDupon completion of the async work).
-
isDiskHealthy
public boolean isDiskHealthy()
-
getErrorMessage
public java.lang.String getErrorMessage()
-
stopInner
public void stopInner()
- Specified by:
stopInnerin classAbstractVeniceService
-
-