My job runs out of memory (OOM)
Problem
My job runs out of memory (OOM) and crashes or restarts.
Possible cause
Jobs uses gVisor as a sandboxing solution to isolate the underlying client containers mutualized on the same host. Directories such as /tmp and /dev are stored in memory. This means that any data written to /tmp consumes memory directly. For example, writing a 100 MB file to /tmp will increase the container's memory usage by the same amount.
Excessive writes to /tmp can fill up the provisioned memory, leading to out-of-memory (OOM) errors, potentially triggering a job restart.
Solutions
- Avoid writing temporary files in
/tmp. - Use alternative storage paths within your job
- Increase the job's memory allocation to accommodate for temporary file usage
Still need help?Create a support ticket