Fair Use Policy¶
Jobs on the Slices AI infrastructure are subject to a fair usage policy to ensure that resources are used fairly and efficiently.
General rules¶
Jobs must only request required resources. Wasting resources on the Slices AI infrastructure may result in your jobs being cancelled and future jobs being deprioritized.
Jobs must be relevant to the project in which they are being run. Request a new project if necessary.
Good practices¶
Jobs should start their computation automatically, without the need of manual intervention.
When multiple GPUs are requested, the amount of non-GPU preprocessing should be minimized: computations on the GPU should have start within the first hour after the job started running.
Jobs should stop to release the allocated resources once the computation has ended.
Long running jobs should checkpoint their computations: hardware failures do occur, make sure that you don’t lose days worth of work.
Split your work into multiple smaller jobs: this allows them to run in parallel and reduces job run time.
Don’t flood the standard output. The system captures logs written to the stdout/stderr of your job, but will drop logs when too many logs are produced in a short amount of time (a leaky bucket algorithm is used for this).
If you need to log a lot, write your logs to a file on the mounted storage instead.
Maximum simultaneous jobs¶
To ensure a fair distribution of resources, there is a cap on the number of concurrent jobs:
10 jobs per user
20 jobs per project
On some smaller clusters the limit may be further reduced to ensure fair access for everyone.
These limits can be temporarily adjusted with a motivated request to the Slices AI infrastructure admins.