AI Infrastructure service¶
The Slices AI infrastructure is a distributed system for running jobs in GPU-enabled Docker containers. The Slices AI infrastructure consists of a set of heterogeneous clusters, each with their own characteristics (GPU model, CPU speed, memory, bus speed, …), allowing you to select the most appropriate hardware. Each job runs isolated within a Docker container with dedicated CPUs, GPUs and memory for maximum performance.
This documentation contains more info on what the Slices AI infrastructure is and how to use it.
Hint
Looking for a quick introduction? Have a look at our 'JupyterHub introduction for the Slices AI infrastructure' slidedeck.
For bugreports, questions and feedback:
E-mail us at gpulab@ilabt.imec.be
Table of Contents
- Overview
- Key Concepts
- Getting Started
- Step 1 - Get a project
- Step 2 - Explore the Slices AI infrastructure interactively with JupyterHub
- Step 3 - Write your training script
- Step 4 - Install and configure the CLI
- Step 5 - Write the job definition
- Step 6 - Submit the job
- Step 7 - Monitor and debug the job
- Step 8 - Retrieve results
- Next steps
- JupyterHub
- CLI
- The Job Request
- Storage
- Reservations
- Fair Use Policy
- Build and use a custom Docker image
- Get a large dataset into a job
- The Slices AI infrastructure API
- Software compatibility by GPU architecture