My AI inference server:

Image 1 Image 2 Image 2 Image 2

This is my AI server project. t has 6 GPUs. It is capable of running 70b models. There are 4 GPUs specialized for LLM inference and 2 other GPUs for image, video and speech. It runs Open-WebUI with ollama for LLMs and ComfyUI for image and video generation. Everything is running safely in a Docker container.