Fix: Build resource limits #328

Closed
opened 2025-03-09 23:39:57 +00:00 by jamie · 2 comments
jamie commented 2025-03-09 23:39:57 +00:00 (Migrated from git.hazaar.io)

Problem Statement

The Kubernetes GitLab runner currently lacks resource limits, which allows unit tests and build processes to consume excessive CPU and RAM. This can result in nodes becoming unresponsive or crashing due to resource exhaustion.

Who will benefit?

  • Developers running builds and unit tests in GitLab CI/CD.
  • System administrators managing Kubernetes cluster stability.
  • Other applications running on the same nodes that could be affected by resource spikes.

Expected Behavior

  • The GitLab runner should enforce CPU and memory limits on jobs.
  • Unit tests and builds should not be able to exhaust node resources.
  • Jobs exceeding resource limits should be throttled or terminated gracefully.

Actual Behavior

  • Some unit tests and build jobs consume excessive CPU and RAM.
  • High resource usage can take down entire nodes.
  • Other applications running on the same cluster experience degraded performance.

Proposed Solution

  • Define resource limits and requests in the GitLab runner Kubernetes configuration.
  • Set appropriate CPU and memory limits for build and test jobs.
  • Monitor and adjust limits based on actual usage patterns.

Priority/Severity

  • High (This will bring a huge increase in performance/productivity/usability/legislative cover)
  • Medium (This will bring a good increase in performance/productivity/usability)
  • Low (anything else e.g., trivial, minor improvements)
## Problem Statement The Kubernetes GitLab runner currently lacks resource limits, which allows unit tests and build processes to consume excessive CPU and RAM. This can result in nodes becoming unresponsive or crashing due to resource exhaustion. ## Who will benefit? - Developers running builds and unit tests in GitLab CI/CD. - System administrators managing Kubernetes cluster stability. - Other applications running on the same nodes that could be affected by resource spikes. ## Expected Behavior - The GitLab runner should enforce CPU and memory limits on jobs. - Unit tests and builds should not be able to exhaust node resources. - Jobs exceeding resource limits should be throttled or terminated gracefully. ## Actual Behavior - Some unit tests and build jobs consume excessive CPU and RAM. - High resource usage can take down entire nodes. - Other applications running on the same cluster experience degraded performance. ## Proposed Solution - Define resource limits and requests in the GitLab runner Kubernetes configuration. - Set appropriate CPU and memory limits for build and test jobs. - Monitor and adjust limits based on actual usage patterns. ## Priority/Severity - [x] High (This will bring a huge increase in performance/productivity/usability/legislative cover) - [ ] Medium (This will bring a good increase in performance/productivity/usability) - [ ] Low (anything else e.g., trivial, minor improvements)
jamie commented 2025-03-09 23:39:58 +00:00 (Migrated from git.hazaar.io)

assigned to @jamie

assigned to @jamie
jamie commented 2025-03-10 08:00:31 +00:00 (Migrated from git.hazaar.io)

I have added the following pod resource limits to the GitLab runner:

      [runners.kubernetes]
        namespace = "devops"
        image = "ubuntu:24.04"
        privileged = true
        pull_policy = ['if-not-present']
        allowed_pull_policies = ['always', 'if-not-present']
        cpu_limit = "2"
        helper_cpu_limit = "1"
        service_cpu_limit = "2"
        memory_limit = "2Gi"
        helper_memory_limit = "1Gi"
        service_memory_limit = "2Gi"
        cpu_request = "500m"
        helper_cpu_request = "250m"
        service_cpu_request = "250m"
        memory_request = "500MiB"
        helper_memory_request = "500MiB"
        service_memory_request= "1Gi"
I have added the following pod resource limits to the GitLab runner: ``` [runners.kubernetes] namespace = "devops" image = "ubuntu:24.04" privileged = true pull_policy = ['if-not-present'] allowed_pull_policies = ['always', 'if-not-present'] cpu_limit = "2" helper_cpu_limit = "1" service_cpu_limit = "2" memory_limit = "2Gi" helper_memory_limit = "1Gi" service_memory_limit = "2Gi" cpu_request = "500m" helper_cpu_request = "250m" service_cpu_request = "250m" memory_request = "500MiB" helper_memory_request = "500MiB" service_memory_request= "1Gi" ```
jamie (Migrated from git.hazaar.io) closed this issue 2025-03-10 08:00:33 +00:00
jamie self-assigned this 2025-09-04 01:13:36 +00:00
jamie removed their assignment 2025-09-04 01:13:44 +00:00
jamie self-assigned this 2025-09-04 01:13:48 +00:00
jamie removed their assignment 2025-09-04 01:13:54 +00:00
jamie self-assigned this 2025-09-04 01:14:00 +00:00
jamie removed their assignment 2025-09-04 01:14:06 +00:00
jamie self-assigned this 2025-09-04 01:14:21 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: hazaar/framework#328
No description provided.