gainforest - Notice history

All systems operational

climateai.org ( Dev PDS ) - Operational

100% - uptime
Apr 2026 · 100.0%May · 100.0%Jun · 100.0%
Apr 2026
May 2026
Jun 2026

gainforest.id ( prod PDS ) - Operational

100% - uptime
Apr 2026 · 99.96%May · 100.0%Jun · 99.96%
Apr 2026
May 2026
Jun 2026

auth.gainforest.id ( auth service for gainforest.id ) - Operational

100% - uptime
Apr 2026 · 99.96%May · 100.0%Jun · 99.99%
Apr 2026
May 2026
Jun 2026

auth.climateai.org ( auth service for climateai.org ) - Operational

99% - uptime
Apr 2026 · 99.97%May · 98.55%Jun · 99.56%
Apr 2026
May 2026
Jun 2026

beta.fund.gainforest.app ( prod application ) - Operational

100% - uptime
Apr 2026 · 100.0%May · 99.94%Jun · 100.0%
Apr 2026
May 2026
Jun 2026

api.hi.gainforest.app ( indexer ) - Operational

98% - uptime
Apr 2026 · 94.08%May · 99.42%Jun · 99.99%
Apr 2026
May 2026
Jun 2026

hyperlabel-production.up.railway.app (labeller) - Operational

100% - uptime
Apr 2026 · 100.0%May · 100.0%Jun · 99.97%
Apr 2026
May 2026
Jun 2026

dev-api-hi.gainforest.app ( dev indexer ) - Operational

100% - uptime
Apr 2026 · 100.0%May · 100.0%Jun · 99.99%
Apr 2026
May 2026
Jun 2026

Notice history

Jun 2026

May 2026

auth.climateai.org ( auth service for climateai.org ) is back up
  • Postmortem
    Postmortem

    Postmortem: climateai.org PDS outage

    Summary

    The climateai.org PDS went down after the Caddy reverse proxy was repeatedly killed by the Linux OOM killer. Caddy memory usage grew to roughly 2.8–2.9 GB RSS, exhausting available VM memory and causing the host to become unstable.

    Root Cause

    Historically, before /tls-check was added, Caddy on-demand TLS was able to issue certificates for invalid or nested subdomains. This resulted in a large number of stale certificates being stored in Caddy’s certificate storage.

    Although /tls-check has been enabled for quite some time and now correctly rejects invalid nested domains, the certificates issued before that protection existed remained in Caddy’s storage.

    During Caddy certificate maintenance or renewal activity, Caddy processed this large stale certificate store. That caused memory usage to spike high enough for the kernel to OOM-kill the Caddy process.

    Impact

    • Public HTTPS access to the PDS became unavailable.

    • SSH access also became unreliable while the VM was under memory pressure.

    • PDS account data, DIDs, repositories, records, and handles were not deleted or corrupted.

    • The issue was limited to Caddy/TLS handling and VM memory exhaustion.

    Why it kept restarting

    The Caddy container was configured with Docker’s unless-stopped restart policy. After each OOM kill, Docker restarted Caddy automatically. Because the stale certificate storage was still present, Caddy kept hitting the same memory pressure and was killed again. This created a repeated restart/OOM loop and eventually left Docker/containerd in a noisy cleanup state.

    Remediation

    Precautions have now been taken to prevent recurrence:

    • Removed stale invalid/nested-domain certificate entries from Caddy storage.

    • Kept valid certificates for:

      • climateai.org

      • auth.climateai.org

      • valid one-label handle subdomains

    • Confirmed current /tls-check rejects invalid nested domains.

    • Added or started adding a memory limit for the Caddy container so it cannot consume enough RAM to destabilize the full VM.

  • Resolved
    Resolved
    auth.climateai.org ( auth service for climateai.org ) is back up. This incident was automatically resolved by Instatus monitoring.
  • Investigating
    Investigating
    auth.climateai.org ( auth service for climateai.org ) is down at the moment. This incident was automatically created by Instatus monitoring.

Apr 2026

Apr 2026 to Jun 2026

Next