The noVNC view hung on "Connecting" forever then timed out. Root cause: x11vnc sweeps the entire fd table (fcntl per fd) on every client connection, and containerd grants pods RLIMIT_NOFILE=2^31, so the RFB handshake never completes (websockify accepts the WS and dials localhost:5900, but x11vnc never sends its banner — verified: handshake timed out at 8s, x11vnc had burned 1h41m CPU spinning). Same bug + fix the android-emulator stack already carries. Cap nofile before x11vnc starts, in two places: - files/novnc/entrypoint.sh: `ulimit -n 65536` (root fix, makes the image correct) - main.tf novnc container: `command = ["bash","-c","ulimit -n 65536; exec /entrypoint.sh"]` so the cap applies deterministically on rollout even though the image is :latest/IfNotPresent (a rebuilt entrypoint isn't guaranteed to be re-pulled). Also documents the gotcha + diagnosis in docs/architecture/chrome-service.md and notes the black-when-idle behaviour + the autoconnect URL. (A live x11vnc relaunch with the cap already unblocked the running pod; this makes it survive restarts.) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
46 lines
1.6 KiB
Bash
46 lines
1.6 KiB
Bash
#!/usr/bin/env bash
|
|
# Connect to the chrome-service container's Xvfb (shared pod network, TCP)
|
|
# and serve the noVNC HTML5 client + websockify bridge on :6080.
|
|
set -e
|
|
|
|
# Containerd grants pods an effectively unbounded RLIMIT_NOFILE (2^31). x11vnc
|
|
# sweeps the WHOLE fd table with fcntl on every client connection, so each VNC
|
|
# connect hangs for ~forever and the noVNC client sits on "Connecting" until it
|
|
# times out. Cap it before launching x11vnc. (Same fix as the android-emulator
|
|
# stack; see docs/architecture/chrome-service.md "noVNC fd-sweep".)
|
|
ulimit -n 65536
|
|
|
|
for i in 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15; do
|
|
if echo > /dev/tcp/127.0.0.1/6099 2>/dev/null; then
|
|
echo "Xvfb TCP up after attempt $i"
|
|
break
|
|
fi
|
|
echo "waiting for Xvfb TCP 6099 attempt=$i"
|
|
sleep 2
|
|
done
|
|
|
|
# websockify runs as PID 1; x11vnc is a child so its logs land on container stdout
|
|
# `-noshm` skips MIT-SHM probes that fail across container boundaries (each
|
|
# container has its own /dev/shm); `-noxdamage` skips XDAMAGE which Xvfb
|
|
# doesn't expose; `-quiet` keeps the polling chatter out of pod logs.
|
|
echo "starting x11vnc -> :5900"
|
|
x11vnc -display localhost:99 -nopw -listen 0.0.0.0 -rfbport 5900 \
|
|
-forever -shared -noshm -noxdamage -quiet 2>&1 &
|
|
X11VNC_PID=$!
|
|
|
|
for i in 1 2 3 4 5 6 7 8 9 10; do
|
|
if echo > /dev/tcp/127.0.0.1/5900 2>/dev/null; then
|
|
echo "x11vnc bound 5900 after attempt $i"
|
|
break
|
|
fi
|
|
echo "waiting for x11vnc :5900 attempt=$i"
|
|
sleep 2
|
|
done
|
|
|
|
if ! echo > /dev/tcp/127.0.0.1/5900 2>/dev/null; then
|
|
echo "ERROR: x11vnc did not bind 5900"
|
|
exit 1
|
|
fi
|
|
|
|
echo "starting websockify -> :6080"
|
|
exec websockify --web=/usr/share/novnc 6080 localhost:5900
|