[dns] pfSense: Unbound replaces dnsmasq (WS D)
Replace pfSense dnsmasq (DNS Forwarder) with Unbound (DNS Resolver) so LAN-side .viktorbarzin.lan resolution survives a full Kubernetes outage. Out-of-band pfSense changes (not in Terraform; pfSense config.xml is VM-managed). Backup at /cf/conf/config.xml.2026-04-19-pre-unbound on-box + /mnt/backup/pfsense/ nightly. - <unbound> enabled; listens on lan, opt1, wan, lo0 - <forwarding> on + <forward_tls_upstream> → DoT to Cloudflare (1.1.1.1 / 1.0.0.1 port 853, SNI cloudflare-dns.com) - <dnssec>, <prefetch>, <prefetchkey>, <dnsrecordcache> (serve-expired) - msgcachesize=256MB, cache_max_ttl=7d, cache_min_ttl=60s - custom_options: auth-zone viktorbarzin.lan master=10.0.20.201 fallback-enabled=yes for-upstream=yes + serve-expired-ttl=259200 - <dnsmasq><enable> removed; dnsmasq stopped - NAT rdr WAN UDP 53 → 10.0.20.201 removed (Unbound listens on WAN now) - Technitium zone viktorbarzin.lan: zoneTransferNetworkACL set to 10.0.20.1, 10.0.10.1, 192.168.1.2 (pfSense source IPs) Verified: - unbound-control list_auth_zones: viktorbarzin.lan serial 49367 - dig @127.0.0.1 idrac.viktorbarzin.lan returns 192.168.1.4 with aa flag (served from auth-zone, not forwarded) - dig @127.0.0.1 example.com +dnssec returns ad flag (DoT + validated) - /var/unbound/viktorbarzin.lan.zone has ~114 records - K8s outage drill passed: scale technitium=0 → dig still returns via WAN/LAN/OPT1 interfaces → scale restored - LAN/management/K8s VLAN clients all resolve via pfSense 192.168.1.2 / 10.0.10.1 / 10.0.20.1 respectively Trade-off: Technitium Split Horizon hairpin for 192.168.1.x → *.viktorbarzin.me (non-proxied) no longer runs via pfSense (Unbound answers locally). Fix if it bites: switch service to proxied or add Unbound Host Override. Documented in docs/runbooks/pfsense-unbound.md. Closes: code-k0d
This commit is contained in:
parent
bc866d53fa
commit
33d934c32f
2 changed files with 253 additions and 28 deletions
|
|
@ -1,6 +1,6 @@
|
|||
# DNS Architecture
|
||||
|
||||
Last updated: 2026-04-19 (NodeLocal DNSCache deployed — Workstream C)
|
||||
Last updated: 2026-04-19 (WS C — NodeLocal DNSCache deployed; WS D — pfSense Unbound replaces dnsmasq)
|
||||
|
||||
## Overview
|
||||
|
||||
|
|
@ -22,10 +22,9 @@ graph TB
|
|||
end
|
||||
|
||||
subgraph "pfSense (10.0.20.1)"
|
||||
pf_dnsmasq[dnsmasq<br/>Forwarder]
|
||||
pf_unbound[Unbound<br/>Resolver<br/>auth-zone AXFR]
|
||||
pf_kea[Kea DHCP4<br/>3 subnets, 53 reservations]
|
||||
pf_ddns[Kea DHCP-DDNS<br/>RFC 2136]
|
||||
pf_nat[NAT rdr<br/>UDP 53 → Technitium]
|
||||
end
|
||||
|
||||
subgraph "Kubernetes Cluster"
|
||||
|
|
@ -53,18 +52,17 @@ graph TB
|
|||
|
||||
Internet -->|DNS query| CF
|
||||
CF -->|CNAME to tunnel| CFTunnel
|
||||
LAN -->|DNS query UDP 53| pf_nat
|
||||
pf_nat -->|forward| LB_DNS
|
||||
LAN -->|DNS query UDP 53| pf_unbound
|
||||
pf_kea -->|lease event| pf_ddns
|
||||
pf_ddns -->|A + PTR| LB_DNS
|
||||
|
||||
pf_dnsmasq -->|.viktorbarzin.lan| LB_DNS
|
||||
pf_dnsmasq -->|public queries| CF
|
||||
pf_unbound -->|AXFR viktorbarzin.lan| LB_DNS
|
||||
pf_unbound -->|public queries DoT :853| CF
|
||||
|
||||
NodeLocalDNS -->|cache miss| KubeDNSUpstream
|
||||
KubeDNSUpstream --> CoreDNS
|
||||
CoreDNS -->|.viktorbarzin.lan| ClusterIP
|
||||
CoreDNS -->|public queries| pf_dnsmasq
|
||||
CoreDNS -->|public queries| pf_unbound
|
||||
|
||||
LB_DNS --> Primary
|
||||
LB_DNS --> Secondary
|
||||
|
|
@ -86,7 +84,7 @@ graph TB
|
|||
| CoreDNS | K8s `kube-system` | Cluster default | K8s service discovery + forwarding to Technitium |
|
||||
| NodeLocal DNSCache | K8s `kube-system` (DaemonSet) | `k8s-dns-node-cache:1.23.1` | Per-node DNS cache, transparent interception on 10.96.0.10 + 169.254.20.10. Insulates pods from CoreDNS/Technitium/pfSense disruption. |
|
||||
| Cloudflare DNS | SaaS | N/A | Public domain management (~50 domains) |
|
||||
| pfSense dnsmasq | 10.0.20.1 | pfSense 2.7.x | DNS forwarder for management VLAN |
|
||||
| pfSense Unbound | 10.0.20.1 | pfSense 2.7.2 (Unbound 1.19) | DNS resolver on LAN/OPT1/WAN; AXFR-slaves `viktorbarzin.lan` from Technitium; DoT upstream to Cloudflare |
|
||||
| Kea DHCP-DDNS | 10.0.20.1 | pfSense 2.7.x | Automatic DNS registration on DHCP lease |
|
||||
| phpIPAM | K8s namespace `phpipam` | v1.7.0 | IPAM ↔ DNS bidirectional sync |
|
||||
|
||||
|
|
@ -98,7 +96,7 @@ graph TB
|
|||
| NodeLocal DNSCache | `stacks/nodelocal-dns/` | DaemonSet (5 pods), ConfigMap, kube-dns-upstream Service, headless metrics Service |
|
||||
| Cloudflared | `stacks/cloudflared/` | Cloudflare DNS records (A, AAAA, CNAME, MX, TXT), tunnel config |
|
||||
| phpIPAM | `stacks/phpipam/` | dns-sync CronJob, pfsense-import CronJob |
|
||||
| pfSense | `stacks/pfsense/` | VM config (DNS config is via pfSense web UI) |
|
||||
| pfSense | `stacks/pfsense/` | VM config only (Unbound config is managed out-of-band via pfSense web UI / direct config.xml edits; see `docs/runbooks/pfsense-unbound.md`) |
|
||||
|
||||
## DNS Resolution Paths
|
||||
|
||||
|
|
@ -122,39 +120,50 @@ Pod → NodeLocal DNSCache (intercepts on kube-dns:10.96.0.10)
|
|||
→ cache hit: serve locally
|
||||
→ cache miss: forward to kube-dns-upstream (selects CoreDNS pods directly)
|
||||
→ CoreDNS: forward to pfSense (10.0.20.1), fallback 8.8.8.8, 1.1.1.1
|
||||
→ pfSense dnsmasq → Cloudflare (1.1.1.1)
|
||||
→ pfSense Unbound:
|
||||
- .viktorbarzin.lan → local auth-zone (AXFR-cached from Technitium)
|
||||
- public → DoT to Cloudflare (1.1.1.1 / 1.0.0.1 port 853)
|
||||
```
|
||||
|
||||
### LAN Client (192.168.1.x) → Any Domain
|
||||
|
||||
```
|
||||
Client gets DNS=192.168.1.2 (pfSense WAN) from DHCP
|
||||
→ pfSense NAT rdr on WAN interface → Technitium LB (10.0.20.201)
|
||||
→ Technitium resolves:
|
||||
- .viktorbarzin.lan → local zone
|
||||
- .viktorbarzin.me (non-proxied) → recursive, then Split Horizon translates
|
||||
176.12.22.76 → 10.0.20.200 for 192.168.1.0/24 clients
|
||||
- other → recursive to Cloudflare DoH (1.1.1.1)
|
||||
→ pfSense Unbound listens on 192.168.1.2:53 directly (no NAT rdr)
|
||||
- .viktorbarzin.lan → auth-zone (AXFR-cached from Technitium 10.0.20.201)
|
||||
Survives full Technitium/K8s outage — auth-zone keeps serving from
|
||||
/var/unbound/viktorbarzin.lan.zone with `fallback-enabled: yes`.
|
||||
- .viktorbarzin.me (non-proxied) and other public → DoT to Cloudflare
|
||||
(1.1.1.1 / 1.0.0.1 on port 853, SNI cloudflare-dns.com)
|
||||
```
|
||||
|
||||
Client source IPs are preserved (no SNAT on 192.168.1.x → 10.0.20.x path) — Technitium logs show real per-device IPs.
|
||||
**Trade-off vs. prior NAT rdr**: Split Horizon hairpin translation
|
||||
(`176.12.22.76 → 10.0.20.200` for 192.168.1.x clients) was only applied
|
||||
when queries reached Technitium via the NAT rdr. With Unbound answering
|
||||
on 192.168.1.2:53 directly, non-proxied `*.viktorbarzin.me` queries on the
|
||||
192.168.1.x LAN return the public IP, which the TP-Link AP can't hairpin.
|
||||
If hairpin is broken on LAN for a given non-proxied service, the fix is
|
||||
either (a) switch the service to proxied (via `dns_type = "proxied"`)
|
||||
or (b) add a local-data override on pfSense Unbound. The pre-Unbound
|
||||
state is documented in the `docs/runbooks/pfsense-unbound.md` rollback
|
||||
section.
|
||||
|
||||
### Management VLAN (10.0.10.x) → Any Domain
|
||||
|
||||
```
|
||||
Client gets DNS from Kea DHCP → pfSense (10.0.10.1)
|
||||
→ pfSense dnsmasq:
|
||||
- .viktorbarzin.lan → forward to Technitium (10.0.20.201)
|
||||
- other → forward to Cloudflare (1.1.1.1)
|
||||
→ pfSense Unbound:
|
||||
- .viktorbarzin.lan → auth-zone (local)
|
||||
- other → DoT to Cloudflare (1.1.1.1 / 1.0.0.1 port 853)
|
||||
```
|
||||
|
||||
### K8s VLAN (10.0.20.x) → Any Domain
|
||||
|
||||
```
|
||||
Client gets DNS from Kea DHCP → pfSense (10.0.20.1)
|
||||
→ pfSense dnsmasq:
|
||||
- .viktorbarzin.lan → forward to Technitium (10.0.20.201)
|
||||
- other → forward to Cloudflare (1.1.1.1)
|
||||
→ pfSense Unbound:
|
||||
- .viktorbarzin.lan → auth-zone (local)
|
||||
- other → DoT to Cloudflare (1.1.1.1 / 1.0.0.1 port 853)
|
||||
```
|
||||
|
||||
## Technitium DNS — Internal DNS Server
|
||||
|
|
@ -438,13 +447,26 @@ The zone-sync CronJob (runs every 30min) pushes the following to the Prometheus
|
|||
|
||||
### LAN Clients Can't Resolve
|
||||
|
||||
1. Verify pfSense NAT rule redirects UDP 53 on WAN to 10.0.20.201
|
||||
2. Check Technitium LB service: `kubectl get svc -n technitium technitium-dns`
|
||||
3. Test from LAN: `dig @192.168.1.2 idrac.viktorbarzin.lan`
|
||||
4. Check `externalTrafficPolicy: Local` — if no Technitium pod runs on the node receiving traffic, it drops
|
||||
1. Verify pfSense Unbound is running: `ssh admin@10.0.20.1 "sockstat -l -4 -p 53 | grep unbound"` — expect listeners on `192.168.1.2:53`, `10.0.10.1:53`, `10.0.20.1:53`, `127.0.0.1:53`
|
||||
2. Verify the auth-zone is loaded: `ssh admin@10.0.20.1 "unbound-control -c /var/unbound/unbound.conf list_auth_zones"` — expect `viktorbarzin.lan. serial N`
|
||||
3. Test from LAN: `dig @192.168.1.2 idrac.viktorbarzin.lan` (should return with `aa` flag)
|
||||
4. Test public upstream: `dig @192.168.1.2 example.com +dnssec` (should have `ad` flag — DoT via Cloudflare working)
|
||||
5. If auth-zone can't AXFR: check Technitium `viktorbarzin.lan` zone options → `zoneTransferNetworkACL` contains `10.0.20.1, 10.0.10.1, 192.168.1.2`
|
||||
6. See `docs/runbooks/pfsense-unbound.md` for full Unbound runbook and rollback instructions
|
||||
|
||||
### Hairpin NAT Not Working (LAN → *.viktorbarzin.me Fails)
|
||||
|
||||
Since 2026-04-19 (Workstream D), pfSense Unbound answers LAN DNS queries
|
||||
directly instead of forwarding to Technitium, so the Technitium Split Horizon
|
||||
post-processing does NOT run for 192.168.1.x clients anymore. Non-proxied
|
||||
services break hairpin on LAN clients again. Options:
|
||||
|
||||
1. **Switch service to proxied Cloudflare** (preferred) — set `dns_type = "proxied"` in the `ingress_factory` module call; DNS now resolves to Cloudflare edge, hairpin-independent.
|
||||
2. **Add a local-data override on pfSense Unbound** — under `Services → DNS Resolver → Host Overrides`, set `<service>.viktorbarzin.me → 10.0.20.200` (Traefik LB IP). This is equivalent to what Split Horizon did, applied at the resolver.
|
||||
3. **Revert to prior NAT rdr + Technitium Split Horizon** — documented in `docs/runbooks/pfsense-unbound.md` rollback section.
|
||||
|
||||
K8s-side Split Horizon is still configured and applies when `*.viktorbarzin.me` queries DO reach Technitium (e.g., from pods that query via CoreDNS → Technitium forwarding for `.viktorbarzin.me` via pfSense). Verify Technitium split-horizon app:
|
||||
|
||||
1. Verify Split Horizon app is installed on all instances
|
||||
2. Check CronJob status: `kubectl get cronjob -n technitium technitium-split-horizon-sync`
|
||||
3. Run the job manually: `kubectl create job --from=cronjob/technitium-split-horizon-sync test-sh -n technitium`
|
||||
|
|
@ -479,6 +501,7 @@ For external `.viktorbarzin.me` records:
|
|||
## Incident History
|
||||
|
||||
- **2026-04-14 (SEV1)**: NFS `fsid=0` caused Technitium primary data loss on restart. Fixed by migrating all 3 instances to `proxmox-lvm-encrypted`, adding zone-sync CronJob (30min AXFR). See [post-mortem](../post-mortems/2026-04-14-nfs-fsid0-dns-vault-outage.md).
|
||||
- **2026-04-19 (hardening, not outage)**: Workstream D — pfSense Unbound replaces dnsmasq as the pfSense DNS service. Unbound AXFR-slaves `viktorbarzin.lan` from Technitium so LAN-side resolution survives a full K8s outage. WAN NAT rdr `192.168.1.2:53 → 10.0.20.201` removed (Unbound listens on WAN directly). DoT upstream via Cloudflare. See `docs/runbooks/pfsense-unbound.md` and bd `code-k0d`.
|
||||
|
||||
## Related
|
||||
|
||||
|
|
|
|||
202
docs/runbooks/pfsense-unbound.md
Normal file
202
docs/runbooks/pfsense-unbound.md
Normal file
|
|
@ -0,0 +1,202 @@
|
|||
# pfSense Unbound DNS Resolver
|
||||
|
||||
Last updated: 2026-04-19
|
||||
|
||||
## Overview
|
||||
|
||||
pfSense runs **Unbound** (DNS Resolver) as its sole DNS service, replacing
|
||||
dnsmasq (DNS Forwarder) as of 2026-04-19 (DNS hardening Workstream D,
|
||||
bd `code-k0d`).
|
||||
|
||||
Unbound AXFR-slaves the `viktorbarzin.lan` zone from the Technitium primary
|
||||
via the `10.0.20.201` LoadBalancer, so LAN-side `.lan` resolution survives
|
||||
a full Kubernetes outage. Public queries go to Cloudflare via DNS-over-TLS
|
||||
(`1.1.1.1` + `1.0.0.1` on port 853, SNI `cloudflare-dns.com`).
|
||||
|
||||
## Listeners
|
||||
|
||||
Unbound binds on:
|
||||
|
||||
| Interface | IP | Purpose |
|
||||
|-----------|-----|---------|
|
||||
| WAN | `192.168.1.2:53` | LAN (192.168.1.0/24) clients querying via pfSense WAN |
|
||||
| LAN | `10.0.10.1:53` | Management VLAN clients |
|
||||
| OPT1 | `10.0.20.1:53` | K8s VLAN clients (CoreDNS upstream) |
|
||||
| lo0 | `127.0.0.1:53` | pfSense itself |
|
||||
|
||||
The prior WAN NAT `rdr` rule (`192.168.1.2:53 → 10.0.20.201`) was removed in
|
||||
the same change — Unbound now answers directly on WAN.
|
||||
|
||||
## Config Summary
|
||||
|
||||
Relevant `<unbound>` keys in `/cf/conf/config.xml`:
|
||||
|
||||
| Key | Value | Meaning |
|
||||
|-----|-------|---------|
|
||||
| `enable` | flag | Enable Unbound |
|
||||
| `dnssec` | flag | DNSSEC validation on |
|
||||
| `forwarding` | flag | Forwarding mode (send recursive queries to upstream) |
|
||||
| `forward_tls_upstream` | flag | Use DoT for upstream forwarders |
|
||||
| `prefetch` | flag | Prefetch records near expiry |
|
||||
| `prefetchkey` | flag | Prefetch DNSKEY records |
|
||||
| `dnsrecordcache` | flag | `serve-expired: yes` |
|
||||
| `active_interface` | `lan,opt1,wan,lo0` | Listen interfaces |
|
||||
| `msgcachesize` | `256` (MB) | Message cache (rrset-cache auto-doubles to 512MB) |
|
||||
| `cache_max_ttl` | `604800` | 7 days |
|
||||
| `cache_min_ttl` | `60` | 60 seconds |
|
||||
| `custom_options` | base64 | Contains `serve-expired-ttl: 259200` + `auth-zone:` block |
|
||||
|
||||
Upstream DoT forwarders live in `<system>`:
|
||||
|
||||
- `dnsserver[0] = 1.1.1.1`
|
||||
- `dnsserver[1] = 1.0.0.1`
|
||||
- `dns1host = cloudflare-dns.com`
|
||||
- `dns2host = cloudflare-dns.com`
|
||||
|
||||
## Auth-Zone for viktorbarzin.lan
|
||||
|
||||
The custom_options block declares:
|
||||
|
||||
```
|
||||
server:
|
||||
serve-expired-ttl: 259200
|
||||
|
||||
auth-zone:
|
||||
name: "viktorbarzin.lan"
|
||||
master: 10.0.20.201
|
||||
fallback-enabled: yes
|
||||
for-downstream: yes
|
||||
for-upstream: yes
|
||||
zonefile: "viktorbarzin.lan.zone"
|
||||
allow-notify: 10.0.20.201
|
||||
```
|
||||
|
||||
- `master: 10.0.20.201` — AXFR source (Technitium LoadBalancer)
|
||||
- `fallback-enabled: yes` — if the zone can't refresh from master, fall back to normal recursion for this name (prevents hard-fail if AXFR breaks)
|
||||
- `for-downstream: yes` — answer queries for this zone with AA flag
|
||||
- `for-upstream: yes` — Unbound's internal iterator also uses this zone
|
||||
- `zonefile` is relative to the chroot (`/var/unbound/viktorbarzin.lan.zone`)
|
||||
- `allow-notify: 10.0.20.201` — accept NOTIFY from Technitium
|
||||
|
||||
## Technitium-side ACL
|
||||
|
||||
Zone `viktorbarzin.lan` on Technitium has `zoneTransfer = UseSpecifiedNetworkACL`
|
||||
with ACL entries:
|
||||
|
||||
- `10.0.20.1` (pfSense OPT1)
|
||||
- `10.0.10.1` (pfSense LAN)
|
||||
- `192.168.1.2` (pfSense WAN)
|
||||
|
||||
Verify via the Technitium API:
|
||||
|
||||
```
|
||||
curl -sk "http://127.0.0.1:5380/api/zones/options/get?token=$TOK&zone=viktorbarzin.lan" | jq .response.zoneTransfer
|
||||
```
|
||||
|
||||
## Operational Checks
|
||||
|
||||
```bash
|
||||
# Is Unbound listening?
|
||||
ssh admin@10.0.20.1 "sockstat -l -4 -p 53"
|
||||
|
||||
# Auth-zone loaded?
|
||||
ssh admin@10.0.20.1 "unbound-control -c /var/unbound/unbound.conf list_auth_zones"
|
||||
# Expected: viktorbarzin.lan. serial NNNNN
|
||||
|
||||
# LAN record via auth-zone? (aa flag = authoritative / from auth-zone)
|
||||
dig @192.168.1.2 idrac.viktorbarzin.lan +norec
|
||||
|
||||
# Public record via DoT? (ad flag = DNSSEC validated, via 1.1.1.1/1.0.0.1)
|
||||
dig @192.168.1.2 example.com +dnssec
|
||||
|
||||
# Zonefile has all records?
|
||||
ssh admin@10.0.20.1 "wc -l /var/unbound/viktorbarzin.lan.zone"
|
||||
```
|
||||
|
||||
## K8s Outage Drill
|
||||
|
||||
Tests that `.lan` resolution survives a full Technitium outage:
|
||||
|
||||
```bash
|
||||
# Scale Technitium primary to 0
|
||||
kubectl -n technitium scale deploy/technitium --replicas=0
|
||||
|
||||
# Wait ~5 seconds, then test from a LAN client
|
||||
ssh devvm.viktorbarzin.lan "dig @192.168.1.2 idrac.viktorbarzin.lan +short"
|
||||
# Expected: 192.168.1.4 (served from Unbound's cached auth-zone)
|
||||
|
||||
# Restore immediately
|
||||
kubectl -n technitium scale deploy/technitium --replicas=1
|
||||
```
|
||||
|
||||
Completed successfully on 2026-04-19 initial deployment.
|
||||
|
||||
Note: secondary/tertiary Technitium pods remain up and continue to serve
|
||||
queries via the `10.0.20.201` LoadBalancer even when the primary is down —
|
||||
so the strongest proof that Unbound's auth-zone serves locally is to also
|
||||
scale those down (optional, not part of the routine drill).
|
||||
|
||||
## Backup & Rollback
|
||||
|
||||
### Backups
|
||||
|
||||
- **On-box**: `/cf/conf/config.xml.2026-04-19-pre-unbound` (created before this
|
||||
workstream ran — keep for 30 days, then delete)
|
||||
- **Daily**: PVE `daily-backup` script copies `/cf/conf/config.xml` and a full
|
||||
pfSense config tar to `/mnt/backup/pfsense/` on the Proxmox host at 05:00
|
||||
- **Offsite**: Synology `pve-backup/pfsense/` (synced daily by
|
||||
`offsite-sync-backup`)
|
||||
|
||||
### Rollback to dnsmasq
|
||||
|
||||
If Unbound misbehaves, revert to dnsmasq + NAT rdr:
|
||||
|
||||
```bash
|
||||
# On pfSense
|
||||
cp /cf/conf/config.xml.2026-04-19-pre-unbound /cf/conf/config.xml
|
||||
|
||||
# Tell pfSense to re-read config and reload services
|
||||
php -r 'require_once("config.inc"); require_once("config.lib.inc"); disable_path_cache();'
|
||||
/etc/rc.restart_webgui # reloads PHP config caches
|
||||
# Restart services
|
||||
php -r 'require_once("config.inc"); require_once("services.inc"); services_dnsmasq_configure(); services_unbound_configure(); filter_configure();'
|
||||
/etc/rc.filter_configure # re-applies NAT rules (brings back rdr)
|
||||
```
|
||||
|
||||
Verify:
|
||||
|
||||
```bash
|
||||
sockstat -l -4 -p 53 | grep dnsmasq # expect dnsmasq on 10.0.10.1 and 10.0.20.1
|
||||
pfctl -sn | grep '53' # expect rdr on wan UDP 53 → 10.0.20.201
|
||||
```
|
||||
|
||||
### Rollback without wiping new changes
|
||||
|
||||
If you only want to stop Unbound without restoring the whole config, edit
|
||||
config.xml and remove `<enable/>` from `<unbound>` + add it back to `<dnsmasq>`,
|
||||
then re-run `services_unbound_configure()` + `services_dnsmasq_configure()`.
|
||||
You also need to re-add the WAN NAT rdr in `<nat><rule>` (see the backup XML
|
||||
for the exact shape — tracker `1775670025`).
|
||||
|
||||
## Known Gotchas
|
||||
|
||||
1. **pfSense regenerates `/var/unbound/unbound.conf`** on every service reload
|
||||
from `<unbound>` in `config.xml`. Edits to unbound.conf are NOT durable.
|
||||
2. **`unbound-control` default config path is wrong**. Always use
|
||||
`unbound-control -c /var/unbound/unbound.conf <cmd>`.
|
||||
3. **`custom_options` is base64-encoded** in config.xml. Use `base64 -d` to
|
||||
decode in a shell, or `base64_decode()` in PHP.
|
||||
4. **`interface-automatic: yes` is NOT used** when `active_interface` is
|
||||
explicitly set to a list — pfSense emits explicit `interface: <ip>` lines.
|
||||
5. **`auth-zone`'s `zonefile` path is relative to the Unbound chroot**
|
||||
(`/var/unbound`), NOT absolute. Using an absolute path silently fails.
|
||||
6. **DoT forwarders need `forward_tls_upstream`** flag AND `dns1host` /
|
||||
`dns2host` set in `<system>` for SNI — without the hostname, pfSense emits
|
||||
`forward-addr: 1.1.1.1@853` (no `#`) which Cloudflare rejects with
|
||||
certificate hostname mismatch.
|
||||
|
||||
## Related Docs
|
||||
|
||||
- `docs/architecture/dns.md` — overall DNS architecture (K8s side, Technitium, CoreDNS)
|
||||
- `docs/architecture/networking.md` — VLAN layout, pfSense interface mapping
|
||||
- `.claude/skills/pfsense/skill.md` — SSH / CLI patterns for pfSense management
|
||||
Loading…
Add table
Add a link
Reference in a new issue