[{"content":"","date":"16 September 2025","externalUrl":null,"permalink":"/categories/","section":"Categories","summary":"","title":"Categories","type":"categories"},{"content":"This is a custom welcome message script for Nushell, designed to replace the default startup banner and provide an information-rich dashboard about the system\u0026rsquo;s status each time the shell starts.\nFeatures Overview # System Information: Displays the hostname, kernel version, and the current logged-in user. Last Login: Shows the user information, source IP, and time of the last login to the system. Resource Status: CPU: Displays the CPU model and the real-time 1-minute, 5-minute, and 15-minute system load averages. Memory: Shows used and total memory in GB, along with the usage percentage. Disk: Shows used and total disk space in GB, along with the usage percentage. Container Status: Podman Containers: Summarizes the total number of discovered containers, as well as the number of running and exited containers. Podman Pods: Summarizes the total number of discovered pods and the number of currently running pods. Security Status: Fail2ban: Presents four key metrics for Jails in a table: \u0026ldquo;Current failed,\u0026rdquo; \u0026ldquo;Total failed,\u0026rdquo; \u0026ldquo;Current banned,\u0026rdquo; and \u0026ldquo;Total banned.\u0026rdquo; Dependencies # Core Utilities: uname, last, df (these are fundamental components of all modern Linux distributions). Optional Utilities: podman: If it is not installed or the service is not running, the relevant section will display a friendly warning message. fail2ban-client: If it is not installed, the relevant section will display a warning. sudo: To allow a non-root user to query the Fail2ban status, a sudo rule must be pre-configured. Installation # Back up your old configuration: Before making any changes, please back up your existing config.nu file. 1cp ~/.config/nushell/config.nu ~/.config/nushell/config.nu.bak Replace the configuration: Copy the complete code below and entirely replace the contents of your ~/.config/nushell/config.nu file with it. (Optional, Not Recommended) Configure Sudo for Fail2ban: To allow a non-root user to see the Fail2ban status, use the sudo visudo command to add the following rule at the end of the file (please replace your_username with your actual username): 1your_username ALL=(ALL) NOPASSWD: /usr/bin/fail2ban-client status Restart Nushell: Close and reopen your terminal to see the new custom welcome screen. Source Code # https://github.com/yuzjing/devScripts/blob/main/nu_banner\n","date":"16 September 2025","externalUrl":null,"permalink":"/posts/linux/nushell_bootinfo/","section":"Posts","summary":"","title":"Custom welcome message script for Nushell","type":"posts"},{"content":"","date":"16 September 2025","externalUrl":null,"permalink":"/tags/devops/","section":"Tags","summary":"","title":"Devops","type":"tags"},{"content":"","date":"16 September 2025","externalUrl":null,"permalink":"/categories/linux/","section":"Categories","summary":"","title":"Linux","type":"categories"},{"content":"","date":"16 September 2025","externalUrl":null,"permalink":"/tags/linux/","section":"Tags","summary":"","title":"Linux","type":"tags"},{"content":"","date":"16 September 2025","externalUrl":null,"permalink":"/posts/","section":"Posts","summary":"","title":"Posts","type":"posts"},{"content":"","date":"16 September 2025","externalUrl":null,"permalink":"/tags/","section":"Tags","summary":"","title":"Tags","type":"tags"},{"content":"","date":"16 September 2025","externalUrl":null,"permalink":"/","section":"涧雨暖云","summary":"","title":"涧雨暖云","type":"page"},{"content":"Last Updated: 2025-09-15\nThis document catalogs my personal collection of essential tools and software across different platforms. It\u0026rsquo;s used for quick environment setups and as a personal reference.\nIcon Legend:\n🐧 - Linux 🪟 - Windows 🤖 - Android 🌐 - Web/Cross-Platform 🐧 System Deployment \u0026amp; Maintenance # Arch Linux Install Script - 🐧 The official archinstall script, which greatly simplifies the initial setup process. reinstall.sh - 🐧 A popular one-click script to reinstall (or \u0026ldquo;DD\u0026rdquo;) a VPS with a clean Arch Linux image. Reflector - 🐧 Automatically fetches and ranks the fastest pacman mirrors, a crucial first step after installation. virtualization \u0026amp; Subsystems # Tools for managing virtual environments, containers, and core OS components.\nRolling LTS Kernel for WSL2 - 🪟 Provides a rolling-release LTS kernel for WSL2 to get the latest features and performance optimizations. Podman - 🐧 A daemonless alternative to Docker that is more secure, lightweight, and can run containers as a regular user. ⌨️ Terminal \u0026amp; Shell Environment # Nushell - 🐧🪟 My primary shell, which treats everything as structured data for powerful pipelines. Oh My Zsh - 🐧 The ultimate configuration framework for Zsh, with an unparalleled ecosystem of plugins and themes. Konsole - 🐧 My go-to terminal emulator in the KDE Plasma environment; powerful and highly configurable. Windows Terminal - 🪟 The modern, feature-rich terminal application for Windows. Termux - 🤖 Provides a powerful Linux terminal environment on Android, an essential tool for geeks. 🎨 Customization \u0026amp; Theming # Tools and resources for beautifying my work environment.\nOh My Posh - 🐧🪟🌐 A cross-platform prompt theme engine for a unified, beautiful prompt across all shells. Windows Terminal Themes - 🌐 A website dedicated to providing beautiful color schemes for Windows Terminal. Catppuccin for Fcitx5 - 🐧 The popular Catppuccin color scheme applied to the Fcitx5 input method framework. 🛠️ Core Command-Line Tools # Paru - 🐧 An AUR helper for Arch Linux and the spiritual successor to yay. Bat - 🐧🪟 A modern replacement for cat with syntax highlighting and Git integration. Bottom - 🐧🪟 A graphical system monitor and a modern replacement for top/htop. Aria2 - 🐧🪟 A command-line download powerhouse that supports multi-connection and BitTorrent, often run as a background service. dust - 🐧 A more intuitive version of du in rust. 🌐 Browser Extensions \u0026amp; Userscripts # Core plugins that enhance the browsing experience.\nViolentmonkey - 🌐 An open-source userscript manager for running custom JavaScript on web pages. Kiss Translator - 🌐 An immersive web page translation extension that is simple, efficient, and supports multiple translation engines. ZeroOmega - 🌐 Likely a fork of or similar to SwitchyOmega, used for easily managing and switching browser proxy settings. 📝 Text, Notes \u0026amp; Development # Helix - 🐧🪟 A modern, Vim-like text editor written in Rust with out-of-the-box LSP support. Acode editor - 🤖 A lightweight yet powerful code editor for Android, great for quick edits on the go. NoteGen - 🤖 A clean and simple note-taking app with a Material Design 3 interface. Hugo - 🐧🪟🌐 A blazing-fast static site generator for building blogs and documentation from Markdown files. 🖼️ Media Processing \u0026amp; Capture # Haruna Video Player - 🐧 An excellent mpv-based video player for the KDE ecosystem. MPV-KT - 🤖 A clean, mpv-based video player for Android with powerful decoding capabilities. ShareX - 🪟 An incredibly powerful screenshot, screen recording, and workflow automation tool for Windows. Seal - 🤖 A video and audio downloader for Android, powered by yt-dlp. 📦 Self-Hosting \u0026amp; NAS # Services that I run on my own server or NAS.\nImmich - 🐧🌐 A self-hosted alternative to Google Photos for backing up and managing personal photos and videos. Aria2NG - 🐧🌐 A beautiful and easy-to-use Web UI for remotely managing Aria2 download tasks. 🌐 Networking Tools # GUI for Sing-box - 🪟 A graphical client for the Sing-box core, providing a convenient way to manage proxy rules. 📖 Reading # Legado (阅读) - 🤖 An open-source e-reader app for novels, supporting custom content sources. ⌨️ Input Methods # Fcitx5 - 🐧 My primary input method framework on Linux; modular, stable, and modern. Rime-ice - 🐧 A popular configuration scheme for the Rime input method engine, providing a powerful Pinyin input experience. Gboard - 🤖 Google\u0026rsquo;s keyboard; stable, smooth, and feature-rich. ✨ System Enhancement \u0026amp; Automation # GKD (搞快点) - 🤖 An accessibility service-based tool for automatically skipping ads in apps. Shizuku - 🤖 A bridge that allows regular apps to easily use system-level APIs, acting as a foundation for many power-user tools. Hail (雹) - 🤖 Uses Shizuku or root to freeze apps, preventing them from running in the background to save battery. 🎓 Education \u0026amp; Learning # z2h字帖 - 🌐 z2h calligraphy copybook. ","date":"15 September 2025","externalUrl":null,"permalink":"/posts/notes/toolkit/","section":"Posts","summary":"","title":"🚀 My Tools","type":"posts"},{"content":"","date":"15 September 2025","externalUrl":null,"permalink":"/categories/notes/","section":"Categories","summary":"","title":"Notes","type":"categories"},{"content":"","date":"15 September 2025","externalUrl":null,"permalink":"/tags/software/","section":"Tags","summary":"","title":"Software","type":"tags"},{"content":"","date":"15 September 2025","externalUrl":null,"permalink":"/tags/theme/","section":"Tags","summary":"","title":"Theme","type":"tags"},{"content":"","date":"15 September 2025","externalUrl":null,"permalink":"/tags/tool/","section":"Tags","summary":"","title":"Tool","type":"tags"},{"content":"Before analyzing the two modes, a core principle must be established: WSL2 (running a full Linux kernel) and Windows (running the NT kernel) are two separate operating systems. The loopback interface (localhost, lo) is, by definition, a kernel-level internal affair, with traffic designed to never leave the local OS\u0026rsquo;s networking stack.\nTherefore, WSL2\u0026rsquo;s ::1 points to the Linux kernel itself, while Windows\u0026rsquo; ::1 points to the Windows kernel itself. They are two isolated, non-communicating addresses. When you attempt to connect to ::1 from within WSL2, the request will never leave WSL2 to reach Windows. The connection failure is the expected behavior according to networking fundamentals.\nThe reason 127.0.0.1 magically works is due to a special cross-OS-boundary forwarding/relay mechanism that Microsoft has implemented for IPv4. How this is implemented differs between networking modes.\n1. Default NAT Mode: The Isolated Virtual Network # In this mode, WSL2 communicates with the host through a virtual switch, and its network environment is relatively independent.\nInterface Observation # WSL2 ip a Output (NAT Mode):\n1# ... 22: eth0: \u0026lt;BROADCAST,MULTICAST,UP,LOWER_UP\u0026gt; mtu 1500 3 link/ether 00:15:5d:xx:xx:xx # A typical Hyper-V virtual MAC address 4 inet 172.20.x.x/20 # A virtual address in a private IP range 5# ... Windows ipconfig /all Output:\n1# ... 2Description . . . . . . . . . . . : Realtek PCIe GBE Family Controller 3Physical Address. . . . . . . . . : AA-BB-CC-DD-EE-FF 4IPv4 Address. . . . . . . . . . . : 192.168.1.100 5# ... Analysis \u0026amp; Conclusion:\nInterface Mismatch: The MAC address (00:15:5d:...) and IP address (172.20.x.x) of eth0 inside WSL2 are completely different from the Windows physical NIC. This serves as physical proof that WSL2 operates on a virtual network, managed by Hyper-V, that is isolated from the host. Explanation of localhost Behavior: 127.0.0.1 (IPv4): Traffic successfully reaches Windows because the WSL2 networking service applies a special Network Address Translation (NAT) rule at the egress point, routing traffic for this destination to the host\u0026rsquo;s gateway address on the virtual network. ::1 (IPv6): This special NAT mechanism is not implemented for IPv6. Therefore, traffic to ::1 follows standard routing to WSL2\u0026rsquo;s own lo interface, which is isolated from Windows, resulting in connection failure. 2. Mirrored Mode: The Shared Physical Network \u0026amp; The Special Internal Channel # This mode is designed to eliminate network isolation, making WSL2 a \u0026ldquo;peer\u0026rdquo; to the host on the network.\nInterface Observation # WSL2 ip a Output (Mirrored Mode):\n1# ... 22: eth0: \u0026lt;BROADCAST,MULTICAST,UP,LOWER_UP\u0026gt; mtu 1500 3 link/ether aa:bb:cc:dd:ee:ff # \u0026lt;-- Identical to Windows Physical MAC 4 inet 192.168.1.100/24 # \u0026lt;-- Identical to Windows IP 5 inet6 fe80::xxxx:xxxx:xxxx:xxxx/64 # \u0026lt;-- Identical to Windows Link-local IPv6 63: loopback0: \u0026lt;BROADCAST,MULTICAST,UP,LOWER_UP\u0026gt; mtu 1500 7 link/ether 00:15:5d:yy:yy:yy # \u0026lt;-- A separate, virtual MAC address 8# ... Analysis \u0026amp; Conclusion:\nThe Mirroring of eth0: The MAC and IP addresses of WSL2\u0026rsquo;s eth0 are identical to the Windows physical NIC. This proves that WSL2\u0026rsquo;s external network connection is a direct mirror of the host\u0026rsquo;s adapter; they share the same network identity. Explanation of localhost Behavior: 127.0.0.1 (IPv4): Although the external network is shared, the loopback interfaces (lo) of the WSL2 (Linux kernel) and Windows (NT kernel) remain isolated from each other. The successful communication relies on a mechanism known as the \u0026ldquo;localhost relay\u0026rdquo;. ::1 (IPv6): The \u0026ldquo;localhost relay\u0026rdquo; mechanism has a known limitation of being IPv4-only. Therefore, traffic to ::1 is not relayed and terminates at WSL2\u0026rsquo;s own isolated lo interface. Discussion of the loopback0 Interface # The loopback0 interface, which appears in Mirrored mode, is identified by its name and virtual MAC address (00:15:5d:...) as a special-purpose virtual interface provided by Hyper-V. This shows that even in mirrored mode, some virtualization components remain active.\nIts role can be inferred as: A dedicated channel for kernel-level internal communication between Windows and WSL2.\nVehicle for the localhost Relay: The \u0026ldquo;localhost relay\u0026rdquo; mechanism described above is very likely implemented over this loopback0 interface. When WSL2 sends traffic to 127.0.0.1, it is specially routed to loopback0, which is then received by a corresponding virtual interface on the Windows side and injected into the host\u0026rsquo;s networking stack. Other Management Traffic: In addition to the localhost relay, this interface may also carry other management and control traffic between WSL2 and Windows, providing a stable, internal communication path that bypasses the external physical network (eth0). Summary of Findings # The Isolation of localhost is Fundamental: Regardless of the networking mode, as two separate kernels, the loopback interfaces (lo) of WSL2 and Windows are always isolated. The failure of ::1 faithfully reflects this underlying truth and is the expected behavior based on networking principles. The availability of 127.0.0.1 is a Special Exception: Its successful communication is an IPv4-only \u0026ldquo;convenience feature\u0026rdquo; implemented by Microsoft through different technologies (NAT in one mode, a relay channel in the other). Interface Differences are Key Evidence: The ip a comparison clearly reveals the architectural difference between the modes: an isolated virtual eth0 in NAT mode versus a shared physical eth0 in Mirrored mode. loopback0 is a Key Component of Mirrored Mode: Its presence provides an independent, internal communication bridge that enables special host-guest interactions, such as the localhost relay, to function even while the main network is shared. ","date":"8 September 2025","externalUrl":null,"permalink":"/posts/devops/wsl2_ipv6/","section":"Posts","summary":"","title":"::1 in Wsl2","type":"posts"},{"content":"","date":"8 September 2025","externalUrl":null,"permalink":"/categories/devops/","section":"Categories","summary":"","title":"DevOps","type":"categories"},{"content":"","date":"8 September 2025","externalUrl":null,"permalink":"/tags/network/","section":"Tags","summary":"","title":"Network","type":"tags"},{"content":"","date":"8 September 2025","externalUrl":null,"permalink":"/tags/wsl/","section":"Tags","summary":"","title":"WSL","type":"tags"},{"content":"This post documents a time when using a forward proxy within a Caddy reverse proxy led me down a rabbit hole involving directive evolution, behavioral differences between versions, and even made me question the accuracy of the official documentation.\nThe forward_proxy_url Fog and the Version Mystery # Our core requirement was to connect to a final backend service through an internal forward proxy, which itself was behind a Caddy reverse proxy. This created a classic \u0026ldquo;proxy chaining\u0026rdquo; scenario.\nTarget Architecture: User -\u0026gt; Caddy (Reverse Proxy) -\u0026gt; Internal Proxy -\u0026gt; Backend App\nFirst Attempt: forward_proxy_url # Based on experience, we wrote the following configuration in our Caddyfile:\n1app.example.com { 2 # ... 3 reverse_proxy backend-service:8188 { 4 transport http { 5 # Use forward_proxy_url to specify the next-hop proxy 6 forward_proxy_url http://internal-proxy:1025 7 } 8 } 9} However, when reloading the Caddy configuration, I received the first confusing message:\n{\u0026quot;level\u0026quot;:\u0026quot;warn\u0026quot;, ... \u0026quot;msg\u0026quot;:\u0026quot;The 'forward_proxy_url' field is deprecated. Use 'network_proxy \u0026lt;url\u0026gt;' instead.\u0026quot;}\nThis was a clear deprecation warning. It told us that forward_proxy_url was obsolete and should be replaced with the new network_proxy directive.\nSecond Attempt: The network_proxy Frustration # Following the warning\u0026rsquo;s advice, we modified the configuration:\n1# Plan A: Place network_proxy inside transport http 2reverse_proxy backend-service:8188 { 3 transport http { 4 network_proxy http://internal-proxy:1055 5 } 6} 7 8# Plan B: Place network_proxy as a direct subdirective of reverse_proxy 9reverse_proxy backend-service:8188 { 10 network_proxy http://internal-proxy:1055 11} However, regardless of which approach we tried, Caddy returned an error: unrecognized subdirective network_proxy.\nThis left me confused: Caddy was warning me that the old directive was deprecated, yet it didn\u0026rsquo;t recognize the new one it recommended. Furthermore, the official Caddy documentation mentioned forward_proxy_url with no mention of network_proxy.\nTurns Out, It Was a Version \u0026ldquo;Bug\u0026rdquo; # After repeatedly confirming that the Caddy version I was using (v2.10.0) was relatively new and didn\u0026rsquo;t require any custom-built plugins, the focus finally shifted to Caddy itself.\nEventually, I painstakingly found the cause: https://github.com/caddyserver/caddy/pull/6978\nIt was a bug in version 2.10.0, which was fixed in 2.10.1\u0026hellip; what a pain.\nThe Solution:\nI ended up rebuilding my Caddy image using version 2.10.2. I simply ignored the deprecation warning, as the network_proxy parameter did not work correctly, and continued using forward_proxy_url.\n1# Final working configuration: Ignore the warning and carry on 2reverse_proxy backend-service:8188 { 3 transport http { 4 forward_proxy_url http://internal-proxy:1055 5 } 6} This experience taught me that even stable software releases can have minor disconnects between documentation, warnings, and actual behavior. When faced with such a situation, trust the actual runtime results.\nHost Header Causing a Browser Redirect Loop # After solving the proxy directive issue, I ran into a second problem. Testing the proxy chain from within the Caddy container using curl was completely successful:\ncurl -x http://internal-proxy:1055 http://backend-service:8188\ncurl could fetch the backend application\u0026rsquo;s page perfectly. However, accessing https://app.example.com in a browser resulted in an infinite redirect loop.\nSymptoms and Root Cause # After analyzing Caddy logs and researching the differences between curl and Caddy\u0026rsquo;s behavior, I found the culprit:\ncurl was successful because the Host header it sent to the backend was the backend\u0026rsquo;s IP address.\nIn contrast, when a browser accessed the site via Caddy, the Host header being passed along was the public domain name (app.example.com). Many backend applications, upon receiving a Host header that doesn\u0026rsquo;t match their own listening address, will initiate a redirect for security or normalization reasons. This redirect often conflicts with Caddy\u0026rsquo;s automatic HTTPS feature, leading to a loop.\nSolution: Spoofing the Host Header # To make Caddy\u0026rsquo;s behavior consistent with the successful curl command, we used header_up to forcibly modify the Host header being sent to the backend.\n1reverse_proxy backend-service:8188 { 2 # Forcibly change the Host header to the backend\u0026#39;s own address 3 header_up Host {http.reverse_proxy.upstream.hostport} 4 5 transport http { 6 forward_proxy_url http://internal-proxy:1055 7 } 8} With this simple line of configuration, we made the backend application receive the Host header it expected, which stopped the unnecessary redirects, and browser access returned to normal.\nSummary # Treat Warnings Rationally: While deprecation warnings are important, when the new solution doesn\u0026rsquo;t work, trust the old solution that is still effective in your current version. The Host Header is the Devil in the Details: In a reverse proxy environment, the Host header is the prime suspect for issues where \u0026ldquo;curl works, but the browser doesn\u0026rsquo;t.\u0026rdquo; Control it with header_up. ","date":"29 August 2025","externalUrl":null,"permalink":"/posts/devops/caddy_forward_proxy/","section":"Posts","summary":"","title":"The Caddy Proxy Mystery: From a Deprecated Directive to a Version \"Bug\"","type":"posts"},{"content":"This post documents the pitfalls encountered while deploying Warpgate as a rootless container using Podman.\nTarget Environment:\nContainer Engine: Podman (in rootless mode) Service Management: Systemd / Quadlet Reverse Proxy: Caddy (also as a rootless container) Network: Caddy and Warpgate are in the same custom Podman network. Pitfall 1: Service Fails to Start, Citing Missing warpgate.yaml Configuration File # This was the first obstacle during deployment. After starting the service with systemctl, the logs repeatedly showed that Warpgate was exiting because it couldn\u0026rsquo;t find the /data/warpgate.yaml configuration file.\nSolution: Execute the one-time setup command using podman run # Warpgate is designed to generate its configuration file via an interactive setup command on its first run, rather than having it created manually. Since Quadlet is used for managing long-running services, this one-time setup task must be completed manually first using podman run.\nExecute the setup command: This command starts a temporary container, runs the interactive setup wizard, and stores the generated configuration in a Podman volume.\n1podman run -it --rm --name warpgate-setup -v warpgate-data:/data --network=caddy ghcr.io/warp-tech/warpgate:latest setup -it: Enables an interactive terminal to answer the setup wizard\u0026rsquo;s questions. --rm: Automatically removes this temporary container after the command finishes. -v warpgate-data:/data: Uses a named volume warpgate-data to persist the configuration. This same volume will be used by the Quadlet service later. Run the service with Quadlet: After the setup is complete, the warpgate-data volume now contains the configuration file. The service started by systemctl can now find the configuration and run normally.\nPitfall 2: Service Runs Normally, but Caddy Reverse Proxy Reports a 502 Bad Gateway # The Warpgate container started successfully and was listening on its port, but when accessed through the Caddy reverse proxy, the browser showed a 502 error.\nSolution: Change the Caddy Reverse Proxy Protocol from HTTP to HTTPS # After investigation, the root of the problem was that Warpgate\u0026rsquo;s internal port 8888 was listening for HTTPS traffic, not standard HTTP. This was likely because, during the interactive setup, a URL starting with https:// was entered when prompted for the Public URL, causing Warpgate to automatically enable internal TLS.\nCaddy defaults to using HTTP to connect to backends, so the connection was being rejected by Warpgate, resulting in the 502 error.\nThe solution was to modify the Caddyfile to use HTTPS when connecting to Warpgate and to ignore verification errors from its internal self-signed certificate.\n1\u0026lt;your-domain\u0026gt; { 2 # ... other configurations ... 3 4 # Reverse proxy to warpgate using https and skip TLS verification for internal traffic 5 reverse_proxy https://warpgate:8888 { 6 transport http { 7 tls_insecure_skip_verify 8 } 9 } 10} https://warpgate:8888: Tells Caddy that the upstream service is HTTPS. tls_insecure_skip_verify: Tells Caddy to ignore certificate verification errors from the upstream, which is standard practice when proxying to internal services that use self-signed certificates. Final Configuration Summary # ~/.config/containers/systemd/warpgate.container # 1[Unit] 2Description=Warpgate Secure Access Gateway 3 4 5[Container] 6ContainerName=warpgate 7Image=ghcr.io/warp-tech/warpgate:latest 8AutoUpdate=image 9PodmanArgs=--network=caddy 10Volume=warpgate-data:/data 11 12[Service] 13Restart=on-failure 14 15[Install] 16WantedBy=default.target Relevant configuration in Caddyfile # 1\u0026lt;your-domain\u0026gt; { 2 3 4 encode zstd gzip 5 6 reverse_proxy https://warpgate:8888 { 7 transport http { 8 tls_insecure_skip_verify 9 } 10 } 11} ","date":"27 August 2025","externalUrl":null,"permalink":"/posts/container/warpgate/","section":"Posts","summary":"","title":"A Summary of Pitfalls Encountered When Deploying Warpgate as a Rootless Container with Quadlet","type":"posts"},{"content":"","date":"27 August 2025","externalUrl":null,"permalink":"/tags/container/","section":"Tags","summary":"","title":"Container","type":"tags"},{"content":"","date":"27 August 2025","externalUrl":null,"permalink":"/categories/container--virtualization/","section":"Categories","summary":"","title":"Container \u0026 Virtualization","type":"categories"},{"content":"","date":"27 August 2025","externalUrl":null,"permalink":"/tags/podman/","section":"Tags","summary":"","title":"Podman","type":"tags"},{"content":"","date":"27 August 2025","externalUrl":null,"permalink":"/tags/virtualization/","section":"Tags","summary":"","title":"Virtualization","type":"tags"},{"content":"","date":"27 August 2025","externalUrl":null,"permalink":"/zh-cn/tags/%E5%AE%B9%E5%99%A8/","section":"Tags","summary":"","title":"容器","type":"tags"},{"content":"","date":"27 August 2025","externalUrl":null,"permalink":"/zh-cn/tags/%E8%99%9A%E6%8B%9F%E5%8C%96/","section":"Tags","summary":"","title":"虚拟化","type":"tags"},{"content":" Container Runtimes Explained: Podman vs. Docker vs. Containerd # In the world of cloud-native and containerization, Podman, Docker, and Containerd are three core technologies for building and managing containers. Although they can all run containers that comply with the OCI (Open Container Initiative) specification, their architectural designs, core philosophies, and ideal use cases are fundamentally different.\nCore Architecture: Daemonless vs. Client-Server # The root of all their differences lies in their distinct architectural models.\nArchitectural Model Podman Docker / Containerd Paradigm Daemonless Client-Server Process Model The podman CLI tool directly creates and manages containers using the traditional fork/exec model. It starts a lightweight container monitor called conmon, which acts as the direct parent process for the container, responsible for log streaming, TTY interaction, and reporting exit codes. The podman command itself can exit after the container is started. The docker or nerdctl CLI acts as a client that communicates via a UNIX socket or TCP with a long-running, stateful daemon (dockerd or containerd) in the background. This daemon is the central manager and parent process for all containers. Failure Domain Distributed. Each container is monitored by its own independent conmon process. The failure of one container or its monitor does not affect any other container. Centralized. The daemon is a Single Point of Failure (SPOF) for all containers. If the daemon crashes or needs to be restarted, it will, by default, terminate all the running containers it manages. System Integration Native Integration. Due to its daemonless nature, Podman can be managed by systemd just like any other normal system process, enabling seamless integration. This has given rise to declarative container management tools like Quadlet. Adapted Integration. dockerd is a long-running service that can be managed by systemd. However, the lifecycle of the containers it manages is decoupled from systemd\u0026rsquo;s service model, requiring extra adaptation. Security Superior. It eliminates the centralized, often high-privilege attack surface of a daemon. Its architecture naturally supports and encourages rootless mode, significantly reducing the risk of container escapes. Inherent Risks. The daemon typically runs with root privileges and controls all containers on the system, making it a high-value attack target. Access to the Docker socket is nearly equivalent to root access on the system. Technical Stack and the OCI Runtime # While their high-level architectures differ, they all converge on the OCI runtime specification at the lowest level.\nHigh-level Runtime: Responsible for complex lifecycle tasks such as image management (pulling, storing, distributing), volume management, and network configuration.\nDocker: The dockerd daemon integrates these functions internally and delegates to Containerd. Containerd: The containerd daemon is itself a pure high-level runtime. Podman: The podman CLI tool implements these high-level management functions itself. Low-level / OCI Runtime: Responsible for using kernel features (Namespaces, Cgroups) to create and run an isolated container process according to the OCI specification.\nrunc: The reference implementation developed by Docker and donated to the OCI. It is the default choice for Containerd and Docker. crun: Developed by Red Hat and written in C, it offers higher performance and lower memory usage. It is the default choice for Podman. Key Insight: They share the same industry standard (OCI) but implement different high-level management logic. Podman\u0026rsquo;s architecture is more direct (Podman -\u0026gt; conmon -\u0026gt; crun), whereas Docker/Containerd uses a layered delegation model (CLI -\u0026gt; Daemon -\u0026gt; OCI Runtime).\nComparison of Pros, Cons, and Professional Use Cases # Tool Core Advantages Core Disadvantages Professional Recommendation Docker Unparalleled Ecosystem: Has the most extensive third-party tooling, documentation, and community support. Cross-platform Consistency: Docker Desktop provides the most seamless development experience on Windows/macOS. Inherent Flaws of the Daemon Architecture: Security risks, a single point of failure, and less native integration with modern Linux system management (systemd). Use Cases: In teams that require deep integration with the vast and mature Docker ecosystem toolchain; when a top-priority, cross-platform development experience is needed; for legacy systems or during the initial learning phase where tutorials are abundant. Podman Superior Security \u0026amp; System Integration: The daemonless architecture and native rootless mode are its biggest highlights. Perfect Fusion with systemd: Achieves declarative \u0026ldquo;Infrastructure as Code\u0026rdquo; through Quadlet. Lightweight: No background service means lower resource consumption. Relatively New Ecosystem: Although the CLI is Docker-compatible, some third-party tools that depend on the Docker socket may require adaptation. Non-Linux Experience: Relies on a VM on Windows/macOS, making the experience less polished than Docker Desktop. Use Cases: The top choice for all modern Linux server environments. For building secure, predictable, and easily automated production systems with declarative management. For a lighter and more secure build environment in CI/CD pipelines. Containerd Stable, Efficient, Standards-Compliant: Designed as a cornerstone for cloud-native platforms and has been battle-tested by large-scale systems like Kubernetes. Componentized: It does one thing—being a container runtime—and does it exceptionally well. Not End-User-Facing: It is an underlying component, not an \u0026ldquo;all-in-one\u0026rdquo; tool. It lacks a user-friendly CLI (nerdctl is bridging this gap) and out-of-the-box networking and storage solutions. Use Cases: As the underlying runtime for container orchestration platforms like Kubernetes. When you need to build your own containerized platform or PaaS, Containerd is the ideal, pluggable core engine. General developers and operators rarely need to interact with it directly. Tips # Podman represents the evolutionary direction of container technology, especially in its fusion with modern Linux operating system philosophy. For professionals pursuing security, predictability, and declarative management, it is the undisputed future in the Linux environment. Docker, with its first-mover advantage and vast ecosystem, will remain a major industry player for the foreseeable future, particularly in cross-platform development and legacy systems. Containerd is the unsung hero behind it all—an industrial-grade, standard component that keeps the entire cloud-native world running stably. As a professional, the choice of which tool to use depends on a trade-off between the architecture, security, and operational model for a specific scenario, rather than a simple feature-list comparison.\n","date":"14 August 2025","externalUrl":null,"permalink":"/posts/container/oci_compare/","section":"Posts","summary":"","title":"Container Runtimes Explained: Podman vs. Docker vs. Containerd","type":"posts"},{"content":" Podman Container Auto-Updates: Native Integration vs. Watchtower # In both production and development environments, ensuring container images are up-to-date is key to maintaining system security and functionality. This article will explore the two main approaches for automatically updating Podman containers: Watchtower and Podman\u0026rsquo;s native podman-auto-update mechanism.\nA Comparison of Approaches: Architectural Philosophy and Design # Choosing an approach is essentially choosing between two different management philosophies.\nComparison Aspect Podman Native Approach (podman-auto-update) Watchtower (External Management) Core Architecture Declarative \u0026amp; System-Integrated. You simply declare \u0026ldquo;this container should be updated,\u0026rdquo; and the rest of the work is handled entirely by the operating system\u0026rsquo;s core systemd timers. Imperative \u0026amp; External Polling. You run a separate manager container that actively and continuously polls the Podman API to check for and perform updates. Design Philosophy \u0026ldquo;Let the system manage me.\u0026rdquo; It leverages the host\u0026rsquo;s existing, extremely reliable init system to achieve zero-overhead management. \u0026ldquo;I will manage you.\u0026rdquo; It introduces an external, stateful manager to monitor and operate other containers. Reliability Extremely High. Its reliability is equivalent to that of systemd itself, one of the most robust components of a Linux system. Moderate. It depends on the stability of the Watchtower container itself, which is another piece of software that needs to be managed and could crash or be misconfigured. Overhead Almost Zero. There are no running processes or memory usage at rest. It only spawns a short-lived process at the moment the check is triggered. Continuous Overhead. It requires a long-running container that constantly consumes a small amount of CPU and memory. Security Superior. There are no extra running daemons, which reduces the attack surface. It aligns perfectly with Podman\u0026rsquo;s rootless philosophy. Introduces Additional Risk. It requires mounting the Podman socket into the container, which is a privileged operation that needs to be handled with care. Features Focuses on the core update functionality. Its purpose is pure, with no built-in extras like notifications. Feature-rich. It supports sending update notifications (Webhook, Email), finer-grained filtering, and more control. Tip: The Podman native approach embodies the modern DevOps philosophy of deep integration with the operating system. It is more secure, more reliable, and more resource-efficient. While Watchtower is a versatile general-purpose solution, it introduces unnecessary complexity and management overhead into the Podman ecosystem. For professional users who value robustness and simplicity, the native approach is the clear first choice.\nThe Native Approach: podman-auto-update # This method works for both rootless and rootful modes.\nStep 1: Declare the \u0026ldquo;Auto-Update\u0026rdquo; Intent for the Container # Using Podman\u0026rsquo;s Quadlet tool, we only need to add a single line to the container\u0026rsquo;s .container configuration file.\nLocate or create your Quadlet file.\nRootless Mode (Recommended): Files are in ~/.config/containers/systemd/ Rootful Mode: Files are in /etc/containers/systemd/ Add AutoUpdate=image to the [Container] section.\nHere is an example of a typical caddy.container file:\n1[Unit] 2Description=Caddy web server 3After=network-online.target 4Wants=network-online.target 5 6[Container] 7ContainerName=caddy 8Image=docker.io/library/caddy:latest 9# Core: Declare that this container should be auto-updated 10AutoUpdate=image 11Port=80:80 12Port=443:443 13 14[Install] 15WantedBy=default.target Apply the changes. After modifying or creating the file, reload systemd and restart your container service to apply the new label.\nRootless Mode: systemctl --user daemon-reload \u0026amp;\u0026amp; systemctl --user restart caddy.service Rootful Mode: sudo systemctl daemon-reload \u0026amp;\u0026amp; sudo systemctl restart caddy.service Step 2: Activate the Global Update Timer # Podman ships with a systemd timer that we just need to activate.\nRootless Mode: systemctl --user enable --now podman-auto-update.timer\nRootful Mode: sudo systemctl enable --now podman-auto-update.timer\nenable ensures it starts on boot, and --now starts the timer immediately. At this point, your auto-update is configured and will run on its default schedule.\nStep 3: (Optional) Customize the Update Frequency # The default update frequency is daily, in the early morning. To modify this safely, it\u0026rsquo;s recommended to use systemd\u0026rsquo;s override mechanism.\nOpen the edit interface. This command will automatically create an override file for you to edit.\nRootless Mode: systemctl --user edit podman-auto-update.timer Rootful Mode: sudo systemctl edit podman-auto-update.timer Enter your new schedule in the editor. The OnCalendar= field follows the systemd.time calendar event format.\n1# 2# /etc/systemd/system/podman-auto-update.timer.d/override.conf 3# 4[Timer] 5# Clear the inherited schedule from the original file to ensure our setting is the only one 6OnCalendar= 7# Set the new schedule, for example: every day at 3:30 AM 8OnCalendar=*-*-* 03:30:00 Common OnCalendar examples:\nRun once per hour: hourly Every Monday at 4:00 AM: Mon *-*-* 04:00:00 Every 15 minutes: *:0/15 Apply and verify your changes.\nRootless Mode: systemctl --user daemon-reload \u0026amp;\u0026amp; systemctl --user restart podman-auto-update.timer Rootful Mode: sudo systemctl daemon-reload \u0026amp;\u0026amp; sudo systemctl restart podman-auto-update.timer You can view the new trigger time with systemctl --user list-timers (or sudo systemctl list-timers) to confirm the configuration has been applied.\n","date":"14 August 2025","externalUrl":null,"permalink":"/posts/container/podman_autoupdate/","section":"Posts","summary":"","title":"Podman Container Auto-Updates: Native Integration vs. Watchtower","type":"posts"},{"content":"Abstract: This article documents the troubleshooting process for a classic networking issue encountered when migrating services from Docker to Podman (in rootless mode): a container being unable to access a host service via the host\u0026rsquo;s IP address. By analyzing Podman\u0026rsquo;s network model and working through a complex SSH proxy scenario, the issue was ultimately identified and resolved.\n1. Problem Statement # During the migration, a service that relied on the \u0026ldquo;container-accesses-host\u0026rdquo; pattern failed. In the Docker environment, this service could communicate directly using the host\u0026rsquo;s LAN or public IP address, but in a Podman rootless container, the connection timed out.\n2. Principle Analysis: Docker vs. Podman Rootless Network Models # Initial analysis revealed that the problem stemmed from fundamental differences in their network models.\nDocker (Rootful): The Docker daemon runs with root privileges, creating kernel-level virtual bridges (like docker0) and actively modifying the host\u0026rsquo;s iptables/nftables rules to support \u0026ldquo;Hairpin NAT\u0026rdquo;. This allows traffic from a container destined for the host\u0026rsquo;s real IP to be correctly routed back to the loopback interface. Podman (Rootless): For security reasons, Podman in rootless mode uses a user-space network stack (like Netavark) and lacks the permission to modify system-level iptables/nftables. Consequently, when a container tries to access the host\u0026rsquo;s real IP, the resulting Hairpin NAT traffic is dropped by the kernel\u0026rsquo;s default policy, causing the connection to fail. 3. Solution: The Correct Internal Communication Mechanism # Since the external loopback path is blocked, the internal communication mechanisms provided by Podman must be used.\n3.1. The Abstraction Layer: host.containers.internal # Podman provides a special DNS name, host.containers.internal, which is automatically resolved inside the container to the host\u0026rsquo;s address within the current container network. This is the preferred way to access the host, as it decouples the dependency on a specific IP.\nTesting connectivity from within the container:\n1$ podman exec -it my_container /bin/sh 2ping host.containers.internal 3PING host.containers.internal (169.254.1.2): 56 data bytes 464 bytes from 169.254.1.2: seq=0 ttl=42 time=0.285 ms 5... The test result confirms that Layer 3 connectivity from the container to the host is working.\n3.2. Firewall Policy # Traffic must be allowed through the host\u0026rsquo;s firewall. The core of the rule is to permit inbound traffic from the Podman subnet.\nGet the Podman network subnet: 1$ podman network inspect podman | grep subnet 2 \u0026#34;subnet\u0026#34;: \u0026#34;10.89.0.0/24\u0026#34;, Add a rule to nftables (using target port TARGET_PORT/tcp as an example): 1# Add a rule to the input chain of the default inet filter table 2sudo nft add rule inet filter input ip saddr 10.89.0.0/24 tcp dport TARGET_PORT accept Persist nftables rules: Rules added via the nft command will be lost on reboot. To make them permanent, save the current ruleset to the configuration file and enable the service. 1# Write the current ruleset to the configuration file 2sudo nft list ruleset \u0026gt; /etc/nftables.conf 3 4# Ensure the nftables service starts on boot to load the rules 5sudo systemctl enable nftables.service 4. Real-World Case Study: Accessing the Host\u0026rsquo;s SSH via a sing-box Proxy # After verifying the theory, I applied it to a complex real-world scenario for testing.\nScenario: Remote SSH client (WindTerm) -\u0026gt; Local sing-box client -\u0026gt; Server\u0026rsquo;s sing-box container -\u0026gt; Host\u0026rsquo;s SSH service (listening on TARGET_PORT). Client Configuration: WindTerm\u0026rsquo;s proxy was set to the local sing-box (listening on LOCAL_PROXY_PORT), with \u0026ldquo;Remote DNS resolution\u0026rdquo; enabled. The SSH hostname was set to host.containers.internal. Despite this setup, the connection still failed. This indicated the problem was not with the underlying network but with the logic of the proxy chain.\n4.1. Server-Side Validation # To rule out server-side issues, I initiated an SSH connection directly from within the sing-box container.\n1# Install and run ssh inside the sing-box container 2$ podman exec -it sing-box /bin/sh 3/ # apk add openssh-client 4/ # ssh my_username@host.containers.internal -p TARGET_PORT 5The authenticity of host ... can\u0026#39;t be established. 6... 7Are you sure you want to continue connecting (yes/no)? yes 8# Successfully received the password prompt 9my_username@host.containers.internal\u0026#39;s password: Conclusion: The server-side configuration was completely correct. The successful ssh connection proved that the network path from the container to the host, the firewall, and the SSH service itself were all working properly.\n4.2. Client-Side Log Analysis and Final Diagnosis # Since the server side was fine, the problem had to be on the client\u0026rsquo;s proxy chain. I checked the logs of the local sing-box client and found the following critical entries:\n1error [timestamp] connection: open outbound connection: NXDOMAIN 2info [timestamp] outbound/direct[]: outbound connection to host.containers.internal:TARGET_PORT The logs clearly pointed out the problem:\nThe client received a request destined for host.containers.internal. However, its routing rules incorrectly matched this request to the direct (direct connection) outbound. The client then attempted to resolve host.containers.internal locally, resulting in an NXDOMAIN (domain does not exist) error. 4.3. Solution # Correct the routing configuration in the client-side sing-box, adding a high-priority rule to forcibly route traffic for host.containers.internal to the server proxy outbound (tagged here as proxy-out).\n1{ 2 \u0026#34;routing\u0026#34;: { 3 \u0026#34;rules\u0026#34;: [ 4 { 5 \u0026#34;domain\u0026#34;: [\u0026#34;host.containers.internal\u0026#34;], 6 \u0026#34;outbound\u0026#34;: \u0026#34;proxy-out\u0026#34; 7 }, 8 // ... other rules 9 ] 10 } 11}``` 12 13After applying this rule and restarting the client, the SSH connection succeeded. 14 15## 5. Summary 16 17* The isolation of Podman\u0026#39;s rootless networking is fundamental to its security but leads to different network behavior compared to Docker, especially in Hairpin NAT scenarios. 18* `host.containers.internal` is the standard abstraction layer for a Podman container to access its host and should be used preferentially. 19* You must configure corresponding firewall inbound rules for the Podman subnet. 20* In complex proxy or network chains, end-to-end log analysis and testing with the native protocol at key nodes are the most efficient and reliable troubleshooting methods. ","date":"12 August 2025","externalUrl":null,"permalink":"/posts/container/podman_network/","section":"Posts","summary":"","title":"Troubleshooting Podman Container Access to the Host Network","type":"posts"},{"content":"","date":"6 August 2025","externalUrl":null,"permalink":"/tags/ai/","section":"Tags","summary":"","title":"AI","type":"tags"},{"content":"","date":"6 August 2025","externalUrl":null,"permalink":"/categories/python/","section":"Categories","summary":"","title":"Python","type":"categories"},{"content":" 🧠 Project Goal # Use the transformers library to load a Hugging Face reranker model and expose it as a REST API using FastAPI.\n🧪 Code Example # 1from fastapi import FastAPI, HTTPException 2from pydantic import BaseModel 3from transformers import AutoTokenizer, AutoModel 4import torch 5from torch.nn import functional as F 6 7# Load the model and tokenizer 8tokenizer = AutoTokenizer.from_pretrained(\u0026#34;BAAI/bge-reranker-large\u0026#34;) 9model = AutoModel.from_pretrained(\u0026#34;BAAI/bge-reranker-large\u0026#34;) 10 11# Move to GPU if available (optional) 12if torch.cuda.is_available(): 13 model = model.to(\u0026#34;cuda\u0026#34;) 14 15app = FastAPI(title=\u0026#34;BGE Reranker API\u0026#34;, version=\u0026#34;1.0\u0026#34;) 16 17# Define the request body model 18class RerankRequest(BaseModel): 19 query: str 20 documents: list[str] 21 22@app.post(\u0026#34;/rerank\u0026#34;) 23async def rerank(request: RerankRequest): 24 try: 25 # Tokenize the query and documents separately 26 query_inputs = tokenizer([request.query], padding=True, truncation=True, return_tensors=\u0026#34;pt\u0026#34;) 27 doc_inputs = tokenizer(request.documents, padding=True, truncation=True, return_tensors=\u0026#34;pt\u0026#34;) 28 29 if torch.cuda.is_available(): 30 query_inputs = {k: v.to(\u0026#34;cuda\u0026#34;) for k, v in query_inputs.items()} 31 doc_inputs = {k: v.to(\u0026#34;cuda\u0026#34;) for k, v in doc_inputs.items()} 32 33 with torch.no_grad(): 34 query_outputs = model(**query_inputs, return_dict=True) 35 doc_outputs = model(**doc_inputs, return_dict=True) 36 37 query_embedding = query_outputs.pooler_output 38 document_embeddings = doc_outputs.pooler_output 39 40 # Compute cosine similarity between the query embedding and each document embedding 41 scores = F.cosine_similarity(query_embedding, document_embeddings, dim=1).tolist() 42 43 ranked_docs = sorted( 44 zip(request.documents, scores), 45 key=lambda x: x, 46 reverse=True 47 ) 48 49 return {\u0026#34;results\u0026#34;: [{\u0026#34;document\u0026#34;: doc, \u0026#34;score\u0026#34;: score} for doc, score in ranked_docs]} 50 except Exception as e: 51 raise HTTPException(status_code=500, detail=str(e)) 52 53if __name__ == \u0026#34;__main__\u0026#34;: 54 import uvicorn 55 uvicorn.run(app, host=\u0026#34;0.0.0.0\u0026#34;, port=58222) ","date":"6 August 2025","externalUrl":null,"permalink":"/posts/python/hgface/","section":"Posts","summary":"","title":"Python: Implementing a Hugging Face Model API","type":"posts"},{"content":"","date":"5 August 2025","externalUrl":null,"permalink":"/categories/go/","section":"Categories","summary":"","title":"Go","type":"categories"},{"content":" 🌐 Project Goal # Build a simple web service using the Gin framework.\n🛠️ Project Structure # 1todo-api/ 2├── main.go # Entrypoint file 3├── routes/ 4│ └── todo_routes.go # Route definitions 5├── models/ 6│ └── todo.go # Data structure 7├── middleware/ 8│ └── logging.go # Custom middleware 9├── config/ 10│ └── config.go # Configuration management 11└── go.mod # Go module 📦 Install Dependencies # 1go mod init todo-api 2go get -u github.com/gin-gonic/gin 3go get -u github.com/jackc/pgx/v4 🧪 Code Examples # 1. Configuration Management (config/config.go) # 1package config 2 3import \u0026#34;github.com/joho/godotenv\u0026#34; 4 5func LoadEnv() { 6 err := godotenv.Load() 7 if err != nil { 8 panic(\u0026#34;Error loading .env file\u0026#34;) 9 } 10} 2. Database Connection (main.go) # 1package main 2 3import ( 4 \u0026#34;context\u0026#34; 5 \u0026#34;fmt\u0026#34; 6 \u0026#34;log\u0026#34; 7 \u0026#34;os\u0026#34; // Make sure to import os to use os.Getenv() 8 \u0026#34;github.com/gin-gonic/gin\u0026#34; 9 \u0026#34;github.com/jackc/pgx/v4\u0026#34; 10 \u0026#34;todo-api/config\u0026#34; 11 \u0026#34;todo-api/routes\u0026#34; 12) 13 14type Todo struct { 15 ID int `json:\u0026#34;id\u0026#34;` 16 Title string `json:\u0026#34;title\u0026#34;` 17} 18 19func main() { 20 // 1. Load environment variables 21 config.LoadEnv() 22 23 // 2. Connect to PostgreSQL 24 connStr := fmt.Sprintf( 25 \u0026#34;postgres://%s:%s@%s:%s/%s?sslmode=disable\u0026#34;, 26 os.Getenv(\u0026#34;DB_USER\u0026#34;), 27 os.Getenv(\u0026#34;DB_PASSWORD\u0026#34;), 28 os.Getenv(\u0026#34;DB_HOST\u0026#34;), 29 os.Getenv(\u0026#34;DB_PORT\u0026#34;), 30 os.Getenv(\u0026#34;DB_NAME\u0026#34;), 31 ) 32 33 conn, err := pgx.Connect(context.Background(), connStr) 34 if err != nil { 35 log.Fatal(\u0026#34;Unable to connect to database:\u0026#34;, err) 36 } 37 defer conn.Close(context.Background()) 38 39 // 3. Create Gin application 40 r := gin.Default() 41 42 // 4. Register routes 43 todoRoutes := routes.TodoRoutes{DB: conn} 44 r.POST(\u0026#34;/todos\u0026#34;, todoRoutes.CreateTodo) 45 r.GET(\u0026#34;/todos\u0026#34;, todoRoutes.GetAllTodos) 46 47 // 5. Start the server 48 r.Run(\u0026#34;:8080\u0026#34;) 49} 3. Route Implementation (routes/todo_routes.go) # 1package routes 2 3import ( 4 \u0026#34;context\u0026#34; // Make sure to import context 5 \u0026#34;github.com/gin-gonic/gin\u0026#34; 6 \u0026#34;github.com/jackc/pgx/v4\u0026#34; // Make sure to import pgx 7 \u0026#34;todo-api/models\u0026#34; 8) 9 10type TodoRoutes struct { 11 DB *pgx.Conn 12} 13 14func (tr *TodoRoutes) CreateTodo(c *gin.Context) { 15 var todo models.Todo 16 if err := c.ShouldBindJSON(\u0026amp;todo); err != nil { 17 c.JSON(400, gin.H{\u0026#34;error\u0026#34;: err.Error()}) 18 return 19 } 20 21 _, err := tr.DB.Exec(context.Background(), \u0026#34;INSERT INTO todos (title) VALUES ($1)\u0026#34;, todo.Title) 22 if err != nil { 23 c.JSON(500, gin.H{\u0026#34;error\u0026#34;: \u0026#34;Database operation failed\u0026#34;}) 24 return 25 } 26 27 c.JSON(201, gin.H{\u0026#34;message\u0026#34;: \u0026#34;Successfully created task\u0026#34;}) 28} 29 30func (tr *TodoRoutes) GetAllTodos(c *gin.Context) { 31 rows, err := tr.DB.Query(context.Background(), \u0026#34;SELECT id, title FROM todos\u0026#34;) 32 if err != nil { 33 c.JSON(500, gin.H{\u0026#34;error\u0026#34;: \u0026#34;Query failed\u0026#34;}) 34 return 35 } 36 defer rows.Close() // Good practice to close rows 37 38 var todos []models.Todo 39 for rows.Next() { 40 var t models.Todo 41 if err := rows.Scan(\u0026amp;t.ID, \u0026amp;t.Title); err != nil { 42 c.JSON(500, gin.H{\u0026#34;error\u0026#34;: \u0026#34;Failed to parse results\u0026#34;}) 43 return 44 } 45 todos = append(todos, t) 46 } 47 48 c.JSON(200, todos) 49} ","date":"5 August 2025","externalUrl":null,"permalink":"/posts/go/simpleweb/","section":"Posts","summary":"","title":"Go: Building a Simple Web Backend","type":"posts"},{"content":"","date":"2 August 2025","externalUrl":null,"permalink":"/tags/k8s/","section":"Tags","summary":"","title":"K8s","type":"tags"},{"content":" Check for Leftover Resources: # 1kubectl get all -n efk -l app=kibana 2kubectl get configmaps,secrets -n efk | grep kibana 3kubectl get pvc -n efk | grep kibana Cleanup # 1kubectl delete configmap kibana-kibana-helm-scripts -n efk --force --grace-period=0 \u0026amp;\u0026amp; \\ 2kubectl delete serviceaccount pre-install-kibana-kibana -n efk --force --grace-period=0 \u0026amp;\u0026amp; \\ 3kubectl delete role pre-install-kibana-kibana -n efk --force --grace-period=0 \u0026amp;\u0026amp; \\ 4kubectl delete rolebinding pre-install-kibana-kibana -n efk --force --grace-period=0 \u0026amp;\u0026amp; \\ 5kubectl delete job pre-install-kibana-kibana -n efk --force --grace-period=0 \u0026amp;\u0026amp; \\ 6kubectl delete secret sh.helm.release.v1.kibana.v1 -n efk --force --grace-period=0 \u0026amp;\u0026amp; \\ 7kubectl delete secret kibana-kibana-es-token -n efk ","date":"2 August 2025","externalUrl":null,"permalink":"/posts/devops/kubectl/","section":"Posts","summary":"","title":"kubectl Commands for Resource Cleanup","type":"posts"},{"content":"Hi, I\u0026rsquo;m Yuzjing 👋\nWelcome to my personal corner of the web. I\u0026rsquo;m a DevOps engineer with a constant curiosity for new technologies. I believe they are engines of change, offering elegant solutions to old problems and unlocking new possibilities, ultimately helping us make the most of our time.\nThis website is my space to document sparks of inspiration, lessons learned from challenges, and some occasional thoughts.\nBeyond the code, I\u0026rsquo;m also exploring\u0026hellip;\n🌲 Into the Wild: On weekends, I love hiking in the nearby mountains. The fresh air is the perfect way to clear my mind and reboot.\n🎵 A World of Music: My playlist is a melting pot, spanning from the melodies of Pop and the rhythms of House to the energy of Hardcore. For me, music is the perfect background for building mood and focus.\n🚀 Exploring Fictional Worlds: Whether it\u0026rsquo;s the distant galaxies of sci-fi novels or the complex strategies of video games, I\u0026rsquo;m fascinated by exploring the grand worlds built from imagination.\nMy Projects # Here\u0026rsquo;s a small tool I built:\nGitHub Actions IP Firewall Auto-Updater # The Problem: Updating servers that use an IP allowlist (like Nftables) via GitHub Actions often failed because the runner IP addresses are dynamic. My Solution: I wrote a small tool in Go that automatically fetches the latest IP ranges used by GitHub Actions and updates the firewall rules. Combined with a Cron job, it creates a true \u0026ldquo;set it and forget it\u0026rdquo; solution. Find the code: You can check out the source code on GitHub. Get in Touch # If you\u0026rsquo;re interested in my projects, want to talk about new tech, or have a great song to recommend, feel free to connect with me on GitHub.\n","date":"1 August 2025","externalUrl":null,"permalink":"/about/","section":"涧雨暖云","summary":"","title":"About Me","type":"page"},{"content":"Abstract: This article documents the exploration of achieving service persistence for Podman in rootless mode. Starting with a common failure case of manually creating a systemd service, it analyzes the root cause and ultimately transitions to using Quadlet, the officially recommended tool from Podman. Through a deep dive into Quadlet\u0026rsquo;s mechanics, this article explains its declarative service management approach and presents the resulting best practices.\n1. The Goal and the Failure of the Initial Attempt # After migrating from Docker to Podman, one of the core goals was to achieve persistence and auto-start for containerized services.\nInitially, I attempted to write a traditional systemd user service for a podman-compose project. However, this approach failed because systemd\u0026rsquo;s lifecycle management for background processes (like podman-compose up -d) was incompatible with expectations, leading the service into a \u0026ldquo;start-stop\u0026rdquo; infinite loop. This prompted me to turn to Podman\u0026rsquo;s official solution: Quadlet.\n2. Initial Quadlet Exploration and a New Confusion # The core idea behind Quadlet is that a user only needs to write a simple .container file, which is then automatically \u0026ldquo;translated\u0026rdquo; into a complex .service file by a systemd generator.\nFollowing the compose.yml file, I created a corresponding Quadlet file ~/.config/containers/systemd/sing-box.container for the sing-box container, ensuring it included an [Install] section to define its auto-start behavior.\nHowever, after running systemctl --user daemon-reload, my attempt to use systemctl --user enable sing-box.service was repeatedly met with the error: Failed to enable unit: Unit ... is transient or generated. Interestingly, systemctl --user start sing-box.service was able to start the container successfully.\nThis indicated that the Quadlet file itself was valid, but there was an interaction between it and systemd\u0026rsquo;s enable mechanism that was beyond conventional understanding.\n3. Root Cause Analysis: The Inner Workings of the Quadlet Generator and [Install] # Through research and repeated experimentation, the root of the problem became clear: Quadlet\u0026rsquo;s autostart capability is not granted by the systemctl enable command but is automatically handled when its generator reads the [Install] section in the .container file.\nThe systemd workflow is as follows:\nWrite: The user creates a .container file with an [Install] section in the specified directory¹. This is the \u0026ldquo;autostart directive\u0026rdquo; given to systemd.\ndaemon-reload triggers: When systemctl daemon-reload² is executed, the quadlet-generator is activated. It scans the directory, finds the .container file, and performs two key actions:\nGenerates a transient service file: In a temporary runtime directory (like /run/systemd/generator/), it creates a .service file that does not include the [Install] section. This file is used for the actual start and stop operations. Automatically creates symbolic links: It reads the [Install] section from the \u0026ldquo;blueprint\u0026rdquo; and directly performs the core task of enable on its behalf—creating symbolic links from the appropriate .wants/ directory (based on WantedBy) to the transient .service file. The enable command issue: After this, when the user manually runs systemctl enable, systemd sees that the service has already been \u0026ldquo;installed\u0026rdquo; by a generator and points to a transient file. According to its design principles, a user should not directly enable a transient unit managed by a generator, so it returns the Unit is transient or generated error. This error is actually a hint: \u0026ldquo;I\u0026rsquo;ve already taken care of this for you.\u0026rdquo;\n¹ Quadlet File Paths: For user services (Rootless), the path is ~/.config/containers/systemd/. For system services (Rootful), the path is /etc/containers/systemd/. ² Command Note: For user services, the command is systemctl --user daemon-reload. For system services, it is sudo systemctl daemon-reload.\n4. [Install] Configuration and the Final Workflow # 4.1. Correctly Configuring the [Install] Section # The value of WantedBy= depends on the mode you are running Podman in:\nRootless Mode: The service belongs to the user session and should start with the user session (or with the system if linger is enabled).\n1[Install] 2WantedBy=default.target Rootful Mode: The service belongs to the system and should start when the system enters the multi-user state.\n1[Install] 2WantedBy=multi-user.target 4.2. The Final, Simplified Workflow # Write or Modify the Quadlet source file: Ensure the file syntax is correct and includes the appropriate [Install] section for your runtime mode.\nApply Changes: After every modification to the .container file, execute the following command to notify systemd and trigger the generator. This is the only necessary management command.\n1# For user services 2systemctl --user daemon-reload 3# For system services 4sudo systemctl daemon-reload Start and Verify: Once daemon-reload is complete, the service is already in an \u0026ldquo;installed\u0026rdquo; and \u0026ldquo;enabled\u0026rdquo; state.\n1# Start the service (use sudo and/or --user depending on the mode) 2systemctl --user start your-service.service 3 4# Verify that it is enabled 5systemctl --user is-enabled your-service.service 6# Expected output: generated 5. Summary # This troubleshooting journey, which started from a seemingly simple need for service persistence, led to a deep dive into the complex interaction between systemd and the Quadlet generator. It revealed the powerful automation behind declarative tools and clarified the changing role of traditional systemctl commands when interacting with generators.\nThe final conclusion is that Quadlet achieves highly automated and declarative persistence management by reading the [Install] configuration in the .container file and automatically completing the service \u0026ldquo;installation\u0026rdquo; (the core work of enable) during a daemon-reload. Understanding the difference in [Install] configuration between rootful and rootless modes and following the \u0026ldquo;edit source file -\u0026gt; daemon-reload -\u0026gt; start\u0026rdquo; workflow is key to mastering Podman container deployment.\n","date":"31 July 2025","externalUrl":null,"permalink":"/posts/container/quadlets/","section":"Posts","summary":"","title":"Exploring Persistence for Podman Rootless Containers: A Deep Dive into Quadlets","type":"posts"},{"content":"","date":"30 July 2025","externalUrl":null,"permalink":"/categories/k8s/","section":"Categories","summary":"","title":"K8s","type":"categories"},{"content":" 1. General Deployment Considerations # Version Compatibility: Various K8s components have version compatibility issues. Therefore, it\u0026rsquo;s crucial to select versions carefully during deployment, especially for components like containerd, kubernetes-dashboard, and etcd. Calico Network Plugin: Deployment may require bypassing network restrictions (e.g., a VPN). If using a domestic mirror, you\u0026rsquo;ll need to modify the containerd config.toml file. If deployment still fails, you can try manually pulling the images first. Helm Domestic Mirrors: Helm charts are just templates; the actual image pulling happens on the cluster nodes. Therefore, you must ensure the containerd registry on the cluster is accessible. Local Management Tools: By installing helm and kubectl on your personal computer and configuring your environment variables, you can directly manage remote K8s clusters using these commands. 2. Dashboard Showing No Data # If you find the dashboard shows no data after installation, it could be due to the following reasons:\nIncorrect Namespace Selected: Check if you have selected the correct namespace. The default default namespace has no monitoring data. Metrics Server Not Installed: You need to install the metrics-server component to collect metrics. 1kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml Kubelet Certificate Issue: If the Kubelet certificate is not signed by a CA trusted by the Metrics Server, the TLS handshake will fail, preventing metrics collection. To fix this, you need to add the following argument to the metrics-server deployment configuration to trust insecure Kubelet certificates: 1- --kubelet-insecure-tls=true 3. K8s Cluster High Availability Process # Fault Detection: Kubernetes (via the Node Controller or Liveness Probes) detects an issue with a Pod or its host Node. Recreation Triggered: The replica controller (Deployment/StatefulSet) notices that the number of running Pods is below the desired count. New Pod Scheduled: The Kubernetes scheduler selects a healthy Node to create a new Pod. Use of Persistent Configuration (if enabled): The new Pod requests the same PersistentVolumeClaim (PVC) it was using before. For network storage (like NFS), the new Pod mounts it directly. For block storage (like EBS), Kubernetes and the CSI driver ensure the volume is safely detached from the failed Node and then attached to the new Node before being mounted by the new Pod. Service Discovery: The Service automatically routes traffic to this new, healthy Pod instance. 4. How to Check the Desired Replica Count # For a Deployment:\n1kubectl get deployment \u0026lt;deployment-name\u0026gt; -n \u0026lt;namespace\u0026gt; -o wide Observe the DESIRED column.\nFor a StatefulSet:\n1kubectl get statefulset \u0026lt;statefulset-name\u0026gt; -n \u0026lt;namespace\u0026gt; -o wide For a ReplicaSet:\n1kubectl get replicaset \u0026lt;replicaset-name\u0026gt; -n \u0026lt;namespace\u0026gt; -o wide 5. Understanding the Pod\u0026rsquo;s READY Column and Replica Count # In the output of kubectl get pods, the denominator in the READY column (e.g., the second 1 in 1/1) represents the number of containers defined within a single Pod.\nThis is a different concept from the controller\u0026rsquo;s desired replica count. The desired replica count refers to how many instances of that Pod the controller should be running.\n\u0026ldquo;The number of ready containers can be understood as \u0026lsquo;orchestration\u0026rsquo;\u0026rdquo;: Yes, a Pod can have multiple containers that are \u0026ldquo;orchestrated\u0026rdquo; together by Kubernetes, scheduled and managed as a single unit.\nThe controller\u0026rsquo;s desired replica count must be checked using kubectl get deployment/statefulset.\n6. About Persistent Storage # If your Pod is not configured with persistent storage (i.e., it uses the Pod\u0026rsquo;s ephemeral storage like emptyDir, or the container\u0026rsquo;s own temporary filesystem layer), then: when the Pod is deleted and recreated for any reason, all data written by the containers to their ephemeral storage will be lost.\nThe new Pod instance will start in a completely new, clean state.\n7. Understanding Ports in a Service # The flow of a request is typically as follows: External Request -\u0026gt; Node(NodeIP:nodePort) -\u0026gt; Service(ClusterIP:port) -\u0026gt; Pod(PodIP:targetPort)\ntargetPort\nDefinition: The port on the backend Pod\u0026rsquo;s container to which the Service forwards traffic. Scope: Inside the Pod. Value: Can be a number (8080) or a port name defined in the Pod spec (http-api). nodePort\nDefinition: A static port exposed on each Node\u0026rsquo;s IP address when the Service type is NodePort or LoadBalancer. Scope: Outside the cluster Nodes. Value: The default range is 30000-32767. It allows access from outside the cluster via http://\u0026lt;Node-IP\u0026gt;:\u0026lt;nodePort\u0026gt;. port\nDefinition: The port that the Service exposes on its own internal ClusterIP. Scope: Inside the cluster. Purpose: Other Pods within the cluster can access this service via http://\u0026lt;ServiceName\u0026gt;.\u0026lt;Namespace\u0026gt;.svc.cluster.local:\u0026lt;port\u0026gt;. 8. Pitfalls Encountered When Deploying Fluentd # When deploying bitnami/fluentd, I encountered the following issues (the successful values.yaml is in Fluentd.yaml):\nMissing CRI Plugin: The official chart was missing the cri plugin for the forward input, defaulting to drop all, which prevented container logs from being correctly processed and parsed. Missing Elasticsearch Plugin: The Aggregator configuration also lacked the Elasticsearch plugin, making it impossible to send logs to ES. Protocol Error: The default scheme for connecting to ES was http, but the ES port 9200 is typically https, requiring a change. Authentication Failure: Connecting to ES requires specifying a username and password (usually the Kibana login credentials), otherwise, it results in a 401 error. Namespace Label Issue: I discovered that all logs from the efk namespace were being rejected by ES. An AI suggestion pointed to labels containing . and / as a possible cause. Upon inspection, I found the namespace had an extra name: efk label. 1# Incorrect namespace labels 2❯ kubectl get ns efk -o yaml 3apiVersion: v1 4kind: Namespace 5metadata: 6 creationTimestamp: \u0026#34;2025-06-06T06:56:03Z\u0026#34; 7 labels: 8 kubernetes.io/metadata.name: efk 9 name: efk # \u0026lt;-- This line is redundant 10... 11 12# Normal namespace labels 13❯ kubectl get ns monitoring -o yaml 14apiVersion: v1 15kind: Namespace 16metadata: 17 creationTimestamp: \u0026#34;2025-05-22T05:45:59Z\u0026#34; 18 labels: 19 kubernetes.io/metadata.name: monitoring 20 name: monitoring 21... Strangely, neither deleting the extra label nor using a Ruby script to replace special characters worked. However, the issue resolved itself the next day, confirming that the name label was indeed the problem. 9. Ingress and MetalLB: Solving External Access # Ingress operates at Layer 7 (Application Layer) and can implement internal reverse proxying and load balancing. Services are accessed externally by hitting the Ingress Controller\u0026rsquo;s nodePort. Drawback: nodePort must use a high-numbered port (e.g., 30000+), which is inconvenient and inelegant for external access. Solution: Install the MetalLB plugin. Purpose: When a Service is of type LoadBalancer, MetalLB automatically assigns it an IP from a predefined external address pool, which becomes its EXTERNAL-IP. Advantages: Eliminates the need to use high-numbered ports for access; you can use the IP directly. Provides a high-availability solution, avoiding the single point of failure that can occur with iptables-based traffic forwarding. ","date":"30 July 2025","externalUrl":null,"permalink":"/posts/devops/k8s_deployproblems/","section":"Posts","summary":"","title":"Pitfalls Encountered During K8s Cluster Deployment","type":"posts"},{"content":"Background: While deploying a service, I noticed that the container\u0026rsquo;s network behavior was consistently problematic. After some research, I discovered that the \u0026ldquo;local delivery mechanism\u0026rdquo; was the culprit, so I decided to take some notes.\nWhen a data packet\u0026rsquo;s destination IP is one of the local machine\u0026rsquo;s IP addresses (regardless of which interface it arrives on), the operating system will deliver the packet directly to a local application for processing, rather than forwarding it or dropping it. This behavior is called:\nLocal delivery Or, more descriptively: the IP belongs to the local machine (a local delivery decision) This is standard behavior in the TCP/IP stack, supported by both Windows and Linux.\n🔍 An Example # Imagine you have a Linux host with two network interfaces:\neth0: 192.168.1.4/24 eth1: 192.168.5.4/24 You are running a web service on this host, listening on 0.0.0.0:80.\nNow, another machine at 192.168.5.30 sends an HTTP request to 192.168.1.4:80. However, due to a routing error or an incorrect ARP response, this packet actually arrives at the host\u0026rsquo;s eth1 interface (192.168.5.4).\nThe result: Linux can still process this request normally. It inspects the packet\u0026rsquo;s destination IP, 192.168.1.4, recognizes it as a local IP address, and therefore delivers the packet directly to the listening web service.\nThis is a classic example of \u0026ldquo;local delivery\u0026rdquo;.\n⚙️ How Do Linux and Windows Differ? # Feature Windows Linux Local Delivery Enabled by Default ✅ Yes ✅ Yes IP Forwarding Enabled by Default ❌ No ❌ No (by default) Configuration Method GUI / Registry / PowerShell sysctl parameters (e.g., net.ipv4.ip_forward) Allows Accessing Local IPs Across Interfaces ✅ Yes ✅ Yes As you can see, their core behavior regarding \u0026ldquo;local delivery\u0026rdquo; is identical.\n🛠️ Linux Kernel Parameters Affecting Local Delivery # Although local delivery is the default behavior, you can modify it in specific scenarios by adjusting kernel parameters.\n1. rp_filter (Reverse Path Filtering) # If strict rp_filter is enabled, Linux might drop packets that \u0026ldquo;arrive on the wrong interface.\u0026rdquo; This is a security mechanism to prevent IP source spoofing.\n1net.ipv4.conf.all.rp_filter = 1 rp_filter typically has several modes:\nstrict: Requires that the packet\u0026rsquo;s inbound interface is the best path to its source address according to the system\u0026rsquo;s routing table. In our example, this could cause the packet to be dropped. loose: Only requires that the source address is reachable in the routing table, regardless of which interface it arrives on. disabled: Completely disables rp_filter. You can check your current setting with the following command:\n1sysctl net.ipv4.conf.all.rp_filter 2. accept_local # 1net.ipv4.conf.all.accept_local = 1 This parameter explicitly allows an interface to accept packets whose destination IP is local but does not belong to that specific interface. This is enabled by default and is one of the key enablers of the local delivery mechanism.\n","date":"30 July 2025","externalUrl":null,"permalink":"/posts/devops/ip_delivert/","section":"Posts","summary":"","title":"The Local Delivery Mechanism in Networking","type":"posts"},{"content":" 1. Introduction # In Kubernetes, dnsPolicy is a critical field within a Pod\u0026rsquo;s spec that precisely defines how DNS resolution should be handled inside the Pod. When an application within your Pod attempts to resolve a domain name (whether it\u0026rsquo;s a cluster-internal service like my-service or an external domain like google.com), dnsPolicy determines which resolution process the system should follow.\n2. The Four DNS Policies # Kubernetes provides four distinct DNS policies to accommodate different application scenarios.\nPolicy (dnsPolicy) Core Behavior Primary Use Cases ClusterFirst (Default Policy) Cluster Priority: DNS requests are first sent to the cluster\u0026rsquo;s internal DNS service (e.g., CoreDNS). If the domain is a cluster service, it\u0026rsquo;s resolved directly; otherwise, the request is forwarded to the node\u0026rsquo;s upstream DNS server. The vast majority of standard applications. This is the most common and recommended configuration because it seamlessly supports both internal and external domain resolution. Default Inherit from Node: The Pod completely ignores the cluster\u0026rsquo;s DNS service and directly inherits the /etc/resolv.conf file configuration from its host node. 1. When a Pod does not need to access other services within the cluster.\n2. To resolve specific compatibility issues with the cluster\u0026rsquo;s DNS. None No Policy / Fully Custom: Kubernetes applies no DNS configuration to the Pod. You must provide a complete DNS setup manually using the dnsConfig field. When you need to use a completely separate, custom DNS resolution scheme, suitable for advanced network configurations. ClusterFirstWithHostNet hostNetwork version of ClusterFirst: Specifically designed for Pods with hostNetwork: true. Its behavior is similar to ClusterFirst, but its configuration is adapted for the host\u0026rsquo;s network. When a Pod needs to use the host\u0026rsquo;s network directly but still needs to resolve services within the cluster. 3. How to Configure # Both dnsPolicy and dnsConfig are configured in the spec section of a Pod\u0026rsquo;s YAML file.\nExample 1: Using ClusterFirst (Default) # If you don\u0026rsquo;t explicitly set dnsPolicy, it will automatically default to ClusterFirst.\n1apiVersion: v1 2kind: Pod 3metadata: 4 name: my-pod-default 5spec: 6 containers: 7 - name: my-app 8 image: my-image 9 # dnsPolicy: ClusterFirst \u0026lt;-- This line can be omitted as it\u0026#39;s the default Example 2: Using the Default Policy # This Pod will use its host node\u0026rsquo;s DNS configuration to resolve all domain names.\n1apiVersion: v1 2kind: Pod 3metadata: 4 name: my-pod-node-dns 5spec: 6 containers: 7 - name: my-app 8 image: my-image 9 # Set the DNS policy here 10 dnsPolicy: Default Example 3: Using None with dnsConfig for Full Customization # In this example, we completely bypass Kubernetes DNS and configure Google\u0026rsquo;s DNS servers directly for the Pod.\n1apiVersion: v1 2kind: Pod 3metadata: 4 name: my-pod-custom-dns 5spec: 6 containers: 7 - name: my-app 8 image: my-image 9 10 # 1. Set the policy to \u0026#34;None\u0026#34; to enable full customization 11 dnsPolicy: \u0026#34;None\u0026#34; 12 13 # 2. Manually provide the complete DNS configuration 14 dnsConfig: 15 # Specify the IP addresses of the DNS servers 16 nameservers: 17 - 8.8.8.8 18 - 8.8.4.4 19 # Specify DNS search domains (for resolving short domain names) 20 searches: 21 - my-namespace.svc.cluster.local 22 - svc.cluster.local 23 - cluster.local 24 # DNS resolver options 25 options: 26 - name: ndots 27 value: \u0026#34;5\u0026#34; 4. Deep Dive into the dnsConfig Field # The dnsConfig field allows for more granular control over DNS and can be used in conjunction with dnsPolicy.\nnameservers: A list of IP addresses to be used as DNS servers for the Pod. A maximum of 3 can be specified. searches: A list of DNS search domains. When resolving a short domain name that doesn\u0026rsquo;t contain a dot (.) (e.g., my-service), the system will search under these domains in order. For example, it will attempt to resolve my-service.my-namespace.svc.cluster.local, then my-service.svc.cluster.local, and so on. options: A list of objects for setting DNS resolver options. name: The name of the option (e.g., ndots, timeout). value: The value for the option. Note: The behavior of dnsConfig differs depending on the dnsPolicy:\nIf dnsPolicy is ClusterFirst, the nameservers provided in dnsConfig are added after the cluster\u0026rsquo;s CoreDNS as fallbacks. The searches and options are merged with the default values. If dnsPolicy is None, dnsConfig must provide all necessary information, as it will be the Pod\u0026rsquo;s sole DNS configuration. 5. Summary # For everyday use: Stick with the default ClusterFirst policy; it satisfies 99% of use cases. For special requirements: Use Default when you need to bypass the cluster DNS. For advanced customization: Use None in combination with dnsConfig when you need to integrate with a completely different DNS system. ","date":"2 July 2025","externalUrl":null,"permalink":"/posts/devops/k8s_dnspolicy/","section":"Posts","summary":"","title":"A Deep Dive into Kubernetes dnsPolicy","type":"posts"},{"content":"When troubleshooting K8s issues, there are three core commands:\nkubectl describe pod/node \u0026lt;name\u0026gt;: To check resource Events and identify the root cause. kubectl logs \u0026lt;pod-name\u0026gt;: To check application logs and resolve program issues. kubectl get \u0026lt;resource-type\u0026gt;: To check the status of resources. Layer 1: Pod Status Codes # Status Code (Status) Core Reason Core Troubleshooting Steps Pending Cannot be scheduled: The scheduler cannot find a suitable node. 1. kubectl describe pod \u0026lt;name\u0026gt;, check Events to find the specific reason:\n- Insufficient cpu/memory (Not enough resources).\n- Taints/Tolerations (Mismatch between taints and tolerations).\n- Affinity rules (Mismatch in affinity/anti-affinity rules).\n- PVC not bound (PersistentVolumeClaim is not ready). ImagePullBackOff / ErrImagePull Image pull failed: The Kubelet cannot pull the container image from the registry. 1. kubectl describe pod \u0026lt;name\u0026gt;, check Events to find the specific reason:\n- Incorrect image name or tag (Check the YAML).\n- Private registry authentication failed (Check imagePullSecrets).\n- Network issue (Log in to the node and test with docker/crictl pull). CrashLoopBackOff Container is crashing repeatedly: The container exits immediately after starting, and the Kubelet keeps restarting it. 1. kubectl logs \u0026lt;pod-name\u0026gt; --previous (Check the logs of the previous crash, extremely important).\n2. kubectl logs \u0026lt;pod-name\u0026gt; (Check the current logs).\n3. Investigate application bugs, configuration errors, or out-of-memory issues based on the logs. RunContainerError Container runtime error: The configuration is correct, but the underlying container runtime (e.g., containerd) cannot start the container. 1. kubectl describe pod \u0026lt;name\u0026gt;, Events will show RunContainerError.\n2. SSH into the node and use journalctl -u containerd (or docker) to check the runtime logs for more low-level error messages. CreateContainerConfigError Container configuration error: There is an issue with the configuration required to create the container (e.g., a ConfigMap or Secret). 1. kubectl describe pod \u0026lt;name\u0026gt;, Events will clearly state which resource is missing or has a format error. Running (but Ready is 0/1) Readiness Probe failed: The Pod is running, but it is not ready to receive traffic. 1. kubectl describe pod \u0026lt;name\u0026gt;, Events will record Readiness probe failed.\n2. Check the ReadinessProbe configuration (initial delay, timeout) or see if a downstream service the application depends on is failing. Terminating (Stuck) Pod cannot terminate properly: Usually due to a finalizer preventing its deletion, or a volume that cannot be unmounted. 1. kubectl describe pod \u0026lt;name\u0026gt;, check Events for storage-related errors like FailedDetachVolume.\n2. kubectl edit pod \u0026lt;name\u0026gt;, check the metadata.finalizers field; a finalizer added by a controller may not have been cleaned up. Unknown Status is unknown: Typically means the node controller cannot communicate with the Kubelet on the Pod\u0026rsquo;s node. 1. This is almost equivalent to a node being NotReady. Immediately check the health of the Pod\u0026rsquo;s host node (see Layer 4). Job Failed: BackoffLimitExceeded Job retry limit exceeded: The Pods created by the Job failed, and after reaching the retry limit, the Job is marked as failed. 1. kubectl get pods -l job-name=\u0026lt;job-name\u0026gt; to find the failed Pods created by the Job.\n2. kubectl logs \u0026lt;failed-pod-name\u0026gt; to view the logs and identify the root cause of the task\u0026rsquo;s failure. Layer 2: Container Exit Codes # Exit Code Meaning Core Troubleshooting Steps 1 General Application Error 1. Check application logs: kubectl logs \u0026lt;pod-name\u0026gt; --previous. 126 / 127 Command not executable / Command not found 1. Check the Dockerfile (chmod +x) and the command path in your YAML. 137 OOMKilled (Out of Memory) 1. kubectl describe pod \u0026lt;name\u0026gt; to confirm Reason: OOMKilled.\n2. Increase resources.limits.memory. 139 Segmentation Fault (SIGSEGV): Code Bug. 1. Notify the developers to debug the code. 143 Graceful Termination (SIGTERM): Normal behavior. 1. Occurs during Pod deletion or updates; no action needed. Layer 3: Network Status Codes and Errors # Error/Status Core Reason Core Troubleshooting Steps Endpoints are empty The Service Selector does not match any Pods. 1. kubectl describe svc \u0026lt;name\u0026gt; to check the Selector.\n2. kubectl get pods --show-labels to compare with the Pod\u0026rsquo;s Labels. HTTP 502/503/504 Ingress Gateway Error / Service Unavailable / Timeout. 1. A comprehensive check of Endpoints and Pod health (CrashLoopBackOff, 0/1 Ready).\n2. For 504: Check Pod logs and resource usage (kubectl top pod) to determine if the application is slow to respond. HTTP 499 Client Closed Request. A non-standard Nginx status code. Simply put, the backend service took too long to respond. 1. Check backend service response time:\nUse kubectl logs \u0026lt;ingress-controller-pod\u0026gt; to check logs and identify which endpoint (URL) frequently returns 499, and confirm if its request_time is too long.\n2. Check client timeout settings:\nConfirm if the client calling the service (browser, app, or another microservice) has set a very short request timeout.\n3. Investigate application performance bottlenecks:\nAnalyze the code of the corresponding service for issues like slow database queries or slow calls to third-party services. Connection refused Connection was refused: The network path is clear, but no process is listening on the target Pod\u0026rsquo;s port. 1. kubectl exec -it \u0026lt;pod-name\u0026gt; -- netstat -tulnp to confirm if the application is listening on the correct port.\n2. Check the application\u0026rsquo;s startup logs for any port binding errors. Connection timed out Connection timed out: Packets are being lost in the network, usually due to a NetworkPolicy or firewall issue. 1. Check NetworkPolicies: kubectl get networkpolicy -A to confirm if a policy is blocking this traffic.\n2. Check node security groups or the underlying network firewall. No route to host No route to host: Typically an issue with the inter-node network (CNI). 1. Check if the CNI plugin\u0026rsquo;s Pods (calico-node, flannel-ds, etc.) are running correctly on all nodes. Layer 4: Node Status Codes # Status Code (Status) Core Reason Core Troubleshooting Steps NotReady Node lost contact: Communication between the Kubelet and the API Server is interrupted. 1. SSH into the node, and check kubelet, containerd, df -h, and free -m in order. SchedulingDisabled Scheduling is disabled: The node has been cordoned, and no new Pods will be scheduled on it. 1. This is an administrative action, not a failure. Use kubectl uncordon \u0026lt;node-name\u0026gt; to resume scheduling. MemoryPressure Memory Pressure: The available memory on the node is too low. 1. The node may start evicting Pods. Log in to the node and use top to find the memory hogs. DiskPressure Disk Pressure: The disk space on the node is insufficient. 1. Log in to the node, use df -h to locate the partition, and clean up images, containers, and logs. PIDPressure PID Pressure: The node has run out of Process IDs. 1. Log in to the node and check for any process fork bombs or applications creating too many threads/processes. Layer 5: Storage Status Codes # Status Code / Event Core Reason Core Troubleshooting Steps PVC: Pending The PVC cannot bind to a PV. 1. kubectl describe pvc \u0026lt;name\u0026gt;, check Events to see if it\u0026rsquo;s a PV mismatch or a StorageClass issue. Pod Event: FailedMount Volume mount failed. 1. kubectl describe pod \u0026lt;name\u0026gt;, Events will provide detailed reasons, such as NFS permissions or cloud disk status. Pod Event: FailedDetachVolume Volume detach failed: Usually, the underlying storage (e.g., a cloud disk) is busy or has an issue. 1. This issue will cause a Pod to get stuck in the Terminating state.\n2. Check the CSI plugin logs or the cloud provider\u0026rsquo;s console to see the status of the volume. App Log: Read-only file system The file system is read-only: The Pod encounters an error when writing to a PV. 1. kubectl exec -it \u0026lt;pod-name\u0026gt; -- mount to view mount information and confirm if the mount option is ro (read-only).\n2. The storage backend itself may have encountered a problem and entered a read-only protective mode. ","date":"2 July 2025","externalUrl":null,"permalink":"/posts/devops/k8s_error_code/","section":"Posts","summary":"","title":"Common Error Codes in Kubernetes Operations","type":"posts"},{"content":"","externalUrl":null,"permalink":"/authors/","section":"Authors","summary":"","title":"Authors","type":"authors"},{"content":"","externalUrl":null,"permalink":"/series/","section":"Series","summary":"","title":"Series","type":"series"}]