index | ~dongdigua

三权分立的家里云方案

$Id: homelab.org,v 26.6 2026/05/25 13:54:07 dongdigua Exp $

想搞个家里云。
不想 all in boom,不想太复杂的虚拟化,想搞存算分离。
在几晚上的失眠之后想出了这个方案:

1. Network

搞了个软路由,十分标准的操作,没啥可讲的
https://openwrt.org/inbox/toh/xiaomi/ax3000t
SB 小米路由器用 sysupgrade 镜像升级就变砖,换货回来又一个不同程度的砖。
换眼馋很久的 NanoPi R5C 了,就是有点烫。

看到个好玩的:在 OpenWrt 上打造 Rickroll 访客 Wi-Fi

1.1. 显示主机名

1.1.1. dhcpcd

fqdn both

1.1.2. NetworkManager

???
为什么我的 wifi 连接显示了但以太网没有

2. Storage

接着上一篇:一次大备份
CM3588 NAS Kit
还是熟悉的 FreeBSD,还是熟悉的 ZFS,
FreeBSD 完全起不来,UART 口似乎也不好使(即使 OMV 也只输出乱码),没法用 UART 调试,所以只能 fallback 到 OpenMediaVault 了。
(我是对 FreeBSD 比较有情怀,但情怀毕竟不能当饭吃,还是务实一点,OMV 的网页端多好啊, 我机器全用 btrfs 也减少学习负担)
硬盘随便买了两个牌子的 PCIE 3.0 盘,因为这个板子 4 盘位的话只能各 PCIE 3.0 单通道,买 4.0 的纯属浪费钱。
不过这回又起了个 Postgres,因为我不想让 Postgres 再走 NFS,使用 NFS 为 PC 和容器提供存储, 还是应该对系统进行解耦,让容器用本地的存储。

zfs 参考 https://gist.github.com/CodeBradley/6acef34563323f8c2a11b72900c20092


3. Runner

其他服务全放在 docker 容器,宿主我选择 Gentoo on Raspberry Pi 5
https://wiki.gentoo.org/wiki/How_to_install_Gentoo_on_Raspberry_Pi_5
然后这两个带风扇的板子我还糊了一层 HEPA 滤网用来防尘,inspiration,然后编译时建议揭开点否则可能过热 确实会过热并且噪声增加,DS 推荐我用正经的防尘网)

3.1. 第一次尝试

刚放假,树莓派还没邮到,现在宿主机 qemu-aarch64-static 上交叉编译。
看着屏幕上缓缓滚动的编译进度,90 多度的 CPU,在两天一夜之后,我发主了以下感慨:

我可能真的是有什么大病,花了十几个小时交叉编译 gcc(因为 podman 的一个依赖项依赖 rust,rust 依赖 libgcc,后来 gcc 实在编译不动了,换 llvm-runtimes/libgcc 然后折腾好一阵 unmask),
大夏天的烫烫烫,然后所有服务还都是容器化的,跟宿主除了内核其他半毛钱关系没有。
哦,然后 rust-bin 还不能用 musl, Error relocating /lib/libgcc_s.so.1.0: __getauxval: symbol not found ,还得编译 rust,然后循环依赖……
rust 坏 go 好。
最后只能切换到不依赖 rust 的 docker
哦对然后 podman-compose 您一个 python 脚本还不支持 arm

go 的 bootstrap 挺有意思

Building Go toolchain1 using /usr/lib/go.
Building Go bootstrap cmd/go (go_bootstrap) using Go toolchain1.
Building Go toolchain2 using go_bootstrap and Go toolchain1.
Building Go toolchain3 using go_bootstrap and Go toolchain2.
Building packages and commands for linux/arm64.

3.2. 第二次尝试

树莓派到了,结果之前编译的 rootfs 启动不了,还忘邮 micro-HDMI 线了,怎么办,不能盲目调试啊。
UART,启动!(但凡搞过嵌入式的手头都有 UART 调试器吧,如果没有可以用 Arduino Nano CH340 代替,有树莓派而无 Arduino 者,未之闻也)
https://www.jeffgeerling.com/blog/2021/attaching-raspberry-pis-serial-console-uart-debugging
对于树莓派5,config.txt 里加上:

dtparam=uart0=on
dtparam=uart0_console=on
enable_uart=1
enable_rp1_uart=1

发现树莓派官方内核不支持 xfs,直接 panic,只能滚回去用 ext4 了

然后能启动了,但是 readonly filesystem 噔噔咚!

3.2.1. on #gentoo(TL;DR 文件损坏)


chat log

<dongdigua> hello, I'm installing gentoo on raspberry pi 5 following (nearly)
            https://wiki.gentoo.org/wiki/How_to_install_Gentoo_on_Raspberry_Pi_5
<dongdigua> but I got [    3.005840] EXT4-fs (mmcblk0p2): orphan cleanup on readonly fs
<dongdigua> [    3.012155] EXT4-fs (mmcblk0p2): mounted filesystem 5f0ea1b9-2fb6-4a8f-a8f8-9baa389fa047 ro with ordered data mode. Quota mode: none.
<dongdigua> [    3.024169] VFS: Mounted root (ext4 filesystem) readonly on device 179:2.
<sam_> that looks okay
<sam_> it'll get remounted rw later
<kgdrenefort> dongdigua: also FYI there is also #gentoo-arm for specific issue
              with ARM device, if it helps later :).
<dongdigua> sam_: it's not remounted rw, even with 'mount -o remount,rw /'
<sam_> that's a later problem though, not in the lines you showed
<sam_> tell us more about what happens please [18:08]
<Randname_> fstab errors ?
<dongdigua> fstab is 1:1 copy of the wiki article
<dongdigua> later dmesg is here https://paste.debian.net/1379518/
<NeddySeagoon> dongdigua: fsck can't fix all errors on the root fs. You need
               to check it offline [18:11]
<dongdigua> yes, I unplugged and fscked it on my host machine [18:12]
<NeddySeagoon> dongdigua: good. Does it mount there?
<dongdigua> yes, and fsck showed no error [18:13]
<dongdigua> really weird
<NeddySeagoon> That's a good sign too. What are you using for a PSU? [18:14]
<dongdigua> NeddySeagoon: something from 亚博智能 capable of outputing 5V5A
<NeddySeagoon> dongdigua: with an attached cable?
<dongdigua> yes, usbC
<dongdigua> NeddySeagoon: and the powermeter says it's only 0.5A
<NeddySeagoon> dongdigua: that sounds good. Check dmesg for undervolt events
               if you can
<dongdigua> NeddySeagoon: none ( [18:17]
<NeddySeagoon> dongdigua: ripple voltage matters a great deal. That's not easy
               to measure.
<dongdigua> NeddySeagoon: so I tried another power from HUAWEI, 5V2A, and the
            same, readonly [18:19]
<NeddySeagoon> dongdigua: you need to fix the fs before you test with another
               PSU [18:20]
<NeddySeagoon> rootfsck can't do it when the fs is mounted ro. It must be
               unmounted completely [18:22]
<dongdigua> NeddySeagoon: I 'fsck -yf' on my host machine, stil readonly :|
<NeddySeagoon> But it mounts on the host still?
<dongdigua> y [18:25]
<dongdigua> host kernel 6.14, pi 6.12, is this a point?
<dongdigua> wait, I will try to format the sd card using raspbian over usb
                                                                        [18:26]
<NeddySeagoon> dongdigua: it all sounds OK, it just doesn't work. I use last
               weekends foundation 6.12.y but it does not sound like a kernel
               issue
\* NeddySeagoon goes for more coffee [18:27]
\* dongdigua rebuilds from stage3 [18:32]
<dongdigua> NeddySeagoon: I rebuilt the sd card from a clean stage3 (I
            previously installed some package) [18:37]
<dongdigua> and it works [18:38]
<dongdigua> probably corrupted files
<dongdigua> NeddySeagoon: thank you a lot for your patience
<NeddySeagoon> dongdigua: Enjoy your Gentoo. A Pi5 should not do that. [18:41]


于是乎拿崭新的 stage3 从头再来
事实证明树莓派性能比 qemu 高很多,发热还小,风扇声音也小,所以丢失进度并没有对我造成太大打击

3.2.2. kernel

  1. MC

    事实证明树莓派5跑 MC 性能足够,1.12.2 vanilla 静置大概 15 mspt
    别用 azul prime JDK树莓派官方内核没启用 hugepages ,会直接 coredump

    pi ~ # docker run -it docker.1ms.run/azul/prime
    root@22923e93ec02:/# java -version
    
    ##### addr(0xc40000000000)sz(0x40000000000), msg(mmap() error.  Failed to reserve thread stacks region)
    Aborted (core dumped)
    root@22923e93ec02:/# 
    
    CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE=y
    CONFIG_HAVE_ARCH_HUGE_VMAP=y
    CONFIG_HAVE_ARCH_HUGE_VMALLOC=y
    CONFIG_ARCH_WANT_HUGE_PMD_SHARE=y
    # CONFIG_TRANSPARENT_HUGEPAGE is not set
    CONFIG_ARCH_SUPPORTS_HUGETLBFS=y
    # CONFIG_HUGETLBFS is not set
    
  2. memory cgroup
    pi ~ # cat /proc/cmdline
    reboot=w coherent_pool=1M 8250.nr_uarts=1 pci=pcie_bus_safe cgroup_disable=memory numa_policy=interleave nvme.max_host_mem_size_mb=0  smsc95xx.macaddr=2C:CF:67:F0:B8:06 vc_mem.mem_base=0x3fc00000 vc_mem.mem_size=0x40000000  dwc_otg.lpm_enable=0 console=ttyAMA0,115200 console=tty1 root=/dev/mmcblk0p2 rootfstype=ext4 fsck.repair=yes rootwait
    pi ~ # cat /boot/cmdline.txt
    dwc_otg.lpm_enable=0 console=serial0,115200 console=tty1 root=/dev/mmcblk0p2 rootfstype=ext4 fsck.repair=yes rootwait
    

    固件会改我的 cmdline,不让我用 memory cgroup, docker stats 看不见内存占用

  3. raspberrypi-sources 启动!

    https://www.raspberrypi.com/documentation/computers/linux_kernel.html#natively-build-a-kernel
    基于上述问题,看起来有必要编译个内核了。 gentoo 就是爽!

    export KERNEL=kernel_2712
    export LLVM=1 # llvm profile
    make bcm2712_defconfig
    make menuconfig
    time make -j3 Image.gz modules dtbs
    make -j3 modules_install
    # install
    cp /boot/$KERNEL.img /boot/$KERNEL-backup.img
    cp arch/arm64/boot/Image.gz /boot/$KERNEL.img
    cp arch/arm64/boot/dts/broadcom/*.dtb /boot/
    cp arch/arm64/boot/dts/overlays/*.dtb* /boot/overlays/
    cp arch/arm64/boot/dts/overlays/README /boot/overlays/
    reboot
    

    用不了多长时间,差不多 100 分钟(只要你不往内核里掺 rust)
    哦对了 llvm 和 clang 别忘了启用 USE="ARM",因为内核里有一些 32 位的部分,否则会 No available targets are compatible with triple "thumbv8a-unknown-linux-gnueabi" (废话)

    1. config patch:
      --- .config_def 2025-06-16 12:44:48.588234025 +0000
      +++ .config     2025-06-16 23:52:04.266680435 +0000
      @@ -27,7 +27,7 @@
       CONFIG_INIT_ENV_ARG_LIMIT=32
       # CONFIG_COMPILE_TEST is not set
       # CONFIG_WERROR is not set
      -CONFIG_LOCALVERSION="-v8-16k"
      +CONFIG_LOCALVERSION="-digua"
       # CONFIG_LOCALVERSION_AUTO is not set
       CONFIG_BUILD_SALT=""
       CONFIG_DEFAULT_INIT=""
      @@ -143,7 +143,7 @@
       CONFIG_RCU_NEED_SEGCBLIST=y
       # end of RCU Subsystem
      
      -CONFIG_IKCONFIG=m
      +CONFIG_IKCONFIG=y
       CONFIG_IKCONFIG_PROC=y
       # CONFIG_IKHEADERS is not set
       CONFIG_LOG_BUF_SHIFT=17
      @@ -178,6 +178,7 @@
       CONFIG_CGROUP_PIDS=y
       # CONFIG_CGROUP_RDMA is not set
       CONFIG_CGROUP_FREEZER=y
      +CONFIG_CGROUP_HUGETLB=y
       CONFIG_CPUSETS=y
       CONFIG_PROC_PID_CPUSET=y
       CONFIG_CGROUP_DEVICE=y
      @@ -199,6 +200,7 @@
       CONFIG_RELAY=y
       CONFIG_BLK_DEV_INITRD=y
       CONFIG_INITRAMFS_SOURCE=""
      +# CONFIG_INITRAMFS_FORCE is not set
       CONFIG_RD_GZIP=y
       CONFIG_RD_BZIP2=y
       CONFIG_RD_LZMA=y
      @@ -522,9 +524,9 @@
       #
       # Boot options
       #
      -CONFIG_CMDLINE="console=ttyAMA0,115200 kgdboc=ttyAMA0,115200 root=/dev/mmcblk0p2 rootfstype=ext4 rootwait"
      -CONFIG_CMDLINE_FROM_BOOTLOADER=y
      -# CONFIG_CMDLINE_FORCE is not set
      +CONFIG_CMDLINE="consreboot=w coherent_pool=1M 8250.nr_uarts=1 pci=pcie_bus_safe numa_policy=interleave nvme.max_host_mem_size_mb=0 smsc95xx.macaddr=2C:CF:67:F0:B8:06 vc_mem.mem_base=0x3fc00000 vc_mem.mem_size=0x40000000 dwc_otg.lpm_enable=0 console=ttyAMA0,115200 console=tty1 root=/dev/mmcblk0p2 rootfstype=ext4 fsck.repair=yes rootwait cgroup_enable=memory"
      +# CONFIG_CMDLINE_FROM_BOOTLOADER is not set
      +CONFIG_CMDLINE_FORCE=y
       CONFIG_EFI_STUB=y
       CONFIG_EFI=y
       CONFIG_DMI=y
      @@ -679,12 +681,14 @@
       CONFIG_STACKPROTECTOR_STRONG=y
       CONFIG_ARCH_SUPPORTS_SHADOW_CALL_STACK=y
       # CONFIG_SHADOW_CALL_STACK is not set
      +CONFIG_LTO=y
      +CONFIG_LTO_CLANG=y
       CONFIG_ARCH_SUPPORTS_LTO_CLANG=y
       CONFIG_ARCH_SUPPORTS_LTO_CLANG_THIN=y
       CONFIG_HAS_LTO_CLANG=y
      -CONFIG_LTO_NONE=y
      +# CONFIG_LTO_NONE is not set
       # CONFIG_LTO_CLANG_FULL is not set
      -# CONFIG_LTO_CLANG_THIN is not set
      +CONFIG_LTO_CLANG_THIN=y
       CONFIG_ARCH_SUPPORTS_CFI_CLANG=y
       # CONFIG_CFI_CLANG is not set
       CONFIG_HAVE_CONTEXT_TRACKING_USER=y
      @@ -696,6 +700,7 @@
       CONFIG_HAVE_ARCH_HUGE_VMAP=y
       CONFIG_HAVE_ARCH_HUGE_VMALLOC=y
       CONFIG_ARCH_WANT_HUGE_PMD_SHARE=y
      +CONFIG_ARCH_WANT_PMD_MKWRITE=y
       CONFIG_HAVE_MOD_ARCH_SPECIFIC=y
       CONFIG_MODULES_USE_ELF_RELA=y
       CONFIG_HAVE_SOFTIRQ_ON_OWN_STACK=y
      @@ -917,6 +922,8 @@
       CONFIG_COMPACT_UNEVICTABLE_DEFAULT=1
       # CONFIG_PAGE_REPORTING is not set
       CONFIG_MIGRATION=y
      +CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION=y
      +CONFIG_ARCH_ENABLE_THP_MIGRATION=y
       CONFIG_CONTIG_ALLOC=y
       CONFIG_PCP_BATCH_SCALE_MAX=5
       CONFIG_PHYS_ADDR_T_64BIT=y
      @@ -925,7 +932,10 @@
       CONFIG_DEFAULT_MMAP_MIN_ADDR=4096
       CONFIG_ARCH_SUPPORTS_MEMORY_FAILURE=y
       # CONFIG_MEMORY_FAILURE is not set
      -# CONFIG_TRANSPARENT_HUGEPAGE is not set
      +CONFIG_TRANSPARENT_HUGEPAGE=y
      +CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y
      +# CONFIG_TRANSPARENT_HUGEPAGE_MADVISE is not set
      +# CONFIG_READ_ONLY_THP_FOR_FS is not set
       CONFIG_CMA=y
       # CONFIG_CMA_DEBUG is not set
       # CONFIG_CMA_DEBUGFS is not set
      @@ -8057,7 +8067,7 @@
       CONFIG_OCFS2_FS_STATS=y
       CONFIG_OCFS2_DEBUG_MASKLOG=y
       # CONFIG_OCFS2_DEBUG_FS is not set
      -CONFIG_BTRFS_FS=m
      +CONFIG_BTRFS_FS=y
       CONFIG_BTRFS_FS_POSIX_ACL=y
       # CONFIG_BTRFS_FS_CHECK_INTEGRITY is not set
       # CONFIG_BTRFS_FS_RUN_SANITY_TESTS is not set
      @@ -8165,7 +8175,8 @@
       # CONFIG_TMPFS_INODE64 is not set
       # CONFIG_TMPFS_QUOTA is not set
       CONFIG_ARCH_SUPPORTS_HUGETLBFS=y
      -# CONFIG_HUGETLBFS is not set
      +CONFIG_HUGETLBFS=y
      +CONFIG_HUGETLB_PAGE=y
       CONFIG_ARCH_HAS_GIGANTIC_PAGE=y
       CONFIG_CONFIGFS_FS=y
       CONFIG_EFIVAR_FS=m
      @@ -8440,7 +8451,7 @@
       # end of Kernel hardening options
       # end of Security options
      
      -CONFIG_XOR_BLOCKS=m
      +CONFIG_XOR_BLOCKS=y
       CONFIG_ASYNC_CORE=m
       CONFIG_ASYNC_MEMCPY=m
       CONFIG_ASYNC_XOR=m
      @@ -8555,7 +8566,7 @@
       #
       # Hashes, digests, and MACs
       #
      -CONFIG_CRYPTO_BLAKE2B=m
      +CONFIG_CRYPTO_BLAKE2B=y
       CONFIG_CRYPTO_CMAC=m
       CONFIG_CRYPTO_GHASH=m
       CONFIG_CRYPTO_HMAC=y
      @@ -8574,7 +8585,7 @@
       # CONFIG_CRYPTO_VMAC is not set
       CONFIG_CRYPTO_WP512=m
       CONFIG_CRYPTO_XCBC=m
      -CONFIG_CRYPTO_XXHASH=m
      +CONFIG_CRYPTO_XXHASH=y
       # end of Hashes, digests, and MACs
      
       #
      @@ -8681,7 +8692,7 @@
       #
       # Library routines
       #
      -CONFIG_RAID6_PQ=m
      +CONFIG_RAID6_PQ=y
       CONFIG_RAID6_PQ_BENCHMARK=y
       CONFIG_LINEAR_RANGES=y
       # CONFIG_PACKING is not set
      

3.2.3. cannot open crtbeginS.o: No such file or directory

TL;DR emerge --oneshot --nodeps clang-runtime

configure:4530: $? = 1                                                                                                                  
configure:4550: checking whether the C compiler works                                                                                   
configure:4572: clang -O2 -pipe -march=native  -Wl,-O1 -Wl,--as-needed -Wl,-z,pack-relative-relocs -Wl,--as-needed conftest.c  >&5      
aarch64-unknown-linux-musl-ld: error: cannot open crtbeginS.o: No such file or directory                                                
aarch64-unknown-linux-musl-ld: error: unable to find library -lgcc                                                                      
aarch64-unknown-linux-musl-ld: error: unable to find library -lgcc_s                                                                    
aarch64-unknown-linux-musl-ld: error: unable to find library -lgcc                                                                      
aarch64-unknown-linux-musl-ld: error: unable to find library -lgcc_s                                                                    
aarch64-unknown-linux-musl-ld: error: cannot open crtendS.o: No such file or directory                                                  
clang: error: linker command failed with exit code 1 (use -v to see invocation)       

sam_ 提到了 https://bugs.gentoo.org/951445 (怎么这些大佬随口就能说出 bug 号啊)

<vimproved>  so there's a period where between merging clang and clang-runtime, the toolchain is not using any config file and therefor could be broken

3.2.4. S3

minio 背刺开源社区,目前用的 garage 替代,其他方案见 https://wener.me/notes/service/storage/s3

3.3. 第三次尝试

md 期中俩U盘的 btrfs 都炸了,期末周刚结束我 emerge 一下,系统寄了,回家发现是 SD 卡写爆了。为了不半年换张卡只能告别 Gentoo 了。
然后 btrfs 拔下来恢复数据,告诉我说 btrfs sectorsize 16384 not yet supported for page size 4096
因为树莓派我编译那个针对 pi5 的内核启用了 16k 页面,这 nm 一点兼容性都不考虑啊,彻底对 btrfs 失望了。

一气之下换 Alpine 了,文件系统也换成 zfs,感觉良好。
zed 也让我省去了手动写脚本监控文件系统的麻烦,关于如何测试 zed,在这看到了一个小妙招
还有一个细节是 zfs 的快照目录 .zfs 用 ls -a 看不见但 cd 可以进入

dongdigua CC BY-NC-SA 禁止转载到私域(公众号,非自己托管的博客等)

Email me to add comment

Proudly made with Emacs Org mode

Date: 2025-06-12 Thu 00:00 Size: 34K (≈ 5.0984 mg CO2e)