这篇文章上次修改于 580 天前,可能其部分内容已经发生变化,如有疑问可询问作者。

前言

eBPF早有耳闻,但受限于自身水平和认知不足,一直没有搞出一个称得上Hello World的东西...

最近eBPF的资料多了起来,这回终于捣鼓出来一个能运行的Hello World

首先要说明一个概念,那就是在安卓中,eBPF程序是运行在内核态的,而结果需要通过用户态的程序去获取

当然eBPF程序在内核态也是可以打印日志,但是这样是低效,且不方便输出自定义格式的做法

我尝试在eBPF中输出日志,但最终是没有成功...没有搞清楚为什么

由于一般情况下system分区不可读写,所以还需要借助Magisk将自定义的eBPF程序挂载到指定目录

环境

  • Android 11
  • Pixel 4XL
  • root权限
  • Magisk 25.0

记录

先检查内核信息

coral:/ # zcat /proc/config.gz | grep PROBE
CONFIG_ARCH_SUPPORTS_UPROBES=y
CONFIG_GENERIC_IRQ_PROBE=y
# CONFIG_KPROBES is not set
CONFIG_UPROBES=y
CONFIG_HAVE_KPROBES=y
CONFIG_HAVE_KRETPROBES=y
# CONFIG_BUILTINS_ASYNC_PROBE is not set
# CONFIG_TEST_ASYNC_DRIVER_PROBE is not set
CONFIG_GENERIC_CPU_AUTOPROBE=y
CONFIG_TIMER_PROBE=y
CONFIG_UPROBE_EVENTS=y
CONFIG_PROBE_EVENTS=y

如果是比较新的内核,那么CONFIG_KPROBES一般是会开启的,本文的尝试是针对TRACEPOINT的,所以只要有CONFIG_TRACEPOINTS=y这一项即可

coral:/ # zcat /proc/config.gz | grep TRACEPOINT
CONFIG_TRACEPOINTS=y
CONFIG_HAVE_SYSCALL_TRACEPOINTS=y
# CONFIG_TRACEPOINT_BENCHMARK is not set

关于KPROBE/UPROBE/TRACEPOINT的介绍请查看


本文尝试对TRACEPOINT下的事件进行追踪

eBPF程序源代码src/example.c如下

#include <bpf_helpers.h>

DEFINE_BPF_MAP(cpu_pid_map, ARRAY, int, uint32_t, 1024);

struct switch_args {
    unsigned long long ignore;
    char prev_comm[16];
    int prev_pid;
    int prev_prio;
    long long prev_state;
    char next_comm[16];
    int next_pid;
    int next_prio;
};

SEC("tracepoint/sched/sched_switch")
int tp_sched_switch(struct switch_args* args) {
    int key;
    uint32_t val;

    key = bpf_get_smp_processor_id();
    val = args->next_pid;
    char fmt[] = "syscall sched_switch";
    bpf_trace_printk(fmt, sizeof(fmt));
    bpf_cpu_pid_map_update_elem(&key, &val, BPF_ANY);
    return 0;
}

char _license[] SEC("license") = "GPL";

有关原理和约束,请查看

Android中eBPF的编写基础介绍

其中DEFINE_BPF_MAP是模板函数,第一个参数会决定涉及map操作的函数名

  • 查找 bpf_cpu_pid_map_lookup_elem
  • 更新 bpf_cpu_pid_map_update_elem
  • 删除 bpf_cpu_pid_map_delete_elem

以及最后系统会在/sys/fs/bpf下生成对应的map文件和prog文件

coral:/ # ls -al /sys/fs/bpf | grep example
-rw-------  1 root root 0 2022-06-19 10:08 map_example_cpu_pid_map
-r--r-----  1 root root 0 2022-06-19 10:08 prog_example_tracepoint_sched_sched_switch

这里我生成的eBPF程序文件名是example.o,所以可以看到map文件名的构成就是map_{example}_{cpu_pid_map}

这一点必须清楚,因为后面在编写用户态程序的时候,需要用到

这里的switch_args结构体,构成可以从/sys/kernel/debug/tracing/events/sched/sched_switch/format得到

coral:/ # cat /sys/kernel/debug/tracing/events/sched/sched_switch/format
name: sched_switch
ID: 88
format:
        field:unsigned short common_type;       offset:0;       size:2; signed:0;
        field:unsigned char common_flags;       offset:2;       size:1; signed:0;
        field:unsigned char common_preempt_count;       offset:3;       size:1; signed:0;
        field:int common_pid;   offset:4;       size:4; signed:1;

        field:char prev_comm[16];       offset:8;       size:16;        signed:0;
        field:pid_t prev_pid;   offset:24;      size:4; signed:1;
        field:int prev_prio;    offset:28;      size:4; signed:1;
        field:long prev_state;  offset:32;      size:8; signed:1;
        field:char next_comm[16];       offset:40;      size:16;        signed:0;
        field:pid_t next_pid;   offset:56;      size:4; signed:1;
        field:int next_prio;    offset:60;      size:4; signed:1;

print fmt: "prev_comm=%s prev_pid=%d prev_prio=%d prev_state=%s%s ==> next_comm=%s next_pid=%d next_prio=%d", REC->prev_comm, REC->prev_pid, REC->prev_prio, (REC->prev_state & ((((0x0000 | 0x0001 | 0x0002 | 0x0004 | 0x0008 | 0x0010 | 0x0020 | 0x0040) + 1) << 1) - 1)) ? __print_flags(REC->prev_state & ((((0x0000 | 0x0001 | 0x0002 | 0x0004 | 0x0008 | 0x0010 | 0x0020 | 0x0040) + 1) << 1) - 1), "|", { 0x0001, "S" }, { 0x0002, "D" }, { 0x0004, "T" }, { 0x0008, "t" }, { 0x0010, "X" }, { 0x0020, "Z" }, { 0x0040, "P" }, { 0x0080, "I" }) : "R", REC->prev_state & (((0x0000 | 0x0001 | 0x0002 | 0x0004 | 0x0008 | 0x0010 | 0x0020 | 0x0040) + 1) << 1) ? "+" : "", REC->next_comm, REC->next_pid, REC->next_prio

这段代码的作用就是追踪tracepoint/sched/sched_switch的情况,然后通过bpf_cpu_pid_map_update_elem将结果更新到map

即用户态程序可以通过读取/sys/fs/bpf/map_example_cpu_pid_map拿到eBPF程序给的结果

当然这个读取并不是直接读取文件,而是要通过bpf_obj_get这个函数转换读取

代码有了,现在可以通过ndk来编译程序了,具体参考如下

代码准备

mkdir ~/ebpfdemo
cd ~/ebpfdemo
git clone -b android11-gsi https://android.googlesource.com/platform/bionic
mkdir system && cd system
git clone -b android11-gsi https://android.googlesource.com/platform/system/core/
git clone -b android11-gsi https://android.googlesource.com/platform/system/bpf/
cd ~/ebpfdemo

Makefile,假定ndk解压路径是/home/kali/android-ndk-r23b

注意,Makefile的target下一行,是一个TAB,即\t,而不是4个空格,否则会出错(我也是才知道,以前都是在原有基础上改的..)

ebpf-build:
    /home/kali/android-ndk-r23b/toolchains/llvm/prebuilt/linux-x86_64/bin/clang \
    --target=bpf \
    -c \
    -nostdlibinc -no-canonical-prefixes -O2 \
    -isystem bionic/libc/include \
    -isystem bionic/libc/kernel/uapi \
    -isystem bionic/libc/kernel/uapi/asm-arm64 \
    -isystem bionic/libc/kernel/android/uapi \
    -I       system/core/libcutils/include \
    -I       system/bpf/progs/include \
    -MD -MF example.d -o example.o src/example.c

执行make命令,然后会在当前目录生成example.o文件

注意eBPF程序并不是可执行程序,最终的处理是内核做的,这里编译的是.o文件


然后是做一个Magisk模块,将example.o挂载到/system/etc/bpf,这样系统启动时才能自动加载

这里我基于HttpCanary System CA Mounter这个模块修改

具体来说就是,将这里customize.shREPLACE内容换成/system/etc/bpf

然后在模块文件夹下创建system/etc/bpf文件夹,并把example.o放进去

然后是将整个文件打包为zip,然后push到手机,Magisk刷入这个模块

然后重启手机,Magisk就会将example.o移动到/system/etc/bpf

怎么看加载成功了没,方法一是开机后执行下面的命令,看看有没有这两个文件,有就说明OK了

coral:/ # ls -al /sys/fs/bpf | grep example
-rw-------  1 root root 0 2022-06-19 10:08 map_example_cpu_pid_map
-r--r-----  1 root root 0 2022-06-19 10:08 prog_example_tracepoint_sched_sched_switch

方法二就是在开机过程中不断执行下面的命令,直到打印出加载eBPF程序的日志

adb shell "logcat | grep -i bpf"

不断执行是因为logcat设置的缓冲区太小,可能开机后再查看就没有了

如果正常,那么日志如下

06-18 11:57:35.313   860   860 D LibBpfLoader: Loading optional ELF object /system/etc/bpf/example.o with license GPL
06-18 11:57:35.313   860   860 E LibBpfLoader: No progs section could be found in elf object
06-18 11:57:35.313   860   860 D LibBpfLoader: Loaded code section 3 (tracepoint_raw_syscalls_sys_exit)
06-18 11:57:35.313   860   860 D LibBpfLoader: Adding section 3 to cs list
06-18 11:57:35.313   860   860 D LibBpfLoader: bpf_create_map name pid_syscall_map, ret: 6
06-18 11:57:35.313   860   860 D LibBpfLoader: map_fd found at 0 is 6 in /system/etc/bpf/example.o
06-18 11:57:35.318   860   860 D LibBpfLoader: bpf_prog_load lib call for /system/etc/bpf/example.o (tracepoint_raw_syscalls_sys_exit) returned fd: 7 (no error)
06-18 11:57:35.318   860   860 I bpfloader: Loaded object: /system/etc/bpf/example.o

怎么确定eBPF程序工作正常呢?可以看到我在代码里面加入了下面的内容

    char fmt[] = "syscall sched_switch";
    bpf_trace_printk(fmt, sizeof(fmt));

结合已有的资料,理论上应该是可以在/sys/kernel/debug/tracing/trace中看到这个日志输出的

但是很遗憾没有,我也还没有搞清楚为什么,估计是什么地方操作不对,可能是开机的时候才有

不过还可以自己编写用户态程序去读取数据,如果有数据那不就说明也是OK的嘛

用户态程序源代码src/trace.cpp如下,这里就用到了前面的两个路径

#include <inttypes.h>
#include <iostream>
#include <unordered_map>

#include "bpf/BpfMap.h"
#include "bpf/BpfUtils.h"
#include "libbpf_android.h"


constexpr const char tp_prog_path[] = "/sys/fs/bpf/prog_example_tracepoint_sched_sched_switch";
constexpr const char tp_map_path[] = "/sys/fs/bpf/map_example_cpu_pid_map";

using namespace android::bpf;
using android::base::StringPrintf;

bool setup() {
    int mProgFd = bpf_obj_get(tp_prog_path);
    if (mProgFd <= 0) return false;

    int ret = bpf_attach_tracepoint(mProgFd, "sched", "sched_switch");
    if (ret == 0) return false;

    return true;
}

void showMapDetail(std::unordered_map<uint32_t, uint32_t> *sysCallMap) {

    BpfMap<uint32_t, uint32_t> m(tp_map_path);

    sleep(1);

    const auto iterFunc = [sysCallMap](const uint32_t& key, const uint32_t& val, const BpfMap<uint32_t, uint32_t>&) {
        if (val) {
            std::string tmp = StringPrintf("%d\t%" PRIu32, key, val);
            std::cout << tmp << std::endl;
            (*sysCallMap)[key] = val;
        }
        return android::base::Result<void>();
    };

    m.iterateWithValue(iterFunc);
}


int main()                                          
{
    std::unordered_map<uint32_t, uint32_t> sysCallMap;
    setup();
    sleep(1);
    showMapDetail(&sysCallMap);
    return 0;
}

先说一个可能踩坑的点,就是BpfMap<uint32_t, uint32_t> m(tp_map_path);这行代码

因为参考了下面这篇文章,于是最开始的写法是传入fd,但是怎么都编译不过

后来才发现这是因为我用的代码,头文件等等都是Android 11的,这个API发生了变化...

编译这段代码也是颇为曲折,一开始是想着用前面的方案,直接配置ndk,但是后面发现涉及的头文件构成太复杂,最终不得不放弃自己写Makefile来编译

所以编写Android.bp如下,假定在aosp文件夹下的testbpf文件夹中

cc_defaults {
  name: "my-defaults",

  local_include_dirs: [
    "include",
  ],

  cflags: [
    "-Wall",
    "-Werror",
    "-Wuninitialized",
    "-Wno-error=unused-variable",
    "-fno-common",
    "-fPIC",
    "-D__STDC_FORMAT_MACROS",
  ],

  target: {
    android_arm64: {
      cflags: [
        "-D__ANDROID__",
      ],
    },
  },
}

cc_binary {
  name: "bpftracer",
  // static_executable: true,
  
  defaults: [
    "my-defaults",
  ],

  local_include_dirs: [
    "include",
  ],

  srcs: [
    "src/trace.cpp",
  ],

  shared_libs: [
      "libbpf",
      "libbase",
      "libutils",
  ],

    static_libs: [
        "libbpf",
        "libbpf_android",
    ],
}

这里是aosp的环境,如果用gsi我认为也是可行的,过程应该一样(难得测试了)

cd ~/aosp11
export LC_ALL=C && . build/envsetup.sh
lunch aosp_arm64-eng
mmm testbpf

这里如果用static_executable: true,会出现符号重复的异常,去掉也能编译出来,就没管了

(一般就是不需要这个吧)

FAILED: out/soong/.intermediates/testbpf/bpftracer/android_arm64_armv8-a/unstripped/bpftracer
prebuilts/clang/host/linux-x86/clang-r383902b1/bin/clang++ out/soong/.intermediates/bionic/libc/crtbegin_static/android_arm64_armv8-a/crtbegin_static.o @out/soong/.intermediates/testbpf/bpftracer/android_arm64_armv8-a/unstripped/bpftracer.rsp out/soong/.intermediates/system/bpf/libbpf_android/libbpf_android/android_arm64_armv8-a_static/libbpf_android.a out/soong/.intermediates/bionic/libm/libm/android_arm64_armv8-a_static/libm.a out/soong/.intermediates/bionic/libc/libc/android_arm64_armv8-a_static/libc.a out/soong/.intermediates/build/soong/libgcc_stripped/android_arm64_armv8-a_static/libgcc_stripped.a out/soong/.intermediates/external/bcc/libbpf/android_arm64_armv8-a_static/libbpf.a out/soong/.intermediates/external/libcxx/libc++_static/android_arm64_armv8-a_static/libc++_static.a out/soong/.intermediates/external/libcxxabi/libc++demangle/android_arm64_armv8-a_static/libc++demangle.a -Wl,--start-group out/soong/.intermediates/bionic/libc/libc/android_arm64_armv8-a_static/libc.a prebuilts/clang/host/linux-x86/clang-r383902b1/lib64/clang/11.0.2/lib/linux/libclang_rt.builtins-aarch64-android.a prebuilts/gcc/linux-x86/aarch64/aarch64-linux-android-4.9/aarch64-linux-android/lib64/libatomic.a -Wl,--end-group out/soong/.intermediates/bionic/libc/crtend_android/android_arm64_armv8-a/obj/bionic/libc/arch-common/bionic/crtend.o -o out/soong/.intermediates/testbpf/bpftracer/android_arm64_armv8-a/unstripped/bpftracer -target aarch64-linux-android10000 -Bprebuilts/gcc/linux-x86/aarch64/aarch64-linux-android-4.9/aarch64-linux-android/bin -Wl,-z,noexecstack -Wl,-z,relro -Wl,-z,now -Wl,--build-id=md5 -Wl,--warn-shared-textrel -Wl,--fatal-warnings -Wl,--no-undefined-version -Wl,--exclude-libs,libgcc.a -Wl,--exclude-libs,libgcc_stripped.a -Wl,--exclude-libs,libunwind_llvm.a -fuse-ld=lld -Wl,--pack-dyn-relocs=android+relr -Wl,--use-android-relr-tags -Wl,--no-undefined -Wl,--hash-style=gnu -Wl,-z,separate-code -Wl,--icf=safe -Wl,-z,max-page-size=4096  -Wl,--exclude-libs=libclang_rt.builtins-aarch64-android.a  -static -nostdlib -Bstatic -Wl,--gc-sections prebuilts/clang/host/linux-x86/clang-r383902b1/lib64/clang/11.0.2/lib/linux/libclang_rt.ubsan_minimal-aarch64-android.a -Wl,--exclude-libs,libclang_rt.ubsan_minimal-aarch64-android.a 
ld.lld: error: duplicate symbol: std::nothrow
>>> defined at new.cpp:24 (bionic/libc/bionic/new.cpp:24)
>>>            new.o:(std::nothrow) in archive out/soong/.intermediates/bionic/libc/libc/android_arm64_armv8-a_static/libc.a
>>> defined at new.cpp:38 (external/libcxx/src/new.cpp:38)
>>>            new.o:(.rodata._ZSt7nothrow+0x0) in archive out/soong/.intermediates/external/libcxx/libc++_static/android_arm64_armv8-a_static/libc++_static.a
clang-11: error: linker command failed with exit code 1 (use -v to see invocation)
12:27:12 ninja failed with: exit status 1

编译成功的样子

然后将程序推送到手机,并添加可执行权限,切换到root下执行!

完全OK,可以获取到对应的信息

不过这里打印得比较简单,但是至少完成了用户态程序获取对应信息的过程

至此,算是完成了Hello World

基于此,进一步打印出sys_entersys_exit事件的详细信息完全具有可行性

不过综合考虑,显然这个过程还是过于复杂,还是通过挂载类debian系统,通过现有的bcc框架直接编译生成对应的程序更方便


再补充一个调试eBPF程序的方法,还有一种是手动设置bpf.progs_loaded属性,然后运行bpfloader服务,再根据logcat日志来检查eBPF程序有没有正常加载

setprop bpf.progs_loaded 0
stop bpfloader
start bpfloader
logcat | grep -i bpfloader

但是测试发现会导致系统崩溃...后来经过分析,确定是因为系统会检查是不是已经运行了

根据源代码,应该先把/sys/fs/bpf下面的map和prog删掉再执行

否则异常会在system/bpf/libbpf_android/Loader.cppcreateMaps发生

因为原本的文件存在,会重复使用,具体导致的问题也不太清楚

如果要在开机状态下重新加载eBPF,那么执行命令的如下

单独一个shell

logcat | grep -i bpfloader

单独一个shell

rm /sys/fs/bpf/*
setprop bpf.progs_loaded 0
stop bpfloader
start bpfloader

这样才能正常调试检查,不得不说有些文章真的就是一笔带过...

还要注意的就是,这里是一次性加载全部的eBPF程序

参考