这篇文章上次修改于 520 天前,可能其部分内容已经发生变化,如有疑问可询问作者。

前言

抓包看到wepkg的请求,咋一看是明文,但是格式又很奇怪,因为和wxapkg格式也不一样

结果搜索了一圈,似乎没有人有写过wepkg的解包,那么我来浅浅分析一下吧

微信版本:8.32

分析

hook下java.io.File.<init>,看看和wepkg相关的路径,调用栈如下:

Backtrace:
    java.io.File.<init>(Native Method)
    com.tencent.mm.vfs.NativeFileSystem$g.q(SourceFile:12)
    com.tencent.mm.vfs.m.q(SourceFile:10)
    com.tencent.mm.vfs.n1.q(SourceFile:18)
    com.tencent.mm.plugin.wepkg.model.l.c(SourceFile:67)
    m73.k.c(SourceFile:371)
    m73.k.b(SourceFile:150)
    m73.l.f(SourceFile:19)
    m73.d.a(SourceFile:14)
    xz.d.je0(SourceFile:24)
    ow0.c.invokeSuspend(SourceFile:311)
    ot3.a.resumeWith(SourceFile:9)
    kotlinx.coroutines.DispatchedTask.run(SourceFile:123)
    hp3.f.run(Unknown Source:2)
    java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:463)
    java.util.concurrent.FutureTask.run(FutureTask.java:264)
    sp3.j.run(SourceFile:246)
    java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1137)
    java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:637)
    lp3.c.run(SourceFile:3)
    java.lang.Thread.run(Thread.java:1012)

查看com.tencent.mm.plugin.wepkg.model.l的反编译代码

立刻定位到com.tencent.mm.plugin.wepkg.model.j,关键词是readCacheBigPackageMicroMsg.Wepkg.WePkgReader

其中的关键代码如下:

fileChannel0.position(0L);
ByteBuffer byteBuffer0 = ByteBuffer.allocate(4);
byteBuffer0.order(WePkgReader.byteOrder);
fileChannel0.read(byteBuffer0);
this.next_item_size = byteBuffer0.getInt(0);
z = this.dealProtoData(fileChannel0);
fileChannel0.position(4L);
ByteBuffer byteBuffer0 = ByteBuffer.allocate(this.next_item_size);
byteBuffer0.order(WePkgReader.byteOrder);
fileChannel0.read(byteBuffer0);
byte[] arr_b = byteBuffer0.array();
if(arr_b != null && arr_b.length != 0) {
    this.e = new iu4();
    this.e.parseFrom(arr_b);
    this.f = this.e.resList;
    this.d = 4 + this.next_item_size;
    return true;
}

结合进一步交叉引用和分析,确定了一个名为resList的字段

而根据经验,parseFrom这里是在解protobuf数据

具体分析一下代码逻辑,结合实际的protobuf数据,可以知道wepkg文件的构成大致如下:

  1. 前4字节表示protobuf数据大小
  2. protobuf数据中实际上包含了多个文件的基本信息,比如:

    • 归属链接
    • 起始偏移(以protobuf数据末尾为起始)
    • 文件实际大小
    • 文件类型
    • 跨域限制/或者是请求头信息
    • 其他信息
  3. 多个连续的文件数据

目的是为了获取分离各个文件内容,于是在以上信息基础之上可以写出解包脚本,效果如下:

脚本

pip install blackboxprotobuf
import shutil
import blackboxprotobuf
from pathlib import Path

def main():
    def write_file(dump_path: Path, url: str, data: bytes):
        next_path = dump_path
        item_names = url.split('/')
        for dir_name in item_names[:-1]:
            next_path = next_path / dir_name
            if not next_path.exists():
                next_path.mkdir()
        next_path = next_path / item_names[-1]
        next_path.write_bytes(data)
        print(f'[+] dump {url} end')

    pkg = Path(r'pkg_p101944_v10253.wepkg')
    data = pkg.read_bytes()
    print(f'[+] file size:{len(data)}')

    next_item_size = int.from_bytes(data[:4], byteorder='big')
    print(f'[*] next_item_size:{next_item_size}')

    proto_data = data[4:next_item_size + 4]
    message, typedef = blackboxprotobuf.decode_message(proto_data)

    print('------protobuf------')
    print(message)
    print('------protobuf------')

    payload_offset = next_item_size + 4

    # for k, v in message.items():
    #     print(k, v)

    res_list = message['1']

    if len(res_list) > 0:
        dump_path = Path(pkg.parent) / 'wepkg_dump'
        if dump_path.exists():
            shutil.rmtree(dump_path.resolve().as_posix())
            print('[+] delete wepkg_dump')
        dump_path.mkdir()
        json_data, typedef = blackboxprotobuf.protobuf_to_json(proto_data)
        info_path = dump_path / 'info.json'
        info_path.write_text(json_data, encoding='utf-8')
        print('[+] save wepkg info')

    for res in res_list:
        url = res['1'].decode('utf-8')
        offset = res['2']
        size = res['3']
        # mime = res['4'].decode('utf-8')
        # headers = res['5']
        print(f'url:{url} offset:{offset} size:{size}')
        start = payload_offset + offset
        end = start + size
        payload = data[start:end]
        write_file(dump_path, url, payload)
    print(f'[+] unpack wepkg file end')

if __name__ == '__main__':
    main()