这篇文章上次修改于 520 天前,可能其部分内容已经发生变化,如有疑问可询问作者。
前言
抓包看到wepkg的请求,咋一看是明文,但是格式又很奇怪,因为和wxapkg格式也不一样
结果搜索了一圈,似乎没有人有写过wepkg的解包,那么我来浅浅分析一下吧
微信版本:8.32
分析
hook下java.io.File.<init>
,看看和wepkg相关的路径,调用栈如下:
Backtrace:
java.io.File.<init>(Native Method)
com.tencent.mm.vfs.NativeFileSystem$g.q(SourceFile:12)
com.tencent.mm.vfs.m.q(SourceFile:10)
com.tencent.mm.vfs.n1.q(SourceFile:18)
com.tencent.mm.plugin.wepkg.model.l.c(SourceFile:67)
m73.k.c(SourceFile:371)
m73.k.b(SourceFile:150)
m73.l.f(SourceFile:19)
m73.d.a(SourceFile:14)
xz.d.je0(SourceFile:24)
ow0.c.invokeSuspend(SourceFile:311)
ot3.a.resumeWith(SourceFile:9)
kotlinx.coroutines.DispatchedTask.run(SourceFile:123)
hp3.f.run(Unknown Source:2)
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:463)
java.util.concurrent.FutureTask.run(FutureTask.java:264)
sp3.j.run(SourceFile:246)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1137)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:637)
lp3.c.run(SourceFile:3)
java.lang.Thread.run(Thread.java:1012)
查看com.tencent.mm.plugin.wepkg.model.l的反编译代码
立刻定位到com.tencent.mm.plugin.wepkg.model.j
,关键词是readCacheBigPackage
和MicroMsg.Wepkg.WePkgReader
其中的关键代码如下:
fileChannel0.position(0L);
ByteBuffer byteBuffer0 = ByteBuffer.allocate(4);
byteBuffer0.order(WePkgReader.byteOrder);
fileChannel0.read(byteBuffer0);
this.next_item_size = byteBuffer0.getInt(0);
z = this.dealProtoData(fileChannel0);
fileChannel0.position(4L);
ByteBuffer byteBuffer0 = ByteBuffer.allocate(this.next_item_size);
byteBuffer0.order(WePkgReader.byteOrder);
fileChannel0.read(byteBuffer0);
byte[] arr_b = byteBuffer0.array();
if(arr_b != null && arr_b.length != 0) {
this.e = new iu4();
this.e.parseFrom(arr_b);
this.f = this.e.resList;
this.d = 4 + this.next_item_size;
return true;
}
结合进一步交叉引用和分析,确定了一个名为resList
的字段
而根据经验,parseFrom
这里是在解protobuf数据
具体分析一下代码逻辑,结合实际的protobuf数据,可以知道wepkg文件的构成大致如下:
- 前4字节表示protobuf数据大小
protobuf数据中实际上包含了多个文件的基本信息,比如:
- 归属链接
- 起始偏移(以protobuf数据末尾为起始)
- 文件实际大小
- 文件类型
- 跨域限制/或者是请求头信息
- 其他信息
- 多个连续的文件数据
目的是为了获取分离各个文件内容,于是在以上信息基础之上可以写出解包脚本,效果如下:
脚本
pip install blackboxprotobuf
import shutil
import blackboxprotobuf
from pathlib import Path
def main():
def write_file(dump_path: Path, url: str, data: bytes):
next_path = dump_path
item_names = url.split('/')
for dir_name in item_names[:-1]:
next_path = next_path / dir_name
if not next_path.exists():
next_path.mkdir()
next_path = next_path / item_names[-1]
next_path.write_bytes(data)
print(f'[+] dump {url} end')
pkg = Path(r'pkg_p101944_v10253.wepkg')
data = pkg.read_bytes()
print(f'[+] file size:{len(data)}')
next_item_size = int.from_bytes(data[:4], byteorder='big')
print(f'[*] next_item_size:{next_item_size}')
proto_data = data[4:next_item_size + 4]
message, typedef = blackboxprotobuf.decode_message(proto_data)
print('------protobuf------')
print(message)
print('------protobuf------')
payload_offset = next_item_size + 4
# for k, v in message.items():
# print(k, v)
res_list = message['1']
if len(res_list) > 0:
dump_path = Path(pkg.parent) / 'wepkg_dump'
if dump_path.exists():
shutil.rmtree(dump_path.resolve().as_posix())
print('[+] delete wepkg_dump')
dump_path.mkdir()
json_data, typedef = blackboxprotobuf.protobuf_to_json(proto_data)
info_path = dump_path / 'info.json'
info_path.write_text(json_data, encoding='utf-8')
print('[+] save wepkg info')
for res in res_list:
url = res['1'].decode('utf-8')
offset = res['2']
size = res['3']
# mime = res['4'].decode('utf-8')
# headers = res['5']
print(f'url:{url} offset:{offset} size:{size}')
start = payload_offset + offset
end = start + size
payload = data[start:end]
write_file(dump_path, url, payload)
print(f'[+] unpack wepkg file end')
if __name__ == '__main__':
main()
没有评论