The knowledge needed below does not require proficiency, just an understanding and awareness of basic grammar is sufficient.
Main text
About Tencent Video cKey
- Tencent Video's cKey is the core key to parse the direct address of the video, and only through the encryption of this algorithm can the data to be parsed be requested. * Currently, I know that Tencent has two versions of the algorithm, 8.1 and 9.1. The 8.1 version is for users who do not support WebAssembly. So, in fact, the work done by these two versions on the PC web end is basically the same. Because the cKey of version 8.1 is basically similar to the analysis process of part of the algorithm of iQIYI, I will not analyze it here (the analysis of Tencent Video's cKey part may be more cumbersome). * Therefore, this article will analyze and implement version 9.1.
About Tools
- This article will not use Fiddler to capture packets, but will analyze them using the Chromium developer tools throughout.
- When it comes to chromium, the first thing that may come to mind for many people is using Google Chrome. However, unfortunately, we do not use Google Chrome here because the cKey used by Google Chrome is version 8.1. (The browser version I have here is 74.0.3729.131, and future versions may use 9.1)
- Besides Google Chrome, there are many browsers that use the Chromium engine, such as 360 Browser (unknown), Sogou Browser (ckey8.1). Here, I will use Baidu Browser (ckey9.1), and in Tencent Video analysis, it uses cKey9.1 version. This is the object we are going to analyze.
- The following will be used as an example for analysis: https://v.qq.com/x/cover/bzfkv5se8qaqel2/j002024w2wg.html
First Analysis
- Press F12 to open the developer tools.
- Open the link:
https://v.qq.com/x/cover/bzfkv5se8qaqel2/j002024w2wg.html
.
- Switch to Network to view the packet capture.
- Search for proxyhttp, find two items
https://vd.l.qq.com/proxyhttp
. (Here search for proxyhttp because the process of finding the link from the video to the resolution link was omitted earlier. If you don't know how to do it, first read the premise suggestions in the previous text)
- Since the link requesting the video is proxyhttp, how can it be interrupted before proxyhttp sends it?
* 转到`Sources`页面,在`XHR/fetch Breakpoints`的`+`进行添加条件断点 `proxyhttp`,意思就是在包含proxyhttp字串的请求链接时进行中断。
* \[图1.1\]![[图1.1]](https://i-blog.csdnimg.cn/blog_migrate/183be0e9c80cd214b3b9a01bc78ba1a1.png)
* 按F5刷新,等待中断发生。
* \[图1.2\]![[图1.2]](https://i-blog.csdnimg.cn/blog_migrate/dd554f45c7318437620d85a84ba127b2.png)
* 之后看到右边的调用栈信息`Call Stack`,可以看到调用函数的右边表明了被调用函数所在的JS链接。
* \[图1.3\]![[图1.3]](https://i-blog.csdnimg.cn/blog_migrate/f8ddfd0c25bfe66591d14c17da51d580.png)
* 为什么要看这些呢,因为对于一个具有庞大的JS脚本链接的视频网站来说,找准加密所在的JS算法所在的链接是第一步。首先要知道的是,在POST`https://vd.l.qq.com/proxyhttp`之前肯定先需要先收集所要发送的data,所以必然这将调用到获取data的函数,而获取部分必然会与加密部分有联系,所以可以通过这样的方式来找到加密部分。
* (事实上你可以直接在Network页面搜索`proxyhttp`来定位到目标链接(注意这不是一定的),但是由于在爱奇艺分析过程中使用了这一方法,我在这里用一下别的方法来解决。)
* 由\[图1.3\]可以知道的是`tvx.core.js`是用来对发送请求的。所以大概可以估计这文件就是对请求函数的集合,既然已经到了发送的地步了,那么data肯定是已经获取完成了。
* 第二个JS文件`pecker.js`,点击他,然后往下滚看到`Scope`项,看到e,f两项就是要发送的请求的所有数据,展开发现data中cKey已经存在,所以这里`Call Stack`往上走(往上一层调用走)。
* \[图1.4\]![[图1.4]](https://i-blog.csdnimg.cn/blog_migrate/f425a09bd70e963208881d32dd6c522f.png)\[图1.5\]![[图1.5]](https://i-blog.csdnimg.cn/blog_migrate/ef93da6f5ddba17b662d9e5dc8335d8a.png)\[图1.6\]![[图1.6]](https://i-blog.csdnimg.cn/blog_migrate/4866f8f2877e0d4e4e7b078a7947bf79.png)
* 到`e.requestPostCgi`位于`htmlframe.......`(关于Call Stack看图1.3),粗看函数名似乎就是提交data的获取。将其作为重点深找一下。
* 进入`e.requestPostCgi`后往下滚看到`Scope`,下图,本地变量`c`就是要提交的data,图1.7的中间红框部分就是本地变量`c`的获取,发现`vinfoparam`是由`62455行`生成的数据。`f.param(b.vinfoparam)`,发现该函数传入了参数`b.vinfoparam`,鼠标停在该参数出现了数据cKey。所以可以断定重点在于`b.vinfoparam`,而不是函数`f.param`。
* \[图1.7\]![[图1.7]](https://i-blog.csdnimg.cn/blog_migrate/eba8f7894948cfe2e64ce504c445cb2b.png)
* 发现`b.vinfoparam`中的变量b是调用`e.requestPostCgi`时传入的参数(位于`62446`)
* 既然这样,看【图1.3】Call Stack,往上一层调用栈走,进入调用栈`c`。
* \[图1.8\]![[图1.8]](https://i-blog.csdnimg.cn/blog_migrate/642cc2cf6d5b1747fe5b59d3d87cb5a0.png)
* 传入的是
{ vinfoparam: g, adparam: e, domain: v, method: w
}
* 我们关注的对象是`vinfoparam: g`,往前找g的生成代码。看【图1.8】的`62742`进入函数`f.getInfoConfig`。却没有发现`cKey`的踪迹,既然我们无法直接知道,不如放个断点走一走。
* \[图1.9\]![[图1.9]](https://i-blog.csdnimg.cn/blog_migrate/d2b8b9b58befcdb4159718d3526734fb.png)\[图1.10\]![[图1.10]](https://i-blog.csdnimg.cn/blog_migrate/bffef21c1de62b06a391e0588194b96a.png)\[图1.11\]![[图1.11]](https://i-blog.csdnimg.cn/blog_migrate/1d2ba6e1a7391d4d1a229e101a026526.png)
* 看上图1.11,我们进入了`getInfoConfig`的调试中。
* \[图1.12\]![[图1.12]](https://i-blog.csdnimg.cn/blog_migrate/a06f974d703786292a728a11674a0426.png)\[图1.13\]![[图1.13]](https://i-blog.csdnimg.cn/blog_migrate/2c9f6175355e78280378eb8078330ac8.png)
* 一直往下走【看图1.12、图1.13】都发现cKey还没获取,一直到了`e(h)`。【图1.14】【图1.15】
* \[图1.14\]![[图1.14]](https://i-blog.csdnimg.cn/blog_migrate/c0c141705065e50a88d305865c77eaf6.png)\[图1.15\]![[图1.15]](https://i-blog.csdnimg.cn/blog_migrate/4657a0fbdc645f19c571edd28b9e42ca.png)
* `a.cKey = b || ""`这就是cKey生成的地方。就是变量`b`,也就是
f ? (a.encryptVer = "9.1", b = f(a.platform, a.appVer, a.vids || a.vid, "", a.guid, a.tm)) : (a.encryptVer = "8.1", b = i(a.vids || a.vid, a.tm, a.appVer, a.guid, a.platform)), a.cKey = b || ""
* The control of versions 8.1 and 9.1 can be seen from here is controlled by the `f()` parameter. But this is not our focus, since we are analyzing version 9.1, then enter the function `f()`.
Key Analysis
function i(a, b, c, d, e) { function k(a, b) { if (0 === b || !a) return ""; for (var c, d = 0, e = 0; ; ) { if (g(a + e < db), c = Ga[a + e >> 0], d |= c, 0 == c && !b) break; if (e++, b && e == b) break } b || (b = e); var f = ""; if (d < 128) { for (var h, i = 1024; b > 0; ) h = String.fromCharCode.apply(String, Ga.subarray(a, a + Math.min(b, i))), f = f ? f + h : h, a += i, b -= i; return f } return m(a) } function f(a) { return "string" === b ? k(a) : "boolean" === b ? Boolean(a) : a } var i = h(a) , j = [] , l = 0; if (g("array" !== b, 'Return type should not be "array".'), d) for (var m = 0; m < d.length; m++) { var n = $a[c[m]]; n ? (0 === l && (l = Ub()), j[m] = n(d[m])) : j[m] = d[m] } var o = i.apply(null, j); return o = f(o), 0 !== l && Tb(l), o }
- 由上面找到的【图1.15】开始。
- 断点继续往下走,进入【图2.1】【图2.2】
- [图2.1][图2.2][图2.3]
- 返回的是变量
o
,那么我们重点关注他,走到o
,64084行
,进去,【图2.3】看到ua._getckey
,可以知道看来是找对地方了。
ua._getckey = function() { return g(ib, "you need to wait for the runtime to be ready (e.g. wait for main() to be called)"), g(!jb, "the runtime was exited (use NO_EXIT_RUNTIME to keep it alive after main() exits)"), ua.asm._getckey.apply(null, arguments)
}
- 进去
ua.asm._getckey.apply(null, arguments)
,???wocao这是什么鬼【图2.4】 - [图2.4][图2.5]
- 这函数名怎么是个数字???而且发现也进不去,而且提示的是
native code
,这说明了这不是JS的原生代码,可能是其他语言实现的方法。 - 事实上这是
WebAssembly
,这是一种JS的一种可以理解成是交叉编程的一种方式,目的是为了提高JS运行效率,这是由C或者其他编程语言生成的代码,生成*.wasm然后交给WebAssembly加载处理运行。 - 可以通过【图2.5】看到加载的wasm文件,而其中的函数名29就是对应
wasm-0005098e-29
,你点进去查看就看反汇编到具体的指令。 - 好了,基本说明了这一种JS的技术,如果要了解更多就百度谷歌把。
- 那么重要的是要找到这被加载的
wasm
文件。 - 一个最简单的方法就是直接在
Sources
页面搜索wasm
就能找到加载的wasm文件。 - [图2.6]
- 对于找wasm也可以使用其他方法实现,但是既然是请求GET到的,当然能抓包到了,所以这里就偷懒不通过代码分析了。(不然篇幅会很长)
- 那么重要的是要找到这被加载的
- 要知道的是,我们虽然得到了wasm文件,但是任何交叉编程类的东西,都需要有接口,而这些接口或者必须提供的,所以我们还需要找到wasm接口部分,但这里先放一边,待会再进行。
- 通过【图2.4】可以看到的是传了参数
arguments
,虽然我们得到了wasm,但是我们还是需要知道参数arguments
才能实现算法。 - 而
arguments
就是前面【图2.3】传递的参数j
,我们要得到j
。 - 看【图2.2】进入函数
Ub()
和n()
,而n()
是由var n = $a[c[m]];
提供的。所以我们F5刷新下页面在【图2.2】重新断点。为的就是单步执行,找所需。 - [图2.7]
- 由【图2.7】出单步走,你会发现有两种
n
,一种是undefined
和
- 通过【图2.4】可以看到的是传了参数
stringToC: function(a) { var b = 0; if (null !== a && void 0 !== a && 0 !== a) { var c = (a.length << 2) + 1; b = Sb(c), o(a, b, c) } return b
}
- Keep looking down until you find
Sb()
,o()
,n()
, which includesUb
in the loop andk()
in thef()
function, then you can organize it.
Ub = function() { return g(ib, "you need to wait for the runtime to be ready (e.g. wait for main() to be called)"), g(!jb, "the runtime was exited (use NO_EXIT_RUNTIME to keep it alive after main() exits)"), ua.asm.stackSave.apply(null, arguments)
} Sb = function() { return g(ib, "you need to wait for the runtime to be ready (e.g. wait for main() to be called)"), g(!jb, "the runtime was exited (use NO_EXIT_RUNTIME to keep it alive after main() exits)"), ua.asm.stackAlloc.apply(null, arguments) } function o(a, b, c) { return g("number" == typeof c, "stringToUTF8(str, outPtr, maxBytesToWrite) is missing the third parameter that specifies the length of the output buffer!"), n(a, Ga, b, c)
} function n(a, b, c, d) { if (!(d > 0)) return 0; for (var e = c, f = c + d - 1, g = 0; g < a.length; ++g) { var h = a.charCodeAt(g); if (h >= 55296 && h <= 57343) { var i = a.charCodeAt(++g); h = 65536 + ((1023 & h) << 10) | 1023 & i } if (h <= 127) { if (c >= f) break; b[c++] = h } else if (h <= 2047) { if (c + 1 >= f) break; b[c++] = 192 | h >> 6, b[c++] = 128 | 63 & h } else if (h <= 65535) { if (c + 2 >= f) break; b[c++] = 224 | h >> 12, b[c++] = 128 | h >> 6 & 63, b[c++] = 128 | 63 & h } else if (h <= 2097151) { if (c + 3 >= f) break; b[c++] = 240 | h >> 18, b[c++] = 128 | h >> 12 & 63, b[c++] = 128 | h >> 6 & 63, b[c++] = 128 | 63 & h } else if (h <= 67108863) { if (c + 4 >= f) break; b[c++] = 248 | h >> 24, b[c++] = 128 | h >> 18 & 63, b[c++] = 128 | h >> 12 & 63, b[c++] = 128 | h >> 6 & 63, b[c++] = 128 | 63 & h } else { if (c + 5 >= f) break; b[c++] = 252 | h >> 30, b[c++] = 128 | h >> 24 & 63, b[c++] = 128 | h >> 18 & 63, b[c++] = 128 | h >> 12 & 63, b[c++] = 128 | h >> 6 & 63, b[c++] = 128 | 63 & h } } return b[c] = 0, c - e
}
- 大家应该发现了上面的函数
o(a, b, c)
调用了方法n(a, Ga, b, c)
,其中a, b,c
我们都知道,但是Ga
是什么东西? - 既然在Locan变量无法找到,那么网上一级找。看下图2.8
- [图2.8][图2.9]
- 发现上一级有
Ga
,所以,我们找到他了,看【图2.9】 - 既然知道了要找
Ga
的缘由,那么把所有对于给Ga
赋值的东西联系起来。 - 这将是个漫长的过程。
function w() { Fa = new Int8Array(Ea), Ha = new Int16Array(Ea), Ja = new Int32Array(Ea), Ga = new Uint8Array(Ea), Ia = new Uint16Array(Ea), Ka = new Uint32Array(Ea), La = new Float32Array(Ea), Ma = new Float64Array(Ea);
} function d(a) { var b = Oa; return Oa = Oa + a + 15 & -16, b
}
function e(a, b) { b || (b = Da); var c = a = Math.ceil(a / b) * b; return c
} var Da = 16; var Ea, Fa, Ga, Ha, Ia, Ja, Ka, La, Ma, Na, Oa, Pa, Qa, Ra, Sa, Ta, Ua, Va = { "f64-rem": function(a, b) { return a % b }, "debugger": function() {}
}, Wa = (new Array(0), 1024) ; Na = Oa = Qa = Ra = Sa = Ta = Ua = 0, Pa = !1;
var cb = 5242880 , db = 16777216, ab = 65536; var wasmMemory = new WebAssembly.Memory({ initial: db / ab, maximum: db / ab
});
Ea = wasmMemory.buffer; w();
Ja[0] = 1668509029;
Ha[1] = 25459; var eb = [] , fb = [] , gb = [] , hb = [] , ib = !1 , jb = !1; Na = Wa, Oa = Na + 6928, fb.push(); Oa += 16; Ua = d(4),
Qa = Ra = e(Oa),
Sa = Qa + cb,
Ta = e(Sa),
Ja[Ua >> 2] = Ta,
Pa = !0;
- The initialization of
Ga
has been resolved above. - So far, the looping part has been resolved.
for (var m = 0; m < d.length; m++) { var n = $a[c[m]]; n ? (0 === l && (l = Ub()), j[m] = n(d[m])) : j[m] = d[m]
}
var o = i.apply(null, j); return o = f(o), 0 !== l && Tb(l),
o
- We have already mentioned
i.apply(null, j);
, which is located in wasm. - So what we need now is to load wasm correctly. Once this step is completed, all functions can be connected to achieve cKey.
- Let's first take a look at the following code
var ub = ua.asm(ua.asmGlobalArg, ua.asmLibraryArg, Ea) var Cb = ub._getckey; ub._getckey = function() { return g(ib, "you need to wait for the runtime to be ready (e.g. wait for main() to be called)"), g(!jb, "the runtime was exited (use NO_EXIT_RUNTIME to keep it alive after main() exits)"), Cb.apply(null, arguments) }
- That is to say, we first know
ub
isua.asm(ua.asmGlobalArg, ua.asmLibraryArg, Ea)
. - Debugging in, find the following code.
ua.asm = function(a, b, c) { if (!b.table) { var d = ua.wasmTableSize; void 0 === d && (d = 1024); var f = ua.wasmMaxTableSize; "object" == typeof WebAssembly && "function" == typeof WebAssembly.Table ? void 0 !== f ? b.table = new WebAssembly.Table({ initial: d, maximum: f, element: "anyfunc" }) : b.table = new WebAssembly.Table({ initial: d, element: "anyfunc" }) : b.table = new Array(d), ua.wasmTable = b.table } b.memoryBase || (b.memoryBase = ua.STATIC_BASE), b.tableBase || (b.tableBase = 0); var h; return h = e(a, b, c), g(h, "no binaryen method succeeded. consider enabling more options, like interpreting, if you want that: http://kripken.github.io/emscripten-site/docs/compiling/WebAssembly.html#binaryen-methods"), h
}
- This is the loading of wasm. And all this loading is based on knowing the parameters
a,b,c
, so back toua.asm(ua.asmGlobalArg, ua.asmLibraryArg, Ea)
- It is
ua.asmGlobalArg, ua.asmLibraryArg, Ea
, and among themEa
we have mentioned before, related toGa
. - Easy to find
ua.wasmTableSize = 99, ua.wasmMaxTableSize = 99, ua.asmGlobalArg = {},
ua.asmLibraryArg = { abort: sa, assert: g, enlargeMemory: B, getTotalMemory: C, abortOnCannotGrowMemory: A, abortStackOverflow: z, nullFunc_ii: ca, nullFunc_iiii: da, nullFunc_v: ea, nullFunc_vi: fa, nullFunc_viiii: ga, nullFunc_viiiii: ha, nullFunc_viiiiii: ia, invoke_ii: ja, invoke_iiii: ka, invoke_v: la, invoke_vi: ma, invoke_viiii: na, invoke_viiiii: oa, invoke_viiiiii: pa, __ZSt18uncaught_exceptionv: Q, ___cxa_find_matching_catch: S, ___gxx_personality_v0: T, ___lock: U, ___resumeException: R, ___setErrNo: ba, ___syscall140: V, ___syscall146: X, ___syscall54: Y, ___syscall6: Z, ___unlock: $, _abort: _, _emscripten_memcpy_big: aa, _get_unicode_str: P, flush_NO_FILESYSTEM: W, DYNAMICTOP_PTR: Ua, tempDoublePtr: rb, STACKTOP: Ra, STACK_MAX: Sa
}; var ub = ua.asm(ua.asmGlobalArg, ua.asmLibraryArg, Ea)
- You can see that wasm's loading connects many interfaces, but here I only mention one of the more important methods
P
, which isP
in_get_unicode_str: P,
, corresponding as follows
function P() { function a(a) { return a ? a.length > 48 ? a.substr(0, 48) : a : "" } function b() { var b = document.URL , c = window.navigator.userAgent.toLowerCase() , d = ""; document.referrer.length > 0 && (d = document.referrer); try { 0 == d.length && opener.location.href.length > 0 && (d = opener.location.href) } catch (e) {} var f = window.navigator.appCodeName , g = window.navigator.appName , h = window.navigator.platform; return b = a(b), d = a(d), c = a(c), b + "|" + c + "|" + d + "|" + f + "|" + g + "|" + h } var c = b() , d = p(c) + 1 , e = Pb(d); return o(c, e, d + 1), e
}
- Why is this important? When you step through the function you initially focus on, you'll find that when executing
_getckey()
, it willcall 20
, which is thefunction number 20
in the wasm file. However, if you look closely at 【Figure 2.5】, you'll notice that thefunction number 20
is missing. This is because it linked the functionP()
during the interface linking, and the functionP()
is thefunction number 20
. - Besides that, the other functions are not very useful to us, so you can use empty functions for linking.
- Therefore, I handled the interface linking and wasm environment configuration as follows.
var fun_ = function(){}; wasm_env = { abort: fun_, assert: fun_, enlargeMemory: fun_, getTotalMemory: C, abortOnCannotGrowMemory: fun_, abortStackOverflow: fun_, nullFunc_ii: fun_, nullFunc_iiii: fun_, nullFunc_v: fun_, nullFunc_vi: fun_, nullFunc_viiii: fun_, nullFunc_viiiii: fun_, nullFunc_viiiiii: fun_, invoke_ii: fun_, invoke_iiii: fun_, invoke_v: fun_, invoke_vi: fun_, invoke_viiii: fun_, invoke_viiiii: fun_, invoke_viiiiii: fun_, __ZSt18uncaught_exceptionv: fun_, ___cxa_find_matching_catch: fun_, ___gxx_personality_v0: fun_, ___lock: fun_, ___resumeException: fun_, ___setErrNo: fun_, ___syscall140: fun_, ___syscall146: fun_, ___syscall54: fun_, ___syscall6: fun_, ___unlock: fun_, _abort: fun_, _emscripten_memcpy_big: fun_, _get_unicode_str: P, flush_NO_FILESYSTEM: fun_, DYNAMICTOP_PTR: 7968, tempDoublePtr: 7952, STACKTOP: 7984, STACK_MAX: 5250864, memoryBase: 1024, tableBase: 0, memory: wasmMemory, table: new WebAssembly.Table({ initial: 99, maximum: 99, element: "anyfunc" }) }; var importObject = { 'env': wasm_env, 'asm2wasm': { "f64-rem": function(a, b) { return a % b }, "debugger": function() {} }, 'global': { NaN: NaN, Infinity: 1 / 0 }, "global.Math": Math, };