简介

FWT 是用来求位运算卷积的工具,对于长度为 2n2^n 的数组 A,B,CA,B,C,它们的元素类型通常是 Zp\mathbb Z_pZ\mathbb Z,若:

0k2n1,Ck=i=02n1j=02n1AiBj[ij=k]\forall 0\le k\le 2^n-1, C_k=\sum_{i=0}^{2^n-1}\sum_{j=0}^{2^n-1} A_iB_j[i\circ j=k]

其中 [][-] 是艾弗森括号,那么称 CCA,BA,B 关于 \circ 运算的卷积,记作 C=ABC=A*B

我们希望找到一个可逆的变换 F\mathcal F,使得:

F(C)=F(A)F(B)\mathcal F(C)=\mathcal F(A)\cdot \mathcal F(B)

其中 \cdot 是逐元素的乘法。

其次,我们也希望这是一个线性变换,即:

F(kA+lB)=kF(A)+lF(B)\mathcal F(kA+lB)=\mathcal kF(A)+\mathcal lF(B)

其中 k,lk,l 是任意常数,加法是逐元素加法,乘法是数量乘法。

变换

下面的问题就是,怎么找到这样的变换?

ci,jc_{i,j}AjA_jFi(A)\mathcal F_i(A) 的贡献系数,我们可以先构造出这样的一种变换:

Fi(A)=j=02n1ci,jAj\mathcal F_i(A)=\sum_{j=0}^{2^n-1} c_{i,j} A_j

其中 ci,jc_{i,j} 满足这样的性质:

  1. ii 的二进制表示是 in1i1i0i_{n-1}\cdots i_1i_0jj 的二进制表示是 jn1j1j0j_{n-1}\cdots j_1j_0,那么 ci,j=k=0n1cik,jkc_{i,j}=\prod_{k=0}^{n-1} c_{i_k,j_k}
  2. 对于任意的 i,ji,jci,j{1,0,1}c_{i,j}\in \{-1,0,1\}

为什么这么设?因为这是一个易于处理的假设,因为只要知道了 c0/1,0/1=±1/0c_{0/1,0/1}=\pm1/0,就能求出所有的 ci,jc_{i,j},而前者一共只有 343^4 种可能性。

根据我们的需求,展开 Fi(C)=Fi(A)Fi(B)\mathcal F_i(C)=\mathcal F_i(A)\mathcal F_i(B),可以得到:

Fi(C)=p=02n1Cpci,p=p=02n1j=02n1k=02n1AjBkci,p[jk=p]=j=02n1k=02n1AjBk(p=02n1ci,p[jk=p])=j=02n1k=02n1AjBkci,jkFi(A)Fi(B)=j=02n1k=02n1AjBkci,jci,k\begin{aligned} \mathcal F_i(C)&=\sum_{p=0}^{2^n-1} C_pc_{i,p}\\ &=\sum_{p=0}^{2^n-1}\sum_{j=0}^{2^n-1}\sum_{k=0}^{2^n-1} A_jB_k c_{i,p}[j\circ k=p]\\ &=\sum_{j=0}^{2^n-1}\sum_{k=0}^{2^n-1} A_jB_k \left(\sum_{p=0}^{2^n-1}c_{i,p}[j\circ k=p]\right)\\ &=\sum_{j=0}^{2^n-1}\sum_{k=0}^{2^n-1} A_jB_k c_{i,j\circ k}\\ \mathcal F_i(A)\mathcal F_i(B)&=\sum_{j=0}^{2^n-1}\sum_{k=0}^{2^n-1} A_jB_kc_{i,j}c_{i,k} \end{aligned}

因此 ci,jci,k=ci,jkc_{i,j}c_{i,k}=c_{i,j\circ k} 是满足 F(C)=F(A)F(B)\mathcal F(C)=\mathcal F(A)\cdot \mathcal F(B) 的充要条件。

当我们在寻找逆变换之前,我们来看如何快速地求出 Fi(A)\mathcal F_i(A)

由于这和二进制相关,所以我们按照 02n110\sim 2^{n-1}-12n12n12^{n-1}\sim 2^n-1 分组,设 ii 在二进制从低到高的第 nn 位是 i0i_0,那么:

Fi(A)=j=02n11ci,jAj+j=2n12n1ci,jAj=ci0,0(j=02n11ci2n1i0,jAj)+ci0,1(j=02n11ci2n1i0,jAj+2n1)=ci0,0Fi2n1i0(AL)+ci0,1Fi2n1i0(AR)\begin{aligned} \mathcal F_i(A)&=\sum_{j=0}^{2^{n-1}-1}c_{i,j}A_j+\sum_{j=2^{n-1}}^{2^n-1}c_{i,j}A_j\\ &=c_{i_0,0}\left(\sum_{j=0}^{2^{n-1}-1} c_{i-2^{n-1}i_0,j} A_{j}\right)+c_{i_0,1}\left(\sum_{j=0}^{2^{n-1}-1}c_{i-2^{n-1}i_0,j}A_{j+2^{n-1}}\right)\\ &=c_{i_0,0}\mathcal F_{i-2^{n-1}i_0}(A^L)+c_{i_0,1}\mathcal F_{i-2^{n-1}i_0} (A^R) \end{aligned}

其中 F(AL)\mathcal F(A^L)F(AR)\mathcal F(A^{R}) 代表将 A0A2n11A_0\sim A_{2^{n-1}-1}A2n1A2n1A_{2^{n-1}}\sim A_{2^n-1} 看作两个独立的数组解出的子问题的答案。

满足 i2n1i0=mi-2^{n-1}i_0=mmm 是某个常数)的 ii 有且仅有两个 iL,iRi_L,i_R,它们的 i0i_0 对应着 0011

也就是说:

FiL(A)=c0,0Fm(AL)+c0,1Fm(AR)FiR(A)=c1,0Fm(AL)+c1,1Fm(AR)\begin{aligned} \mathcal F_{i_L}(A)&=c_{0,0} \mathcal F_m(A^L)+c_{0,1}\mathcal F_m(A^R)\\ \mathcal F_{i_R}(A)&=c_{1,0}\mathcal F_m(A^L)+c_{1,1}\mathcal F_m(A^R) \end{aligned}

如果矩阵 [c0,0c0,1c1,0c1,1]\begin{bmatrix}c_{0,0}&c_{0,1}\\c_{1,0}&c_{1,1}\end{bmatrix} 可逆,那么这个变换自然就可逆。

  • \circ 是按位与运算时,c0,0,c0,1,c1,0,c1,1=1,1,0,1c_{0,0},c_{0,1},c_{1,0},c_{1,1}=1,1,0,1 是满足要求的;
  • \circ 是按位或运算时,1,0,1,11,0,1,1 是满足要求的;
  • \circ 是按位异或运算时,1,1,1,11,1,1,-1 是满足要求的。

逆变换

接下来我们看,具体如何进行逆变换。

对于变换 F\mathcal F,我们想找到一个 G\mathcal G 满足:

Gi(F(A))=Ai\mathcal G_i(\mathcal F(A))= A_i

G\mathcal G 对应的贡献系数是 di,jd_{i,j},那么展开可以得到:

Gi(F(A))=j=02n1di,jFj(A)=j=02n1di,j(k=02n1cj,kAk)=k=02n1Ak(j=02n1di,jcj,k)\begin{aligned} \mathcal G_i(\mathcal F(A))&=\sum_{j=0}^{2^n-1} d_{i,j}\mathcal F_j(A)\\ &=\sum_{j=0}^{2^n-1} d_{i,j}\left(\sum_{k=0}^{2^n-1} c_{j,k} A_k\right)\\ &=\sum_{k=0}^{2^n-1} A_k\left(\sum_{j=0}^{2^n-1} d_{i,j}c_{j,k}\right) \end{aligned}

于是我们希望 j=02n1di,jcj,k=[i=k]\sum_{j=0}^{2^n-1} d_{i,j}c_{j,k}=[i=k],这显然是一个矩阵的形式,于是这个式子成立的充要条件是 dd 的矩阵与 cc 的矩阵之积是单位阵 II

进一步地,如果下面的式子成立:

[c0,0c0,1c1,0c1,1][d0,0d0,1d1,0d1,1]=I2\begin{bmatrix}c_{0,0}&c_{0,1}\\c_{1,0}&c_{1,1}\end{bmatrix} \begin{bmatrix}d_{0,0}&d_{0,1}\\d_{1,0}&d_{1,1}\end{bmatrix}=I_2

令左边的矩阵是 FF,右边的矩阵是 GG,就会有:

[c0,0c0,1c0,2c0,3c1,0c1,1c1,2c1,3c2,0c2,1c2,2c2,3c3,0c3,1c3,2c3,3]=[c0,0Fc0,1Fc1,0Fc1,1F],[d0,0d0,1d0,2d0,3d1,0d1,1d1,2d1,3d2,0d2,1d2,2d2,3d3,0d3,1d3,2d3,3]=[d0,0Gd0,1Gd1,0Gd1,1G]\begin{bmatrix} c_{0,0}&c_{0,1}&c_{0,2}&c_{0,3}\\ c_{1,0}&c_{1,1}&c_{1,2}&c_{1,3}\\ c_{2,0}&c_{2,1}&c_{2,2}&c_{2,3}\\ c_{3,0}&c_{3,1}&c_{3,2}&c_{3,3}\\ \end{bmatrix}= \begin{bmatrix} c_{0,0} F&c_{0,1}F\\ c_{1,0} F&c_{1,1}F \end{bmatrix}, \begin{bmatrix} d_{0,0}&d_{0,1}&d_{0,2}&d_{0,3}\\ d_{1,0}&d_{1,1}&d_{1,2}&d_{1,3}\\ d_{2,0}&d_{2,1}&d_{2,2}&d_{2,3}\\ d_{3,0}&d_{3,1}&d_{3,2}&d_{3,3}\\ \end{bmatrix}= \begin{bmatrix} d_{0,0} G&d_{0,1}G\\ d_{1,0} G&d_{1,1}G \end{bmatrix}

于是,这两个矩阵按照分块矩阵的乘法乘起来也是单位阵 I4I_4,以此类推,最大的那个 F,GF,G 乘起来就是 I2nI_{2^n}

注意,由 2×22\times 24×44\times 4 的过程中,以 c0,0c_{0,0} 为例,它其实发生了一个微妙的变化:原来 2×22\times 2 的矩阵中,这个 0/10/1 是长度为 11 的二进制数,扩张之后是长度为 22 的二进制数。这在当前讨论的与或异或卷积的体现不是很明显,但是在下面的一般化的公式中这就有区别了。

下面是洛谷模板题的代码:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
void OR(int A[], int n) {
for (int len = 2, half = 1; len <= 1 << n; len <<= 1, half <<= 1) {
for (int i = 0; i < 1 << n; i += len) {
for (int j = 0; j < half; j++) {
// i: block start, j: block pointer
A[i+j+half] = (A[i+j+half] + A[i+j]) % P;
}
}
}
}

void IOR(int A[], int n) {
for (int len = 2, half = 1; len <= 1 << n; len <<= 1, half <<= 1) {
for (int i = 0; i < 1 << n; i += len) {
for (int j = 0; j < half; j++) {
A[i+j+half] = (A[i+j+half] - A[i+j] + P) % P;
}
}
}
}

void AND(int A[], int n) {
for (int len = 2, half = 1; len <= 1 << n; len <<= 1, half <<= 1) {
for (int i = 0; i < 1 << n; i += len) {
for (int j = 0; j < half; j++) {
A[i+j] = (A[i+j] + A[i+j+half]) % P;
}
}
}
}

void IAND(int A[], int n) {
for (int len = 2, half = 1; len <= 1 << n; len <<= 1, half <<= 1) {
for (int i = 0; i < 1 << n; i += len) {
for (int j = 0; j < half; j++) {
A[i+j] = (A[i+j] - A[i+j+half] + P) % P;
}
}
}
}

void XOR(int A[], int n) {
for (int len = 2, half = 1; len <= 1 << n; len <<= 1, half <<= 1)
for (int i = 0; i < 1 << n; i += len)
for (int j = 0; j < half; j++) {
int A0 = A[i+j], A1 = A[i+j+half];
A[i+j] = (A0 + A1) % P;
A[i+j+half] = (A0 - A1) % P;
if (A[i+j+half] < 0) A[i+j+half] += P;
}
}

void IXOR(int A[], int n) {
for (int len = 2, half = 1; len <= 1 << n; len <<= 1, half <<= 1)
for (int i = 0; i < 1 << n; i += len)
for (int j = 0; j < half; j++) {
int A0 = A[i+j], A1 = A[i+j+half];
A[i+j] = 1ll * (A0 + A1) * INV2 % P;
A[i+j+half] = 1ll * (A0 - A1) * INV2 % P;
if (A[i+j+half] < 0) A[i+j+half] += P;
}
}

一般化

这里给出一道题目:Belarusian State University

对于任意给定真值表的二进制位独立的运算 \circ,我们还能导出上面的结论吗?答案是否定的。

例如,如果 jk1j\circ k\equiv 1(恒等于 11),那么可以推出 ci,j1c_{i,j}\equiv 1,此时的矩阵显然就是不可逆的了。

此时我们退而求其次,可以尝试找三种变换 F,G,H\mathcal F,\mathcal G,\mathcal H,满足:

H(AB)=F(A)G(B)\mathcal H(A*B)=\mathcal F(A)\cdot \mathcal G(B)

其中 H\mathcal H 是可逆的变换,F,G\mathcal F,\mathcal G 可以不可逆。

H\mathcal H 对应的系数是 h0/1,0/1h_{0/1,0/1},由上一节的推导同理可得 hi,jk=fi,jgi,kh_{i,j\circ k}=f_{i,j}g_{i,k}

由于运算一共有 242^4 种,然后 f,g,hf,g,h 的取值一共有 (34)3(3^4)^3 种,这是可以暴力寻找的。

下面是打表程序:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
#include <bits/stdc++.h>
using namespace std;

int pow3[5] = {
1, 3, 9, 27, 81,
};

int calc(int rule, int a, int b) {
return rule >> ((a << 1) | b) & 1;
}

bool validate(int rule, int f, int g, int h) {
for (int i = 0; i < 2; i++) {
for (int j = 0; j < 2; j++) {
for (int k = 0; k < 2; k++) {
// f(i, j) g(i, k) = h(i, j op k)
int ij = (i << 1) | j;
int ik = (i << 1) | k;
int ijk = (i << 1) | calc(rule, j, k);
if (
(f / pow3[ij] % 3 - 1) * (g / pow3[ik] % 3 - 1) !=
(h / pow3[ijk] % 3 - 1)
) {
return false;
}
}
}
}
return true;
}

array<tuple<int,int,int>, 16> init() {
array<tuple<int,int,int>, 16> res{};
for (int rule = 0; rule < 16; rule++) {
for (int h = 0; h < pow3[4]; h++) {
// validate whether h is invertible
int h00 = h % 3 - 1;
int h01 = h / 3 % 3 - 1;
int h10 = h / 9 % 3 - 1;
int h11 = h / 27 % 3 - 1;
if (h00 * h11 - h01 * h10 == 0) continue;

for (int f = 0; f < pow3[4]; f++) {
for (int g = 0; g < pow3[4]; g++) {
if (validate(rule, f, g, h)) {
res[rule] = {f, g, h};
}
}
}
}
}
return res;
}

signed main() {
ios::sync_with_stdio(false); cin.tie(0); cout.tie(0);
auto tab = init();
for (auto [f, g, h]: tab) {
cout << '{' << f << ',' << g << ',' << h << "},";
}
return 0;
}

用这个打出来的表,就可以来做自定义运算的 FWT 了。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
#include <bits/stdc++.h>
using namespace std;

#define int long long
tuple<int,int,int> fgh[] = {{80,76,79},{77,77,79},{77,79,79},{77,80,79},{79,77,79},{80,77,79},{78,74,78},{79,79,77},{79,79,79},{78,78,78},{80,79,79},{79,77,77},{79,80,79},{77,79,77},{77,77,77},{80,80,79}};
int rule[20];
int A[1 << 18], B[1 << 18];

template<size_t F>
void FWT(int n, int A[]) {
for (int len = 2, half = 1, k = 1; len <= 1 << n; len <<= 1, half <<= 1, k++) {
int f = get<F>(fgh[rule[k]]);
int f00 = f % 3 - 1;
int f01 = f / 3 % 3 - 1;
int f10 = f / 9 % 3 - 1;
int f11 = f / 27 % 3 - 1;
for (int i = 0; i < 1 << n; i += len) {
for (int j = 0; j < half; j++) {
int A0 = A[i+j], A1 = A[i+j+half];
A[i+j] = f00 * A0 + f01 * A1;
A[i+j+half] = f10 * A0 + f11 * A1;
}
}
}
}

void IFWT(int n, int A[]) {
for (int len = 2, half = 1, k = 1; len <= 1 << n; len <<= 1, half <<= 1, k++) {
int h = get<2>(fgh[rule[k]]);
int h00 = h % 3 - 1;
int h01 = h / 3 % 3 - 1;
int h10 = h / 9 % 3 - 1;
int h11 = h / 27 % 3 - 1;

int div = h00 * h11 - h01 * h10;
int r00 = h11, r01 = -h01, r10 = -h10, r11 = h00;

for (int i = 0; i < 1 << n; i += len) {
for (int j = 0; j < half; j++) {
int A0 = A[i+j], A1 = A[i+j+half];
A[i+j] = r00 * A0 + r01 * A1;
A[i+j+half] = r10 * A0 + r11 * A1;
assert(A[i+j] % div == 0);
A[i+j] /= div;
assert(A[i+j+half] % div == 0);
A[i+j+half] /= div;
}
}
}
}

signed main() {
ios::sync_with_stdio(false); cin.tie(0); cout.tie(0);
int n;
cin >> n;
for (int k = 1; k <= n; k++) {
string s;
cin >> s;
for (int i = 0; i < 2; i++) {
for (int j = 0; j < 2; j++) {
rule[k] ^= (s[(i << 1) | j] - '0') << ((i << 1) | j);
}
}
}
for (int i = 0; i < 1 << n; i++) cin >> A[i];
FWT<0>(n, A);
for (int i = 0; i < 1 << n; i++) cin >> B[i];
FWT<1>(n, B);

for (int i = 0; i < 1 << n; i++) A[i] = A[i] * B[i];
IFWT(n, A);
for (int i = 0; i < 1 << n; i++) {
cout << A[i] << " \n"[i == (1 << n) - 1];
}

return 0;
}