311073: CF1930D2. Sum over all Substrings (Hard Version)
Description
This is the hard version of the problem. The only difference between the two versions is the constraint on $t$ and $n$. You can make hacks only if both versions of the problem are solved.
For a binary$^\dagger$ pattern $p$ and a binary string $q$, both of length $m$, $q$ is called $p$-good if for every $i$ ($1 \leq i \leq m$), there exist indices $l$ and $r$ such that:
- $1 \leq l \leq i \leq r \leq m$, and
- $p_i$ is a mode$^\ddagger$ of the string $q_l q_{l+1} \ldots q_{r}$.
For a pattern $p$, let $f(p)$ be the minimum possible number of $\mathtt{1}$s in a $p$-good binary string (of the same length as the pattern).
You are given a binary string $s$ of size $n$. Find $$\sum_{i=1}^{n} \sum_{j=i}^{n} f(s_i s_{i+1} \ldots s_j).$$ In other words, you need to sum the values of $f$ over all $\frac{n(n+1)}{2}$ substrings of $s$.
$^\dagger$ A binary pattern is a string that only consists of characters $\mathtt{0}$ and $\mathtt{1}$.
$^\ddagger$ Character $c$ is a mode of string $t$ of length $m$ if the number of occurrences of $c$ in $t$ is at least $\lceil \frac{m}{2} \rceil$. For example, $\mathtt{0}$ is a mode of $\mathtt{010}$, $\mathtt{1}$ is not a mode of $\mathtt{010}$, and both $\mathtt{0}$ and $\mathtt{1}$ are modes of $\mathtt{011010}$.
InputEach test contains multiple test cases. The first line contains the number of test cases $t$ ($1 \le t \le 10^5$) — the number of test cases. The description of the test cases follows.
The first line of each test case contains a single integer $n$ ($1 \le n \le 10^6$) — the length of the binary string $s$.
The second line of each test case contains a binary string $s$ of length $n$ consisting of only characters $\mathtt{0}$ and $\mathtt{1}$.
It is guaranteed that the sum of $n$ over all test cases does not exceed $10^6$.
OutputFor each test case, output the sum of values of $f$ over all substrings of $s$.
ExampleInput4 1 1 2 10 5 00000 20 11110110000000111111Output
1 2 0 346Note
In the first test case, the only $\mathtt{1}$-good string is $\mathtt{1}$. Thus, $f(\mathtt{1})=1$.
In the second test case, $f(\mathtt{10})=1$ because $\mathtt{01}$ is $\mathtt{10}$-good, and $\mathtt{00}$ is not $\mathtt{10}$-good. Thus, the answer is $f(\mathtt{1})+f(\mathtt{10})+f(\mathtt{0}) = 1 + 1 + 0 = 2$.
In the third test case, $f$ equals to $0$ for all $1 \leq i \leq j \leq 5$. Thus, the answer is $0$.
Output
这是一个难题的较难版本。两个版本之间的唯一区别是关于t和n的限制。只有当两个版本的问题都解决时,才能进行破解。
对于一个二进制模式p和长度为m的二进制字符串q,如果对于任意的i(1≤i≤m),存在索引l和r使得:
1. 1≤l≤i≤r≤m,并且
2. pi是字符串qlql+1…qr的众数。
那么称q是p-好的。
对于一个模式p,设fp为在p-好的二进制字符串(与模式长度相同)中最少可能的1的数量。
给你一个长度为n的二进制字符串s。求∑i=1n∑j=if(sisj)。换句话说,你需要对所有n(n+1)/2个s的子字符串的f值求和。
输入输出数据格式:
输入:
每个测试包含多个测试用例。第一行包含测试用例数t(1≤t≤10^5)。
每个测试用例的第一行包含一个整数n(1≤n≤10^6)。
每个测试用例的第二行包含一个长度为n,仅由字符0和1组成的二进制字符串s。
保证所有测试用例的n之和不超过10^6。
输出:
对于每个测试用例,输出s的所有子字符串的f值之和。题目大意: 这是一个难题的较难版本。两个版本之间的唯一区别是关于t和n的限制。只有当两个版本的问题都解决时,才能进行破解。 对于一个二进制模式p和长度为m的二进制字符串q,如果对于任意的i(1≤i≤m),存在索引l和r使得: 1. 1≤l≤i≤r≤m,并且 2. pi是字符串qlql+1…qr的众数。 那么称q是p-好的。 对于一个模式p,设fp为在p-好的二进制字符串(与模式长度相同)中最少可能的1的数量。 给你一个长度为n的二进制字符串s。求∑i=1n∑j=if(sisj)。换句话说,你需要对所有n(n+1)/2个s的子字符串的f值求和。 输入输出数据格式: 输入: 每个测试包含多个测试用例。第一行包含测试用例数t(1≤t≤10^5)。 每个测试用例的第一行包含一个整数n(1≤n≤10^6)。 每个测试用例的第二行包含一个长度为n,仅由字符0和1组成的二进制字符串s。 保证所有测试用例的n之和不超过10^6。 输出: 对于每个测试用例,输出s的所有子字符串的f值之和。