Proof Complexity Meets Algebra

Albert Atserias
Universitat Politècnica de Catalunya
Barcelona, Catalonia
  Joanna Ochremiak
Université Paris Diderot
Paris, France
Abstract

We analyse how the standard reductions between constraint satisfaction problems affect their proof complexity. We show that, for the most studied propositional, algebraic, and semi-algebraic proof systems, the classical constructions of pp-interpretability, homomorphic equivalence and addition of constants to a core preserve the proof complexity of the CSP. As a result, for those proof systems, the classes of constraint languages for which small unsatisfiability certificates exist can be characterised algebraically. We illustrate our results by a gap theorem saying that a constraint language either has resolution refutations of constant width, or does not have bounded-depth Frege refutations of subexponential size. The former holds exactly for the widely studied class of constraint languages of bounded width. This class is also known to coincide with the class of languages with refutations of sublinear degree in Sums-of-Squares and Polynomial Calculus over the real-field, for which we provide alternative proofs. We then ask for the existence of a natural proof system with good behaviour with respect to reductions and simultaneously small size refutations beyond bounded width. We give an example of such a proof system by showing that bounded-degree Lovász-Schrijver satisfies both requirements. Finally, building on the known lower bounds, we demonstrate the applicability of the method of reducibilities and construct new explicit hard instances of the graph 3-coloring problem for all studied proof systems.

1 Introduction

The notion of an efficient reduction lies at the heart of computational complexity. However, in some of its subareas such as proof complexity, even though the concept exists, it is much less developed. The study of the lengths of proofs has developed mostly by studying combinatorial statements, each somewhat in isolation. There is little theory, for instance, explaining why the best studied families of propositional tautologies are encodings of the pigeonhole principle or those derived from systems of linear equations over the 2-element field. Whether there is any connection between the two is an even less explored mystery.

Luckily this fact is subject to revision, especially if proof complexity exports its methods to the study of problems beyond universal combinatorial statements. Consider the NP-hard optimization problem called MAX-CUT. The objective is to find a partition of the vertices of a given graph which maximizes the number of edges that cross the partition. The best efficient approximation algorithm known for this problem relies on certifying a bound on the optimum of its semidefinite programming relaxation. Once the certificate for the relaxation is in place, a rounding procedure gives an approximate integral solution: at worst 87% of the optimum in this case [27].

In the example of the previous paragraph, the problem that is subject to proof complexity analysis is that of certifying a bound on the optimum of an arbitrary MAX-CUT instance. The celebrated Unique Games Conjecture (UGC) can be understood as a successful approach to explaining why current algorithms and proof complexity analyses stop being successful where they do, and reductions play an important role there [49]. One of the interesting open problems in this area is whether the analysis of the Sums-of-Squares semidefinite programming hierarchy of proof systems (SOS) could be used to improve over the 87% approximation ratio for MAX-CUT. Any improvement on this would improve the approximation status of all problems that reduce to it, and refute the UGC [34]. For the constraint satisfaction problem, in which all constraints must be satisfied, as well as for its optimisation version, the analogue question was resolved recently also by exploiting the theory of reducibility: in that arena, low-degree SOS unsatisfiability proofs exist only for problems of bounded width [47, 25].

The goal of this paper is to develop the standard theory of reductions between constraint satisfaction problems in a way that it applies to many of the proof systems from the literature, including but not limited to Sums-of-Squares. Doing this requires a good amount of tedious work, but at the same time has some surprises to offer that we discuss next.

Consider a constraint language B𝐵B given by a finite domain of values, and relations over that domain. The instances of the constraint satisfaction problem (CSP) over B𝐵B are given by a set of variables and a set of constraints, each of which binds some tuple of the variables to take values in one of the relations of B𝐵B. The literature on CSPs has focussed on three different types of conditions that, if met by two constraint languages, give a reduction from the CSP of one language to the CSP of the other. These conditions are a) pp-interpretability, b) homomorphic equivalence, and c) addition of constants to the core (see [21, 14]). What makes these three types of reductions important is that they correspond to classical algebraic constructions at the level of the algebras of polymorphisms of the constraint languages. Indeed, pp-interpretations correspond to taking homomorphic images, subalgebras and powers. The other two types of reductions put together ensure that the algebra of the constraint language is idempotent. Thus, for any fixed algorithm, heuristic, or method \mathcal{M} for deciding the satisfiability of CSPs, if the class of constraint languages that are solvable by \mathcal{M} is closed under these notions of reducibility, then this class admits a purely algebraic characterization in terms of identities.

Our first result is that, for most proof systems P𝑃P in the literature, each of these methods of reduction preserves the proof complexity of the problem with respect to proofs in P𝑃P. Technically, what this means is that if Bsuperscript𝐵B^{\prime} is obtained from B𝐵B by a finite number of constructions a), b) and c), then, for any appropriate encoding scheme of the statement that an instance is unsatisfiable, efficient proofs of unsatisfiability in P𝑃P for instances of B𝐵B translate into efficient proofs of unsatisfiability in P𝑃P for instances of Bsuperscript𝐵B^{\prime}. Our results hold for a very general definition of an appropriate encoding scheme that we call local. The propositional proof systems for which we prove these results include DNF-resolution with terms of bounded size, Bounded-Depth Frege, and (unrestricted) Frege. The algebraic and semi-algebraic proof systems for which we prove it include Polynomial Calculus (PC) over any field, Sherali-Adams (SA), Lasserre/SOS, and Lovász-Schrijver (LS) of bounded and unbounded degree. This is the object of Section 4.

Our second main result is an application: we obtain unconditional gap theorems for the proof complexity of CSPs. Building on the bounded-width theorem for CSPs [12, 19], the known correspondence between local consistency algorithms, existential pebble games and bounded width resolution [35, 7], the lower bounds for propositional, algebraic and semi-algebraic proof systems [1, 37, 16, 17, 28, 22, 23], and a modest amount of additional work to fill in the gaps, we prove the following strong gap theorem:

Theorem 1.

Let B𝐵B be a finite constraint language. Then, exactly one of the following holds:

  1. 1.

    B𝐵Bhas resolution refutations of constant width,

  2. 2.

    B𝐵Bhas neither bounded-depth Frege refutations of subexponential size, nor PC over the reals, nor SOS refutations of sublinear degree.

In Theorem 1 and below, the statement that the constraint language B𝐵B has efficient proofs in proof system P𝑃P means that, for some and hence every local encoding scheme, all unsatisfiable instances of B𝐵B have efficient refutations in P𝑃P. Also, here and below, sublinear means o(n)𝑜𝑛o(n), sublinear-exponential means 2o(n)superscript2𝑜𝑛2^{o(n)}, and subexponential means 2no(1)superscript2superscript𝑛𝑜12^{n^{o(1)}}, where n𝑛n is the number of variables of the instance.

The proof of Theorem 1 actually shows that case 1 happens precisely if B𝐵B has bounded width. As noted earlier, the collapse of Lasserre/SOS to bounded width was already known; here we give a different proof. By a very recent result on the simulation of Polynomial Calculus over the real-field by Lasserre/SOS [18], the collapse of Lasserre/SOS implies the collapse of Polynomial Calculus. The proof we present does not depend on that. Instead we exploit directly the theory of reducibility.

As an immediate corollary we get that resolution is also captured by algebra, despite the fact that our methods fall short to prove that it is closed under reductions.

Corollary 1.

Let B𝐵B be a finite constraint language. The following are equivalent:

  1. 1.

    B𝐵Bhas bounded width,

  2. 2.

    B𝐵Bhas resolution refutations of constant width,

  3. 3.

    B𝐵Bhas resolution refutation of sublinear width,

  4. 4.

    B𝐵Bhas resolution refutations of polynomial size,

  5. 5.

    B𝐵Bhas resolution refutations of sublinear-exponential size,

  6. 6.

    B𝐵Bhas Frege refutations of bounded depth and polynomial size,

  7. 7.

    B𝐵Bhas Frege refutations of bounded depth and subexponential size,

  8. 8.

    B𝐵Bhas SA, SOS, and PC refutations over the reals of constant degree,

  9. 9.

    B𝐵Bhas SA, SOS, and PC refutations over the reals of sublinear degree.

The proof of this is the object of Sections 5 and 6.

Section 7 is about proof systems that operate with polynomial inequalities and that are stronger than Lasserre/SOS. Theorem 1 raises the question of identifying a proof system that is closed under reducibilities and that can surpass bounded width. In other words: is there a natural proof system for which the class of languages that have efficient unsatisfiability proofs is closed under the standard reducibility methods for CSPs, and that at the same time has efficient unsatisfiability proofs beyond bounded width? By the bounded-width theorem for CSPs, one way, and indeed the only way, of surpassing bounded width is by having efficient proofs of unsatisfiability for systems of linear equations over some finite Abelian group. A straightforward answer to our question is thus the following: Polynomial Calculus over a field of non-zero characteristic p𝑝p has efficient unsatisfiability proofs for systems of linear equations over psubscript𝑝\mathbb{Z}_{p}. On the other hand, in view of the limitations of Polynomial Calculus over the real-field, and of certain semi-algebraic proof systems that are imposed by Theorem 1, it is perhaps a surprise that, as we show, bounded degree Lovász-Schrijver also satisfies both requirements.

Theorem 2.

Unsatisfiable systems of linear equations over the 2-element group have LS refutations of bounded degree and polynomial size.

Proving this amounts to showing that Gaussian elimination over 2subscript2\mathbb{Z}_{2} can be simulated by reasoning with low-degree polynomial inequalities over \mathbb{R}. The proof of this counter-intuitive fact relies on earlier work in proof complexity for reasoning about gaps of the type (,c][c+1,+)𝑐𝑐1(-\infty,c]\cup[c+1,+\infty), for c𝑐c\in\mathbb{Z}, through quadratic polynomial inequalities [30].

It should be pointed out that another proof system that can efficiently solve CSPs of bounded width, and that at the same time goes beyond bounded width, is the proof system that operates with ordered binary decision diagrams from [8]. Although it looks unlikely that our methods could be used for this proof system, whether it is closed under the standard CSP reductions is something that was not checked, neither in [8], nor here.

In Section 8 we demonstrate the applicability of our results. Consider the graph 3-coloring problem seen as the CSP of a finite constraint language on a 3-element domain in the standard way. Since it is known that 3-coloring has unbounded width, Corollary 1 applies to it, and we get 3-coloring instances that are hard for all indicated proof systems. We open the box of the method, and elaborate on that, in order to get explicit 3-coloring instances that are hard for Polynomial Calculus over all fields simultaneously. This gives a new proof of the main result in [40]. Indeed, the same analysis applies to all CSPs that are NP-complete and all proof systems that are closed under reducibilities. This way we solve Open Problem 5.3 in [40] that asks for explicit 3-coloring instances that are hard for Lasserre/SOS.

This article is an extended version of [10]. Except for providing full proof details, we generalise the main gap theorem to cover Polynomial Calculus over the reals and apply our results to the 3-coloring problem, as explained in the paragraph above.

2 Preliminaries

2.1 Propositional logic and propositional proofs

Formulas.

Fix a set of propositional variables taking values true or false. A literal is a variable X𝑋X or the negation of a variable X¯¯𝑋\overline{X}. We write propositional formulas out of literals using conjunctions \wedge, disjunctions \vee, and parentheses, with the usual conventions on parentheses. Also we implicitly think of \wedge and \vee as commutative, associative and idempotent. Thus the formula AA𝐴𝐴A\wedge A is viewed literally the same as A𝐴A, the formula AB𝐴𝐵A\wedge B is viewed literally the same as BA𝐵𝐴B\wedge A, and the formula (AB)C𝐴𝐵𝐶(A\wedge B)\wedge C is viewed literally the same as A(BC)𝐴𝐵𝐶A\wedge(B\wedge C). The same applies to disjunctions. Negation is allowed only at the level of literals, so our formulas are written in negation normal form. If A𝐴A is a formula, we define its complement A¯¯𝐴\overline{A} inductively: if A𝐴A is a variable X𝑋X, then A¯=X¯¯𝐴¯𝑋\overline{A}=\overline{X}; if A𝐴A is a negated variable X¯¯𝑋\overline{X}, then A¯=X¯𝐴𝑋\overline{A}=X; if A𝐴A is a conjunction CD𝐶𝐷C\wedge D, then A¯=C¯D¯¯𝐴¯𝐶¯𝐷\overline{A}=\overline{C}\vee\overline{D}; if A𝐴A is a disjunction CD𝐶𝐷C\vee D, then A¯=C¯D¯¯𝐴¯𝐶¯𝐷\overline{A}=\overline{C}\wedge\overline{D}. The empty formula is denoted 00 and is always false by convention. Its complement 0¯¯0\overline{0} is denoted 111, and is always true by convention. We think of 00 and 111 as the neutral elements of \vee and \wedge, respectively, and the absorbing elements of \wedge and \vee, respectively. Thus we view the formulas 0A0𝐴0\vee A and 1A1𝐴1\wedge A as literally the same as A𝐴A, and 0A0𝐴0\wedge A and 1A1𝐴1\vee A as literally the same as 00 and 111, respectively. The size s(A)𝑠𝐴s(A) of a formula A𝐴A is defined inductively: if A𝐴A is 00 or 111, then s(A)=0𝑠𝐴0s(A)=0; if A𝐴A is a literal, then s(A)=1𝑠𝐴1s(A)=1; if A𝐴A is a conjunction CD𝐶𝐷C\wedge D or a disjunction CD𝐶𝐷C\vee D with non-absorbing and non-neutral C𝐶C and D𝐷D, then s(A)=s(C)+s(D)+1𝑠𝐴𝑠𝐶𝑠𝐷1s(A)=s(C)+s(D)+1.

Propositional proof systems.

We work with a Tait-style proof system for propositional logic that we call Frege. The system manipulates formulas in negation normal form and has the following four rules of inference called axiom, cut, introduction of conjunction, and weakening:

AA¯CADA¯CDCADBCD(AB)CCD.absent𝐴¯𝐴𝐶𝐴𝐷¯𝐴𝐶𝐷𝐶𝐴𝐷𝐵𝐶𝐷𝐴𝐵𝐶𝐶𝐷\frac{}{A\vee\overline{A}}\;\;\;\;\;\;\;\;\frac{C\vee A\;\;\;\;D\vee\overline{A}}{C\vee D}\;\;\;\;\;\;\;\;\frac{C\vee A\;\;\;\;D\vee B}{C\vee D\vee(A\wedge B)}\;\;\;\;\;\;\;\;\frac{C}{C\vee D}.(1)

In these rules, C𝐶C and D𝐷D could be the empty formula 00 or its complement 111. In particular 111 is an instance of an axiom rule. A Frege proof is called cut-free if it does not use the cut rule. A Frege proof from a set of formulas F𝐹F is a proof in which the formulas in F𝐹F are allowed as additional axioms. In case such a proof ends with the empty formula we call it a Frege refutation of F𝐹F. As a proof system, Frege is sound and implicationally complete, which means that if A𝐴A is a logical consequence of A1,,Amsubscript𝐴1subscript𝐴𝑚A_{1},\ldots,A_{m}, then there is a Frege proof of A𝐴A from A1,,Amsubscript𝐴1subscript𝐴𝑚A_{1},\ldots,A_{m}. We will give a proof of this in Section 2.2 that will apply also to certain subsystems of Frege. If 𝒞𝒞\mathcal{C} is a class of formulas, a 𝒞𝒞\mathcal{C}-Frege proof is one that has all its formulas in the class 𝒞𝒞\mathcal{C}. The size of a proof is the sum of the sizes of the formulas in it. The length of a proof is the number of formulas in it.

Resolution, k𝑘k-DNF Frege and Bounded Depth Frege.

A term is a conjunction of literals and a clause is a disjunction of literals. A k𝑘k-term or a k𝑘k-clause is one with at most k𝑘k literals. A k𝑘k-DNF is a disjunction of k𝑘k-terms and a k𝑘k-CNF is a conjunction of k𝑘k-clauses.

We define the classes of Σt,ksubscriptΣ𝑡𝑘\Sigma_{t,k}- and Πt,ksubscriptΠ𝑡𝑘\Pi_{t,k}-formulas inductively. For t=1𝑡1t=1, these are just the classes of k𝑘k-DNF and k𝑘k-CNF formulas, respectively. For t2𝑡2t\geq 2, a formula is Σt,ksubscriptΣ𝑡𝑘\Sigma_{t,k} if it is a disjunction of Πt1,ksubscriptΠ𝑡1𝑘\Pi_{t-1,k}-formulas, and it is Πt,ksubscriptΠ𝑡𝑘\Pi_{t,k} if it is a conjunction of Σt1,ksubscriptΣ𝑡1𝑘\Sigma_{t-1,k}-formulas.

In this paper, we use the expression Frege proof of depth t𝑡t and bottom fan-in k𝑘kto mean a Σt,ksubscriptΣ𝑡𝑘\Sigma_{t,k}-Frege proof. Bounded-depth Frege means Σt,ksubscriptΣ𝑡𝑘\Sigma_{t,k}-Frege for some fixed t𝑡t and k𝑘k. This coincides with other definitions in the literature. Frege of depth t𝑡t and bottom fan-in k𝑘k, as a proof system, is sound and implicationally complete for proving Σt,ksubscriptΣ𝑡𝑘\Sigma_{t,k}-formulas from Σt,ksubscriptΣ𝑡𝑘\Sigma_{t,k}-formulas. A proof of this will follow from the general completeness theorem below.

Σ1,1subscriptΣ11\Sigma_{1,1} is the class of clauses. It is well-known that Σ1,1subscriptΣ11\Sigma_{1,1}-Frege and resolution proofs are basically the same thing (the difference is that in Σ1,1subscriptΣ11\Sigma_{1,1}-Frege proofs we allow clause axioms and weakening, but these can always be removed at no cost). A resolution proof which uses only l𝑙l-clauses is called a proof of width l𝑙l.

Σ1,ksubscriptΣ1𝑘\Sigma_{1,k}-Frege, for k2𝑘2k\geq 2, is the system R(k)R𝑘\mathrm{R}(k) introduced by Krajicek [36], also known as Res(k)Res𝑘\mathrm{Res}(k), k𝑘k-DNF resolution, and k𝑘k-DNF Frege. This family of proof systems is important for us because, by letting k𝑘k range over all constants (i.e., by considering R(const)Rconst\mathrm{R}(\mathrm{const})), it is the weakest for which we can prove closure under reductions.

2.2 Completeness of Frege and its subsystems

The proof that Frege is implicationally complete is rather standard. We give a detailed proof nonetheless because we want to have concrete bounds.

Theorem 3 (Quantitative Completeness).

Let 𝒞𝒞\mathcal{C} be a class of formulas that is closed under subformulas and complementation, and let 𝒞superscript𝒞\mathcal{C}^{\prime} be the closure of 𝒞𝒞\mathcal{C} under disjunctions. Let A1,,Amsubscript𝐴1subscript𝐴𝑚A_{1},\ldots,A_{m} and A𝐴A be formulas in 𝒞superscript𝒞\mathcal{C}^{\prime}. If A1,,Amsubscript𝐴1subscript𝐴𝑚A_{1},\ldots,A_{m} logically imply A𝐴A, then there is a 𝒞superscript𝒞\mathcal{C}^{\prime}-Frege proof of A𝐴A from A1,,Amsubscript𝐴1subscript𝐴𝑚A_{1},\ldots,A_{m}. Moreover, if the formulas A1,,Amsubscript𝐴1subscript𝐴𝑚A_{1},\ldots,A_{m} and A𝐴A have n𝑛n variables and size at most s𝑠s, then the size of the proof is at most polynomial in n𝑛n, s𝑠s, m𝑚m, 2nsuperscript2𝑛2^{n} and smsuperscript𝑠𝑚s^{m}.

Proof.

Let X1,,Xnsubscript𝑋1subscript𝑋𝑛X_{1},\ldots,X_{n} be the variables in A1,,Amsubscript𝐴1subscript𝐴𝑚A_{1},\ldots,A_{m} and A𝐴A. For c{0,1}𝑐01c\in\{0,1\}, let Xi(c)superscriptsubscript𝑋𝑖𝑐X_{i}^{(c)} denote the negative literal Xi¯¯subscript𝑋𝑖\overline{X_{i}} if c=0𝑐0c=0, and the positive literal Xisubscript𝑋𝑖X_{i} if c=1𝑐1c=1. For a truth assignment b=(b1,,bn){0,1}n𝑏subscript𝑏1subscript𝑏𝑛superscript01𝑛b=(b_{1},\ldots,b_{n})\in\{0,1\}^{n}, let Sbsubscript𝑆𝑏S_{b} be the formula i=1nXi(1bi)superscriptsubscript𝑖1𝑛superscriptsubscript𝑋𝑖1subscript𝑏𝑖\bigvee_{i=1}^{n}X_{i}^{(1-b_{i})}. First we show that, for each formula B𝐵B on the variables X1,,Xnsubscript𝑋1subscript𝑋𝑛X_{1},\ldots,X_{n} and each truth assignment b=(b1,,bn){0,1}n𝑏subscript𝑏1subscript𝑏𝑛superscript01𝑛b=(b_{1},\ldots,b_{n})\in\{0,1\}^{n}, if b𝑏b satisfies B𝐵B, then there is a cut-free proof of SbBsubscript𝑆𝑏𝐵S_{b}\vee B from no assumptions. This is proved by induction on the size of B𝐵B. If B𝐵B is a literal, say B=Xi𝐵subscript𝑋𝑖B=X_{i} or B=Xi¯𝐵¯subscript𝑋𝑖B=\overline{X_{i}}, then SbBsubscript𝑆𝑏𝐵S_{b}\vee B is obtained as the weakening of the axiom Xi(1bi)Xi(bi)superscriptsubscript𝑋𝑖1subscript𝑏𝑖superscriptsubscript𝑋𝑖subscript𝑏𝑖X_{i}^{(1-b_{i})}\vee X_{i}^{(b_{i})}. If B𝐵B is a conjunction, say B=CD𝐵𝐶𝐷B=C\wedge D, then b𝑏b satisfies both C𝐶C and D𝐷D, and by induction hypothesis there are cut-free proofs of SbCsubscript𝑆𝑏𝐶S_{b}\vee C and SbDsubscript𝑆𝑏𝐷S_{b}\vee D. A cut-free proof of SbBsubscript𝑆𝑏𝐵S_{b}\vee B then follows from applying introduction of conjunction. If B𝐵B is a disjunction, say B=CD𝐵𝐶𝐷B=C\vee D, then b𝑏b satisfies either C𝐶C or D𝐷D, and by induction hypothesis there is a cut-free proof of either SbCsubscript𝑆𝑏𝐶S_{b}\vee C or SbDsubscript𝑆𝑏𝐷S_{b}\vee D. A cut-free proof of SbBsubscript𝑆𝑏𝐵S_{b}\vee B then follows from applying weakening. Note that the length of the proof constructed this way is bounded by s(B)𝑠𝐵s(B), and since all the formulas in the proof have sizes bounded by n+s(B)𝑛𝑠𝐵n+s(B), the size of the proof is bounded by (n+s(B))s(B)𝑛𝑠𝐵𝑠𝐵(n+s(B))s(B). Note also for later use that, as a consequence of the assumption that 𝒞𝒞\mathcal{C} is closed under subformulas, the following holds: if B𝐵B is a disjunction of formulas in 𝒞𝒞\mathcal{C}, say B=iBi𝐵subscript𝑖subscript𝐵𝑖B=\bigvee_{i}B_{i}, then each formula in this proof is a disjunction of formulas in 𝒞𝒞\mathcal{C}, and if B𝐵B is a conjunction of formulas in 𝒞𝒞\mathcal{C}, say B=iBi𝐵subscript𝑖subscript𝐵𝑖B=\bigwedge_{i}B_{i}, then the construction gives a cut-free proof of SbBisubscript𝑆𝑏subscript𝐵𝑖S_{b}\vee B_{i} for each Bisubscript𝐵𝑖B_{i}, and each formula in the proof of SbBisubscript𝑆𝑏subscript𝐵𝑖S_{b}\vee B_{i} is again a disjunction of formulas in 𝒞𝒞\mathcal{C}.

Now we assume that A𝐴A is a logical consequence of A1,,Amsubscript𝐴1subscript𝐴𝑚A_{1},\ldots,A_{m} and we build a proof of A𝐴A from A1,,Amsubscript𝐴1subscript𝐴𝑚A_{1},\ldots,A_{m}. This proof will not yet be guaranteed to have all its formulas in 𝒞superscript𝒞\mathcal{C}^{\prime}. We will deal with this issue later. For each truth assignment b{0,1}n𝑏superscript01𝑛b\in\{0,1\}^{n}, the following hold: 1) if b𝑏b satisfies A𝐴A, then the previous paragraph gives a proof of SbAsubscript𝑆𝑏𝐴S_{b}\vee A, and 2) if b𝑏b falsifies A𝐴A, then it also falsifies some Ajsubscript𝐴𝑗A_{j} for some j[m]𝑗delimited-[]𝑚j\in[m], and the previous paragraph gives a proof of SbAj¯subscript𝑆𝑏¯subscript𝐴𝑗S_{b}\vee\overline{A_{j}}. From these 2nsuperscript2𝑛2^{n} proofs, a sequence of 2n1superscript2𝑛12^{n}-1 cuts followed by one weakening gives a proof of AA1¯Am¯𝐴¯subscript𝐴1¯subscript𝐴𝑚A\vee\overline{A_{1}}\vee\cdots\vee\overline{A_{m}}. From there a sequence of m𝑚m cuts with the m𝑚m hypotheses A1,,Amsubscript𝐴1subscript𝐴𝑚A_{1},\ldots,A_{m} gives a proof of A𝐴A. Finally we argue how to turn this proof into one that uses only formulas in 𝒞superscript𝒞\mathcal{C}^{\prime}. For the proofs of the type SbAsubscript𝑆𝑏𝐴S_{b}\vee A there is no issue because A𝐴A is a disjunction of formulas in 𝒞𝒞\mathcal{C} and the previous paragraph argues that such proofs have all its formulas in 𝒞superscript𝒞\mathcal{C}^{\prime}. The problem comes from the proofs of the type SbAj¯subscript𝑆𝑏¯subscript𝐴𝑗S_{b}\vee\overline{A_{j}}. However, since each Ajsubscript𝐴𝑗A_{j} is a disjunction of formulas in 𝒞𝒞\mathcal{C}, say Aj=kIjAjksubscript𝐴𝑗subscript𝑘subscript𝐼𝑗subscript𝐴𝑗𝑘A_{j}=\bigvee_{k\in I_{j}}A_{jk}, its negation Aj¯¯subscript𝐴𝑗\overline{A_{j}} is a conjunction of formulas in 𝒞𝒞\mathcal{C}, because 𝒞𝒞\mathcal{C} is closed under complementation. This means that instead of using the proof of SbAj¯subscript𝑆𝑏¯subscript𝐴𝑗S_{b}\vee\overline{A_{j}}, we could have used the proof of SbAjk¯subscript𝑆𝑏¯subscript𝐴𝑗𝑘S_{b}\vee\overline{A_{jk}} for each kIj𝑘subscript𝐼𝑗k\in I_{j}. We do this for each choice of (k1,,km)I1××Imsubscript𝑘1subscript𝑘𝑚subscript𝐼1subscript𝐼𝑚(k_{1},\ldots,k_{m})\in I_{1}\times\cdots\times I_{m}, and what we get are proofs of AA1k1¯Amkm¯𝐴¯subscript𝐴1subscript𝑘1¯subscript𝐴𝑚subscript𝑘𝑚A\vee\overline{A_{1k_{1}}}\vee\cdots\vee\overline{A_{mk_{m}}}. These proofs now have all their formulas in 𝒞superscript𝒞\mathcal{C}^{\prime}. Combining these at most smsuperscript𝑠𝑚s^{m} many proofs with the hypotheses A1,,Amsubscript𝐴1subscript𝐴𝑚A_{1},\ldots,A_{m} in a sequence of at most smsuperscript𝑠𝑚s^{m} cuts, we get a proof of A𝐴A from A1,,Amsubscript𝐴1subscript𝐴𝑚A_{1},\ldots,A_{m}, and all the formulas in this proof are in 𝒞superscript𝒞\mathcal{C}^{\prime}. The size is polynomial in n𝑛n, s𝑠s, m𝑚m, 2nsuperscript2𝑛2^{n} and smsuperscript𝑠𝑚s^{m}, and the proof is complete. ∎

The quantitative completeness theorem applies to Σ1,ksubscriptΣ1𝑘\Sigma_{1,k}-Frege (k𝑘k-DNF Frege and resolution) because if 𝒞𝒞\mathcal{C} is the class of k𝑘k-terms and k𝑘k-clauses, then 𝒞𝒞\mathcal{C} is closed under subformulas and complementation, and the closure of 𝒞𝒞\mathcal{C} under disjunctions is the class of k𝑘k-DNFs. It also applies to Σt,ksubscriptΣ𝑡𝑘\Sigma_{t,k}-Frege, for t2𝑡2t\geq 2, because the class Σt1,kΠt1,ksubscriptΣ𝑡1𝑘subscriptΠ𝑡1𝑘\Sigma_{t-1,k}\cup\Pi_{t-1,k} is closed under subformulas and complementation, and its closure under disjunctions is precisely Σt,ksubscriptΣ𝑡𝑘\Sigma_{t,k}.

2.3 Polynomials and algebraic proofs

Polynomials.

We define everything for the real field \mathbb{R} for simplicity. For algebraic proofs the field would not matter, but for semi-algebraic proofs we need an ordered field such as \mathbb{R}. Let X1,,Xnsubscript𝑋1subscript𝑋𝑛X_{1},\ldots,X_{n} be n𝑛n algebraic commuting variables ranging over \mathbb{R}. We want to define proof systems that manipulate equations of the form P=0𝑃0P=0 and inequalities of the form P0𝑃0P\geq 0, where P𝑃P is a polynomial in [X1,,Xn]subscript𝑋1subscript𝑋𝑛\mathbb{R}[X_{1},\ldots,X_{n}], the ring of polynomials with commuting variables X1,,Xnsubscript𝑋1subscript𝑋𝑛X_{1},\ldots,X_{n} and coefficients in \mathbb{R}. For our purposes it will suffice to assume that the variables range over {0,1}01\{0,1\}. Accordingly, it will also be convenient to introduce twin variables X1¯,,Xn¯¯subscript𝑋1¯subscript𝑋𝑛\bar{X_{1}},\ldots,\bar{X_{n}} with the intended meaning that Xi¯=1Xi¯subscript𝑋𝑖1subscript𝑋𝑖\bar{X_{i}}=1-X_{i} for i=1,,n𝑖1𝑛i=1,\ldots,n. In all proof systems of this section, the following axioms will be imposed on the variables:

Xi2Xi=0X¯i2X¯i=0Xi+Xi¯1=0.superscriptsubscript𝑋𝑖2subscript𝑋𝑖0missing-subexpressionsuperscriptsubscript¯𝑋𝑖2subscript¯𝑋𝑖0missing-subexpressionsubscript𝑋𝑖¯subscript𝑋𝑖10\begin{array}[]{lllll}X_{i}^{2}-X_{i}=0&&\bar{X}_{i}^{2}-\bar{X}_{i}=0&&X_{i}+\bar{X_{i}}-1=0.\end{array}(2)

Observe that XiXi¯=0subscript𝑋𝑖¯subscript𝑋𝑖0X_{i}\bar{X_{i}}=0 follows from these axioms: multiply Xi+Xi¯1=0subscript𝑋𝑖¯subscript𝑋𝑖10X_{i}+\bar{X_{i}}-1=0 by Xisubscript𝑋𝑖X_{i} and subtract Xi2Xi=0superscriptsubscript𝑋𝑖2subscript𝑋𝑖0X_{i}^{2}-X_{i}=0. This sort of reasoning is captured by the proof systems we are about to define.

Algebraic and semi-algebraic proof systems.

Let P𝑃P and Q𝑄Q denote polynomials. In addition to the axioms in (2), consider the following inference rules called addition and multiplication:

P=0Q=0P+Q=0P=0PQ=0.formulae-sequence𝑃0𝑄0𝑃𝑄0𝑃0𝑃𝑄0\frac{P=0\;\;\;\;\;\;\;\;\;\;Q=0}{P+Q=0}\;\;\;\;\;\;\;\;\;\;\;\frac{P=0}{PQ=0}.(3)

Clearly, these rules are sound: any assignment f:{X1,,Xn,X¯1,,X¯n}:𝑓subscript𝑋1subscript𝑋𝑛subscript¯𝑋1subscript¯𝑋𝑛f:\{X_{1},\ldots,X_{n},\bar{X}_{1},\ldots,\bar{X}_{n}\}\rightarrow\mathbb{R} that satisfies the equations in the premises, also satisfies the equation in the conclusions. For semi-algebraic proofs we add the following axioms:

Xi0X¯i01Xi01X¯i010.subscript𝑋𝑖0missing-subexpressionsubscript¯𝑋𝑖0missing-subexpression1subscript𝑋𝑖0missing-subexpression1subscript¯𝑋𝑖0missing-subexpression10\begin{array}[]{lllllllll}X_{i}\geq 0&&\bar{X}_{i}\geq 0&&1-X_{i}\geq 0&&1-\bar{X}_{i}\geq 0&&1\geq 0.\end{array}(4)

and the following inference rules for polynomial inequalities:

P0Q0P+Q0P0Q0PQ0P20.formulae-sequence𝑃0𝑄0𝑃𝑄0formulae-sequence𝑃0𝑄0𝑃𝑄0absentsuperscript𝑃20\frac{P\geq 0\;\;\;\;\;\;\;\;\;\;Q\geq 0}{P+Q\geq 0}\;\;\;\;\;\;\;\;\;\;\;\frac{P\geq 0\;\;\;\;\;\;\;\;\;\;Q\geq 0}{PQ\geq 0}\;\;\;\;\;\;\;\;\;\;\;\frac{}{P^{2}\geq 0}.(5)

These rules are called addition, multiplication and positivity of squares and are also sound for assignments f:{X1,,Xn,X¯1,,X¯n}:𝑓subscript𝑋1subscript𝑋𝑛subscript¯𝑋1subscript¯𝑋𝑛f:\{X_{1},\ldots,X_{n},\bar{X}_{1},\ldots,\bar{X}_{n}\}\rightarrow\mathbb{R}. One could also consider additional rules that link equalities with inequalities, such as deriving P0𝑃0P\geq 0 from P=0𝑃0P=0, or deriving P=0𝑃0P=0 from P0𝑃0P\geq 0 and P0𝑃0-P\geq 0, but if we think of an equality as two inequalities, then they are not strictly necessary. On the other hand, some of the axioms are redundant, such as 10101\geq 0 which can be obtained from adding Xi0subscript𝑋𝑖0X_{i}\geq 0 and 1Xi01subscript𝑋𝑖01-X_{i}\geq 0, but for the sake of clarity in writing proofs we prefer to keep them.

If H𝐻H denotes a system of polynomial equations P1=0,,Pr=0formulae-sequencesubscript𝑃10subscript𝑃𝑟0P_{1}=0,\ldots,P_{r}=0 and P=0𝑃0P=0 is a further equation, an algebraic proof of P=0𝑃0P=0 from H𝐻H is a sequence of polynomial equations ending with P=0𝑃0P=0 where each equation in the proof is either a hypothesis equation from H𝐻H, or an axiom equation as in (2), or follows from previous equations in the sequence by one of the inference rules in (3). If H𝐻H in addition includes a system of polynomial inequalities Q10,,Qs0formulae-sequencesubscript𝑄10subscript𝑄𝑠0Q_{1}\geq 0,\ldots,Q_{s}\geq 0, then a semi-algebraic proof of Q0𝑄0Q\geq 0 from H𝐻H is defined analogously except that we think of each equation as two inequalities, we use additionally the axioms in (4), and we use additionally the rules in (5). Note that by writing Q=Q+Q𝑄superscript𝑄superscript𝑄Q=Q^{+}-Q^{-}, where Q+superscript𝑄Q^{+} and Qsuperscript𝑄Q^{-} have only positive coefficients, the rules in (3) are actually easily simulated by the rules in (5) (for the multiplication rule, this uses also the axioms in (4)). If an algebraic proof ends with the equation 1=0101=0, or similarly if a semi-algebraic proof ends with the inequality 1010-1\geq 0, we call it a refutation of H𝐻H.

As proof systems for deriving new polynomial equations or inequalities that follow from old ones on all evaluations of their variables in {0,1}01\{0,1\}, both systems are sound and implicationally complete (we note, however, that without some restrictions on the domain of evaluation, such as {0,1}01\{0,1\} in our case, the completeness claim is not true). In Section 2.4 below we will prove implicational completeness for two subsystems of algebraic and semi-algebraic proofs, and hence for algebraic and semi-algebraic proofs themselves.

The main complexity measures for algebraic and semi-algebraic proofs are size and degree. Size is measured by the number of symbols it takes to write the representations of the polynomials in the proofs, and degree is the maximum of the total degrees of the polynomials in the proofs. Polynomials are typically represented as explicit sums of monomials, or as algebraic formulas or circuits. Using formulas or circuits as representations requires some additional technicalities in the definitions of the rules, that we want to avoid (see [42, 29]). For all our examples below, we use the representation of an explicit sum of monomials.

Some proof systems from the literature.

The proofs in the Polynomial Calculus (PC) are algebraic proofs restricted in such a way that the polynomial Q𝑄Q in the multiplication rule in (3) is either a scalar or a variable [24]. In the literature, this has been called PCR for PC with resolution (see [2]), due to the presence of twin variables, but in recent works the shorter original name PC is used. As pointed out earlier, algebraic proofs can be defined over arbitrary scalar-fields F𝐹F beyond the real-field \mathbb{R}. A claim about algebraic proofs in which the field is omitted is meant to hold for all fields simultaneously. Whenever we need to specify the field F𝐹F, we speak of algebraic and PC proofs over F𝐹F.

The proofs in the Lovász-Schrijver (LS) proof system are semi-algebraic proofs for which the following restrictions apply: 1) the polynomial Q𝑄Q in the multiplication rule in (5) is either a positive scalar or a variable, and 2) the positivity-of-squares rule in (5) is not allowed. When the positivity-of-squares is also allowed, the system is called Positive Semidefinite Lovász-Schrijver and is denoted LS+. Originally the Lovász-Schrijver proof system was defined to manipulate quadratic polynomials only (see [41, 43]). We follow [30] and consider the extension to arbitrary degree. For the original Lovász-Schrijver proof systems we use LS2 and LS+2superscriptsubscriptabsent2{}_{2}^{+}. Degree-d𝑑d Lovász-Schrijver and degree-d𝑑d Positive Semidefinite Lovász-Schrijver are denoted LSd and LS+dsuperscriptsubscriptabsent𝑑{}_{d}^{+}, respectively. For LS and LS+ proofs, an important complexity measure originally studied by Lovász and Schrijver is their rank, which is the maximum nesting depth of multiplication by a variable in the proof. Note that, due to possible cancellations, the degree of an LS proof could in principle be much smaller than its rank.

We define four additional proof systems called Nullstellensatz (NS), Sherali-Adams (SA), Positive Semidefinite Sherali-Adams (SA+), and Lasserre/Sums-of-Squares (SOS). For NS, SA and SA+, we define them as the subsystems of PC, LS and LS+, respectively, in which all applications of the multiplication rule must precede all applications of the addition rule. Due to the structural restriction in which multiplications precede additions, we can think of a proof from a set H𝐻H of hypotheses as a static polynomial identity of the form

i=1rPicijJiXjkKiX¯k=P,superscriptsubscript𝑖1𝑟subscript𝑃𝑖subscript𝑐𝑖subscriptproduct𝑗subscript𝐽𝑖subscript𝑋𝑗subscriptproduct𝑘subscript𝐾𝑖subscript¯𝑋𝑘𝑃\sum_{i=1}^{r}P_{i}\cdot c_{i}\prod_{j\in J_{i}}X_{j}\prod_{k\in K_{i}}\bar{X}_{k}=P,(6)

where P1,,Prsubscript𝑃1subscript𝑃𝑟P_{1},\ldots,P_{r} are polynomials that either come from the set H𝐻H of hypotheses, or they are axiom polynomials from the lists in (2) and (4) as appropriate (i.e., from (2) for NS, and from both (2) and (4) for SA and SA+), or are squares of polynomials when they are allowed (i.e., for SA+), and c1,,crsubscript𝑐1subscript𝑐𝑟c_{1},\ldots,c_{r} are scalars of the appropriate type (i.e., arbitrary when the Pisubscript𝑃𝑖P_{i} they multiply comes from an equation, or positive when the Pisubscript𝑃𝑖P_{i} they multiply comes from an inequality). Finally we define Lasserre/Sums-of-Squares proof system as the subsystem of semi-algebraic proofs to which the following restrictions apply: 1) the polynomial Q𝑄Q is arbitrary in the multiplication rule in (3) and it is a square polynomial in the multiplication rule in (5), and 2) all multiplications precede all additions. Thus, in terms of static identities, these are proofs of the form

i=1rPiSi=P,superscriptsubscript𝑖1𝑟subscript𝑃𝑖subscript𝑆𝑖𝑃\sum_{i=1}^{r}P_{i}\cdot S_{i}=P,(7)

where P1,,Prsubscript𝑃1subscript𝑃𝑟P_{1},\ldots,P_{r} are polynomials that either come from the set H𝐻H of hypotheses, or they are axiom polynomials from the lists (2) and (4), or they are squares, and S1,,Srsubscript𝑆1subscript𝑆𝑟S_{1},\ldots,S_{r} are arbitrary polynomials or square polynomials as appropriate (i.e., arbitrary if the Pisubscript𝑃𝑖P_{i} they multiply comes from an equation, and squares if the Pisubscript𝑃𝑖P_{i} they multiply comes from an inequality). Note that the size of an NS, SA, SA+ or SOS proof is polynomially related to the sum of the sizes of the non-zero cisubscript𝑐𝑖c_{i}’s and Sisubscript𝑆𝑖S_{i}’s in the corresponding static identities (6) and (7). Non-static proofs are sometimes called dynamic [30]. We will avoid using this term here.

We close this section by noting the relationships between these proof systems. Clearly, every NS proof of degree d𝑑d is also a PC proof of degree d𝑑d. The converse is certainly not true, but what is true is that every PC proof of degree d𝑑d and rank k𝑘k can be converted into an NS proof of degree d+k𝑑𝑘d+k, where the rank of a PC proof is the analogue of the rank measure for LS proofs that we defined earlier. The same relationships hold between SA and LS, and SA+ and LS+. In all three cases, the conversions go by swapping the order in which the addition and the multiplication rules are applied, when they appear in the wrong order. Also, every NS proof over the reals is an SA proof, which is an SA+ proof. Finally, thanks to the axioms (2), each SA+ proof can be easily converted to an SOS proof of twice the degree: replace each multiplication by a variable X𝑋X by a multiplication by X2superscript𝑋2X^{2}, and subtract the appropriate multiple of the axiom X2X=0superscript𝑋2𝑋0X^{2}-X=0 to effectively simulate the multiplication by X𝑋X. See [39] for a related discussion.

Discussion on variants of NS, SA, SA+ and SOS.

The polynomial identity interpretations of NS, SA, SA+ and SOS, c.f., (6) and (7), are closely related to the original definitions by Beame et al. [15] for NS, and the settings of Sherali and Adams [46] and Lasserre [38] for SA and SOS, respectively. In most incarnations of these proof systems the twin variables are not present; in some others they are (e.g., [9]). If we care only about degree, the presence of twin variables makes no difference at all for Nullstellensatz since we can always simulate a multiplication by X¯isubscript¯𝑋𝑖\bar{X}_{i} by subtracting a multiplication by Xisubscript𝑋𝑖X_{i}. Note, however, that this blows up the size exponentially in the degree. In order to make sense of Sherali-Adams without twin variables, we need to extend the definition to allow Q𝑄Q in the multiplication rule to be, besides a positive scalar or a variable Xisubscript𝑋𝑖X_{i}, a linear polynomial of the form 1Xi1subscript𝑋𝑖1-X_{i}. The static form of such a proof is an identity such as

i=1rPicijJiXjkKi(1Xk)=P,superscriptsubscript𝑖1𝑟subscriptsuperscript𝑃𝑖subscript𝑐𝑖subscriptproduct𝑗subscript𝐽𝑖subscript𝑋𝑗subscriptproduct𝑘subscript𝐾𝑖1subscript𝑋𝑘superscript𝑃\sum_{i=1}^{r}P^{\prime}_{i}\cdot c_{i}\prod_{j\in J_{i}}X_{j}\prod_{k\in K_{i}}(1-X_{k})=P^{\prime},(8)

where P1,,Prsubscriptsuperscript𝑃1subscriptsuperscript𝑃𝑟P^{\prime}_{1},\ldots,P^{\prime}_{r} and Psuperscript𝑃P^{\prime} are polynomials as in (6), but without twin variables. If P1,,Prsubscriptsuperscript𝑃1subscriptsuperscript𝑃𝑟P^{\prime}_{1},\ldots,P^{\prime}_{r} and Psuperscript𝑃P^{\prime} denote the polynomials over X1,,Xnsubscript𝑋1subscript𝑋𝑛X_{1},\ldots,X_{n} that result from the polynomials P1,,Prsubscript𝑃1subscript𝑃𝑟P_{1},\ldots,P_{r} and P𝑃P over X1,,Xn,X¯1,,X¯nsubscript𝑋1subscript𝑋𝑛subscript¯𝑋1subscript¯𝑋𝑛X_{1},\ldots,X_{n},\bar{X}_{1},\ldots,\bar{X}_{n} when each twin variable X¯isubscript¯𝑋𝑖\bar{X}_{i} is replaced by 1Xi1subscript𝑋𝑖1-X_{i}, then any valid proof with twin variables as in (6) transforms into a valid proof without twin variables as in (8). Thus, if we care only about degree, the versions of Sherali-Adams and Positive Semidefinite Sherali-Adams without twin variables simulate the versions with twin variables, for polynomials without twin variables. As for Nullstellensatz the size could blow up exponentially in the degree. The same facts are true for Sums-of-Squares.

Two further comments are in order. For Nullstellensatz, one could consider an alternative definition in which proofs are polynomial identities of the form iPiRi=Psubscript𝑖subscript𝑃𝑖subscript𝑅𝑖𝑃\sum_{i}P_{i}\cdot R_{i}=P, where the Pisubscript𝑃𝑖P_{i} are hypotheses or axiom polynomials, and the Risubscript𝑅𝑖R_{i} are arbitrary polynomials. However this difference is minor since we can always write each Risubscript𝑅𝑖R_{i} as a combination of monomials jcijMijsubscript𝑗subscript𝑐𝑖𝑗subscript𝑀𝑖𝑗\sum_{j}c_{ij}M_{ij} and split PiRisubscript𝑃𝑖subscript𝑅𝑖P_{i}\cdot R_{i} into jPicijMijsubscript𝑗subscript𝑃𝑖subscript𝑐𝑖𝑗subscript𝑀𝑖𝑗\sum_{j}P_{i}\cdot c_{ij}M_{ij}. Second, one could consider the version of Sums-of-Squares in which in addition to squares Sisubscript𝑆𝑖S_{i} as in (7), one is also allowed multiplication by variables. As noted earlier, such multiplications by a variable X𝑋X can be simulated by multiplications by their squares X2superscript𝑋2X^{2}, thanks to the axioms X2X=0superscript𝑋2𝑋0X^{2}-X=0 from (2), at the cost of at most doubling the degree, and blowing up the size at most polynomially.

2.4 Completeness of Nullstellensatz and Sherali-Adams

In this section we prove the implicational completeness of Nullstellensatz and Sherali-Adams with quantitative bounds. We start with two technical lemmas that will be used to justify the elimination of twin variables.

Lemma 1.

For every polynomial P𝑃P of degree d𝑑d and every variable Y𝑌Y, there are NS and SA proofs of P(1YY¯)=0𝑃1𝑌¯𝑌0P\cdot(1-Y-\bar{Y})=0 and P(Y2Y)=0𝑃superscript𝑌2𝑌0P\cdot(Y^{2}-Y)=0 of degree d+1𝑑1d+1 and d+2𝑑2d+2, respectively, and size polynomial in the size of P𝑃P.

Proof.

Split P𝑃P into a sum of monomials jcjMjsubscript𝑗subscript𝑐𝑗subscript𝑀𝑗\sum_{j}c_{j}M_{j}, lift the axiom 1YY¯=01𝑌¯𝑌01-Y-\bar{Y}=0 by cjMjsubscript𝑐𝑗subscript𝑀𝑗c_{j}M_{j}, and add up together to get P(1YY¯)=0𝑃1𝑌¯𝑌0P\cdot(1-Y-\bar{Y})=0. ∎

The second technical lemma that we need formalizes the elimination of twin variables.

Lemma 2.

For every polynomial P𝑃P of degree d𝑑d, every scalar c𝑐c and every two subsets J𝐽J and K𝐾K of [n]delimited-[]𝑛[n], with |J|+|K|=𝐽𝐾|J|+|K|=\ell, there are NS and SA proofs of the equation

PcjJXjkK(1Xk)PcjJXjkKX¯k=0𝑃𝑐subscriptproduct𝑗𝐽subscript𝑋𝑗subscriptproduct𝑘𝐾1subscript𝑋𝑘𝑃𝑐subscriptproduct𝑗𝐽subscript𝑋𝑗subscriptproduct𝑘𝐾subscript¯𝑋𝑘0P\cdot c\prod_{j\in J}X_{j}\prod_{k\in K}(1-X_{k})-P\cdot c\prod_{j\in J}X_{j}\prod_{k\in K}\bar{X}_{k}=0(9)

of degree d+𝑑d+\ell and size polynomial in 2superscript22^{\ell} and the size of c𝑐c and P𝑃P.

Proof.

Assume without loss of generality that K=[t]𝐾delimited-[]𝑡K=[t] where t𝑡t\leq\ell. Let Q=cPjJXj𝑄𝑐𝑃subscriptproduct𝑗𝐽subscript𝑋𝑗Q=cP\prod_{j\in J}X_{j}. Define Rj=Qk=1j(1Xk)k=j+1tX¯ksubscript𝑅𝑗𝑄superscriptsubscriptproduct𝑘1𝑗1subscript𝑋𝑘superscriptsubscriptproduct𝑘𝑗1𝑡subscript¯𝑋𝑘R_{j}=Q\prod_{k=1}^{j}(1-X_{k})\prod_{k=j+1}^{t}\bar{X}_{k} for all j[t]{0}𝑗delimited-[]𝑡0j\in[t]\cup\{0\}. Observe that the goal equation is RtR0=0subscript𝑅𝑡subscript𝑅00R_{t}-R_{0}=0. For each j[t]𝑗delimited-[]𝑡j\in[t], let Tj=Qk=1j1(1Xk)k=j+1tX¯ksubscript𝑇𝑗𝑄superscriptsubscriptproduct𝑘1𝑗11subscript𝑋𝑘superscriptsubscriptproduct𝑘𝑗1𝑡subscript¯𝑋𝑘T_{j}=Q\prod_{k=1}^{j-1}(1-X_{k})\prod_{k=j+1}^{t}\bar{X}_{k}. Now:

(1XjX¯j)Tj=RjRj11subscript𝑋𝑗subscript¯𝑋𝑗subscript𝑇𝑗subscript𝑅𝑗subscript𝑅𝑗1(1-X_{j}-\bar{X}_{j})T_{j}=R_{j}-R_{j-1}(10)

for each j[t]𝑗delimited-[]𝑡j\in[t]. Lemma 1 gives proofs of (1XjX¯j)Tj=01subscript𝑋𝑗subscript¯𝑋𝑗subscript𝑇𝑗0(1-X_{j}-\bar{X}_{j})T_{j}=0 for every j[t]𝑗delimited-[]𝑡j\in[t]. Adding them all together gives RtR0=0subscript𝑅𝑡subscript𝑅00R_{t}-R_{0}=0 by (10) and we are done. ∎

We will need the following definitions. For every assignment a:{X1,,Xn}{0,1}:𝑎subscript𝑋1subscript𝑋𝑛01a:\{X_{1},\ldots,X_{n}\}\rightarrow\{0,1\}, define

Jasubscript𝐽𝑎\displaystyle J_{a}={i[n]:a(Xi)=1},absentconditional-set𝑖delimited-[]𝑛𝑎subscript𝑋𝑖1\displaystyle=\{i\in[n]:a(X_{i})=1\},
Kasubscript𝐾𝑎\displaystyle K_{a}={i[n]:a(Xi)=0}.absentconditional-set𝑖delimited-[]𝑛𝑎subscript𝑋𝑖0\displaystyle=\{i\in[n]:a(X_{i})=0\}.

Define its indicator polynomial:

Ia(X1,,Xn):=jJaXjkKa(1Xk).assignsubscript𝐼𝑎subscript𝑋1subscript𝑋𝑛subscriptproduct𝑗subscript𝐽𝑎subscript𝑋𝑗subscriptproduct𝑘subscript𝐾𝑎1subscript𝑋𝑘I_{a}(X_{1},\ldots,X_{n}):=\prod_{j\in J_{a}}X_{j}\prod_{k\in K_{a}}(1-X_{k}).(11)

For every polynomial P𝑃P, let P(a)𝑃𝑎P(a) denote the evaluation of P𝑃P when Xisubscript𝑋𝑖X_{i} is assigned a(Xi)𝑎subscript𝑋𝑖a(X_{i}).

For a polynomial P𝑃P on the variables X1,,Xnsubscript𝑋1subscript𝑋𝑛X_{1},\ldots,X_{n}, its multilinearization is the unique multilinear polynomial that agrees with P𝑃P on all assignments of values in {0,1}01\{0,1\} to its variables. The uniqueness of the multilinearization follows from the fact that the collection of multilinear polynomials in [X1,,Xn]subscript𝑋1subscript𝑋𝑛\mathbb{R}[X_{1},\ldots,X_{n}] forms a vector space of dimension 2nsuperscript2𝑛2^{n} for which the monomials make a basis. Note that this holds for any field; not just \mathbb{R}.

Lemma 3.

For every polynomial P𝑃P on the variables X1,,Xnsubscript𝑋1subscript𝑋𝑛X_{1},\ldots,X_{n}, there are polynomials Q1,,Qnsubscript𝑄1subscript𝑄𝑛Q_{1},\ldots,Q_{n} such that the following identity holds:

P+i=1nQi(Xi2Xi)=P,𝑃superscriptsubscript𝑖1𝑛subscript𝑄𝑖superscriptsubscript𝑋𝑖2subscript𝑋𝑖superscript𝑃P+\sum_{i=1}^{n}Q_{i}\cdot(X_{i}^{2}-X_{i})=P^{*},(12)

where Psuperscript𝑃P^{*} denotes the multilinearization of P𝑃P. Moreover, each Qisubscript𝑄𝑖Q_{i} has size polynomial in the size of P𝑃P.

Proof.

Observe that it is enough to prove the lemma for the special case of monomials. Indeed, if P𝑃P is an arbitrary polynomial, we get the identity (12) by splitting P𝑃P into a sum of monomials, applying the lemma to each monomial, and adding up the obtained identities.

Let P𝑃P be a monomial. We proceed by induction on the sum of the individual degrees of the variables. If all variables have individual degree one, there is nothing to prove. Otherwise, some variable must have individual degree at least two. Say this variable is Xjsubscript𝑋𝑗X_{j} and let Psuperscript𝑃P^{\prime} and P′′superscript𝑃′′P^{\prime\prime} be such that P=XjP𝑃subscript𝑋𝑗superscript𝑃P=X_{j}P^{\prime} and P=XjP′′superscript𝑃subscript𝑋𝑗superscript𝑃′′P^{\prime}=X_{j}P^{\prime\prime}. Note that the multilinearizations of P𝑃P and Psuperscript𝑃P^{\prime} are the same, and in both Psuperscript𝑃P^{\prime} and P′′superscript𝑃′′P^{\prime\prime} the sum of the individual degrees is strictly smaller. The induction hypothesis applied to Psuperscript𝑃P^{\prime} gives polynomials Q1,,Qnsubscriptsuperscript𝑄1subscriptsuperscript𝑄𝑛Q^{\prime}_{1},\ldots,Q^{\prime}_{n} such that

P+i=1nQi(Xi2Xi)=P=P.superscript𝑃superscriptsubscript𝑖1𝑛subscriptsuperscript𝑄𝑖superscriptsubscript𝑋𝑖2subscript𝑋𝑖superscriptsuperscript𝑃superscript𝑃P^{\prime}+\sum_{i=1}^{n}Q^{\prime}_{i}\cdot(X_{i}^{2}-X_{i})={P^{\prime}}^{*}=P^{*}.(13)

Now the identity we want is obtained by defining Qi=Qisubscript𝑄𝑖subscriptsuperscript𝑄𝑖Q_{i}=Q^{\prime}_{i} for ij𝑖𝑗i\not=j, and Qj=QjP′′subscript𝑄𝑗subscriptsuperscript𝑄𝑗superscript𝑃′′Q_{j}=Q^{\prime}_{j}-P^{\prime\prime}. Indeed:

P+iQi(Xi2Xi)𝑃subscript𝑖subscript𝑄𝑖superscriptsubscript𝑋𝑖2subscript𝑋𝑖\displaystyle P+\sum_{i}Q_{i}\cdot(X_{i}^{2}-X_{i})=P+(QjP′′)(Xj2Xj)+ijQi(Xi2Xi)absent𝑃subscriptsuperscript𝑄𝑗superscript𝑃′′superscriptsubscript𝑋𝑗2subscript𝑋𝑗subscript𝑖𝑗subscriptsuperscript𝑄𝑖superscriptsubscript𝑋𝑖2subscript𝑋𝑖\displaystyle=P+(Q^{\prime}_{j}-P^{\prime\prime})\cdot(X_{j}^{2}-X_{j})+\sum_{i\not=j}Q^{\prime}_{i}\cdot(X_{i}^{2}-X_{i})(14)
=P+iQi(Xi2Xi)P′′Xj2+P′′Xj=absent𝑃subscript𝑖subscriptsuperscript𝑄𝑖superscriptsubscript𝑋𝑖2subscript𝑋𝑖superscript𝑃′′superscriptsubscript𝑋𝑗2superscript𝑃′′subscript𝑋𝑗absent\displaystyle=P+\sum_{i}Q^{\prime}_{i}\cdot(X_{i}^{2}-X_{i})-P^{\prime\prime}\cdot X_{j}^{2}+P^{\prime\prime}\cdot X_{j}=(15)
=P+iQi(Xi2Xi),absentsuperscript𝑃subscript𝑖subscriptsuperscript𝑄𝑖superscriptsubscript𝑋𝑖2subscript𝑋𝑖\displaystyle=P^{\prime}+\sum_{i}Q^{\prime}_{i}\cdot(X_{i}^{2}-X_{i}),(16)

and we already proved in (13) that this last thing is Psuperscript𝑃P^{*}. ∎

Theorem 4.

Let H𝐻H be a system of polynomial equations, let I𝐼I be a system of polynomial inequalities, and let P𝑃P be a polynomial, all over the same n𝑛n variables. If P=0𝑃0P=0 follows from H𝐻H on all evaluations of its variables in {0,1}01\{0,1\}, then there is an NS proof of P=0𝑃0P=0 from H𝐻H. Similarly, if P0𝑃0P\geq 0 follows from HI𝐻𝐼H\cup I on all evaluations of its variables in {0,1}01\{0,1\}, then there is an SA proof of P0𝑃0P\geq 0 from HI𝐻𝐼H\cup I. Moreover, in both cases the degree of the proof is at most n+1𝑛1n+1, and the size is polynomial in 2nsuperscript2𝑛2^{n} and in the size of H𝐻H and HI𝐻𝐼H\cup I, respectively.

Proof.

Both proofs are essentially the same; first we give the proof for Sherali-Adams and then indicate how to adapt it to Nullstellensatz. We prove the theorem when P𝑃P is multilinear and then we adapt it to the general case. Assume P𝑃P is multilinear and let HI={P10,,Pm0}𝐻𝐼formulae-sequencesubscript𝑃10subscript𝑃𝑚0H\cup I=\{P_{1}\geq 0,\ldots,P_{m}\geq 0\}, where we have written each equation in H𝐻H as two inequalities. For every assignment a:{X1,,Xn}{0,1}:𝑎subscript𝑋1subscript𝑋𝑛01a:\{X_{1},\ldots,X_{n}\}\rightarrow\{0,1\}, let ca,0,ca,1,,ca,msubscript𝑐𝑎0subscript𝑐𝑎1subscript𝑐𝑎𝑚c_{a,0},c_{a,1},\ldots,c_{a,m} be the real numbers defined by cases as follows. If P(a)0𝑃𝑎0P(a)\geq 0, let ca,i=P(a)subscript𝑐𝑎𝑖𝑃𝑎c_{a,i}=P(a) for i=0𝑖0i=0 and ca,i=0subscript𝑐𝑎𝑖0c_{a,i}=0 for i[m]𝑖delimited-[]𝑚i\in[m]. If P(a)<0𝑃𝑎0P(a)<0, let isuperscript𝑖i^{*} be the smallest element in [m]delimited-[]𝑚[m] such that Pi(a)<0subscript𝑃superscript𝑖𝑎0P_{i^{*}}(a)<0, which must exist by the hypothesis, and define ca,i=P(a)/Pi(a)subscript𝑐𝑎𝑖𝑃𝑎subscript𝑃𝑖𝑎c_{a,i}=P(a)/P_{i}(a) for i=i𝑖superscript𝑖i=i^{*} and ca,i=0subscript𝑐𝑎𝑖0c_{a,i}=0 for each i([m]{0}){i}𝑖delimited-[]𝑚0superscript𝑖i\in([m]\cup\{0\})\setminus\{i^{*}\}. Observe that in all cases ca,isubscript𝑐𝑎𝑖c_{a,i} is non-negative. In the first case because P(a)𝑃𝑎P(a) was non-negative, and in the second case because both Pi(a)subscript𝑃superscript𝑖𝑎P_{i^{*}}(a) and P(a)𝑃𝑎P(a) were negative, so their ratio is positive. The choice of these reals guarantees that

ca,0+i=1mca,iPi(a)=P(a).subscript𝑐𝑎0superscriptsubscript𝑖1𝑚subscript𝑐𝑎𝑖subscript𝑃𝑖𝑎𝑃𝑎c_{a,0}+\sum_{i=1}^{m}c_{a,i}P_{i}(a)=P(a).(17)

We need the following claim.

Claim 1.

For every assignment a𝑎a and every i[m]𝑖delimited-[]𝑚i\in[m], the polynomial Pi(a)Iasubscript𝑃𝑖𝑎subscript𝐼𝑎P_{i}(a)\cdot I_{a} is the multilinearization of PiIasubscript𝑃𝑖subscript𝐼𝑎P_{i}\cdot I_{a}. In addition, a(ca,0Ia+i=1mca,iPi(a)Ia)=Psubscript𝑎subscript𝑐𝑎0subscript𝐼𝑎superscriptsubscript𝑖1𝑚subscript𝑐𝑎𝑖subscript𝑃𝑖𝑎subscript𝐼𝑎𝑃\sum_{a}\big{(}c_{a,0}\cdot I_{a}+\sum_{i=1}^{m}c_{a,i}\cdot P_{i}(a)\cdot I_{a}\big{)}=P.

Proof.

Since the multilinearization is unique and the polynomial Pi(a)Iasubscript𝑃𝑖𝑎subscript𝐼𝑎P_{i}(a)\cdot I_{a} is multilinear, it suffices to show that Pi(a)Iasubscript𝑃𝑖𝑎subscript𝐼𝑎P_{i}(a)\cdot I_{a} and PiIasubscript𝑃𝑖subscript𝐼𝑎P_{i}\cdot I_{a} agree on all assignments of values in {0,1}01\{0,1\} to their variables. But this is easy: they both evaluate to Pi(a)subscript𝑃𝑖𝑎P_{i}(a), or both evaluate to 00, depending on whether the assignment is a𝑎a, or different from a𝑎a, respectively. For the second claim we use the same argument, and add the additional fact that P𝑃P is itself multilinear: the big sum over a𝑎a is a multilinear polynomial and, by (17), it agrees with P𝑃P on all assignments of values in {0,1}01\{0,1\} to its variables. Hence, by the uniqueness of the multilinearization, and since P𝑃P is multilinear, it is P𝑃P itself. ∎

Back to the proof, by the first part of Claim 1, for every assignment a𝑎a and every i[m]𝑖delimited-[]𝑚i\in[m], there exist polynomials Qa,i1,,Qa,insuperscriptsubscript𝑄𝑎𝑖1superscriptsubscript𝑄𝑎𝑖𝑛Q_{a,i}^{1},\ldots,Q_{a,i}^{n} according to Lemma 3 that make the following identities hold:

PiIa+j=1nQa,ij(Xj2Xj)=Pi(a)Ia.subscript𝑃𝑖subscript𝐼𝑎superscriptsubscript𝑗1𝑛superscriptsubscript𝑄𝑎𝑖𝑗superscriptsubscript𝑋𝑗2subscript𝑋𝑗subscript𝑃𝑖𝑎subscript𝐼𝑎P_{i}\cdot I_{a}+\sum_{j=1}^{n}Q_{a,i}^{j}\cdot(X_{j}^{2}-X_{j})=P_{i}(a)\cdot I_{a}.(18)

We are ready to build up the proof of P0𝑃0P\geq 0 from P10,,Pm0formulae-sequencesubscript𝑃10subscript𝑃𝑚0P_{1}\geq 0,\ldots,P_{m}\geq 0. We claim that the following identity holds:

a(ca,0Ia+i=1mca,i(PiIa+j=1nQa,ij(Xj2Xj)))=P.subscript𝑎subscript𝑐𝑎0subscript𝐼𝑎superscriptsubscript𝑖1𝑚subscript𝑐𝑎𝑖subscript𝑃𝑖subscript𝐼𝑎superscriptsubscript𝑗1𝑛subscriptsuperscript𝑄𝑗𝑎𝑖superscriptsubscript𝑋𝑗2subscript𝑋𝑗𝑃\sum_{a}\Big{(}{c_{a,0}\cdot I_{a}+\sum_{i=1}^{m}c_{a,i}\cdot\Big{(}{P_{i}\cdot I_{a}+\sum_{j=1}^{n}Q^{j}_{a,i}\cdot(X_{j}^{2}-X_{j})}\Big{)}}\Big{)}=P.(19)

First we claim that the left-hand side can be converted into a valid SA proof (with multiplications by Xjsubscript𝑋𝑗X_{j}’s and 1Xj1subscript𝑋𝑗1-X_{j}’s, which can be simulated in our definition of Sherali-Adams as discussed in Lemma 2). To see this, just reorder the terms and apply Lemma 1 to replace Qa,ij(Xj2Xj)subscriptsuperscript𝑄𝑗𝑎𝑖superscriptsubscript𝑋𝑗2subscript𝑋𝑗Q^{j}_{a,i}\cdot(X_{j}^{2}-X_{j}) by proper SA proofs. It remains to see that the identity (19) holds; this will show that it is an SA proof of P0𝑃0P\geq 0 from P10,,Pm0formulae-sequencesubscript𝑃10subscript𝑃𝑚0P_{1}\geq 0,\ldots,P_{m}\geq 0.

In order to see that (19) holds, first use equation (18) to rewrite its left-hand side:

a(ca,0Ia+i=1mca,iPi(a)Ia).subscript𝑎subscript𝑐𝑎0subscript𝐼𝑎superscriptsubscript𝑖1𝑚subscript𝑐𝑎𝑖subscript𝑃𝑖𝑎subscript𝐼𝑎\sum_{a}\Big{(}{c_{a,0}\cdot I_{a}+\sum_{i=1}^{m}c_{a,i}\cdot P_{i}(a)\cdot I_{a}}\Big{)}.(20)

And now use the second part of Claim 1 to complete the proof when P𝑃P is multilinear.

When P𝑃P is not multilinear, it suffices to apply the above argument to get its multilinearization Psuperscript𝑃P^{*}, and then apply the reverse identity in Lemma 3. Indeed,

Pi=1nQi(Xi2Xi)=P.superscript𝑃superscriptsubscript𝑖1𝑛subscript𝑄𝑖superscriptsubscript𝑋𝑖2subscript𝑋𝑖𝑃P^{*}-\sum_{i=1}^{n}Q_{i}\cdot(X_{i}^{2}-X_{i})=P.(21)

To turn this into a proper SA proof we need to use Lemma 1 again.

For Nullstellensatz, the argument is the same except that, in order to handle arbitrary fields besides the real field \mathbb{R}, the coefficients ca,isubscript𝑐𝑎𝑖c_{a,i} need to be redefined. Let H={P1=0,,Pm=0}𝐻formulae-sequencesubscript𝑃10subscript𝑃𝑚0H=\{P_{1}=0,\ldots,P_{m}=0\}. If Pi(a)=0subscript𝑃𝑖𝑎0P_{i}(a)=0, define ci,a=0subscript𝑐𝑖𝑎0c_{i,a}=0 for all i[m]𝑖delimited-[]𝑚i\in[m]. If Pi(a)0subscript𝑃𝑖𝑎0P_{i}(a)\not=0, let isuperscript𝑖i^{*} be the smallest element in [m]delimited-[]𝑚[m] such that Pi(a)0subscript𝑃superscript𝑖𝑎0P_{i^{*}}(a)\not=0, which must exist by hypothesis, and define ca,i=P(a)/Pi(a)subscript𝑐𝑎𝑖𝑃𝑎subscript𝑃𝑖𝑎c_{a,i}=P(a)/P_{i}(a) for i=i𝑖superscript𝑖i=i^{*} and ca,i=0subscript𝑐𝑎𝑖0c_{a,i}=0 for i([m]{0}){i}𝑖delimited-[]𝑚0superscript𝑖i\in([m]\cup\{0\})\setminus\{i^{*}\}. This choice is well-defined over any field and guarantees (17). The rest of the proof is the same. ∎

2.5 Constraint satisfaction problem

There are many equivalent definitions of the constraint satisfaction problem. Here we use the definition in terms of homomorphisms. Below we introduce the necessary terminology. A concrete example will be developed in Section 8 where we apply the method of reducibilities to the graph k𝑘k-coloring problem for k3𝑘3k\geq 3.

CSPs and homomorphisms.

A relational vocabulary L𝐿L is a set of symbols; each symbol has an associated natural number called its arity. A relational structure 𝔹𝔹\mathbb{B} over L𝐿L (or an L𝐿L-structure) is a set B𝐵B, called a domain together with a set of relations over B𝐵B. For each natural number r𝑟r and each relation symbol RL𝑅𝐿R\in L of arity r𝑟r, there is a relation in 𝔹𝔹\mathbb{B} of arity r𝑟r denoted R(𝔹)𝑅𝔹R(\mathbb{B}), i.e., R(𝔹)Br𝑅𝔹superscript𝐵𝑟R(\mathbb{B})\subseteq B^{r}. Sometimes we call it an interpretation of R𝑅R in 𝔹𝔹\mathbb{B}. We say that a relational structure is finite if its domain is finite and it has finitely many non-empty relations.

Let 𝔹𝔹\mathbb{B} and 𝔹superscript𝔹\mathbb{B}^{\prime} be L𝐿L-structures, for some relational vocabulary L𝐿L. A homomorphism from 𝔹𝔹\mathbb{B} to 𝔹superscript𝔹\mathbb{B}^{\prime} is a function h:BB:𝐵superscript𝐵h\colon B\rightarrow B^{\prime}, which preserves all the relations, that is, for every natural number r𝑟r and each relation symbol RL𝑅𝐿R\in L of arity r𝑟r, if (b1,,br)R(𝔹)subscript𝑏1subscript𝑏𝑟𝑅𝔹(b_{1},\ldots,b_{r})\in R(\mathbb{B}), then (h(b1),,h(br))R(𝔹)subscript𝑏1subscript𝑏𝑟𝑅superscript𝔹(h(b_{1}),\ldots,h(b_{r}))\in R(\mathbb{B}^{\prime}).

For a fixed L𝐿L-structure 𝔹𝔹\mathbb{B}, the constraint satisfaction problem of 𝔹𝔹\mathbb{B}, denoted CSP(𝔹𝔹\mathbb{B}), is the following computational problem: given a finite L𝐿L-structure 𝔸𝔸\mathbb{A}, decide whether there exists a homomorphism from 𝔸𝔸\mathbb{A} to 𝔹𝔹\mathbb{B}. If the anwser is positive we call the instance 𝔸𝔸\mathbb{A} satisfiable; otherwise we call it unsatisfiable. The size of an instance 𝔸𝔸\mathbb{A} is the number of elements in its domain plus the number of tuples in all its relations. Note that if the vocabulary L𝐿L is fixed and finite, then the size of 𝔸𝔸\mathbb{A} is polynomial in the number of elements of its domain which we denote by |A|𝐴|A|. In the context of CSP the structure 𝔹𝔹\mathbb{B} is often called a constraint language or a template. We usually assume that the constraint language 𝔹𝔹\mathbb{B} is finite.

Bounded-width.

The existential k𝑘k-pebble game is played on two relational structures 𝔸𝔸\mathbb{A} and 𝔹𝔹\mathbb{B} over the same vocabulary by two players called Spoiler and Duplicator. The players are given two corresponding sets of pebbles {a1,,ak}subscript𝑎1subscript𝑎𝑘\{a_{1},\ldots,a_{k}\} and {b1,,bk}subscript𝑏1subscript𝑏𝑘\{b_{1},\ldots,b_{k}\}. In each round Spoiler picks one of the k𝑘k pebbles a1,,aksubscript𝑎1subscript𝑎𝑘a_{1},\ldots,a_{k}, say aisubscript𝑎𝑖a_{i}, and puts it on an element of the structure 𝔸𝔸\mathbb{A}. Duplicator responds by picking the corresponding pebble bisubscript𝑏𝑖b_{i} and placing it on some element of the structure 𝔹𝔹\mathbb{B}. For simplicity, in any given configuration of the game let us identify a pebble with the element of the structure that it is placed on. Spoiler wins if at any point during the game the partial function f:AB:𝑓𝐴𝐵f:A\rightarrow B defined by f(ai)=bi𝑓subscript𝑎𝑖subscript𝑏𝑖f(a_{i})=b_{i}, for each pebbled element aisubscript𝑎𝑖a_{i} of 𝔸𝔸\mathbb{A}, is either not well defined (because there exist indices i,j[k]𝑖𝑗delimited-[]𝑘i,j\in[k] of two pebbled elements such that ai=ajsubscript𝑎𝑖subscript𝑎𝑗a_{i}=a_{j} but bibjsubscript𝑏𝑖subscript𝑏𝑗b_{i}\not=b_{j}), or is not a partial homomorphism. Otherwise, the Duplicator wins.

We say that a finite relational structure 𝔹𝔹\mathbb{B} has width k𝑘k if, for every finite structure 𝔸𝔸\mathbb{A} of the same vocabulary as 𝔹𝔹\mathbb{B}, if there is no homomorphism from 𝔸𝔸\mathbb{A} to 𝔹𝔹\mathbb{B}, then Spoiler wins the existential k𝑘k-pebble game on 𝔸𝔸\mathbb{A} and 𝔹𝔹\mathbb{B}. The structure 𝔹𝔹\mathbb{B} has bounded width if it has width k𝑘k for some k𝑘k. Structures of bounded width are exactly those structures for which CSP(𝔹𝔹\mathbb{B}) can be solved by a local consistency algorithm [35].

2.6 Propositional and polynomial encodings

To reason about proof systems for CSPs we encode the fact that a finite structure 𝔸𝔸\mathbb{A} maps homomorphically to a finite structure 𝔹𝔹\mathbb{B}, over the same vocabulary, as a CNF or a system of polynomial inequalities or/and equations. In the proofs we will use concrete fixed encodings but our results hold for a whole class of encodings which we call local.

Local encodings.

First let us fix some notation. In the context of propositional proof systems, for any sets A𝐴A and B𝐵B by V(A,B)𝑉𝐴𝐵V(A,B) we denote a set of propositional variables: for every aA𝑎𝐴a\in A and every bB𝑏𝐵b\in B there is a variable X(a,b)𝑋𝑎𝑏X(a,b) in the set V(A,B)𝑉𝐴𝐵V(A,B). Truth valuations of the variables in V(A,B)𝑉𝐴𝐵V(A,B) and relations on A×B𝐴𝐵A\times B have a natural one-to-one correspondence: a variable X(a,b)𝑋𝑎𝑏X(a,b) is assigned the truth value 111 if and only if the pair (a,b)𝑎𝑏(a,b) belongs to the relation. Recall that a function f𝑓f from A𝐴A to B𝐵B is a relation {(a,f(a)):aA}conditional-set𝑎𝑓𝑎𝑎𝐴\{(a,f(a)):a\in A\} on A×B𝐴𝐵A\times B. Hence, a homomorphism from an L𝐿L-structure 𝔸𝔸\mathbb{A} to an L𝐿L-structure 𝔹𝔹\mathbb{B} is a relation on A×B𝐴𝐵A\times B.

Fix a finite relational vocabulary L𝐿L and a finite structure 𝔹𝔹\mathbb{B} over L𝐿L.

A propositional encoding scheme E𝐸E for CSP(𝔹)CSP𝔹\mathrm{CSP}(\mathbb{B}) is a mapping which assigns to every L𝐿L-structure 𝔸𝔸\mathbb{A} a set of clauses E(𝔸)𝐸𝔸E(\mathbb{A}) over the variables in V(A,B)𝑉𝐴𝐵V(A,B) in such a way that there is a one-to-one correspondence between the truth valuations of the variables in V(A,B)𝑉𝐴𝐵V(A,B) satisfying E(𝔸)𝐸𝔸E(\mathbb{A}) and the homomorphisms from 𝔸𝔸\mathbb{A} to 𝔹𝔹\mathbb{B}.

In the context of algebraic and semi-algebraic proof systems we additionally assume the presence of twin variables. For every aA𝑎𝐴a\in A and every bB𝑏𝐵b\in B there is both the algebraic variable X(a,b)𝑋𝑎𝑏X(a,b) and the algebraic variable X¯(a,b)¯𝑋𝑎𝑏\bar{X}(a,b) in the set V(A,B)𝑉𝐴𝐵V(A,B), and an analogous bijective correspondence holds between relations of A×B𝐴𝐵A\times B and those evaluations of the variables from V(A,B)𝑉𝐴𝐵V(A,B) in {0,1}01\{0,1\} which satisfy the axioms from (2): a pair (a,b)𝑎𝑏(a,b) belongs to the relation if and only if the variable X(a,b)𝑋𝑎𝑏X(a,b) is assigned the value 111 if and only if the variable X¯(a,b)¯𝑋𝑎𝑏\bar{X}(a,b) is assigned the value 00.

An algebraic encoding scheme E𝐸E over a field F𝐹F for CSP(𝔹)CSP𝔹\mathrm{CSP}(\mathbb{B}) is a mapping which assigns to every L𝐿L-structure 𝔸𝔸\mathbb{A} a set of polynomial equations E(𝔸)𝐸𝔸E(\mathbb{A}) over the variables in V(A,B)𝑉𝐴𝐵V(A,B) in such a way that there is a one-to-one correspondence between the evaluations of the variables form V(A,B)𝑉𝐴𝐵V(A,B) in {0,1}01\{0,1\} satisfying E(𝔸)𝐸𝔸E(\mathbb{A}) and the axioms from (2) over F𝐹F, and the homomorphisms from 𝔸𝔸\mathbb{A} to 𝔹𝔹\mathbb{B}. Finally, a semi-algebraic encoding scheme E𝐸E for CSP(𝔹)CSP𝔹\mathrm{CSP}(\mathbb{B}) is a mapping which assigns to every L𝐿L-structure 𝔸𝔸\mathbb{A} a set of polynomial inequalities E(𝔸)𝐸𝔸E(\mathbb{A}) over the variables in V(A,B)𝑉𝐴𝐵V(A,B) in such a way that there is a one-to-one correspondence between the evaluations of the variables form V(A,B)𝑉𝐴𝐵V(A,B) in {0,1}01\{0,1\} satisfying E(𝔸)𝐸𝔸E(\mathbb{A}) and the axioms from (2) and (4), and the homomorphisms from 𝔸𝔸\mathbb{A} to 𝔹𝔹\mathbb{B}. Observe that every algebraic encoding scheme over the real-field is also a semi-algebraic encoding scheme.

An encoding scheme E𝐸E is invariant under isomorphisms if, whenever f:AA:𝑓𝐴superscript𝐴f:A\rightarrow A^{\prime} is an isomorphism from an L𝐿L-structure 𝔸𝔸\mathbb{A} to an L𝐿L-structure 𝔸superscript𝔸\mathbb{A}^{\prime}, it holds that E(𝔸)=f(E(𝔸))𝐸superscript𝔸𝑓𝐸𝔸E(\mathbb{A}^{\prime})=f(E(\mathbb{A})), where f(E(𝔸))𝑓𝐸𝔸f(E(\mathbb{A})) is obtained from E(𝔸)𝐸𝔸E(\mathbb{A}) by substituting each variable X(a,b)𝑋𝑎𝑏X(a,b) by X(f(a),b)𝑋𝑓𝑎𝑏X(f(a),b) (and each variable X¯(a,b)¯𝑋𝑎𝑏\bar{X}(a,b) by X¯(f(a),b)¯𝑋𝑓𝑎𝑏\bar{X}(f(a),b) if necessary).

Next we define the key notion of local encoding scheme. We need two pieces of notation. If the structure 𝔸𝔸\mathbb{A} has a single element and each of its relations is empty, we denote the encoding E(𝔸)𝐸𝔸E(\mathbb{A}) by E(a)𝐸𝑎E(a). If the structure 𝔸𝔸\mathbb{A} has a single non-empty relation R(𝔸)𝑅𝔸R({\mathbb{A}}) with a single tuple (a1,,ar)subscript𝑎1subscript𝑎𝑟(a_{1},\ldots,a_{r}) in it, and its domain is {a1,,ar}subscript𝑎1subscript𝑎𝑟\{a_{1},\ldots,a_{r}\}, then we denote E(𝔸)𝐸𝔸E(\mathbb{A}) by E(R(a1,,ar))𝐸𝑅subscript𝑎1subscript𝑎𝑟E(R(a_{1},\ldots,a_{r})). Since the vocabulary L𝐿L is finite, up to isomorphism there are only finitely many structures of one of the above-mentioned two kinds. Therefore, for any relational structure 𝔹𝔹\mathbb{B} over a finite vocabulary L𝐿L and any encoding scheme E𝐸E that is invariant under isomorphisms, the size of encodings of the form E(a)𝐸𝑎E(a) or E(R(a1,,ar))𝐸𝑅subscript𝑎1subscript𝑎𝑟E(R(a_{1},\ldots,a_{r})) is bounded by a constant. We call it the local bound of the encoding scheme.

An encoding scheme E𝐸E in local if it is invariant under isomorphisms and, for every L𝐿L-structure 𝔸𝔸\mathbb{A}, the encoding E(𝔸)𝐸𝔸E(\mathbb{A}) is a sum of E(a)𝐸𝑎E(a) over all aA𝑎𝐴a\in A and E(R(a1,,ar))𝐸𝑅subscript𝑎1subscript𝑎𝑟E(R(a_{1},\ldots,a_{r})) over all RL𝑅𝐿R\in L and (a1,,ar)R(𝔸)subscript𝑎1subscript𝑎𝑟𝑅𝔸(a_{1},\ldots,a_{r})\in R({\mathbb{A}}). For our purposes all local encodings of the same kind (i.e., propositional, algebraic or semi-algebraic) are essentially equivalent, as formalized by the following result.

Lemma 4.

Let 𝔹𝔹\mathbb{B} be a finite structure over a finite vocabulary L𝐿L, and let (E,E)𝐸superscript𝐸(E,E^{\prime}), (F,F)𝐹superscript𝐹(F,F^{\prime}) and (G,G)𝐺superscript𝐺(G,G^{\prime}) be pairs of local encoding schemes for 𝔹𝔹\mathbb{B} that are propositional, algebraic and semi-algebraic, respectively. There exists a positive integer p𝑝p such that for every finite L𝐿L-structure 𝔸𝔸\mathbb{A} it holds that:

  1. 1.

    every clause in E(𝔸)superscript𝐸𝔸E^{\prime}(\mathbb{A})has a resolution proof from E(𝔸)𝐸𝔸E(\mathbb{A})of size bounded by p𝑝p,

  2. 2.

    every equation in F(𝔸)superscript𝐹𝔸F^{\prime}(\mathbb{A})has an NS proof from F(𝔸)𝐹𝔸F(\mathbb{A})of size and degree bounded by p𝑝p,

  3. 3.

    every inequality in G(𝔸)superscript𝐺𝔸G^{\prime}(\mathbb{A})has an SA proof from G(𝔸)𝐺𝔸G(\mathbb{A})of size and degree bounded by p𝑝p.

Proof.

For 1, let s𝑠s and ssuperscript𝑠s^{\prime} be the local bounds of E𝐸E and Esuperscript𝐸E^{\prime}, respectively. Take a clause C𝐶C from E(𝔸)superscript𝐸𝔸E^{\prime}(\mathbb{A}). The clause C𝐶C belongs to a subset of E(𝔸)superscript𝐸𝔸E^{\prime}(\mathbb{A}) of the form E(a)superscript𝐸𝑎E^{\prime}(a) or E(R(a1,,ar))superscript𝐸𝑅subscript𝑎1subscript𝑎𝑟E^{\prime}(R(a_{1},\ldots,a_{r})), so the size of C𝐶C is bounded by ssuperscript𝑠s^{\prime}. Without loss of generality suppose that C𝐶C belongs to a set E(R(a1,,ar))superscript𝐸𝑅subscript𝑎1subscript𝑎𝑟E^{\prime}(R(a_{1},\ldots,a_{r})). The corresponding subset E(R(a1,,ar))𝐸𝑅subscript𝑎1subscript𝑎𝑟E(R(a_{1},\ldots,a_{r})) of E(𝔸)𝐸𝔸E(\mathbb{A}) has size at most s𝑠s. The satisfying truth valuations for E(R(a1,,ar))𝐸𝑅subscript𝑎1subscript𝑎𝑟E(R(a_{1},\ldots,a_{r})) and E(R(a1,,ar))superscript𝐸𝑅subscript𝑎1subscript𝑎𝑟E^{\prime}(R(a_{1},\ldots,a_{r})) are the same. Therefore, since C𝐶C is an element of E(R(a1,,ar))superscript𝐸𝑅subscript𝑎1subscript𝑎𝑟E^{\prime}(R(a_{1},\ldots,a_{r})), we have that E(R(a1,,ar))𝐸𝑅subscript𝑎1subscript𝑎𝑟E(R(a_{1},\ldots,a_{r})) logically implies C𝐶C. It follows from the quantitative completeness theorem for resolution (cf. Theorem 3) that the clause C𝐶C has a resolution derivation from E(R(a1,,ar))𝐸𝑅subscript𝑎1subscript𝑎𝑟E(R(a_{1},\ldots,a_{r})) of size bounded by a function of s𝑠s and ssuperscript𝑠s^{\prime}.

The proofs of 2 and 3 are analogous. The completeness theorem for Nullstellensatz and Sherali-Adams (cf. Theorem 4) needs to be used instead of Theorem 3. ∎

Three specific examples.

The results of this paper hold for arbitrary local encoding schemes. However, in the proofs it is often convenient to be specific. We now introduce three concrete encoding schemes that, in addition, are defined uniformly with respect to the template 𝔹𝔹\mathbb{B}.

For every structures 𝔸𝔸\mathbb{A} and 𝔹𝔹\mathbb{B} over the same vocabulary, let CNF(𝔸,𝔹)CNF𝔸𝔹\mathrm{CNF}(\mathbb{A},\mathbb{B}) be a set of clauses with:

  1. 1.

    a clause bBX(a,b)subscript𝑏𝐵𝑋𝑎𝑏\bigvee_{b\in B}X(a,b) for each aA𝑎𝐴a\in A,

  2. 2.

    a clause X(a,b0)¯X(a,b1)¯¯𝑋𝑎subscript𝑏0¯𝑋𝑎subscript𝑏1\overline{X(a,b_{0})}\vee\overline{X(a,b_{1})} for each aA𝑎𝐴a\in A and (b0,b1)B2subscript𝑏0subscript𝑏1superscript𝐵2(b_{0},b_{1})\in B^{2} with b0b1subscript𝑏0subscript𝑏1b_{0}\not=b_{1},

  3. 3.

    a clause i[r]X(ai,bi)¯subscript𝑖delimited-[]𝑟¯𝑋subscript𝑎𝑖subscript𝑏𝑖\bigvee_{i\in[r]}\overline{X(a_{i},b_{i})} for each natural number r𝑟r, each RL𝑅𝐿R\in L of arity r𝑟r, each (a1,,ar)R(𝔸)subscript𝑎1subscript𝑎𝑟𝑅𝔸(a_{1},\ldots,a_{r})\in R({\mathbb{A}}), and each (b1,,br)BrR(𝔹)subscript𝑏1subscript𝑏𝑟superscript𝐵𝑟𝑅𝔹(b_{1},\ldots,b_{r})\in B^{r}\setminus R({\mathbb{B}}).

Note that the mapping that to an L𝐿L-structure 𝔸𝔸\mathbb{A} assigns CNF(𝔸,𝔹)CNF𝔸𝔹\mathrm{CNF}(\mathbb{A},\mathbb{B}) is a local encoding scheme for CSP(𝔹)CSP𝔹\mathrm{CSP}(\mathbb{B}). Since this definition is uniform with respect to 𝔹𝔹\mathbb{B} we call it simply the CNFCNF\mathrm{CNF} encoding scheme. We use it to reason about propositional proof systems for CSP(𝔹)CSP𝔹\mathrm{CSP}(\mathbb{B}).

There are two standard ways of encoding a clause into a system of inequalities: multiplicatively and additively. These give rise to two local encoding schemes which we use to reason about algebraic and semi-algebraic proof systems in the context of CSP. Specifically, the multiplicative and additive encodings of a clause C=X1¯X¯X+1Xk𝐶¯subscript𝑋1¯subscript𝑋subscript𝑋1subscript𝑋𝑘C=\overline{X_{1}}\vee\cdots\vee\overline{X_{\ell}}\vee X_{\ell+1}\vee\cdots\vee X_{k} are the following equation and inequality, respectively:

X1XX¯+1X¯k=0 and X¯1++X¯+X+1++Xk10.formulae-sequencesubscript𝑋1subscript𝑋subscript¯𝑋1subscript¯𝑋𝑘0 and subscript¯𝑋1subscript¯𝑋subscript𝑋1subscript𝑋𝑘10\displaystyle X_{1}\cdots X_{\ell}\bar{X}_{\ell+1}\cdots\bar{X}_{k}=0\;\;\;\;\text{ and }\;\;\;\;\bar{X}_{1}+\cdots+\bar{X}_{\ell}+X_{\ell+1}+\cdots+X_{k}-1\geq 0.

Let EQ(𝔸,𝔹)EQ𝔸𝔹\mathrm{EQ}(\mathbb{A},\mathbb{B}) be the system of polynomial equations that are multiplicative encodings of the clauses in CNF(𝔸,𝔹)CNF𝔸𝔹\mathrm{CNF}(\mathbb{A},\mathbb{B}), that is:

  1. 1.

    bBX¯(a,b)=0subscriptproduct𝑏𝐵¯𝑋𝑎𝑏0\prod_{b\in B}\bar{X}(a,b)=0 for each aA𝑎𝐴a\in A,

  2. 2.

    X(a,b0)X(a,b1)=0𝑋𝑎subscript𝑏0𝑋𝑎subscript𝑏10X(a,b_{0})X(a,b_{1})=0 for each aA𝑎𝐴a\in A and (b0,b1)B2subscript𝑏0subscript𝑏1superscript𝐵2(b_{0},b_{1})\in B^{2} with b0b1subscript𝑏0subscript𝑏1b_{0}\not=b_{1},

  3. 3.

    i=1rX(a,bi)=0superscriptsubscriptproduct𝑖1𝑟𝑋𝑎subscript𝑏𝑖0\prod_{i=1}^{r}X(a,b_{i})=0 for each natural number r𝑟r, each RL𝑅𝐿R\in L of arity r𝑟r, each (a1,,ar)R(𝔸)subscript𝑎1subscript𝑎𝑟𝑅𝔸(a_{1},\ldots,a_{r})\in R({\mathbb{A}}), and each (b1,,br)BrR(𝔹)subscript𝑏1subscript𝑏𝑟superscript𝐵𝑟𝑅𝔹(b_{1},\ldots,b_{r})\in B^{r}\setminus R({\mathbb{B}}).

The mapping that to an L𝐿L-structure 𝔸𝔸\mathbb{A} assigns EQ(𝔸,𝔹)EQ𝔸𝔹\mathrm{EQ}(\mathbb{A},\mathbb{B}) is a local encoding scheme for CSP(𝔹)CSP𝔹\mathrm{CSP}(\mathbb{B}). Note that this scheme makes sense over any field. We call it the EQEQ\mathrm{EQ} encoding scheme. It is used in Section 4 to reason both about algebraic and semi-algebraic proof systems, and in Section 6 while discussing lower bounds for SOS.

Similarly, let INEQ(𝔸,𝔹)INEQ𝔸𝔹\mathrm{INEQ}(\mathbb{A},\mathbb{B}) be a system of of linear inequalities that are additive encodings of the clauses in CNF(𝔸,𝔹)CNF𝔸𝔹\mathrm{CNF}(\mathbb{A},\mathbb{B}), that is:

  1. 1.

    bBX(a,b)10subscript𝑏𝐵𝑋𝑎𝑏10\sum_{b\in B}X(a,b)-1\geq 0 for each aA𝑎𝐴a\in A,

  2. 2.

    X¯(a,b0)+X¯(a,b1)10¯𝑋𝑎subscript𝑏0¯𝑋𝑎subscript𝑏110\bar{X}(a,b_{0})+\bar{X}(a,b_{1})-1\geq 0 for each aA𝑎𝐴a\in A and (b0,b1)B2subscript𝑏0subscript𝑏1superscript𝐵2(b_{0},b_{1})\in B^{2} with b0b1subscript𝑏0subscript𝑏1b_{0}\not=b_{1},

  3. 3.

    i=1rX¯(a,bi)10superscriptsubscript𝑖1𝑟¯𝑋𝑎subscript𝑏𝑖10\sum_{i=1}^{r}\bar{X}(a,b_{i})-1\geq 0 for each natural number r𝑟r, each RL𝑅𝐿R\in L of arity r𝑟r, each (a1,,ar)R(𝔸)subscript𝑎1subscript𝑎𝑟𝑅𝔸(a_{1},\ldots,a_{r})\in R({\mathbb{A}}), and each (b1,,br)BrR(𝔹)subscript𝑏1subscript𝑏𝑟superscript𝐵𝑟𝑅𝔹(b_{1},\ldots,b_{r})\in B^{r}\setminus R({\mathbb{B})}.

The mapping that to an L𝐿L-structure 𝔸𝔸\mathbb{A} assigns INEQ(𝔸,𝔹)INEQ𝔸𝔹\mathrm{INEQ}(\mathbb{A},\mathbb{B}) is a local encoding scheme for CSP(𝔹)CSP𝔹\mathrm{CSP}(\mathbb{B}). We call it the INEQINEQ\mathrm{INEQ} encoding scheme. It is used in Section 7 to reason about semi-algebraic proof systems.

In Section 8 we will discuss one more local semi-algebraic encoding scheme that was used in [40] to prove PC lower bounds for graph coloring.

3 General proof complexity facts

Substitutions will play a central role in showing that certain propositional and semi-algebraic proof systems behave well with respect to the classical CSP reductions. In the case of propositional proof systems we will consider substitutions of variables by bounded-DNF formulas with a bounded number of terms, and in the case of algebraic and semi-algebraic proof systems we will use substitutions by polynomials with bounded degree and a bounded number of monomials. We now prove some key technical lemmas regarding such substitutions.

3.1 Substitutions in Frege

In the case of propositional proof systems, a substitution is a mapping from variables to formulas. Applying a substitution to a formula means replacing all variables by the corresponding formulas, simultaneously all at once. Since our formulas are in negation normal form, it is implicit that the result of applying the substitution XFmaps-to𝑋𝐹X\mapsto F to a negative literal X¯¯𝑋\overline{X} is the formula dual to F𝐹F, i.e., F¯¯𝐹\overline{F}.

Lemma 5.

Let k𝑘k, d𝑑d and m𝑚m be positive integers, let A𝐴A be a k𝑘k-term and let A+superscript𝐴A^{+} be the result of replacing each variable in A𝐴A by a (possibly different) d𝑑d-DNF with at most m𝑚m many terms. Then A+superscript𝐴A^{+} is logically equivalent to a k(d+m)𝑘𝑑𝑚k(d+m)-DNF with at most mkdkmsuperscript𝑚𝑘superscript𝑑𝑘𝑚m^{k}d^{km} many terms.

Proof.

Let p𝑝p and n𝑛n be the numbers of positive and negative literals in A𝐴A, respectively. After applying the substitution, the k𝑘k-term becomes a conjunction of p𝑝p many d𝑑d-DNFs and n𝑛n many negations of d𝑑d-DNFs, where each d𝑑d-DNF has at most m𝑚m many terms. Applying the De Morgan rules to the negated d𝑑d-DNFs, what we get is a formula of the following schematic form:

(pmd)(nmd).superscript𝑝superscript𝑚superscript𝑑superscript𝑛superscript𝑚superscript𝑑\left({\bigwedge^{p}\bigvee^{m}\bigwedge^{d}}\right)\wedge\left({\bigwedge^{n}\bigwedge^{m}\bigvee^{d}}\right).(22)

In the left subformula in (22), distributing the outer conjunction over the disjunction gives a disjunction of at most mpsuperscript𝑚𝑝m^{p} many pd𝑝𝑑pd-terms. In the right subformula in (22), distributing the two outer conjunctions over the disjunction gives a disjunction of at most dnmsuperscript𝑑𝑛𝑚d^{nm} many nm𝑛𝑚nm-terms. Schematically:

(mppd)(dnmnm).superscriptsuperscript𝑚𝑝superscript𝑝superscript𝑑superscriptsuperscript𝑑𝑛𝑚superscript𝑛𝑚\left({\bigvee^{m^{p}}\bigwedge^{p}\bigwedge^{d}}\right)\wedge\left({\bigvee^{d^{nm}}\bigwedge^{nm}}\right).(23)

Finally, in formula (23), distributing the outer conjunction over the disjunctions gives a disjunction of mpdnmsuperscript𝑚𝑝superscript𝑑𝑛𝑚m^{p}d^{nm} many (pd+mn)𝑝𝑑𝑚𝑛(pd+mn)-terms:

mpdnmpd+mn.superscriptsuperscript𝑚𝑝superscript𝑑𝑛𝑚superscript𝑝𝑑𝑚𝑛\bigvee^{m^{p}d^{nm}}\bigwedge^{pd+mn}.(24)

Using p+nk𝑝𝑛𝑘p+n\leq k we get the result. ∎

Lemma 6.

Fix any positive integers q𝑞q, d𝑑d, m𝑚m and p𝑝p. Let F𝐹F and G𝐺G be sets of clauses with at most q𝑞q variables each, and let σ𝜎\sigma be a substitution of the variables of F𝐹F into d𝑑d-DNFs with at most m𝑚m many terms on the variables of G𝐺G. For any positive integers k𝑘k, s𝑠s and any t2𝑡2t\geq 2, if F𝐹F has a Frege refutation of depth t𝑡t, bottom fan-in k𝑘k, and size s𝑠s, and for each formula in F𝐹F its substitution is a logical consequence of at most p𝑝p many clauses from G𝐺G, then G𝐺G has a Frege refutation of depth t𝑡t, bottom fan-in k(d+m)𝑘𝑑𝑚k(d+m), and size polynomial in 2ksuperscript2𝑘2^{k} and s𝑠s.

Proof.

Fix some positive integers q𝑞q, d𝑑d, m𝑚m and p𝑝p. Assume that F𝐹F has a Σt,ksubscriptΣ𝑡𝑘\Sigma_{t,k}-Frege refutation, for some k𝑘k and t2𝑡2t\geq 2. Let =k(d+m)𝑘𝑑𝑚\ell=k(d+m).

We now define an operator which maps formulas in Σt,ksubscriptΣ𝑡𝑘\Sigma_{t,k} to formulas in Σt,subscriptΣ𝑡\Sigma_{t,\ell}. If a formula D𝐷D is a variable or the negation of a variable, then we define D+superscript𝐷D^{+} simply as the d𝑑d-DNF or d𝑑d-CNF obtained by applying the substitution σ𝜎\sigma to D𝐷D. For a k𝑘k-DNF D𝐷D, we put D+superscript𝐷D^{+} to be the \ell-DNF that one gets from applying Lemma 5 to each k𝑘k-term in D𝐷D with the substitution σ𝜎\sigma and then taking the disjunction of the resulting DNFs. For a k𝑘k-CNF D𝐷D, we define D+superscript𝐷D^{+} as the complement of (D¯)+superscript¯𝐷(\overline{D})^{+}. In this case D+superscript𝐷D^{+} is an \ell-CNF. Clauses and terms are treated as 111-DNFs and 111-CNFs, respectively. Finally, if D𝐷D is a formula from Σt,ksubscriptΣ𝑡𝑘\Sigma_{t,k} of depth at least 333, then we define D+superscript𝐷D^{+} to be the formula constructed by replacing each maximal subformula E𝐸E of D𝐷D of depth at most 222 by E+superscript𝐸E^{+}. By Lemma 5, the size of D+superscript𝐷D^{+} is at most polynomial in 2ksuperscript2𝑘2^{k} and the size of D𝐷D.

If D𝐷D and E𝐸E are both formulas in Σt,ksubscriptΣ𝑡𝑘\Sigma_{t,k}, then (DE)+=D+E+superscript𝐷𝐸superscript𝐷superscript𝐸(D\vee E)^{+}=D^{+}\vee E^{+}. Moreover, for any D𝐷D it holds that (D¯)+=(D+)¯superscript¯𝐷¯superscript𝐷(\overline{D})^{+}=\overline{(D^{+})}. Hence also (DE)+=D+E+superscript𝐷𝐸superscript𝐷superscript𝐸(D\wedge E)^{+}=D^{+}\wedge E^{+}. This means that the result of applying our operator to the premises and conclusion of any of the rules of Frege is an instance of the same rule.

Let D1,D2,,Dtsubscript𝐷1subscript𝐷2subscript𝐷𝑡D_{1},D_{2},\ldots,D_{t} be a Σt,ksubscriptΣ𝑡𝑘\Sigma_{t,k}-Frege refutation of F𝐹F of size s𝑠s. In order to transform the sequence of formulas D1+,D2+,,Dt+superscriptsubscript𝐷1superscriptsubscript𝐷2superscriptsubscript𝐷𝑡D_{1}^{+},D_{2}^{+},\ldots,D_{t}^{+} into a valid Σt,subscriptΣ𝑡\Sigma_{t,\ell}-Frege refutation of G𝐺G we need to prove that for each non-logical axiom Disubscript𝐷𝑖D_{i}, the formula Di+superscriptsubscript𝐷𝑖D_{i}^{+} has constant size Σt,subscriptΣ𝑡\Sigma_{t,\ell}-Frege proof from G𝐺G.

Each non-logical axiom Disubscript𝐷𝑖D_{i} is a q𝑞q-clause C𝐶C from F𝐹F. By assumption, the substitution σ(C)𝜎𝐶\sigma(C) and hence also C+superscript𝐶C^{+} is a logical consequence of at most p𝑝p many q𝑞q-clauses of G𝐺G. Moreover, the size of C+superscript𝐶C^{+} is bounded by a function of d𝑑d, m𝑚m and q𝑞q, and the total size of the p𝑝p many q𝑞q-clauses of G𝐺G that imply C+superscript𝐶C^{+} is bounded by a function of p𝑝p and q𝑞q. The quantitative completeness theorem for Σt,subscriptΣ𝑡\Sigma_{t,\ell}-Frege does the rest: Di+superscriptsubscript𝐷𝑖D_{i}^{+} has a Σt,subscriptΣ𝑡\Sigma_{t,\ell}-Frege derivation from G𝐺G of size bounded by a function of d𝑑d, m𝑚m, p𝑝p and q𝑞q. ∎

3.2 Substitutions in algebraic and semi-algebraic proof systems

In the case of algebraic and semi-algebraic proof systems, a substitution is a mapping from variables to polynomials. Applying a substitution to an equation or inequality means replacing all variables by the corresponding polynomials, simultaneously all at once.

For every set of polynomial equations F𝐹F, by Eq(F)Eq𝐹\mathrm{Eq}(F) we denote the union of F𝐹F and all the axiom polynomial equations from (2) for the variables in F𝐹F, i.e., for each variable X𝑋X or X¯¯𝑋\bar{X} appearing in one of the equations from F𝐹F, we add to F𝐹F the polynomial equations X2X=0superscript𝑋2𝑋0X^{2}-X=0, X¯2X¯=0superscript¯𝑋2¯𝑋0\bar{X}^{2}-\bar{X}=0 and X+X¯1=0𝑋¯𝑋10X+\bar{X}-1=0.

Lemma 7.

Fix any positive integers d𝑑d, m𝑚m, p𝑝p, and q𝑞q. Let F𝐹F and G𝐺G be sets of polynomial equations of the form P=0𝑃0P=0, where P𝑃P is a monomial of degree at most q𝑞q with coefficient 111, and let σ𝜎\sigma be a substitution of the variables of F𝐹F into polynomials on the variables of G𝐺G of degree at most d𝑑d, with at most m𝑚m many monomials and every coefficient equal 111. For 𝒫𝒫\mathcal{P} being the Nullstellensatz or Polynomial Calculus proof system over any field, and for any positive integers k𝑘k and s𝑠s, if F𝐹F has a 𝒫𝒫\mathcal{P} refutation of degree k𝑘k, size s𝑠s, and for each equation in Eq(F)Eq𝐹\mathrm{Eq}(F) its substitution follows from at most p𝑝p many equations from G𝐺G on all evaluations of its variables in {0,1}01\{0,1\} over the underlying field, then G𝐺G has a 𝒫𝒫\mathcal{P} refutation of degree linear in k𝑘k and size polynomial in 2ksuperscript2𝑘2^{k} and s𝑠s.

Proof.

Let us fix some positive integers d𝑑d, m𝑚m, p𝑝p, and q𝑞q. Let F𝐹F and G𝐺G be sets of polynomial equations of the form P=0𝑃0P=0, where P𝑃P is a monomial of degree at most q𝑞q with coefficient 111, and let σ𝜎\sigma be a substitution of the variables of F𝐹F into polynomials on the variables of G𝐺G of degree at most d𝑑d, with at most m𝑚m many monomials and every coefficient equal 111. If for each equation in Eq(F)Eq𝐹\mathrm{Eq}(F) its substitution follows from at most p𝑝p many equations from G𝐺G on all evaluations of its variables in {0,1}01\{0,1\}, then by Theorem 4 for every equation in Eq(F)Eq𝐹\mathrm{Eq}(F) its substitution has an NS derivation from G𝐺G. The size and degree of this derivation are bounded by some constants which depend on d𝑑d, m𝑚m, p𝑝p, and q𝑞q.

Suppose that 𝒫𝒫\mathcal{P} is the Nullstellensatz proof system and assume that for some positive integers k𝑘k and s𝑠s, the set of equations F𝐹F has an NS refutation of degree k𝑘k, size s𝑠s. The refutation of F𝐹F is of the form

i=1rPicijJiXjkKiX¯k=1,superscriptsubscript𝑖1𝑟subscript𝑃𝑖subscript𝑐𝑖subscriptproduct𝑗subscript𝐽𝑖subscript𝑋𝑗subscriptproduct𝑘subscript𝐾𝑖subscript¯𝑋𝑘1\sum_{i=1}^{r}P_{i}\cdot c_{i}\prod_{j\in J_{i}}X_{j}\prod_{k\in K_{i}}\bar{X}_{k}=-1,(25)

where P1,,Prsubscript𝑃1subscript𝑃𝑟P_{1},\ldots,P_{r} are polynomials such that the equation Pi=0subscript𝑃𝑖0P_{i}=0 is in the set Eq(F)Eq𝐹\mathrm{Eq}(F), and c1,,crsubscript𝑐1subscript𝑐𝑟c_{1},\ldots,c_{r} are scalars. We substitute the variables in the above equality according to σ𝜎\sigma and substitute the polynomials from the set Eq(F)Eq𝐹\mathrm{Eq}(F) by their NS derivations. This way we obtain an NS refutation of G𝐺G of degree linear in k𝑘k and size polynomial in 2ksuperscript2𝑘2^{k} and s𝑠s.

Suppose that 𝒫𝒫\mathcal{P} is the Polynomial Calculus proof system and assume that for some positive integers k𝑘k and s𝑠s, the set of equations F𝐹F has a PC refutation of degree k𝑘k, size s𝑠s. The PC refutation of G𝐺G goes as follows: first for each equation in Eq(F)Eq𝐹\mathrm{Eq}(F) we derive its substitution in the Nullstellensatz proof system, and then we simulate the subsequent steps of the refutation of F𝐹F. Applications of addition and multiplication by scalars remain as they were, and applications of multiplication by variables are simulated in several steps. Since after applying the substitution to the variables they become polynomials of degree at most d𝑑d, with at most m𝑚m many monomials and every coefficient equal 111, we can simulate multiplication by a variable by at most md𝑚𝑑md multiplication steps and at most m1𝑚1m-1 additions. The substitution of variables causes a blow-up in size which is polynomial in 2ksuperscript2𝑘2^{k}, and the simulation additionally increases the size by a constant factor. Altogether, the degree of the PC refutation of G𝐺G described above is linear in k𝑘k and its size is polynomial in 2ksuperscript2𝑘2^{k} and s𝑠s. ∎

For every set of polynomial inequalities F𝐹F, by Ineq(F)Ineq𝐹\mathrm{Ineq}(F) we denote the union of F𝐹F and all the axiom polynomial inequalities and equations from (4) and (2) for the variables in F𝐹F, i.e., for each variable X𝑋X or X¯¯𝑋\bar{X} appearing in one of the equations from F𝐹F, we add to F𝐹F the polynomial equations X2X=0superscript𝑋2𝑋0X^{2}-X=0, X¯2X¯=0superscript¯𝑋2¯𝑋0\bar{X}^{2}-\bar{X}=0, X+X¯1=0𝑋¯𝑋10X+\bar{X}-1=0, and inequalities X0𝑋0X\geq 0, X¯0¯𝑋0\bar{X}\geq 0, 1X01𝑋01-X\geq 0, 1X¯01¯𝑋01-\bar{X}\geq 0, 10101\geq 0.

Lemma 8.

Fix any positive integers d𝑑d, m𝑚m, p𝑝p, and q𝑞q. Let F𝐹F and G𝐺G be sets of polynomial equations of the form P=0𝑃0P=0, where P𝑃P is a monomial of degree at most q𝑞q with coefficient 111, and let σ𝜎\sigma be a substitution of the variables of F𝐹F into polynomials on the variables of G𝐺G of degree at most d𝑑d, with at most m𝑚m many monomials and every coefficient equal 111. For 𝒫𝒫\mathcal{P} being the Sherali-Adams, Positive Semidefinite Sherali-Adams, Sums-of-Squares, Lovász-Schrijver or Positive Semidefinite Lovász-Schrijver proof system, and for any positive integers k𝑘k and s𝑠s, if F𝐹F has a 𝒫𝒫\mathcal{P} refutation of degree k𝑘k, size s𝑠s, and for each inequality and equation in Ineq(F)Ineq𝐹\mathrm{Ineq}(F) its substitution follows from at most p𝑝p many equations from G𝐺G on all evaluations of its variables in {0,1}01\{0,1\}, then G𝐺G has a 𝒫𝒫\mathcal{P} refutation of degree linear in k𝑘k and size polynomial in 2ksuperscript2𝑘2^{k} and s𝑠s.

Proof.

Let us fix some positive integers d𝑑d, m𝑚m, p𝑝p, and q𝑞q. Let F𝐹F and G𝐺G be sets of polynomial equations of the form P=0𝑃0P=0, where P𝑃P is a monomial of degree at most q𝑞q with coefficient 111, and let σ𝜎\sigma be a substitution of the variables of F𝐹F into polynomials on the variables of G𝐺G of degree at most d𝑑d, with at most m𝑚m many monomials and every coefficient equal 111. If for an inequality or equation in Ineq(F)Ineq𝐹\mathrm{Ineq}(F) its substitution follows from at most p𝑝p many equations from G𝐺G on all evaluations of its variables in {0,1}01\{0,1\}, then by Theorem 4 such substitution has an SA derivation from G𝐺G. Moreover, the size and degree of those derivations are bounded by some constants which depend on d𝑑d, m𝑚m, p𝑝p, and q𝑞q.

Suppose that 𝒫𝒫\mathcal{P} is the Sherali-Adams or Positive Semidefinite Sherali-Adams proof system and assume that for some positive integers k𝑘k and s𝑠s, the set of equations F𝐹F has an SA (or SA+) refutation of degree k𝑘k, size s𝑠s. The refutation of F𝐹F is of the form

i=1rPicijJiXjkKiX¯k=1,superscriptsubscript𝑖1𝑟subscript𝑃𝑖subscript𝑐𝑖subscriptproduct𝑗subscript𝐽𝑖subscript𝑋𝑗subscriptproduct𝑘subscript𝐾𝑖subscript¯𝑋𝑘1\sum_{i=1}^{r}P_{i}\cdot c_{i}\prod_{j\in J_{i}}X_{j}\prod_{k\in K_{i}}\bar{X}_{k}=-1,(26)

where c1,,crsubscript𝑐1subscript𝑐𝑟c_{1},\ldots,c_{r} are reals and P1,,Prsubscript𝑃1subscript𝑃𝑟P_{1},\ldots,P_{r} are polynomials such that the equation Pi=0subscript𝑃𝑖0P_{i}=0 or the inequality Pi0subscript𝑃𝑖0P_{i}\geq 0 is in the set Ineq(F)Ineq𝐹\mathrm{Ineq}(F), or they are squares of polynomials when they are allowed (i.e., for SA+). We substitute the variables in the above equality according to σ𝜎\sigma and substitute the polynomials from the set Ineq(F)Ineq𝐹\mathrm{Ineq}(F) by their SA derivations. This way we obtain an SA (or SA+) refutation of G𝐺G of degree linear in k𝑘k and size polynomial in 2ksuperscript2𝑘2^{k} and s𝑠s.

Suppose that 𝒫𝒫\mathcal{P} is the Sum-of-Squares proof system and assume that for some positive integers k𝑘k and s𝑠s, the set of equations F𝐹F has an SOS refutation of degree k𝑘k, size s𝑠s. The refutation of F𝐹F is of the form

i=1rPiSi=1,superscriptsubscript𝑖1𝑟subscript𝑃𝑖subscript𝑆𝑖1\sum_{i=1}^{r}P_{i}\cdot S_{i}=-1,(27)

where P1,,Prsubscript𝑃1subscript𝑃𝑟P_{1},\ldots,P_{r} are polynomials such that either the equation Pi=0subscript𝑃𝑖0P_{i}=0 or the inequality Pi0subscript𝑃𝑖0P_{i}\geq 0 is in the set Ineq(F)Ineq𝐹\mathrm{Ineq}(F), or they are squares, and S1,,Srsubscript𝑆1subscript𝑆𝑟S_{1},\ldots,S_{r} are arbitrary polynomials. We substitute the variables in the above equality according to σ𝜎\sigma and substitute the polynomials from the set Ineq(F)Ineq𝐹\mathrm{Ineq}(F) by their SA derivations. This way we obtain an SOS refutation of degree linear in k𝑘k and size polynomial in 2ksuperscript2𝑘2^{k} and s𝑠s.

Suppose that 𝒫𝒫\mathcal{P} is the Lovász-Schrijver or Positive Semidefinite Lovász-Schrijver proof system and assume that for some positive integers k𝑘k and s𝑠s, the set of equations F𝐹F has an LS (or LS+) refutation of degree k𝑘k, size s𝑠s. The refutation of G𝐺G goes as follows: first for each equation and inequality in Ineq(F)Ineq𝐹\mathrm{Ineq}(F) we derive its substitution in the Sherali-Adams proof system, and then we simulate the subsequent steps of the refutation of F𝐹F. Applications of addition and multiplication by positive reals as well as positivity-of-squares when it is allowed (i.e. for LS+) remain as they were, and applications of multiplication by variables are simulated in several steps. Since after applying the substitution to the variables they become polynomials of degree at most d𝑑d, with at most m𝑚m many monomials and every coefficient equal 111, we can simulate multiplication by a variable by at most md𝑚𝑑md multiplication steps and at most m1𝑚1m-1 additions. The substitution of variables causes a blow-up in size which is polynomial in 2ksuperscript2𝑘2^{k}, and the simulation additionally increases the size by a constant factor. Altogether, the degree of the LS (or LS+) refutation of G𝐺G described above is linear in k𝑘k and its size is polynomial in 2ksuperscript2𝑘2^{k} and s𝑠s. ∎

3.3 Simulations

At a later section we will need to use the known fact that both Polynomial Calculus and Sherali-Adams efficiently simulate resolution. If C𝐶C is the clause iIXi¯jJXjsubscript𝑖𝐼¯subscript𝑋𝑖subscript𝑗𝐽subscript𝑋𝑗\bigvee_{i\in I}\overline{X_{i}}\vee\bigvee_{j\in J}X_{j}, let

M(C):=iIXijJX¯j.assign𝑀𝐶subscriptproduct𝑖𝐼subscript𝑋𝑖subscriptproduct𝑗𝐽subscript¯𝑋𝑗M(C):=\prod_{i\in I}X_{i}\prod_{j\in J}\bar{X}_{j}.(28)

Note that, under the axioms (2), the clause C𝐶C is encoded by the equation M(C)=0𝑀𝐶0M(C)=0 or, in the context of semi-algebraic proofs, by the pair of inequalities M(C)0𝑀𝐶0M(C)\geq 0 and M(C)0𝑀𝐶0-M(C)\geq 0. In Section 2.6 we called this the multiplicative encoding of C𝐶C.

Lemma 9.

If C𝐶C is a clause that has a resolution derivation of width k𝑘k and size s𝑠s from clauses C1,,Cmsubscript𝐶1subscript𝐶𝑚C_{1},\ldots,C_{m}, then the equation M(C)=0𝑀𝐶0M(C)=0 has a PC proof over any field from M(C1)=0,,M(Cm)=0formulae-sequence𝑀subscript𝐶10𝑀subscript𝐶𝑚0M(C_{1})=0,\ldots,M(C_{m})=0 and an SA proof from M(C1)=0,,M(Cm)=0formulae-sequence𝑀subscript𝐶10𝑀subscript𝐶𝑚0M(C_{1})=0,\ldots,M(C_{m})=0 of degree linear in k𝑘k and size polynomial in s𝑠s and k𝑘k.

Proof.

Assume that C𝐶C has a resolution derivation of width k𝑘k and size s𝑠s. Before we describe the conversions we need to apply a light pre-processing to the resolution derivation. Convert each resolution step deriving DE𝐷𝐸D\vee E from DX𝐷𝑋D\vee X and EX¯𝐸¯𝑋E\vee\overline{X} into a symmetric resolution step in which first DEX𝐷𝐸𝑋D\vee E\vee X and DEX¯𝐷𝐸¯𝑋D\vee E\vee\overline{X} are derived by weakenings from DX𝐷𝑋D\vee X and EX¯𝐸¯𝑋E\vee\overline{X}, respectively, and then DE𝐷𝐸D\vee E is derived from these by resolving on X𝑋X. Let D1,D2,,Dtsubscript𝐷1subscript𝐷2subscript𝐷𝑡D_{1},D_{2},\ldots,D_{t} be the resulting resolution derivation. The proofs for Polynomial Calculus and for Sherali-Adams are quite different because the latter one is a static proof system while the former one is not.

For Polynomial Calculus, we derive the equation M(Di)=0𝑀subscript𝐷𝑖0M(D_{i})=0 for i=1,,t𝑖1𝑡i=1,\ldots,t, by induction on i𝑖i. When Disubscript𝐷𝑖D_{i} is a clause from the set {C1,,Cm}subscript𝐶1subscript𝐶𝑚\{C_{1},\ldots,C_{m}\}, there is nothing to do. Assume now that Disubscript𝐷𝑖D_{i} is derived by a symmetric resolution step from Dj=DiXsubscript𝐷𝑗subscript𝐷𝑖𝑋D_{j}=D_{i}\vee X and Dk=DiX¯subscript𝐷𝑘subscript𝐷𝑖¯𝑋D_{k}=D_{i}\vee\overline{X}, where j,k<i𝑗𝑘𝑖j,k<i. By induction hypothesis the equations M(Di)X¯=0𝑀subscript𝐷𝑖¯𝑋0M(D_{i})\bar{X}=0 and M(Di)X=0𝑀subscript𝐷𝑖𝑋0M(D_{i})X=0 have already been derived. Add these equations to the lift of the axiom 1XX¯=01𝑋¯𝑋01-X-\bar{X}=0 by M(Di)𝑀subscript𝐷𝑖M(D_{i}) to get the equation M(Di)=0𝑀subscript𝐷𝑖0M(D_{i})=0. Next assume that Disubscript𝐷𝑖D_{i} is derived by a weakening step from Djsubscript𝐷𝑗D_{j}, say Di=DjXsubscript𝐷𝑖subscript𝐷𝑗𝑋D_{i}=D_{j}\vee X or Di=DjX¯subscript𝐷𝑖subscript𝐷𝑗¯𝑋D_{i}=D_{j}\vee\overline{X}. By induction hypothesis the equation M(Dj)=0𝑀subscript𝐷𝑗0M(D_{j})=0 has already been derived. Lift this equation by X𝑋X or X¯¯𝑋\bar{X} as appropriate to get M(Di)=0𝑀subscript𝐷𝑖0M(D_{i})=0. Clearly, the degree of this proof is linear in k𝑘k and the size is polynomial in s𝑠s and k𝑘k.

For Sherali-Adams the proof is quite different. For each Disubscript𝐷𝑖D_{i} in the resolution derivation we produce an inequality Qi0subscript𝑄𝑖0Q_{i}\geq 0 as follows. If Disubscript𝐷𝑖D_{i} is an initial clause, let Qi:=M(Di)assignsubscript𝑄𝑖𝑀subscript𝐷𝑖Q_{i}:=-M(D_{i}) so that Qi0subscript𝑄𝑖0Q_{i}\geq 0 is one of the given inequalities. If Disubscript𝐷𝑖D_{i} is obtained as a one-variable weakening step deriving DjXsubscript𝐷𝑗𝑋D_{j}\vee X from Djsubscript𝐷𝑗D_{j}, let Qi:=M(Dj)M(DjX)assignsubscript𝑄𝑖𝑀subscript𝐷𝑗𝑀subscript𝐷𝑗𝑋Q_{i}:=M(D_{j})-M(D_{j}\vee X) and derive Qi0subscript𝑄𝑖0Q_{i}\geq 0 by lifting the axiom 1X¯01¯𝑋01-\bar{X}\geq 0. If Disubscript𝐷𝑖D_{i} is obtained as a one-variable weakening step deriving DjX¯subscript𝐷𝑗¯𝑋D_{j}\vee\overline{X} from Djsubscript𝐷𝑗D_{j}, let Qi:=M(Dj)M(DjX¯)assignsubscript𝑄𝑖𝑀subscript𝐷𝑗𝑀subscript𝐷𝑗¯𝑋Q_{i}:=M(D_{j})-M(D_{j}\vee\overline{X}) and derive Qi0subscript𝑄𝑖0Q_{i}\geq 0 by lifting the axiom 1X01𝑋01-X\geq 0. If Disubscript𝐷𝑖D_{i} is obtained as a symmetric resolution step deriving Disubscript𝐷𝑖D_{i} from DiXsubscript𝐷𝑖𝑋D_{i}\vee X and DiX¯subscript𝐷𝑖¯𝑋D_{i}\vee\overline{X}, let Qi:=M(DiX)+M(DiX¯)M(Di)assignsubscript𝑄𝑖𝑀subscript𝐷𝑖𝑋𝑀subscript𝐷𝑖¯𝑋𝑀subscript𝐷𝑖Q_{i}:=M(D_{i}\vee X)+M(D_{i}\vee\overline{X})-M(D_{i}) and derive Qi0subscript𝑄𝑖0Q_{i}\geq 0 by lifting the axiom X¯+X10¯𝑋𝑋10\bar{X}+X-1\geq 0. Next consider the DAG of the resolution derivation oriented from the initial clauses towards the conclusion Dtsubscript𝐷𝑡D_{t}. We assign a weight cisubscript𝑐𝑖c_{i} to each Disubscript𝐷𝑖D_{i} in this DAG inductively: the conclusion Dtsubscript𝐷𝑡D_{t} gets weight 111, and if all immediate successors of Disubscript𝐷𝑖D_{i} have already been assigned weights, then Disubscript𝐷𝑖D_{i} gets as weight the sum of the weights of its immediate successors. Next multiply each inequality Qi0subscript𝑄𝑖0Q_{i}\geq 0 by its weight cisubscript𝑐𝑖c_{i} and add them together. This could cause the coefficients in the SA proof to go exponentially big, but their bitsize is still polynomial. The result is an SA proof of M(C)0𝑀𝐶0-M(C)\geq 0 since the only monomial that survives is the conclusion. The reverse inequality M(C)0𝑀𝐶0M(C)\geq 0 follows from lifting the axiom 10101\geq 0. This gives an SA proof of M(C)=0𝑀𝐶0M(C)=0 as required. The degree of this proof is linear in k𝑘k and its size is polynomial in s𝑠s and k𝑘k. ∎

4 Closure under reductions

Three types of reductions are often considered in the context of constraint satisfaction problems: a) pp-interpretability, b) homomorphic equivalence, c) addition of constants to a core. In this section we give their precise definitions and show that many proof systems behave well with respect to those types of reductions.

4.1 Reductions.

Let 𝔹𝔹\mathbb{B} and 𝔹superscript𝔹\mathbb{B}^{\prime} be finite relational structures over finite vocabularies L𝐿L and Lsuperscript𝐿L^{\prime}, respectively. We say that the structure 𝔹superscript𝔹\mathbb{B}^{\prime} is pp-definable in the structure 𝔹𝔹\mathbb{B} if it has the same domain and for every relation symbol TL𝑇superscript𝐿T\in L^{\prime} the relation T(𝔹)𝑇superscript𝔹T(\mathbb{B}^{\prime}) is definable in 𝔹𝔹\mathbb{B} by a pp-formula. Recall that a primitive positive formula over L𝐿L, or pp-formula, is a first-order formula which uses only symbols from L𝐿L, equality, conjunction, and first-order existential quantification. A relation TBr𝑇superscript𝐵𝑟T\subseteq B^{r} is definable in 𝔹𝔹\mathbb{B} by a pp-formula, or pp-definable in 𝔹𝔹\mathbb{B}, if there exists a pp-formula ϕ(x1,,xr)italic-ϕsubscript𝑥1subscript𝑥𝑟\phi(x_{1},\ldots,x_{r}) over L𝐿L, with free variables x1,,xrsubscript𝑥1subscript𝑥𝑟x_{1},\ldots,x_{r}, such that

T={(b1,,br)Br:𝔹ϕ(x1/b1,,xr/br)}.𝑇conditional-setsubscript𝑏1subscript𝑏𝑟superscript𝐵𝑟models𝔹italic-ϕsubscript𝑥1subscript𝑏1subscript𝑥𝑟subscript𝑏𝑟T=\{(b_{1},\ldots,b_{r})\in B^{r}:\mathbb{B}\models\phi(x_{1}/b_{1},\ldots,x_{r}/b_{r})\}.

Pp-interpretability is a generalization of pp-definability which allows for changing the domain of a CSP language. Given two relational structures 𝔹𝔹\mathbb{B} and 𝔹superscript𝔹\mathbb{B}^{\prime} in finite vocabularies L𝐿L and Lsuperscript𝐿L^{\prime}, respectively, we say that 𝔹superscript𝔹\mathbb{B}^{\prime} is pp-interpretable in 𝔹𝔹\mathbb{B} if there exist a positive integer n𝑛n and a surjective partial function f:BnB:𝑓superscript𝐵𝑛superscript𝐵f\colon B^{n}\rightarrow B^{\prime} such that the preimages of all relations in 𝔹superscript𝔹\mathbb{B}^{\prime} (including the equality relation) and the domain of f𝑓f are pp-definable in 𝔹𝔹\mathbb{B}. Showing that a CSP over a language 𝔹superscript𝔹\mathbb{B}^{\prime} pp-interpretable in the language 𝔹𝔹\mathbb{B} is not harder than the CSP of the language 𝔹𝔹\mathbb{B} itself [21] is one of the fundamental results of the so-called algebraic approach to constraint satisfaction problem, which led to many break-through results in the area.

Probably the simplest of all the constructions is the homomorphic equivalence. Structures 𝔹𝔹\mathbb{B} and 𝔹superscript𝔹\mathbb{B}^{\prime} over a vocabulary L𝐿L are homomorphically equivalent if there exists a homomorphism from 𝔹𝔹\mathbb{B} to 𝔹superscript𝔹\mathbb{B}^{\prime} and a homomorphism from 𝔹superscript𝔹\mathbb{B}^{\prime} to 𝔹𝔹\mathbb{B}. Obviously, if L𝐿L-structures 𝔹𝔹\mathbb{B} and 𝔹superscript𝔹\mathbb{B}^{\prime} are homomorphically equivalent, then any L𝐿L-structure 𝔸𝔸\mathbb{A} maps homomorphically to 𝔹𝔹\mathbb{B} if and only if it maps homomorphically to 𝔹superscript𝔹\mathbb{B}^{\prime}. So the CSP problems over both languages are the same.

Homomorphic equivalence allows us to focus on studying constraint satisfaction problems of well-behaved structures which in this context turn out to be those exhibiting little symmetry. A finite relational structure is called a core if all its endomorphisms are surjective. It is known that every relational structure has a homomorphically equivalent substructure that is a core. Core structures can be extended by one-element unary relations which we refer to as constants, without increasing the complexity of the language [21].

The importance of the constructions a), b) and c) follows from the fact that classes of constraint languages closed under those constructions can be studied via the corresponding algebras of polymorphisms, that is algebras of operations which preserve all the relations in the language (for details see e.g. the survey [13]). Here we show that bounded-DNF Frege, bounded-depth Frege, Frege, Polynomial Calculus, Sherali-Adams, Sums-of-Squares and Lovász-Schrijver of bounded and unbounded degree behave well with respect to those three types of reductions. This allows us to apply (in Section 6) strong results based on the algebraic approach to CSP.

4.2 Results

Let us fix relational structures 𝔹𝔹\mathbb{B} and 𝔹superscript𝔹\mathbb{B}^{\prime} over finite vocabularies L𝐿L and Lsuperscript𝐿L^{\prime}, respectively, such that 𝔹superscript𝔹\mathbb{B}^{\prime} is obtained from 𝔹𝔹\mathbb{B} by a finite sequence of constructions a), b) and c). In the following we recall the known polynomial-time computable transformation that maps instances 𝔸𝔸\mathbb{A} of CSP(𝔹)CSPsuperscript𝔹\mathrm{CSP}(\mathbb{B}^{\prime}) to instances 𝔸superscript𝔸\mathbb{A}^{\prime} of CSP(𝔹)CSP𝔹\mathrm{CSP}(\mathbb{B}) such that 𝔸superscript𝔸\mathbb{A}^{\prime} is satisfiable if and only if 𝔸𝔸\mathbb{A} is satisfiable, and the size of 𝔸superscript𝔸\mathbb{A}^{\prime} is linear in the size of 𝔸𝔸\mathbb{A}. The notation is supposed to remind the reader that once a template 𝔹superscript𝔹\mathbb{B}^{\prime} is constructed from a template 𝔹𝔹\mathbb{B}, the transformation of instances goes in the other direction: from an instance 𝔸𝔸\mathbb{A} of CSP(𝔹)CSPsuperscript𝔹\mathrm{CSP}(\mathbb{B}^{\prime}) we build an instance 𝔸superscript𝔸\mathbb{A}^{\prime} of CSP(𝔹)CSP𝔹\mathrm{CSP}(\mathbb{B}) satisfying the above mentioned conditions.

We prove that if E𝐸E and Esuperscript𝐸E^{\prime} are any local propositional encoding schemes for CSP(𝔹)CSP𝔹\mathrm{CSP}(\mathbb{B}) and CSP(𝔹)CSPsuperscript𝔹\mathrm{CSP}(\mathbb{B}^{\prime}), respectively, then this transformation satisfies the following:

Theorem 5.

For any positive integers t𝑡t, k𝑘k and s𝑠s, and any Lsuperscript𝐿L^{\prime}-structure 𝔸𝔸\mathbb{A}, if there is a Frege refutation of E(𝔸)𝐸superscript𝔸E(\mathbb{A}^{\prime}) of depth t𝑡t, bottom fan-in k𝑘k, and size s𝑠s, then there is a Frege refutation of E(𝔸)superscript𝐸𝔸E^{\prime}(\mathbb{A}) of depth t𝑡t, bottom fan-in polynomial in k𝑘k, and size polynomial in 2ksuperscript2𝑘2^{k}, s𝑠s and the size of 𝔸𝔸\mathbb{A}.

As a special case of the above theorem, by taking t=1𝑡1t=1 we obtain that bounded-DNF Frege behaves well with respect to the classical CSP reductions:

Corollary 2.

For any positive integers k𝑘k and s𝑠s, and any Lsuperscript𝐿L^{\prime}-structure 𝔸𝔸\mathbb{A}, if there is a k𝑘k-DNF Frege refutation of E(𝔸)𝐸superscript𝔸E(\mathbb{A}^{\prime}) of size s𝑠s, then there is an \ell-DNF Frege refutation of E(𝔸)superscript𝐸𝔸E^{\prime}(\mathbb{A}) of size polynomial in 2ksuperscript2𝑘2^{k}, s𝑠s and the size of 𝔸𝔸\mathbb{A}, where \ell is polynomial in k𝑘k.

Notice also that a Frege refutation of depth t𝑡t and bottom fan-in k𝑘k can be seen as a Frege refutation of depth t+1𝑡1t+1 and bottom fan-in 111. Therefore, Theorem 5 implies the following statement, which will be crucial for obtaining lower bounds in Section 6:

Corollary 3.

For any positive integers t𝑡t and s𝑠s, and any Lsuperscript𝐿L^{\prime}-structure 𝔸𝔸\mathbb{A}, if there is a Frege refutation of E(𝔸)𝐸superscript𝔸E(\mathbb{A}^{\prime}) of depth t𝑡t and size s𝑠s, then there is a Frege refutation of E(𝔸)superscript𝐸𝔸E^{\prime}(\mathbb{A}) of depth t+1𝑡1t+1 and size polynomial in s𝑠s and the size of 𝔸𝔸\mathbb{A}.

One more consequence of Theorem 5 concerns proofs in Frege proof system without any bounds on the depth. Corollary 3 above immediately implies that Frege is well-behaved with respect to the classical CSP reductions, that is:

Corollary 4.

For any positive integer s𝑠s, and any Lsuperscript𝐿L^{\prime}-structure 𝔸𝔸\mathbb{A}, if there is a Frege refutation of E(𝔸)𝐸superscript𝔸E(\mathbb{A}^{\prime}) of size s𝑠s, then there is a Frege refutation of E(𝔸)superscript𝐸𝔸E^{\prime}(\mathbb{A}) of size polynomial in s𝑠s and the size of 𝔸𝔸\mathbb{A}.

In the case of algebraic proof systems, if E𝐸E and Esuperscript𝐸E^{\prime} are any local algebraic encoding schemes over a field F𝐹F for CSP(𝔹)CSP𝔹\mathrm{CSP}(\mathbb{B}) and CSP(𝔹)CSPsuperscript𝔹\mathrm{CSP}(\mathbb{B}^{\prime}), respectively, we show that:

Theorem 6.

For any positive integers k𝑘k and s𝑠s, and any Lsuperscript𝐿L^{\prime}-structure 𝔸𝔸\mathbb{A}, if there is a PC refutation over F𝐹F of E(𝔸)𝐸superscript𝔸E(\mathbb{A}^{\prime}) of degree k𝑘k and size s𝑠s, then there is a PC refutation over F𝐹F of E(𝔸)superscript𝐸𝔸E^{\prime}(\mathbb{A}) of degree linear in k𝑘k and size polynomial in 2ksuperscript2𝑘2^{k}, s𝑠s and the size of 𝔸𝔸\mathbb{A}.

Finally, if E𝐸E and Esuperscript𝐸E^{\prime} are any local semi-algebraic encoding schemes for CSP(𝔹)CSP𝔹\mathrm{CSP}(\mathbb{B}) and CSP(𝔹)CSPsuperscript𝔹\mathrm{CSP}(\mathbb{B}^{\prime}), respectively, then:

Theorem 7.

For any positive integers k𝑘k and s𝑠s, and any Lsuperscript𝐿L^{\prime}-structure 𝔸𝔸\mathbb{A}, if there is an SA, SA+, SOS, LS or LS+ refutation of E(𝔸)𝐸superscript𝔸E(\mathbb{A}^{\prime}) of degree k𝑘k and size s𝑠s, then there is, respectively, an SA, SA+, SOS, LS or LS+ refutation of E(𝔸)superscript𝐸𝔸E^{\prime}(\mathbb{A}) of degree linear in k𝑘k and size polynomial in 2ksuperscript2𝑘2^{k}, s𝑠s and the size of 𝔸𝔸\mathbb{A}.

We point out that Theorem 7 in the case of the Sherali-Adams and Sums-of-Squares proof systems and the EQEQ\mathrm{EQ} encoding scheme can be extracted from [48] and [47].

The main idea in proving the above theorems for all the proof systems under consideration is the same. The refutation for an instance 𝔸superscript𝔸\mathbb{A}^{\prime} of CSP(𝔹)CSP𝔹\mathrm{CSP}(\mathbb{B}) is transformed into a refutation for an instance 𝔸𝔸\mathbb{A} of CSP(𝔹)CSPsuperscript𝔹\mathrm{CSP}(\mathbb{B}^{\prime}) by substituting the variables of E(𝔸)𝐸superscript𝔸E(\mathbb{A}^{\prime}) by DNFs with a bounded number of terms and a bounded number of literals in each term, or by polynomials with bounded degree, a bounded number of monomials and all coefficients equal 111. The additional condition we need to ensure is that each element of E(𝔸)𝐸superscript𝔸E(\mathbb{A}^{\prime}) after applying the substitution is a logical consequence of a subset of E(𝔸)superscript𝐸𝔸E^{\prime}(\mathbb{A}) of a bounded size. This way we can use Lemmas 6, 7 and 8 from Section 3 to control the growth of the size and depth/degree of the refutations. This argument, however, fails if one of the steps in constructing 𝔹superscript𝔹\mathbb{B}^{\prime} from 𝔹𝔹\mathbb{B} is adding the equality relation (which is a special case of a pp-definition). We deal with this by showing that equality propagation can be done in bounded-width resolution.

We prove Theorems 5, 6 and 7 for CNF and EQ encoding schemes in a series of lemmas below. It follows from Lemma 4 that this suffices to obtain the theorems in full generality. Let us see how to argue this for propositional proof systems. The reasoning in the case of algebraic and semi-algebraic proof systems is analogous.

Proof of Theorem 5.

Assume that the statement of the theorem holds for E𝐸E and Esuperscript𝐸E^{\prime} being the CNF encoding scheme. Let now E𝐸E and Esuperscript𝐸E^{\prime} be arbitrary local propositional encoding schemes for CSP(𝔹)CSP𝔹\mathrm{CSP}(\mathbb{B}) and CSP(𝔹)CSPsuperscript𝔹\mathrm{CSP}(\mathbb{B}^{\prime}), respectively.

By Lemma 4 there exist positive integers p𝑝p and psuperscript𝑝p^{\prime} such that for each L𝐿L-structure 𝔸superscript𝔸\mathbb{A}^{\prime}, every clause in E(𝔸)𝐸superscript𝔸E(\mathbb{A}^{\prime}) has a resolution proof from CNF(𝔸,𝔹)CNFsuperscript𝔸𝔹\mathrm{CNF}(\mathbb{A}^{\prime},\mathbb{B}) of size bounded by p𝑝p, and for each Lsuperscript𝐿L^{\prime}-structure 𝔸𝔸\mathbb{A}, every clause in CNF(𝔸,𝔹)CNF𝔸superscript𝔹\mathrm{CNF}(\mathbb{A},\mathbb{B}^{\prime}) has a resolution proof from E(𝔸)𝐸𝔸E(\mathbb{A}) of size bounded by psuperscript𝑝p^{\prime}.

Take an Lsuperscript𝐿L^{\prime}-structure 𝔸𝔸\mathbb{A}, and assume that there is a Frege refutation of E(𝔸)𝐸superscript𝔸E(\mathbb{A}^{\prime}) of depth t𝑡t, bottom fan-in k𝑘k, and size s𝑠s. Since every clause in E(𝔸)𝐸superscript𝔸E(\mathbb{A}^{\prime}) has a resolution proof from CNF(𝔸,𝔹)CNFsuperscript𝔸𝔹\mathrm{CNF}(\mathbb{A}^{\prime},\mathbb{B}) of size bounded by p𝑝p, it follows that CNF(𝔸,𝔹)CNFsuperscript𝔸𝔹\mathrm{CNF}(\mathbb{A}^{\prime},\mathbb{B}) has a Frege refutation of depth t𝑡t, bottom fan-in k𝑘k, and size linear in s𝑠s. The statement of the theorem holds for the CNF encoding schemes, so CNF(𝔸,𝔹)CNF𝔸superscript𝔹\mathrm{CNF}(\mathbb{A},\mathbb{B}^{\prime}) has a Frege refutation of depth t𝑡t, bottom fan-in polynomial in k𝑘k, and size polynomial in 2ksuperscript2𝑘2^{k}, s𝑠s and the size of 𝔸𝔸\mathbb{A}. Since every clause in CNF(𝔸,𝔹)CNF𝔸superscript𝔹\mathrm{CNF}(\mathbb{A},\mathbb{B}^{\prime}) has a resolution proof from E(𝔸)𝐸𝔸E(\mathbb{A}) of size bounded by psuperscript𝑝p^{\prime}, it follows that E(𝔸)𝐸𝔸E(\mathbb{A}) has a Frege refutation of depth t𝑡t, bottom fan-in polynomial in k𝑘k, and size polynomial in 2ksuperscript2𝑘2^{k}, s𝑠s and the size of 𝔸𝔸\mathbb{A}. ∎

In the subsequent sections we consider one by one the cases when 𝔹superscript𝔹\mathbb{B}^{\prime} is constructed from 𝔹𝔹\mathbb{B} using a), b) and c). We begin with pp-definability, with which we deal in three steps: by considering the equality relation, pp-formulas using conjunction only and existential quantification only.

4.3 Equality

Suppose that none of the relation symbols in L𝐿L interprets in 𝔹𝔹\mathbb{B} as the equality relation. For a binary relation symbol E𝐸E not in L𝐿L, let L=L{E}superscript𝐿𝐿𝐸L^{\prime}=L\cup\{E\}. Assume that 𝔹superscript𝔹\mathbb{B}^{\prime} is the Lsuperscript𝐿L^{\prime}-structure with domain B𝐵B, all relation symbols from L𝐿L interpreted as in 𝔹𝔹\mathbb{B}, i.e., R(𝔹)=R(𝔹)𝑅superscript𝔹𝑅𝔹R(\mathbb{B}^{\prime})=R(\mathbb{B}) for every RL𝑅𝐿R\in L, and the relation symbol E𝐸E interpreted as the equality relation over B𝐵B, i.e., E(𝔹)={(b,b):bB}𝐸superscript𝔹conditional-set𝑏𝑏𝑏𝐵E(\mathbb{B}^{\prime})=\{(b,b):b\in B\}.

For every instance of the CSP of the language 𝔹superscript𝔹\mathbb{B}^{\prime}, that is for every finite Lsuperscript𝐿L^{\prime}-structure 𝔸𝔸\mathbb{A}, there is a natural corresponding instance 𝔸superscript𝔸\mathbb{A}^{\prime} of the CSP over the language 𝔹𝔹\mathbb{B}. If \equiv is the smallest equivalence relation on A𝐴A which contains E(𝔸)𝐸𝔸E(\mathbb{A}), then define 𝔸superscript𝔸\mathbb{A}^{\prime} to be the L𝐿L-structure whose domain Asuperscript𝐴A^{\prime} is the set of the equivalence classes of the relation \equiv and every relation symbol RL𝑅𝐿R\in L is interpreted as {([a1],,[ar]):(a1,,ar)R(𝔸)}conditional-setsubscriptdelimited-[]subscript𝑎1subscriptdelimited-[]subscript𝑎𝑟subscript𝑎1subscript𝑎𝑟𝑅𝔸\{([a_{1}]_{\equiv},\ldots,[a_{r}]_{\equiv}):(a_{1},\ldots,a_{r})\in R(\mathbb{A})\}, where r𝑟r is the arity of R𝑅R. It is not difficult to see that 𝔸𝔸\mathbb{A} maps homomorphically to 𝔹superscript𝔹\mathbb{B}^{\prime} if and only if 𝔸superscript𝔸\mathbb{A}^{\prime} maps homomorphically to 𝔹𝔹\mathbb{B}.

Lemma 10.

There exists a positive integer c𝑐c such that the following holds. For any positive integers t𝑡t, k𝑘k and s𝑠s and every finite Lsuperscript𝐿L^{\prime}-structure 𝔸𝔸\mathbb{A}, if there is a Frege refutation of CNF(𝔸,𝔹)CNFsuperscript𝔸𝔹\mathrm{CNF}(\mathbb{A}^{\prime},\mathbb{B}) of depth t𝑡t, bottom fan-in k𝑘k and size s𝑠s, then there is a Frege refutation of CNF(𝔸,𝔹)CNF𝔸superscript𝔹\mathrm{CNF}(\mathbb{A},\mathbb{B}^{\prime}) of depth t𝑡t, bottom fan-in k𝑘k and size at most (c|A|+1)s𝑐𝐴1𝑠(c|A|+1)s.

Proof.

Let F𝐹F denote CNF(𝔸,𝔹)CNFsuperscript𝔸𝔹\mathrm{CNF}(\mathbb{A}^{\prime},\mathbb{B}) and let G𝐺G denote CNF(𝔸,𝔹)CNF𝔸superscript𝔹\mathrm{CNF}(\mathbb{A},\mathbb{B}^{\prime}). For each [a]Asubscriptdelimited-[]𝑎superscript𝐴[a]_{\equiv}\in A^{\prime}, we choose one element of [a]subscriptdelimited-[]𝑎[a]_{\equiv} denoted by asuperscript𝑎a^{*}, and consider the substitution σ𝜎\sigma of variables in F𝐹F defined by:

X([a],b):=X(a,b),assign𝑋subscriptdelimited-[]𝑎𝑏𝑋superscript𝑎𝑏X([a]_{\equiv},b):=X(a^{*},b),

for every [a]Asubscriptdelimited-[]𝑎superscript𝐴[a]_{\equiv}\in A^{\prime} and bB𝑏𝐵b\in B. We show that, for a constant c𝑐c to be determined later, for every clause C𝐶C from F𝐹F, the substituted formula σ(C)𝜎𝐶\sigma(C) has a resolution proof from G𝐺G of size at most c|A|𝑐𝐴c|A|. It follows that G𝐺G has a Frege refutation of depth t𝑡t, bottom fan-in k𝑘k and size at most (c|A|+1)s𝑐𝐴1𝑠(c|A|+1)s.

Let C𝐶C be any of the clauses in F𝐹F. Note that if C𝐶C is of type 1 or 2 then by applying the substitution we obtain a clause in G𝐺G, so there is nothing to be proved. Now, let us assume that C𝐶C is a clause of type 3, i.e., C=X([a1],b1)¯X([ar],br)¯𝐶¯𝑋subscriptdelimited-[]subscript𝑎1subscript𝑏1¯𝑋subscriptdelimited-[]subscript𝑎𝑟subscript𝑏𝑟C=\overline{X([a_{1}]_{\equiv},b_{1})}\vee\cdots\vee\overline{X([a_{r}]_{\equiv},b_{r})} for some RL𝑅𝐿R\in L of arity r𝑟r, ([a1],,[ar])R(𝔸)subscriptdelimited-[]subscript𝑎1subscriptdelimited-[]subscript𝑎𝑟𝑅superscript𝔸([a_{1}]_{\equiv},\ldots,[a_{r}]_{\equiv})\in R(\mathbb{A}^{\prime}) and (b1,,br)BrR(𝔹)subscript𝑏1subscript𝑏𝑟superscript𝐵𝑟𝑅𝔹(b_{1},\ldots,b_{r})\in B^{r}\setminus R(\mathbb{B}). The following claim will finish the proof. We state the bound on width because it will be useful later. By q𝑞q we denote the number of elements in B𝐵B.

Claim 2.

There are constants c𝑐c and d𝑑d such that the clause σ(C)𝜎𝐶\sigma(C) has a resolution derivation from G𝐺G of width at most d𝑑d and size at most c|A|𝑐𝐴c|A|.

To prove this claim the following observation will be helpful.

Claim 3.

There is a constant e𝑒e such that for every a,aA𝑎superscript𝑎𝐴a,a^{\prime}\in A such that aa𝑎superscript𝑎a\equiv a^{\prime}, and every bBsuperscript𝑏𝐵b^{\prime}\in B, there is a resolution proof of X(a,b)¯¯𝑋𝑎superscript𝑏\overline{X(a,b^{\prime})} from X(a,b)¯¯𝑋superscript𝑎superscript𝑏\overline{X(a^{\prime},b^{\prime})} and clauses in G𝐺G of width at most e𝑒e, length at most (2q+1)|A|2𝑞1𝐴(2q+1)|A| and size at most (2q+1)2|A|superscript2𝑞12𝐴(2q+1)^{2}|A|.

We use Claim 3 to prove Claim 2. Note that for every i[r]𝑖delimited-[]𝑟i\in[r] there exists ai[ai]subscriptsuperscript𝑎𝑖subscriptdelimited-[]subscript𝑎𝑖a^{\prime}_{i}\in[a_{i}]_{\equiv} such that (a1,,ar)R(𝔸)subscriptsuperscript𝑎1subscriptsuperscript𝑎𝑟𝑅𝔸(a^{\prime}_{1},\ldots,a^{\prime}_{r})\in R(\mathbb{A}). Therefore, the clause X(a1,b1)¯X(ar,br)¯¯𝑋subscriptsuperscript𝑎1subscript𝑏1¯𝑋subscriptsuperscript𝑎𝑟subscript𝑏𝑟\overline{X(a^{\prime}_{1},b_{1})}\vee\cdots\vee\overline{X(a^{\prime}_{r},b_{r})} belongs to G𝐺G. Now, since a1a1subscriptsuperscript𝑎1subscriptsuperscript𝑎1a^{*}_{1}\equiv a^{\prime}_{1}, it follows from Claim 3 that there is a resolution derivation of width at most e𝑒e, length at most (2q+1)|A|2𝑞1𝐴(2q+1)|A| and size bounded by (2q+1)2|A|superscript2𝑞12𝐴(2q+1)^{2}|A| of X(a1,b1)¯¯𝑋subscriptsuperscript𝑎1subscript𝑏1\overline{X(a^{*}_{1},b_{1})} from X(a1,b1)¯¯𝑋subscriptsuperscript𝑎1subscript𝑏1\overline{X(a^{\prime}_{1},b_{1})} and clauses in G𝐺G. If we reproduce exactly the same derivation starting with the clause X(a1,b1)¯X(ar,br)¯¯𝑋subscriptsuperscript𝑎1subscript𝑏1¯𝑋subscriptsuperscript𝑎𝑟subscript𝑏𝑟\overline{X(a^{\prime}_{1},b_{1})}\vee\cdots\vee\overline{X(a^{\prime}_{r},b_{r})} instead of X(a1,b1)¯¯𝑋subscriptsuperscript𝑎1subscript𝑏1\overline{X(a^{\prime}_{1},b_{1})}, what we get is a valid resolution derivation of X(a1,b1)¯X(a2,b2)¯X(ar,br)¯¯𝑋subscriptsuperscript𝑎1subscript𝑏1¯𝑋subscriptsuperscript𝑎2subscript𝑏2¯𝑋subscriptsuperscript𝑎𝑟subscript𝑏𝑟\overline{X(a^{*}_{1},b_{1})}\vee\overline{X(a^{\prime}_{2},b_{2})}\vee\cdots\vee\overline{X(a^{\prime}_{r},b_{r})} of width at most e+r𝑒𝑟e+r and size at most (2q+1)2|A|+(2q+1)(2r2)|A|superscript2𝑞12𝐴2𝑞12𝑟2𝐴(2q+1)^{2}|A|+(2q+1)(2r-2)|A|. We repeat the same construction r1𝑟1r-1 more times starting with the last clause derived and get a resolution derivation of σ(C)𝜎𝐶\sigma(C) whose width is bounded by re𝑟𝑒re and whose size is bounded by ((2q+1)2r+(2q+1)(2r2)r)|A|superscript2𝑞12𝑟2𝑞12𝑟2𝑟𝐴((2q+1)^{2}r+(2q+1)(2r-2)r)|A|. It remains to prove Claim 3.

Proof of Claim 3. First let us show that for every a,aA𝑎superscript𝑎𝐴a,a^{\prime}\in A such that (a,a)E(𝔸)𝑎superscript𝑎𝐸𝔸(a,a^{\prime})\in\ E(\mathbb{A}) or (a,a)E(𝔸)superscript𝑎𝑎𝐸𝔸(a^{\prime},a)\in\ E(\mathbb{A}), and every bBsuperscript𝑏𝐵b^{\prime}\in B there is a resolution proof of width at most q𝑞q, length at most 2q+12𝑞12q+1 and size bounded by (2q+1)2superscript2𝑞12(2q+1)^{2} of X(a,b)¯¯𝑋𝑎superscript𝑏\overline{X(a,b^{\prime})} from X(a,b)¯¯𝑋superscript𝑎superscript𝑏\overline{X(a^{\prime},b^{\prime})} and the clauses in G𝐺G. Indeed, the cut rule applied to X(a,b)¯¯𝑋superscript𝑎superscript𝑏\overline{X(a^{\prime},b^{\prime})} and the formula bBX(a,b)subscript𝑏𝐵𝑋superscript𝑎𝑏\bigvee_{b\in B}X(a^{\prime},b) from G𝐺G gives bBX(a,b)subscript𝑏superscript𝐵𝑋superscript𝑎𝑏\bigvee_{b\in B^{\prime}}X(a^{\prime},b), where B=B{b}superscript𝐵𝐵superscript𝑏B^{\prime}=B\setminus\{b^{\prime}\}. Then by a sequence of q1𝑞1q-1 cuts with formulas X(a,b)¯X(a,b)¯¯𝑋𝑎superscript𝑏¯𝑋superscript𝑎𝑏\overline{X(a,b^{\prime})}\vee\overline{X(a^{\prime},b)}, for bB𝑏superscript𝐵b\in B^{\prime}, we derive X(a,b)¯¯𝑋𝑎superscript𝑏\overline{X(a,b^{\prime})}. The total number of formulas in this sequence is 2q+12𝑞12q+1, and each has width at most q𝑞q and size at most 2q+12𝑞12q+1.

Now, let a=a1,,am=aformulae-sequence𝑎subscript𝑎1subscript𝑎𝑚superscript𝑎a=a_{1},\ldots,a_{m}=a^{\prime} be a sequence of elements of A𝐴A such that (ai,ai+1)E(𝔸)subscript𝑎𝑖subscript𝑎𝑖1𝐸𝔸(a_{i},a_{i+1})\in E(\mathbb{A}) or (ai+1,ai)E(𝔸)subscript𝑎𝑖1subscript𝑎𝑖𝐸𝔸(a_{i+1},a_{i})\in E(\mathbb{A}), and let us assume that this is one of the shortest sequences with this property. The statement of the claim then follows from the fact that m|A|𝑚𝐴m\leq|A|. ∎

Lemma 11.

Let 𝒫𝒫\mathcal{P} be Polynomial Calculus, Sherali-Adams, Positive Semidefinite Sherali-Adams, Sums-of-Squares, Lovász-Schrijver or Positive Semidefinite Lovász-Schrijver. For any positive integers k𝑘k and s𝑠s, and every finite Lsuperscript𝐿L^{\prime}-structure 𝔸𝔸\mathbb{A}, if there is a 𝒫𝒫\mathcal{P} refutation of EQ(𝔸,𝔹)EQsuperscript𝔸𝔹\mathrm{EQ}(\mathbb{A}^{\prime},\mathbb{B}) of degree k𝑘k and size s𝑠s, then there is a 𝒫𝒫\mathcal{P} refutation of EQ(𝔸,𝔹)EQ𝔸superscript𝔹\mathrm{EQ}(\mathbb{A},\mathbb{B}^{\prime}) of degree linear in k𝑘k and size polynomial in |A|𝐴|A| and s𝑠s.

Proof.

Let F𝐹F denote EQ(𝔸,𝔹)EQsuperscript𝔸𝔹\mathrm{EQ}(\mathbb{A}^{\prime},\mathbb{B}) and let G𝐺G denote EQ(𝔸,𝔹)EQ𝔸superscript𝔹\mathrm{EQ}(\mathbb{A},\mathbb{B}^{\prime}). Analogously as in the proof of Lemma 10 above, for each [a]Asubscriptdelimited-[]𝑎superscript𝐴[a]_{\equiv}\in A^{\prime}, we choose one element of [a]subscriptdelimited-[]𝑎[a]_{\equiv} denoted by asuperscript𝑎a^{*}, and consider the substitution σ𝜎\sigma of the variables in F𝐹F defined by:

X([a],b):=X(a,b),X¯([a],b):=X¯(a,b),formulae-sequenceassign𝑋subscriptdelimited-[]𝑎𝑏𝑋superscript𝑎𝑏assign¯𝑋subscriptdelimited-[]𝑎𝑏¯𝑋superscript𝑎𝑏\displaystyle X([a]_{\equiv},b):=X(a^{*},b),\ \ \ \ \ \ \ \ \ \ \bar{X}([a]_{\equiv},b):=\bar{X}(a^{*},b),

for every [a]Asubscriptdelimited-[]𝑎superscript𝐴[a]_{\equiv}\in A^{\prime} and bB𝑏𝐵b\in B.

We show that every equation from Eq(F)Eq𝐹\mathrm{Eq}(F) after applying the substitution σ𝜎\sigma has a PC derivation from Eq(G)Eq𝐺\mathrm{Eq}(G) of constant degree and size polynomial in |A|𝐴|A|. Once we have this, the proof for 𝒫𝒫\mathcal{P} being the Polynomial Calculus proof system follows the same lines as the proof of Lemma 7. Similarly, for (Positive Semidefinite) Sherali-Adams, Sums-of-Squares or (Positive Semidefinite) Lovász-Schrijver proof systems, the proof follows the same lines as the proof of Lemma 8 once we show that every inequality from Ineq(F)Ineq𝐹\mathrm{Ineq}(F) after applying the substitution σ𝜎\sigma has an SA derivation from Ineq(G)Ineq𝐺\mathrm{Ineq}(G) of constant degree and size polynomial in |A|𝐴|A|.

Note that by applying the substitution to equations of type 1 and 2, and to the axiom equations and inequalities we obtain equations and inequalities from Eq(G)Eq𝐺\mathrm{Eq}(G) and Ineq(G)Ineq𝐺\mathrm{Ineq}(G) so there is nothing to be proved. Now, consider an equation of type 3 from F𝐹F, i.e., X([a1],b1)X([ar],br)=0𝑋subscriptdelimited-[]subscript𝑎1subscript𝑏1𝑋subscriptdelimited-[]subscript𝑎𝑟subscript𝑏𝑟0X([a_{1}]_{\equiv},b_{1})\cdot\ldots\cdot X([a_{r}]_{\equiv},b_{r})=0 for some RL𝑅𝐿R\in L of arity r𝑟r, ([a1],,[ar])R(𝔸)subscriptdelimited-[]subscript𝑎1subscriptdelimited-[]subscript𝑎𝑟𝑅superscript𝔸([a_{1}]_{\equiv},\ldots,[a_{r}]_{\equiv})\in R(\mathbb{A}^{\prime}) and (b1,,br)BrR(𝔹)subscript𝑏1subscript𝑏𝑟superscript𝐵𝑟𝑅𝔹(b_{1},\ldots,b_{r})\in B^{r}\setminus R(\mathbb{B}). In the terminology of Section 3.3, this equality is the multiplicative encoding M(C)=0𝑀𝐶0M(C)=0 of the corresponding clause C𝐶C from CNF(𝔸,𝔹)superscript𝔸𝔹(\mathbb{A}^{\prime},\mathbb{B}). Note that σ(C)𝜎𝐶\sigma(C) is again a clause because σ𝜎\sigma is a substitution by variables. Let us call this clause D𝐷D. The substitution of the equation we are considering is the multiplicative encoding M(D)=0𝑀𝐷0M(D)=0 of the clause D𝐷D. Now, by Claim 2 that we proved inside the proof of Lemma 10, there is a resolution derivation of D𝐷D from G𝐺G of some constant width d𝑑d and size linear in |A|𝐴|A|. Since G𝐺G is the set of multiplicative encodings of the clauses in CNF(𝔸,𝔹)CNF𝔸superscript𝔹\mathrm{CNF}(\mathbb{A},\mathbb{B}^{\prime}), Lemma 9 applies and we get both PC and SA proofs of M(D)=0𝑀𝐷0M(D)=0 from G𝐺G of degree linear in d𝑑d and size polynomial in |A|𝐴|A|. This is a constant degree, and the proof is complete. ∎

4.4 Conjunction

We now consider the case when the structure 𝔹superscript𝔹\mathbb{B}^{\prime} is pp-definable from 𝔹𝔹\mathbb{B} by adding a single relation pp-definable using conjunction only. Let S𝑆S and P𝑃P be relation symbols in L𝐿L, let T𝑇T be a relation symbol not in L𝐿L, let L=L{T}superscript𝐿𝐿𝑇L^{\prime}=L\cup\{T\}, and assume that 𝔹superscript𝔹\mathbb{B}^{\prime} is the expansion of 𝔹𝔹\mathbb{B} with the relation T(𝔹)𝑇superscript𝔹T(\mathbb{B}^{\prime}) defined using a pp-formula ϕ(x1,,xr)italic-ϕsubscript𝑥1subscript𝑥𝑟\phi(x_{1},\ldots,x_{r}), where r𝑟r is the arity of T𝑇T, that is made of a conjunction of one atom on S𝑆S and one atom on P𝑃P. That is, R(𝔹)=R(𝔹)𝑅superscript𝔹𝑅𝔹R(\mathbb{B}^{\prime})=R(\mathbb{B}) for every RL𝑅𝐿R\in L, and T(𝔹)={(b1,,br)Br:𝔹ϕ(x1/b1,,xr/br)}𝑇superscript𝔹conditional-setsubscript𝑏1subscript𝑏𝑟superscript𝐵𝑟models𝔹italic-ϕsubscript𝑥1subscript𝑏1subscript𝑥𝑟subscript𝑏𝑟T(\mathbb{B}^{\prime})=\{(b_{1},\ldots,b_{r})\in B^{r}:\mathbb{B}\models\phi(x_{1}/b_{1},\ldots,x_{r}/b_{r})\}. To focus our attention let us assume that S𝑆S and P𝑃P are binary, T𝑇T is ternary, and that the pp-formula that defines T𝑇T is ϕ(x1,x2,x3)=S(x1,x2)P(x2,x3)italic-ϕsubscript𝑥1subscript𝑥2subscript𝑥3𝑆subscript𝑥1subscript𝑥2𝑃subscript𝑥2subscript𝑥3\phi(x_{1},x_{2},x_{3})=S(x_{1},x_{2})\wedge P(x_{2},x_{3}). The proof of the general case will be the same.

For a finite Lsuperscript𝐿L^{\prime}-structure 𝔸𝔸\mathbb{A} the corresponding L𝐿L-structure 𝔸superscript𝔸\mathbb{A}^{\prime} has the same domain and all the relation symbols except for S𝑆S and P𝑃P interpreted the same as in 𝔸𝔸\mathbb{A}. Moreover, S(𝔸)=S(𝔸){(a1,a2):(a1,a2,a3)T(𝔸) for some a3}𝑆superscript𝔸𝑆𝔸conditional-setsubscript𝑎1subscript𝑎2subscript𝑎1subscript𝑎2subscript𝑎3𝑇𝔸 for some subscript𝑎3S(\mathbb{A}^{\prime})=S(\mathbb{A})\cup\{(a_{1},a_{2}):(a_{1},a_{2},a_{3})\in T(\mathbb{A})\text{ for some }a_{3}\}, and P(𝔸)=P(𝔸){(a2,a3):(a1,a2,a3)T(𝔸) for some a1}𝑃superscript𝔸𝑃𝔸conditional-setsubscript𝑎2subscript𝑎3subscript𝑎1subscript𝑎2subscript𝑎3𝑇𝔸 for some subscript𝑎1P(\mathbb{A}^{\prime})=P(\mathbb{A})\cup\{(a_{2},a_{3}):(a_{1},a_{2},a_{3})\in T(\mathbb{A})\text{ for some }a_{1}\}. It is easy to see that 𝔸𝔸\mathbb{A} maps homomorphically to 𝔹superscript𝔹\mathbb{B}^{\prime} if and only if 𝔸superscript𝔸\mathbb{A}^{\prime} maps homomorphically to 𝔹𝔹\mathbb{B}.

Lemma 12.

For any positive integers t𝑡t, k𝑘k and s𝑠s, and every finite Lsuperscript𝐿L^{\prime}-structure 𝔸𝔸\mathbb{A}, if there is a Frege refutation of CNF(𝔸,𝔹)CNFsuperscript𝔸𝔹\mathrm{CNF}(\mathbb{A}^{\prime},\mathbb{B}) of depth t𝑡t, bottom fan-in k𝑘k and size s𝑠s, then there is a Frege refutation of CNF(𝔸,𝔹)CNF𝔸superscript𝔹\mathrm{CNF}(\mathbb{A},\mathbb{B}^{\prime}) of depth t𝑡t, bottom fan-in k𝑘k and size linear in s𝑠s.

Proof.

Let F𝐹F denote CNF(𝔸,𝔹)CNFsuperscript𝔸𝔹\mathrm{CNF}(\mathbb{A}^{\prime},\mathbb{B}) and let G𝐺G denote CNF(𝔸,𝔹)CNF𝔸superscript𝔹\mathrm{CNF}(\mathbb{A},\mathbb{B}^{\prime}). Observe that the variables of F𝐹F and G𝐺G as well as the clauses of type 1 and 2 are the same. Below we show that every clause C𝐶C of type 3 in F𝐹F is a logical consequence of a bounded number of clauses of G𝐺G. It follows that every clause C𝐶C of type 3 in F𝐹F has a resolution derivation from G𝐺G of size bounded by some constant c𝑐c, hence G𝐺G has a Frege refutation of depth t𝑡t, bottom fan-in k𝑘k and size at most cs+s𝑐𝑠𝑠cs+s.

Let C𝐶C be a clause X(a1,b1)¯X(ar,br)¯¯𝑋subscript𝑎1subscript𝑏1¯𝑋subscript𝑎𝑟subscript𝑏𝑟\overline{X(a_{1},b_{1})}\vee\cdots\vee\overline{X(a_{r},b_{r})} for some natural number r𝑟r, RL𝑅𝐿R\in L of arity r𝑟r, (a1,,ar)R(𝔸)subscript𝑎1subscript𝑎𝑟𝑅superscript𝔸(a_{1},\ldots,a_{r})\in R({\mathbb{A}^{\prime}}), and (b1,,br)BrR(𝔹)subscript𝑏1subscript𝑏𝑟superscript𝐵𝑟𝑅𝔹(b_{1},\ldots,b_{r})\in B^{r}\setminus R({\mathbb{B}}). If R{S,P}𝑅𝑆𝑃R\not\in\{S,P\} then C𝐶C is also a clause of G𝐺G and there is nothing to be proved. Without loss of generality let us assume that R=S𝑅𝑆R=S and hence C𝐶C is of the form X(a1,b1)¯X(a2,b2)¯¯𝑋subscript𝑎1subscript𝑏1¯𝑋subscript𝑎2subscript𝑏2\overline{X(a_{1},b_{1})}\vee\overline{X(a_{2},b_{2})} where (a1,a2)S(𝔸)subscript𝑎1subscript𝑎2𝑆superscript𝔸(a_{1},a_{2})\in S({\mathbb{A}^{\prime}}), and (b1,b2)B2S(𝔹)subscript𝑏1subscript𝑏2superscript𝐵2𝑆𝔹(b_{1},b_{2})\in B^{2}\setminus S({\mathbb{B}}). Now if (a1,a2)S(𝔸)subscript𝑎1subscript𝑎2𝑆𝔸(a_{1},a_{2})\in S(\mathbb{A}) then C𝐶C is a clause of G𝐺G and we are done. Otherwise, there exists a3Asubscript𝑎3superscript𝐴a_{3}\in A^{\prime} such that (a1,a2,a3)T(𝔸)subscript𝑎1subscript𝑎2subscript𝑎3𝑇𝔸(a_{1},a_{2},a_{3})\in T(\mathbb{A}) and for every b3Bsubscript𝑏3𝐵b_{3}\in B there is a clause X(a1,b1)¯X(a2,b2)¯X(a3,b3)¯¯𝑋subscript𝑎1subscript𝑏1¯𝑋subscript𝑎2subscript𝑏2¯𝑋subscript𝑎3subscript𝑏3\overline{X(a_{1},b_{1})}\vee\overline{X(a_{2},b_{2})}\vee\overline{X(a_{3},b_{3})} in G𝐺G. Indeed, since (b1,b2)B2S(𝔹)subscript𝑏1subscript𝑏2superscript𝐵2𝑆𝔹(b_{1},b_{2})\in B^{2}\setminus S({\mathbb{B}}) we have that (b1,b2,b3)B3T(𝔹)subscript𝑏1subscript𝑏2subscript𝑏3superscript𝐵3𝑇superscript𝔹(b_{1},b_{2},b_{3})\in B^{3}\setminus T({\mathbb{B}}^{\prime}). The number of such clauses is bounded by qsuperscript𝑞q^{\ell}, where q𝑞q is the number of elements in B𝐵B and \ell is the arity of T𝑇T. Those clauses together with the clause of type 1 for a3subscript𝑎3a_{3} logically imply C𝐶C. ∎

Lemma 13.

Let 𝒫𝒫\mathcal{P} be Polynomial Calculus, Sherali-Adams, Positive Semidefinite Sherali-Adams, Sums-of-Squares, Lovász-Schrijver or Positive Semidefinite Lovász-Schrijver. For any positive integers k𝑘k and s𝑠s, and every finite Lsuperscript𝐿L^{\prime}-structure 𝔸𝔸\mathbb{A}, if there is a 𝒫𝒫\mathcal{P} refutation of EQ(𝔸,𝔹)EQsuperscript𝔸𝔹\mathrm{EQ}(\mathbb{A}^{\prime},\mathbb{B}) of degree k𝑘k and size s𝑠s, then there is a 𝒫𝒫\mathcal{P} refutation of EQ(𝔸,𝔹)EQ𝔸superscript𝔹\mathrm{EQ}(\mathbb{A},\mathbb{B}^{\prime}) of degree linear in k𝑘k and size linear in s𝑠s.

Proof.

Let F𝐹F denote EQ(𝔸,𝔹)EQsuperscript𝔸𝔹\mathrm{EQ}(\mathbb{A}^{\prime},\mathbb{B}) and let G𝐺G denote EQ(𝔸,𝔹)EQ𝔸superscript𝔹\mathrm{EQ}(\mathbb{A},\mathbb{B}^{\prime}). Observe that the variables of F𝐹F and G𝐺G as well as the equations of type 1 and 2, and the axiom equations and inequalities are the same. Below we show that each equation of type 3 in F𝐹F follows from a bounded number of equations in Eq(G)Eq𝐺\mathrm{Eq}(G) on all evaluations of its variables in {0,1}01\{0,1\}. The way to show this is analogous as in Lemma 12 above. It follows that every equation in F𝐹F has NS and SA derivations from G𝐺G of degrees and sizes bounded by some constants. Similarly as in the proofs of Lemmas 7 and 8, this implies that G𝐺G has a 𝒫𝒫\mathcal{P} refutation of degree linear in k𝑘k and size linear in s𝑠s.

Let P=0𝑃0P=0 be an equation X(a1,b1)X(ar,br)=0𝑋subscript𝑎1subscript𝑏1𝑋subscript𝑎𝑟subscript𝑏𝑟0X(a_{1},b_{1})\cdot\ldots\cdot X(a_{r},b_{r})=0 for some natural number r𝑟r, RL𝑅𝐿R\in L of arity r𝑟r, (a1,,ar)R(𝔸)subscript𝑎1subscript𝑎𝑟𝑅superscript𝔸(a_{1},\ldots,a_{r})\in R({\mathbb{A}^{\prime}}), and (b1,,br)BrR(𝔹)subscript𝑏1subscript𝑏𝑟superscript𝐵𝑟𝑅𝔹(b_{1},\ldots,b_{r})\in B^{r}\setminus R({\mathbb{B}}). If R{S,P}𝑅𝑆𝑃R\not\in\{S,P\} then the equation P=0𝑃0P=0 is also in G𝐺G and there is nothing to be proved. Without loss of generality let us assume that R=S𝑅𝑆R=S and hence P=0𝑃0P=0 is of the form X(a1,b1)X(a2,b2)=0𝑋subscript𝑎1subscript𝑏1𝑋subscript𝑎2subscript𝑏20X(a_{1},b_{1})X(a_{2},b_{2})=0 where (a1,a2)S(𝔸)subscript𝑎1subscript𝑎2𝑆superscript𝔸(a_{1},a_{2})\in S({\mathbb{A}^{\prime}}), and (b1,b2)B2S(𝔹)subscript𝑏1subscript𝑏2superscript𝐵2𝑆𝔹(b_{1},b_{2})\in B^{2}\setminus S({\mathbb{B}}). Now if (a1,a2)S(𝔸)subscript𝑎1subscript𝑎2𝑆𝔸(a_{1},a_{2})\in S(\mathbb{A}) then the equation P=0𝑃0P=0 is in G𝐺G and we are done. Otherwise, there exists a3Asubscript𝑎3superscript𝐴a_{3}\in A^{\prime} such that (a1,a2,a3)T(𝔸)subscript𝑎1subscript𝑎2subscript𝑎3𝑇𝔸(a_{1},a_{2},a_{3})\in T(\mathbb{A}) and for every b3Bsubscript𝑏3𝐵b_{3}\in B there is an equation X(a1,b1)X(a2,b2)X(a3,b3)=0𝑋subscript𝑎1subscript𝑏1𝑋subscript𝑎2subscript𝑏2𝑋subscript𝑎3subscript𝑏30X(a_{1},b_{1})X(a_{2},b_{2})X(a_{3},b_{3})=0 in G𝐺G. The number of such equations is bounded by qsuperscript𝑞q^{\ell}, where q𝑞q is the number of elements in the domain of 𝔹𝔹\mathbb{B} and \ell is the arity of T𝑇T. The equation in question follows on all evaluations of its variables in {0,1}01\{0,1\} from the set Eq(G)Eqsuperscript𝐺\mathrm{Eq}(G^{\prime}), where Gsuperscript𝐺G^{\prime} is the set of those at most qsuperscript𝑞q^{\ell} equations together with the equation of type 1 for a3subscript𝑎3a_{3}. ∎

4.5 Existential quantification

We now consider the case when the structure 𝔹superscript𝔹\mathbb{B}^{\prime} is pp-definable from 𝔹𝔹\mathbb{B} by adding a single relation definable using existential quantification only. Let S𝑆S be a relation symbol in L𝐿L, let T𝑇T be a relation symbol not in L𝐿L, let L=L{T}superscript𝐿𝐿𝑇L^{\prime}=L\cup\{T\}, and assume that 𝔹superscript𝔹\mathbb{B}^{\prime} is the expansion of 𝔹𝔹\mathbb{B} with the relation T(𝔹)𝑇superscript𝔹T(\mathbb{B}^{\prime}) defined using a pp-formula ϕ(x1,,xr)italic-ϕsubscript𝑥1subscript𝑥𝑟\phi(x_{1},\ldots,x_{r}), where r𝑟r is the arity of T𝑇T, that is made of the existential quantification of one variable over an atom on S𝑆S. That is, R(𝔹)=R(𝔹)𝑅superscript𝔹𝑅𝔹R(\mathbb{B}^{\prime})=R(\mathbb{B}) for every RL𝑅𝐿R\in L, and T(𝔹)={(b1,,br)Br:𝔹ϕ(x1/b1,,xr/br)}𝑇superscript𝔹conditional-setsubscript𝑏1subscript𝑏𝑟superscript𝐵𝑟models𝔹italic-ϕsubscript𝑥1subscript𝑏1subscript𝑥𝑟subscript𝑏𝑟T(\mathbb{B}^{\prime})=\{(b_{1},\ldots,b_{r})\in B^{r}:\mathbb{B}\models\phi(x_{1}/b_{1},\ldots,x_{r}/b_{r})\}. To focus our attention let us assume that S𝑆S is ternary, T𝑇T is binary, and that the pp-formula that defines T𝑇T is ϕ(x1,x2)=yS(x1,x2,y)italic-ϕsubscript𝑥1subscript𝑥2𝑦𝑆subscript𝑥1subscript𝑥2𝑦\phi(x_{1},x_{2})=\exists yS(x_{1},x_{2},y).

For a finite Lsuperscript𝐿L^{\prime}-structure 𝔸𝔸\mathbb{A} the corresponding L𝐿L-structure 𝔸superscript𝔸\mathbb{A}^{\prime} has domain A𝐴A extended by a set of witnesses for S𝑆S. For each (a1,a2)T(𝔸)subscript𝑎1subscript𝑎2𝑇𝔸(a_{1},a_{2})\in T(\mathbb{A}), we add to A𝐴A a new point y(a1,a2)𝑦subscript𝑎1subscript𝑎2y(a_{1},a_{2}) so the domain Asuperscript𝐴A^{\prime} is equal to A{y(a1,a2):(a1,a2)T(𝔸)}𝐴conditional-set𝑦subscript𝑎1subscript𝑎2subscript𝑎1subscript𝑎2𝑇𝔸A\cup\{y(a_{1},a_{2}):(a_{1},a_{2})\in T(\mathbb{A})\}. All the relation symbols from L𝐿L except for S𝑆S are interpreted in 𝔸superscript𝔸\mathbb{A}^{\prime} the same as in 𝔸𝔸\mathbb{A}, and S(𝔸)=S(𝔸){(a1,a2,y(a1,a2)):(a1,a2)T(𝔸)}𝑆superscript𝔸𝑆𝔸conditional-setsubscript𝑎1subscript𝑎2𝑦subscript𝑎1subscript𝑎2subscript𝑎1subscript𝑎2𝑇𝔸S(\mathbb{A}^{\prime})=S(\mathbb{A})\cup\{(a_{1},a_{2},y(a_{1},a_{2})):(a_{1},a_{2})\in T(\mathbb{A})\}. It is not difficult to see that 𝔸𝔸\mathbb{A} maps homomorphically to 𝔹superscript𝔹\mathbb{B}^{\prime} if and only if 𝔸superscript𝔸\mathbb{A}^{\prime} maps homomorphically to 𝔹𝔹\mathbb{B}.

Lemma 14.

For any positive integers t𝑡t, k𝑘k and s𝑠s, and every finite Lsuperscript𝐿L^{\prime}-structure 𝔸𝔸\mathbb{A}, if there is a Frege refutation of CNF(𝔸,𝔹)CNFsuperscript𝔸𝔹\mathrm{CNF}(\mathbb{A}^{\prime},\mathbb{B}) of depth t𝑡t, bottom fan-in k𝑘k and size s𝑠s, then there is a Frege refutation of CNF(𝔸,𝔹)CNF𝔸superscript𝔹\mathrm{CNF}(\mathbb{A},\mathbb{B}^{\prime}) of depth t𝑡t, bottom fan-in polynomial in k𝑘k and size polynomial in 2ksuperscript2𝑘2^{k} and s𝑠s.

Proof.

Let F𝐹F denote CNF(𝔸,𝔹)CNFsuperscript𝔸𝔹\mathrm{CNF}(\mathbb{A}^{\prime},\mathbb{B}) and let G𝐺G denote CNF(𝔸,𝔹)CNF𝔸superscript𝔹\mathrm{CNF}(\mathbb{A},\mathbb{B}^{\prime}). Assume for this proof that B=[q]𝐵delimited-[]𝑞B=[q]. For each b[q]𝑏delimited-[]𝑞b\in[q] we define a subset F(b)𝐹𝑏F(b) of T(𝔹)𝑇superscript𝔹T(\mathbb{B}^{\prime}) inductively as follows:

F(b)={(i,j)[q]2:(i,j,b)S(𝔹)}(F(1)F(b1)).𝐹𝑏conditional-set𝑖𝑗superscriptdelimited-[]𝑞2𝑖𝑗𝑏𝑆superscript𝔹𝐹1𝐹𝑏1F(b)=\big{\{}{(i,j)\in[q]^{2}:(i,j,b)\in S(\mathbb{B}^{\prime})}\big{\}}\setminus(F(1)\cup\cdots\cup F(b-1)).

Note that F(1),,F(q)𝐹1𝐹𝑞F(1),\ldots,F(q) cover T(𝔹)𝑇superscript𝔹T(\mathbb{B}^{\prime}) and are pairwise disjoint. In other words, they partition T(𝔹)𝑇superscript𝔹T(\mathbb{B}^{\prime}); note however that some F(b)𝐹𝑏F(b)’s may be empty.

Consider the substitution σ𝜎\sigma defined by the identity on all variables of G𝐺G and defined as follows for every variable in F𝐹F that is not in G𝐺G:

X(y(a1,a2),b):=(b1,b2)F(b)X(a1,b1)X(a2,b2),assign𝑋𝑦subscript𝑎1subscript𝑎2𝑏subscriptsubscript𝑏1subscript𝑏2𝐹𝑏𝑋subscript𝑎1subscript𝑏1𝑋subscript𝑎2subscript𝑏2X(y(a_{1},a_{2}),b):=\bigvee_{(b_{1},b_{2})\in F(b)}X(a_{1},b_{1})\wedge X(a_{2},b_{2}),

for every (a1,a2)T(𝔸)subscript𝑎1subscript𝑎2𝑇𝔸(a_{1},a_{2})\in T(\mathbb{A}) and b[q]𝑏delimited-[]𝑞b\in[q]. Note that this is an \ell-DNF with at most qsuperscript𝑞q^{\ell} many terms, where \ell is the arity of T𝑇T. By Lemma 6 it suffices to check that, for each clause C𝐶C of F𝐹F, the substituted formula σ(C)𝜎𝐶\sigma(C) is a logical consequence of a bounded number of clauses of G𝐺G.

To argue this, let C𝐶C be any of the clauses in F𝐹F, say b[q]X(a,b)subscript𝑏delimited-[]𝑞𝑋𝑎𝑏\bigvee_{b\in[q]}X(a,b) for a𝑎a in the domain of 𝔸superscript𝔸\mathbb{A}^{\prime}. If a𝑎a is not of the form y(a1,a2)𝑦subscript𝑎1subscript𝑎2y(a_{1},a_{2}), then the clause is left untouched by the substitution. Since the same clause is also in G𝐺G, there is nothing to prove. Suppose now that a𝑎a is y(a1,a2)𝑦subscript𝑎1subscript𝑎2y(a_{1},a_{2}). The substituted formula is then the following:

b[q](b1,b2)F(b)X(a1,b1)X(a2,b2).subscript𝑏delimited-[]𝑞subscriptsubscript𝑏1subscript𝑏2𝐹𝑏𝑋subscript𝑎1subscript𝑏1𝑋subscript𝑎2subscript𝑏2\bigvee_{b\in[q]}\bigvee_{(b_{1},b_{2})\in F(b)}X(a_{1},b_{1})\wedge X(a_{2},b_{2}).

Since the sets F(1),,F(q)𝐹1𝐹𝑞F(1),\ldots,F(q) cover T(𝔹)𝑇superscript𝔹T(\mathbb{B}^{\prime}), this is indeed equivalent to

(b1,b2)T(𝔹)X(a1,b1)X(a2,b2).subscriptsubscript𝑏1subscript𝑏2𝑇superscript𝔹𝑋subscript𝑎1subscript𝑏1𝑋subscript𝑎2subscript𝑏2\bigvee_{(b_{1},b_{2})\in T(\mathbb{B}^{\prime})}X(a_{1},b_{1})\wedge X(a_{2},b_{2}).

Note now that this formula is a logical consequence of the following clauses of G𝐺G: those of type 1 for a1subscript𝑎1a_{1} and a2subscript𝑎2a_{2}, and all those of type 3 for (a1,a2)subscript𝑎1subscript𝑎2(a_{1},a_{2}) and the relation symbol T𝑇T. These are at most q+2superscript𝑞2q^{\ell}+2 many clauses, where \ell is the arity of T𝑇T, and we are done for this case.

Suppose now that C𝐶C is the clause X(a,b)¯X(a,b)¯¯𝑋𝑎𝑏¯𝑋𝑎superscript𝑏\overline{X(a,b)}\vee\overline{X(a,b^{\prime})} for a𝑎a in the domain of 𝔸superscript𝔸\mathbb{A}^{\prime} and (b,b)B2𝑏superscript𝑏superscript𝐵2(b,b^{\prime})\in B^{2} with bb𝑏superscript𝑏b\not=b^{\prime}. As in the previous case, if a𝑎a is not of the form y(a1,a2)𝑦subscript𝑎1subscript𝑎2y(a_{1},a_{2}), then the clause is left untouched by the substitution and there is nothing to prove. Suppose now that a𝑎a is y(a1,a2)𝑦subscript𝑎1subscript𝑎2y(a_{1},a_{2}). By applying the substitution σ𝜎\sigma and converting the resulting formula into negation normal form we obtain the following:

((b1,b2)F(b)X(a1,b1)¯X(a2,b2)¯)((b1,b2)F(b)X(a1,b1)¯X(a2,b2)¯).subscriptsubscript𝑏1subscript𝑏2𝐹𝑏¯𝑋subscript𝑎1subscript𝑏1¯𝑋subscript𝑎2subscript𝑏2subscriptsubscript𝑏1subscript𝑏2𝐹superscript𝑏¯𝑋subscript𝑎1subscript𝑏1¯𝑋subscript𝑎2subscript𝑏2\Big{(}{\bigwedge_{(b_{1},b_{2})\in F(b)}\overline{X(a_{1},b_{1})}\vee\overline{X(a_{2},b_{2})}}\Big{)}\vee\Big{(}{\bigwedge_{(b_{1},b_{2})\in F(b^{\prime})}\overline{X(a_{1},b_{1})}\vee\overline{X(a_{2},b_{2})}}\Big{)}.

This formula says that the tuple (a1,a2)subscript𝑎1subscript𝑎2(a_{1},a_{2}) is either not mapped to any tuple in F(b)𝐹𝑏F(b) or not mapped to any tuple in F(b)𝐹superscript𝑏F(b^{\prime}). Since the sets F(b)𝐹𝑏F(b) and F(b)𝐹superscript𝑏F(b^{\prime}) are disjoint, this is a logical consequence of at most 2q22superscript𝑞22q^{2} many clauses of G𝐺G: those of type 2 for a1subscript𝑎1a_{1} and a2subscript𝑎2a_{2}. Indeed, those clauses imply that the tuple (a1,a2)subscript𝑎1subscript𝑎2(a_{1},a_{2}) can be mapped to at most one tuple from B2superscript𝐵2B^{2}.

Now, let C𝐶C be the clause X(a1,b1)¯X(ar,br)¯¯𝑋subscript𝑎1subscript𝑏1¯𝑋subscript𝑎𝑟subscript𝑏𝑟\overline{X(a_{1},b_{1})}\vee\cdots\vee\overline{X(a_{r},b_{r})} for some natural number r𝑟r, RL𝑅𝐿R\in L of arity r𝑟r, (a1,,ar)R(𝔸)subscript𝑎1subscript𝑎𝑟𝑅superscript𝔸(a_{1},\ldots,a_{r})\in R({\mathbb{A}^{\prime}}), and (b1,,br)BrR(𝔹)subscript𝑏1subscript𝑏𝑟superscript𝐵𝑟𝑅𝔹(b_{1},\ldots,b_{r})\in B^{r}\setminus R({\mathbb{B}}). If (a1,,ar)Arsubscript𝑎1subscript𝑎𝑟superscript𝐴𝑟(a_{1},\ldots,a_{r})\in A^{r} then the same argument as above shows that there is nothing to be proved. Observe that the only other case is when R=S𝑅𝑆R=S and C𝐶C is of the form X(a1,b1)¯X(a2,b2)¯X(y(a1,a2),b3)¯¯𝑋subscript𝑎1subscript𝑏1¯𝑋subscript𝑎2subscript𝑏2¯𝑋𝑦subscript𝑎1subscript𝑎2subscript𝑏3\overline{X(a_{1},b_{1})}\vee\overline{X(a_{2},b_{2})}\vee\overline{X(y(a_{1},a_{2}),b_{3})}, where (a1,a2)T(𝔸)subscript𝑎1subscript𝑎2𝑇𝔸(a_{1},a_{2})\in T(\mathbb{A}) and (b1,b2,b3)B3S(𝔹)subscript𝑏1subscript𝑏2subscript𝑏3superscript𝐵3𝑆𝔹(b_{1},b_{2},b_{3})\in B^{3}\setminus S({\mathbb{B}}). The substituted formula (after converting to negation normal form) is then the following:

X(a1,b1)¯X(a2,b2)¯(b,b)F(b3)X(a1,b)¯X(a2,b)¯.¯𝑋subscript𝑎1subscript𝑏1¯𝑋subscript𝑎2subscript𝑏2subscript𝑏superscript𝑏𝐹subscript𝑏3¯𝑋subscript𝑎1𝑏¯𝑋subscript𝑎2superscript𝑏\overline{X(a_{1},b_{1})}\vee\overline{X(a_{2},b_{2})}\vee{\bigwedge_{(b,b^{\prime})\in F(b_{3})}\overline{X(a_{1},b)}\vee\overline{X(a_{2},b^{\prime})}}.

There are two possibilities. If (b1,b2)B2T(𝔹)subscript𝑏1subscript𝑏2superscript𝐵2𝑇superscript𝔹(b_{1},b_{2})\in B^{2}\setminus T({\mathbb{B}^{\prime}}), then the formula above is the logical consequence of the clause X(a1,b1)¯X(a2,b2)¯¯𝑋subscript𝑎1subscript𝑏1¯𝑋subscript𝑎2subscript𝑏2\overline{X(a_{1},b_{1})}\vee\overline{X(a_{2},b_{2})} from G𝐺G. Otherwise, we have that (b1,b2)T(𝔹)subscript𝑏1subscript𝑏2𝑇superscript𝔹(b_{1},b_{2})\in T({\mathbb{B}^{\prime}}), but (b1,b2,b3)B3S(𝔹)subscript𝑏1subscript𝑏2subscript𝑏3superscript𝐵3𝑆𝔹(b_{1},b_{2},b_{3})\in B^{3}\setminus S({\mathbb{B}}) which means that (b1,b2)F(b3)subscript𝑏1subscript𝑏2𝐹subscript𝑏3(b_{1},b_{2})\not\in F(b_{3}). Observe that the substituted formula says that the tuple (a1,a2)subscript𝑎1subscript𝑎2(a_{1},a_{2}) is not mapped to the tuple (b1,b2)subscript𝑏1subscript𝑏2(b_{1},b_{2}) or it is not mapped to any tuple from F(b3)𝐹subscript𝑏3F(b_{3}). Similarly to the previous case, this is a logical consequence of at most 2q22superscript𝑞22q^{2} many clauses of G𝐺G: those of type 2 for a1subscript𝑎1a_{1} and a2subscript𝑎2a_{2}. This is because those clauses imply that the tuple (a1,a2)subscript𝑎1subscript𝑎2(a_{1},a_{2}) can be mapped to at most one tuple from B2superscript𝐵2B^{2}. ∎

Lemma 15.

Let 𝒫𝒫\mathcal{P} be Polynomial Calculus, Sherali-Adams, Positive Semidefinite Sherali-Adams, Sums-of-Squares, Lovász-Schrijver or Positive Semidefinite Lovász-Schrijver. For any positive integers k𝑘k and s𝑠s, and every finite Lsuperscript𝐿L^{\prime}-structure 𝔸𝔸\mathbb{A}, if there is a 𝒫𝒫\mathcal{P} refutation of EQ(𝔸,𝔹)EQsuperscript𝔸𝔹\mathrm{EQ}(\mathbb{A}^{\prime},\mathbb{B}) of degree k𝑘k and size s𝑠s, then there is a 𝒫𝒫\mathcal{P} refutation of EQ(𝔸,𝔹)EQ𝔸superscript𝔹\mathrm{EQ}(\mathbb{A},\mathbb{B}^{\prime}) of degree linear in k𝑘k and size polynomial in 2ksuperscript2𝑘2^{k} and s𝑠s.

Proof.

Let F𝐹F denote EQ(𝔸,𝔹)EQsuperscript𝔸𝔹\mathrm{EQ}(\mathbb{A}^{\prime},\mathbb{B}) and let G𝐺G denote EQ(𝔸,𝔹)EQ𝔸superscript𝔹\mathrm{EQ}(\mathbb{A},\mathbb{B}^{\prime}). Assume that B=[q]𝐵delimited-[]𝑞B=[q]. For each b[q]𝑏delimited-[]𝑞b\in[q] define a subset F(b)𝐹𝑏F(b) of T(𝔹)𝑇superscript𝔹T(\mathbb{B}^{\prime}) as in Lemma 14 above. Consider the substitution σ𝜎\sigma defined by the identity on all variables of G𝐺G and defined as follows for every variable in F𝐹F that is not in G𝐺G:

X(y(a1,a2),b)𝑋𝑦subscript𝑎1subscript𝑎2𝑏\displaystyle X(y(a_{1},a_{2}),b):=(b1,b2)F(b)X(a1,b1)X(a2,b2),assignabsentsubscriptsubscript𝑏1subscript𝑏2𝐹𝑏𝑋subscript𝑎1subscript𝑏1𝑋subscript𝑎2subscript𝑏2\displaystyle:=\sum_{(b_{1},b_{2})\in F(b)}X(a_{1},b_{1})X(a_{2},b_{2}),
X¯(y(a1,a2),b)¯𝑋𝑦subscript𝑎1subscript𝑎2𝑏\displaystyle\bar{X}(y(a_{1},a_{2}),b):=(b1,b2)B2F(b)X(a1,b1)X(a2,b2),assignabsentsubscriptsubscript𝑏1subscript𝑏2superscript𝐵2𝐹𝑏𝑋subscript𝑎1subscript𝑏1𝑋subscript𝑎2subscript𝑏2\displaystyle:=\sum_{(b_{1},b_{2})\in B^{2}\setminus F(b)}X(a_{1},b_{1})X(a_{2},b_{2}),

for every (a1,a2)T(𝔸)subscript𝑎1subscript𝑎2𝑇𝔸(a_{1},a_{2})\in T(\mathbb{A}) and b[q]𝑏delimited-[]𝑞b\in[q]. Note that those are polynomials of degree m𝑚m with at most qmsuperscript𝑞𝑚q^{m} many monomials and all coefficients equal 111, where m𝑚m is the arity of T𝑇T. We will show that for each equation in F𝐹F and for each axiom inequality and equation, its substitution follows on all evaluations of its variables in {0,1}01\{0,1\} from a bounded number of equations in Eq(G)Eq𝐺\mathrm{Eq}(G). By Lemmas 7 and 8 this implies the statement of the lemma. The way to show this is analogous as in Lemma 14 above.

Let P=0𝑃0P=0 be any of the equations in F𝐹F, say bBX¯(a,b)=0subscriptproduct𝑏𝐵¯𝑋𝑎𝑏0\prod_{b\in B}\bar{X}(a,b)=0 for a𝑎a in the domain of 𝔸superscript𝔸\mathbb{A}^{\prime}. If a𝑎a is not of the form y(a1,a2)𝑦subscript𝑎1subscript𝑎2y(a_{1},a_{2}), then the equation is left untouched by the substitution and there is nothing to prove. Suppose now that a𝑎a is y(a1,a2)𝑦subscript𝑎1subscript𝑎2y(a_{1},a_{2}). The substituted equation is then the following:

bB((b1,b2)B2F(b)X(a1,b1)X(a2,b2))=0.subscriptproduct𝑏𝐵subscriptsubscript𝑏1subscript𝑏2superscript𝐵2𝐹𝑏𝑋subscript𝑎1subscript𝑏1𝑋subscript𝑎2subscript𝑏20\prod_{b\in B}\big{(}\sum_{(b_{1},b_{2})\in B^{2}\setminus F(b)}X(a_{1},b_{1})X(a_{2},b_{2})\big{)}=0.

This equation follows on all evaluations of its variables in {0,1}01\{0,1\} from the set Eq(G)Eqsuperscript𝐺\mathrm{Eq}(G^{\prime}), where Gsuperscript𝐺G^{\prime} contains the following equations of G𝐺G: those of type 1 and 2 for a1subscript𝑎1a_{1} and a2subscript𝑎2a_{2}, and all those of type 3 for (a1,a2)subscript𝑎1subscript𝑎2(a_{1},a_{2}) and the relation symbol T𝑇T. Indeed, take any evaluation satisfying Eq(G)Eqsuperscript𝐺\mathrm{Eq}(G^{\prime}). It corresponds to a mapping from {a1,a2}subscript𝑎1subscript𝑎2\{a_{1},a_{2}\} to B𝐵B, where (a1,a2)subscript𝑎1subscript𝑎2(a_{1},a_{2}) is mapped to a pair (b1,b2)subscript𝑏1subscript𝑏2(b_{1},b_{2}) in T(𝔹)𝑇superscript𝔹T(\mathbb{B}^{\prime}). Since the sets F(1),,F(q)𝐹1𝐹𝑞F(1),\ldots,F(q) form a partition of T(𝔹)𝑇superscript𝔹T(\mathbb{B}^{\prime}), there is bB𝑏𝐵b\in B such that (b1,b2)F(b)subscript𝑏1subscript𝑏2𝐹𝑏(b_{1},b_{2})\in F(b). For such b𝑏b it holds that X(a1,b1)X(a2,b2)=0𝑋subscript𝑎1subscript𝑏1𝑋subscript𝑎2subscript𝑏20X(a_{1},b_{1})X(a_{2},b_{2})=0 whenever (b1,b2)B2F(b)subscript𝑏1subscript𝑏2superscript𝐵2𝐹𝑏(b_{1},b_{2})\in B^{2}\setminus F(b). There are at most q+2q2+2superscript𝑞2superscript𝑞22q^{\ell}+2q^{2}+2 many equations in Gsuperscript𝐺G^{\prime}, where \ell is the arity of T𝑇T, so we are done for this case.

Suppose now that P=0𝑃0P=0 is the equation X(a,b)X(a,b)=0𝑋𝑎𝑏𝑋𝑎superscript𝑏0X(a,b)X(a,b^{\prime})=0 for a𝑎a in the domain of 𝔸superscript𝔸\mathbb{A}^{\prime} and (b,b)B2𝑏superscript𝑏superscript𝐵2(b,b^{\prime})\in B^{2} with bb𝑏superscript𝑏b\not=b^{\prime}. As in the previous case, if a𝑎a is not of the form y(a1,a2)𝑦subscript𝑎1subscript𝑎2y(a_{1},a_{2}), then the equation is left untouched by the substitution and there is nothing to prove. Suppose now that a𝑎a is y(a1,a2)𝑦subscript𝑎1subscript𝑎2y(a_{1},a_{2}). By applying the substitution σ𝜎\sigma we obtain the following:

((b1,b2)F(b)X(a1,b1)X(a2,b2))((b1,b2)F(b)X(a1,b1)X(a2,b2))=0.subscriptsubscript𝑏1subscript𝑏2𝐹𝑏𝑋subscript𝑎1subscript𝑏1𝑋subscript𝑎2subscript𝑏2subscriptsubscript𝑏1subscript𝑏2𝐹superscript𝑏𝑋subscript𝑎1subscript𝑏1𝑋subscript𝑎2subscript𝑏20\big{(}{\sum_{(b_{1},b_{2})\in F(b)}X(a_{1},b_{1})X(a_{2},b_{2})}\big{)}\cdot\big{(}{\sum_{(b_{1},b_{2})\in F(b^{\prime})}X(a_{1},b_{1})X(a_{2},b_{2})}\big{)}=0.

Since the sets F(b)𝐹𝑏F(b) and F(b)𝐹superscript𝑏F(b^{\prime}) are disjoint, this equation follows on all evaluations of its variables in {0,1}01\{0,1\} from the set of equations of type 2 for a1subscript𝑎1a_{1} and a2subscript𝑎2a_{2}. Indeed, those equations imply that at most one of the pairs (b1,b2)B2subscript𝑏1subscript𝑏2superscript𝐵2(b_{1},b_{2})\in B^{2} the product X(a1,b1)X(a2,b2)𝑋subscript𝑎1subscript𝑏1𝑋subscript𝑎2subscript𝑏2X(a_{1},b_{1})X(a_{2},b_{2}) is 111.

Now, let P=0𝑃0P=0 be the equation X(a1,b1)X(ar,br)=0𝑋subscript𝑎1subscript𝑏1𝑋subscript𝑎𝑟subscript𝑏𝑟0X(a_{1},b_{1})\cdot\ldots\cdot X(a_{r},b_{r})=0 for some natural number r𝑟r, RL𝑅𝐿R\in L of arity r𝑟r, (a1,,ar)R(𝔸)subscript𝑎1subscript𝑎𝑟𝑅superscript𝔸(a_{1},\ldots,a_{r})\in R({\mathbb{A}^{\prime}}), and (b1,,br)BrR(𝔹)subscript𝑏1subscript𝑏𝑟superscript𝐵𝑟𝑅𝔹(b_{1},\ldots,b_{r})\in B^{r}\setminus R({\mathbb{B}}). If (a1,,ar)Arsubscript𝑎1subscript𝑎𝑟superscript𝐴𝑟(a_{1},\ldots,a_{r})\in A^{r} then the same argument as above shows that there is nothing to be proved. Observe that the only other case is when R=S𝑅𝑆R=S and the equation is of the form

X(a1,b1)X(a2,b2)X(y(a1,a2),b3)=0,𝑋subscript𝑎1subscript𝑏1𝑋subscript𝑎2subscript𝑏2𝑋𝑦subscript𝑎1subscript𝑎2subscript𝑏30X(a_{1},b_{1})X(a_{2},b_{2})X(y(a_{1},a_{2}),b_{3})=0,

where (a1,a2)T(𝔸)subscript𝑎1subscript𝑎2𝑇𝔸(a_{1},a_{2})\in T(\mathbb{A}) and (b1,b2,b3)B3S(𝔹)subscript𝑏1subscript𝑏2subscript𝑏3superscript𝐵3𝑆𝔹(b_{1},b_{2},b_{3})\in B^{3}\setminus S({\mathbb{B}}). The substituted equation is then the following:

X(a1,b1)X(a2,b2)((b,b)F(b3)X(a1,b)X(a2,b))=0.X(a_{1},b_{1}){X}(a_{2},b_{2})\ \ \cdot\big{(}{\sum_{(b,b^{\prime})\in F(b_{3})}{X(a_{1},b)}{X(a_{2},b^{\prime})}}\big{)}=0.

There are two possibilities. If (b1,b2)B2T(𝔹)subscript𝑏1subscript𝑏2superscript𝐵2𝑇superscript𝔹(b_{1},b_{2})\in B^{2}\setminus T({\mathbb{B}^{\prime}}), then the equation above follows on all evaluations of its variables in {0,1}01\{0,1\} from the equation X(a1,b1)X(a2,b2)=0𝑋subscript𝑎1subscript𝑏1𝑋subscript𝑎2subscript𝑏20{X}(a_{1},b_{1}){X}(a_{2},b_{2})=0 from G𝐺G. Otherwise, we have that (b1,b2)T(𝔹)subscript𝑏1subscript𝑏2𝑇superscript𝔹(b_{1},b_{2})\in T({\mathbb{B}^{\prime}}), but (b1,b2,b3)B3S(𝔹)subscript𝑏1subscript𝑏2subscript𝑏3superscript𝐵3𝑆𝔹(b_{1},b_{2},b_{3})\in B^{3}\setminus S({\mathbb{B}}) which means that (b1,b2)B2F(b3)subscript𝑏1subscript𝑏2superscript𝐵2𝐹subscript𝑏3(b_{1},b_{2})\in B^{2}\setminus F(b_{3}). In this case, the substituted equation follows on all evaluations of its variables in {0,1}01\{0,1\} from the set of all equations of type 2 for a1subscript𝑎1a_{1} and a2subscript𝑎2a_{2}, which imply that for at most one of the pairs (b1,b2)B2subscript𝑏1subscript𝑏2superscript𝐵2(b_{1},b_{2})\in B^{2} the product X(a1,b1)X(a2,b2)𝑋subscript𝑎1subscript𝑏1𝑋subscript𝑎2subscript𝑏2X(a_{1},b_{1})X(a_{2},b_{2}) is 111.

Let us consider the axiom equation X(a,b)2X(a,b)=0𝑋superscript𝑎𝑏2𝑋𝑎𝑏0X(a,b)^{2}-X(a,b)=0 for a𝑎a in the domain of 𝔸superscript𝔸\mathbb{A}^{\prime} and bB𝑏𝐵b\in B. If a𝑎a is not of the form y(a1,a2)𝑦subscript𝑎1subscript𝑎2y(a_{1},a_{2}), then the equation is left untouched by the substitution and there is nothing to prove. Suppose now that a𝑎a is y(a1,a2)𝑦subscript𝑎1subscript𝑎2y(a_{1},a_{2}). By applying the substitution σ𝜎\sigma we obtain the following:

((b1,b2)F(b)X(a1,b1)X(a2,b2))2(b1,b2)F(b)X(a1,b1)X(a2,b2)=0.superscriptsubscriptsubscript𝑏1subscript𝑏2𝐹𝑏𝑋subscript𝑎1subscript𝑏1𝑋subscript𝑎2subscript𝑏22subscriptsubscript𝑏1subscript𝑏2𝐹𝑏𝑋subscript𝑎1subscript𝑏1𝑋subscript𝑎2subscript𝑏20\big{(}\sum_{(b_{1},b_{2})\in F(b)}X(a_{1},b_{1})X(a_{2},b_{2})\big{)}^{2}-\sum_{(b_{1},b_{2})\in F(b)}X(a_{1},b_{1})X(a_{2},b_{2})=0.

This equation follows on all evaluations of its variables in {0,1}01\{0,1\} from Eq(G)Eqsuperscript𝐺\mathrm{Eq}(G^{\prime}) where Gsuperscript𝐺G^{\prime} is the set of equations of type 1 and 2 for a1subscript𝑎1a_{1} and a2subscript𝑎2a_{2}.

Let us consider the axiom equation X(a,b)+X¯(a,b)1=0𝑋𝑎𝑏¯𝑋𝑎𝑏10X(a,b)+\bar{X}(a,b)-1=0 for a𝑎a in the domain of 𝔸superscript𝔸\mathbb{A}^{\prime} and bB𝑏𝐵b\in B. If a𝑎a is not of the form y(a1,a2)𝑦subscript𝑎1subscript𝑎2y(a_{1},a_{2}), then the equation is left untouched by the substitution and there is nothing to prove. Suppose now that a𝑎a is y(a1,a2)𝑦subscript𝑎1subscript𝑎2y(a_{1},a_{2}). By applying the substitution σ𝜎\sigma we obtain the following:

(b1,b2)B2X(a1,b1)X(a2,b2)1=0.subscriptsubscript𝑏1subscript𝑏2superscript𝐵2𝑋subscript𝑎1subscript𝑏1𝑋subscript𝑎2subscript𝑏210\sum_{(b_{1},b_{2})\in B^{2}}X(a_{1},b_{1})X(a_{2},b_{2})-1=0.

This equation follows on all evaluations of its variables in {0,1}01\{0,1\} from Eq(G)Eqsuperscript𝐺\mathrm{Eq}(G^{\prime}) where Gsuperscript𝐺G^{\prime} is the set of equations of type 1 and 2 for a1subscript𝑎1a_{1} and a2subscript𝑎2a_{2}.

Let us consider the axiom inequality 1X(a,b)01𝑋𝑎𝑏01-X(a,b)\geq 0, for a𝑎a in the domain of 𝔸superscript𝔸\mathbb{A}^{\prime} and bB𝑏𝐵b\in B. If a𝑎a is not of the form y(a1,a2)𝑦subscript𝑎1subscript𝑎2y(a_{1},a_{2}), then the inequality is left untouched by the substitution and there is nothing to prove. Suppose now that a𝑎a is y(a1,a2)𝑦subscript𝑎1subscript𝑎2y(a_{1},a_{2}). By applying the substitution σ𝜎\sigma we obtain the following:

1(b1,b2)F(b)X(a1,b1)X(a2,b2)0.1subscriptsubscript𝑏1subscript𝑏2𝐹𝑏𝑋subscript𝑎1subscript𝑏1𝑋subscript𝑎2subscript𝑏201\ -\sum_{(b_{1},b_{2})\in F(b)}X(a_{1},b_{1})X(a_{2},b_{2})\geq 0.

This inequality follows on all evaluations of its variables in {0,1}01\{0,1\} from Eq(G)Eqsuperscript𝐺\mathrm{Eq}(G^{\prime}) where Gsuperscript𝐺G^{\prime} is the set of equations of type 1 and 2 for a1subscript𝑎1a_{1} and a2subscript𝑎2a_{2}. They imply that at most one of the products X(a1,b1)X(a2,b2)𝑋subscript𝑎1subscript𝑏1𝑋subscript𝑎2subscript𝑏2X(a_{1},b_{1})X(a_{2},b_{2}) for (b1,b2)F(b)subscript𝑏1subscript𝑏2𝐹𝑏(b_{1},b_{2})\in F(b) is equal 111. The same way we deal with the case, when the inequality in question is the axiom inequality 1X¯(a,b)01¯𝑋𝑎𝑏01-\bar{X}(a,b)\geq 0, for a𝑎a in the domain of 𝔸superscript𝔸\mathbb{A}^{\prime} and bB𝑏𝐵b\in B.

Finally, the axiom inequalities X(a,b)0𝑋𝑎𝑏0X(a,b)\geq 0 and X¯(a,b)0¯𝑋𝑎𝑏0\bar{X}(a,b)\geq 0, for a𝑎a in the domain of 𝔸superscript𝔸\mathbb{A}^{\prime} and bB𝑏𝐵b\in B, after applying the substitution σ𝜎\sigma are always satisfied on evaluations of their variables in {0,1}01\{0,1\}. ∎

4.6 All together: pp-interpretations

Let 𝔹superscript𝔹\mathbb{B}^{\prime} be a finite Lsuperscript𝐿L^{\prime}-structure pp-interpretable in 𝔹𝔹\mathbb{B}, and let f:BnB:𝑓superscript𝐵𝑛superscript𝐵f\colon B^{n}\rightarrow B^{\prime} be a surjective partial function such that the domain of f𝑓f is defined by a pp-formula δ(x1,,xn)𝛿subscript𝑥1subscript𝑥𝑛\delta(x_{1},\ldots,x_{n}) in the language L𝐿L, i.e,. f1(B)={(b1,,bn)Bn:𝔹δ(x1/b1,,xn/bn)}superscript𝑓1superscript𝐵conditional-setsubscript𝑏1subscript𝑏𝑛superscript𝐵𝑛models𝔹𝛿subscript𝑥1subscript𝑏1subscript𝑥𝑛subscript𝑏𝑛f^{-1}(B^{\prime})=\{(b_{1},\ldots,b_{n})\in B^{n}:\mathbb{B}\models\delta(x_{1}/b_{1},\ldots,x_{n}/b_{n})\}, the preimage of the equality relation on Bsuperscript𝐵B^{\prime} is defined by a pp-formula ϵ(x1,,x2n)italic-ϵsubscript𝑥1subscript𝑥2𝑛\epsilon(x_{1},\ldots,x_{2n}) in the language L𝐿L, i.e., f1({(b,b):bB})={((b1,,bn),(bn+1,,b2n))(Bn)2:𝔹ϵ(x1/b1,,x2n/b2n)}superscript𝑓1conditional-setsuperscript𝑏superscript𝑏superscript𝑏superscript𝐵conditional-setsubscript𝑏1subscript𝑏𝑛subscript𝑏𝑛1subscript𝑏2𝑛superscriptsuperscript𝐵𝑛2models𝔹italic-ϵsubscript𝑥1subscript𝑏1subscript𝑥2𝑛subscript𝑏2𝑛f^{-1}(\{(b^{\prime},b^{\prime}):b^{\prime}\in B^{\prime}\})=\{((b_{1},\ldots,b_{n}),(b_{n+1},\ldots,b_{2n}))\in(B^{n})^{2}:\mathbb{B}\models\epsilon(x_{1}/b_{1},\ldots,x_{2n}/b_{2n})\}, and for every relation symbol RL𝑅superscript𝐿R\in L^{\prime} of arity r𝑟r, the preimage of the relation R(𝔹)𝑅superscript𝔹R(\mathbb{B}^{\prime}) is defined by a pp-formula φR(x1,,xrn)subscript𝜑𝑅subscript𝑥1subscript𝑥𝑟𝑛\varphi_{R}(x_{1},\ldots,x_{rn}) in the vocabulary L𝐿L, i.e.,

f1(R(𝔹))={((b1,,bn),,(bnrn+1,,bnr))(Bn)r:𝔹φR(x1/b1,,xrn/brn)}.superscript𝑓1𝑅superscript𝔹conditional-setsubscript𝑏1subscript𝑏𝑛subscript𝑏𝑛𝑟𝑛1subscript𝑏𝑛𝑟superscriptsuperscript𝐵𝑛𝑟models𝔹subscript𝜑𝑅subscript𝑥1subscript𝑏1subscript𝑥𝑟𝑛subscript𝑏𝑟𝑛f^{-1}(R(\mathbb{B}^{\prime}))=\{((b_{1},\ldots,b_{n}),\ldots,(b_{nr-n+1},\ldots,b_{nr}))\in(B^{n})^{r}:\mathbb{B}\models\varphi_{R}(x_{1}/b_{1},\ldots,x_{rn}/b_{rn})\}.

Consider the set f1(B)Bnsuperscript𝑓1superscript𝐵superscript𝐵𝑛f^{-1}(B^{\prime})\subseteq B^{n} quotiented by the equivalence relation f1({(b,b):bB})(Bn)2superscript𝑓1conditional-setsuperscript𝑏superscript𝑏superscript𝑏superscript𝐵superscriptsuperscript𝐵𝑛2f^{-1}(\{(b^{\prime},b^{\prime}):b^{\prime}\in B^{\prime}\})\subseteq(B^{n})^{2}. For every equivalence class [(b1,,bn)]delimited-[]subscript𝑏1subscript𝑏𝑛[(b_{1},\ldots,b_{n})] we choose a representative (b1,,bn)superscriptsubscript𝑏1subscript𝑏𝑛(b_{1},\ldots,b_{n})^{*}. The Lsuperscript𝐿L^{\prime}-structure whose domain is the set of all representatives and for each RL𝑅superscript𝐿R\in L^{\prime} of arity r𝑟r the relation R𝑅R interpreted as {((b1,,bn),,(bnrn+1,,bnr)):𝔹φR(x1/b1,,xrn/brn)}conditional-setsuperscriptsubscript𝑏1subscript𝑏𝑛superscriptsubscript𝑏𝑛𝑟𝑛1subscript𝑏𝑛𝑟models𝔹subscript𝜑𝑅subscript𝑥1subscript𝑏1subscript𝑥𝑟𝑛subscript𝑏𝑟𝑛\{((b_{1},\ldots,b_{n})^{*},\ldots,(b_{nr-n+1},\ldots,b_{nr})^{*}):\mathbb{B}\models\varphi_{R}(x_{1}/b_{1},\ldots,x_{rn}/b_{rn})\} is isomorphic to 𝔹superscript𝔹\mathbb{B}^{\prime}. From now on whenever we talk about the structure 𝔹superscript𝔹\mathbb{B}^{\prime} we mean the structure that we have just defined.

We now define a structure 𝔹′′superscript𝔹′′\mathbb{B}^{\prime\prime} pp-definable in 𝔹𝔹\mathbb{B} and show intuitively that small refutations for 𝔹′′superscript𝔹′′\mathbb{B}^{\prime\prime} imply small refutations for 𝔹superscript𝔹\mathbb{B}^{\prime}. By the results of previous sections it follows that small refutations for 𝔹𝔹\mathbb{B} imply small refutations for 𝔹superscript𝔹\mathbb{B}^{\prime}. To this end, for every relation symbol RL𝑅superscript𝐿R\in L^{\prime} of arity r𝑟r, let R^^𝑅\hat{R} be a relation symbol of arity nr𝑛𝑟nr, and let L′′={R^:RL}superscript𝐿′′conditional-set^𝑅𝑅superscript𝐿L^{\prime\prime}=\{\hat{R}:R\in L^{\prime}\}. We define 𝔹′′superscript𝔹′′\mathbb{B}^{\prime\prime} to be the finite L′′superscript𝐿′′L^{\prime\prime}-structure with domain B𝐵B and relations defined by R^(𝔹′′)={(b1,,brn)Brn:𝔹φR(x1/b1,,xrn/brn)}^𝑅superscript𝔹′′conditional-setsubscript𝑏1subscript𝑏𝑟𝑛superscript𝐵𝑟𝑛models𝔹subscript𝜑𝑅subscript𝑥1subscript𝑏1subscript𝑥𝑟𝑛subscript𝑏𝑟𝑛\hat{R}(\mathbb{B}^{\prime\prime})=\{(b_{1},\ldots,b_{rn})\in B^{rn}:\mathbb{B}\models\varphi_{R}(x_{1}/b_{1},\ldots,x_{rn}/b_{rn})\}, for each R^L′′^𝑅superscript𝐿′′\hat{R}\in L^{\prime\prime} of arity rn𝑟𝑛rn.

For every instance 𝔸𝔸\mathbb{A} of the CSP of the language 𝔹superscript𝔹\mathbb{B}^{\prime}, that is, for every finite Lsuperscript𝐿L^{\prime}-structure 𝔸𝔸\mathbb{A}, the corresponding instance of the CSP of the language 𝔹′′superscript𝔹′′\mathbb{B}^{\prime\prime} is the L′′superscript𝐿′′L^{\prime\prime}-structure 𝔸′′superscript𝔸′′\mathbb{A}^{\prime\prime} whose domain A′′superscript𝐴′′A^{\prime\prime} is A×[n]𝐴delimited-[]𝑛A\times[n] and whose relations are defined by

R^(𝔸′′)={((a1,1),,(a1,n),(a2,1),,(ar,n)):(a1,,ar)R(𝔸)},^𝑅superscript𝔸′′conditional-setsubscript𝑎11subscript𝑎1𝑛subscript𝑎21subscript𝑎𝑟𝑛subscript𝑎1subscript𝑎𝑟𝑅𝔸\hat{R}(\mathbb{A}^{\prime\prime})=\{((a_{1},1),\ldots,(a_{1},n),(a_{2},1),\ldots\ldots\ldots,(a_{r},n)):(a_{1},\ldots,a_{r})\in R(\mathbb{A})\},

for each R^L′′^𝑅superscript𝐿′′\hat{R}\in L^{\prime\prime} of arity rn𝑟𝑛rn. It is not difficult to see that 𝔸𝔸\mathbb{A} maps homomorphically to 𝔹superscript𝔹\mathbb{B}^{\prime} if and only if 𝔸′′superscript𝔸′′\mathbb{A}^{\prime\prime} maps homomorphically to 𝔹′′superscript𝔹′′\mathbb{B}^{\prime\prime}.

Lemma 16.

For any positive integers t𝑡t, k𝑘k and s𝑠s, and every finite Lsuperscript𝐿L^{\prime}-structure 𝔸𝔸\mathbb{A}, if there is a Frege refutation of CNF(𝔸′′,𝔹′′)CNFsuperscript𝔸′′superscript𝔹′′\mathrm{CNF}(\mathbb{A}^{\prime\prime},\mathbb{B}^{\prime\prime}) of depth t𝑡t, bottom fan-in k𝑘k and size s𝑠s, then there is a Frege refutation of CNF(𝔸,𝔹)CNF𝔸superscript𝔹\mathrm{CNF}(\mathbb{A},\mathbb{B}^{\prime}) of depth t𝑡t, bottom fan-in polynomial in k𝑘k and size polynomial in 2ksuperscript2𝑘2^{k} and s𝑠s.

Proof.

Let F𝐹F denote CNF(𝔸′′,𝔹′′)CNFsuperscript𝔸′′superscript𝔹′′\mathrm{CNF}(\mathbb{A}^{\prime\prime},\mathbb{B}^{\prime\prime}), let G𝐺G denote CNF(𝔸,𝔹)CNF𝔸superscript𝔹\mathrm{CNF}(\mathbb{A},\mathbb{B}^{\prime}), and let q𝑞q denote the number of elements in B𝐵B. For each i[n]𝑖delimited-[]𝑛i\in[n] and each bB𝑏𝐵b\in B we define F(b,i)𝐹𝑏𝑖F(b,i) to be the set of those tuples in BBnsuperscript𝐵superscript𝐵𝑛B^{\prime}\subseteq B^{n} which have b𝑏b on their i𝑖i-th coordinate, i.e.,

F(b,i)={(b1,,bn)B:bi=b}.𝐹𝑏𝑖conditional-setsubscript𝑏1subscript𝑏𝑛superscript𝐵subscript𝑏𝑖𝑏F(b,i)=\{(b_{1},\ldots,b_{n})\in B^{\prime}:b_{i}=b\}.

Observe that for a fixed i𝑖i the sets F(b,i)𝐹𝑏𝑖F(b,i) are disjoint subsets of Bsuperscript𝐵B^{\prime} and they cover the whole Bsuperscript𝐵B^{\prime}. In other words, they partition Bsuperscript𝐵B^{\prime}; note however that some F(b,i)𝐹𝑏𝑖F(b,i)’s may be empty.

Consider the following substitution σ𝜎\sigma of the variables of F𝐹F:

X((a,i),b):=(b1,,bn)F(b,i)X(a,(b1,,bn)).assign𝑋𝑎𝑖𝑏subscriptsubscript𝑏1subscript𝑏𝑛𝐹𝑏𝑖𝑋𝑎subscript𝑏1subscript𝑏𝑛X((a,i),b):=\bigvee_{(b_{1},\ldots,b_{n})\in F(b,i)}X(a,(b_{1},\ldots,b_{n})).

Note that this is a clause with at most qn1superscript𝑞𝑛1q^{n-1} many literals, and hence a 111-DNF with at most qn1superscript𝑞𝑛1q^{n-1} many terms. By Lemma 6 it suffices to check that, for each clause C𝐶C of F𝐹F, the substituted formula σ(C)𝜎𝐶\sigma(C) is a logical consequence of a bounded number of clauses of G𝐺G.

To argue this, let C𝐶C be any of the clauses in F𝐹F, say bBX((a,i),b)subscript𝑏𝐵𝑋𝑎𝑖𝑏\bigvee_{b\in B}X((a,i),b) for (a,i)A×[n]𝑎𝑖𝐴delimited-[]𝑛(a,i)\in A\times[n]. The substituted formula is then the following:

bB(b1,,bn)F(b,i)X(a,(b1,,bn)).subscript𝑏𝐵subscriptsubscript𝑏1subscript𝑏𝑛𝐹𝑏𝑖𝑋𝑎subscript𝑏1subscript𝑏𝑛\bigvee_{b\in B}\ \bigvee_{(b_{1},\ldots,b_{n})\in F(b,i)}X(a,(b_{1},\ldots,b_{n})).

Since for each i[n]𝑖delimited-[]𝑛i\in[n] the sets F(b,i)𝐹𝑏𝑖F(b,i) partition Bsuperscript𝐵B^{\prime}, this is equivalent to

(b1,,bn)BX(a,(b1,,bn)),subscriptsubscript𝑏1subscript𝑏𝑛superscript𝐵𝑋𝑎subscript𝑏1subscript𝑏𝑛\bigvee_{(b_{1},\ldots,b_{n})\in B^{\prime}}X(a,(b_{1},\ldots,b_{n})),

which is the clause of type 1 for a𝑎a in G𝐺G. Hence, we are done for this case.

Suppose now that C𝐶C is the clause X((a,i),b)¯X((a,i),b)¯¯𝑋𝑎𝑖𝑏¯𝑋𝑎𝑖superscript𝑏\overline{X((a,i),b)}\vee\overline{X((a,i),b^{\prime})} for (a,i)A×[n]𝑎𝑖𝐴delimited-[]𝑛(a,i)\in A\times[n] and (b,b)B2𝑏superscript𝑏superscript𝐵2(b,b^{\prime})\in B^{2} with bb𝑏superscript𝑏b\not=b^{\prime}. If either of the sets F(b,i)𝐹𝑏𝑖F(b,i) or F(b,i)𝐹superscript𝑏𝑖F(b^{\prime},i) is empty, then X((a,i),b)𝑋𝑎𝑖𝑏X((a,i),b) (or X((a,i),b)𝑋𝑎𝑖superscript𝑏X((a,i),b^{\prime}) respectively) is substituted by the empty formula and σ(C)𝜎𝐶\sigma(C) is true so there is nothing to be proved. Otherwise, the substituted formula (after converting to negation normal form) is the following:

((b1,,bn)F(b,i)X(a,(b1,,bn))¯)((b1,,bn)F(b,i)X(a,(b1,,bn))¯),subscriptsubscript𝑏1subscript𝑏𝑛𝐹𝑏𝑖¯𝑋𝑎subscript𝑏1subscript𝑏𝑛subscriptsubscript𝑏1subscript𝑏𝑛𝐹superscript𝑏𝑖¯𝑋𝑎subscript𝑏1subscript𝑏𝑛\Big{(}{\bigwedge_{(b_{1},\ldots,b_{n})\in F(b,i)}\overline{X(a,(b_{1},\ldots,b_{n}))}}\Big{)}\vee\Big{(}{\bigwedge_{(b_{1},\ldots,b_{n})\in F(b^{\prime},i)}\overline{X(a,(b_{1},\ldots,b_{n}))}}\Big{)},

and it says that either a𝑎a is not mapped to any of the elements in F(b,i)𝐹𝑏𝑖F(b,i), or it is not mapped to any of the elements in F(b,i)𝐹superscript𝑏𝑖F(b^{\prime},i). Since the sets F(b,i)𝐹𝑏𝑖F(b,i) and F(b,i)𝐹superscript𝑏𝑖F(b^{\prime},i) are disjoint, this formula is a logical consequence of qn(qn1)/2superscript𝑞𝑛superscript𝑞𝑛12q^{n}(q^{n}-1)/2 clauses of G𝐺G: those of type 2 for a𝑎a. Indeed, those clauses imply that the element a𝑎a can be mapped to at most one tuple from Bsuperscript𝐵B^{\prime}.

Now, let C𝐶C be the clause X((a1,1),b1)¯X((ar,n),bnr)¯¯𝑋subscript𝑎11subscript𝑏1¯𝑋subscript𝑎𝑟𝑛subscript𝑏𝑛𝑟\overline{X((a_{1},1),b_{1})}\vee\cdots\vee\overline{X((a_{r},n),b_{nr})} for some R^L′′^𝑅superscript𝐿′′\hat{R}\in L^{\prime\prime} of arity nr𝑛𝑟nr, (a1,,ar)R(𝔸)subscript𝑎1subscript𝑎𝑟𝑅𝔸(a_{1},\ldots,a_{r})\in R({\mathbb{A}}), and (b1,,bnr)BnrR^(𝔹′′)subscript𝑏1subscript𝑏𝑛𝑟superscript𝐵𝑛𝑟^𝑅superscript𝔹′′(b_{1},\ldots,b_{nr})\in B^{nr}\setminus\hat{R}({\mathbb{B}^{\prime\prime}}). If for some j[r]𝑗delimited-[]𝑟j\in[r] and some i[n]𝑖delimited-[]𝑛i\in[n] the set F(bnjn+i,i)𝐹subscript𝑏𝑛𝑗𝑛𝑖𝑖F(b_{nj-n+i},i) is empty, then the variable X((aj,i),bnjn+i)𝑋subscript𝑎𝑗𝑖subscript𝑏𝑛𝑗𝑛𝑖X((a_{j},i),b_{nj-n+i}) is substituted by the empty formula, in which case σ(C)𝜎𝐶\sigma(C) is true, and there is nothing to be proved. Otherwise, the substituted formula is a qn1superscript𝑞𝑛1q^{n-1}-DNF: for each j[r]𝑗delimited-[]𝑟j\in[r] and each i[n]𝑖delimited-[]𝑛i\in[n] there is a term which says that ajsubscript𝑎𝑗a_{j} is not mapped to any tuple from F(bnjn+i,i)𝐹subscript𝑏𝑛𝑗𝑛𝑖𝑖F(b_{nj-n+i},i), that is, ajsubscript𝑎𝑗a_{j} is not mapped to any tuple in Bsuperscript𝐵B^{\prime} which has bnjn+isubscript𝑏𝑛𝑗𝑛𝑖b_{nj-n+i} on the i𝑖i-th coordinate. There are two cases: either the tuple ((b1,,bn),,(bnrn+1,,bnr))subscript𝑏1subscript𝑏𝑛subscript𝑏𝑛𝑟𝑛1subscript𝑏𝑛𝑟((b_{1},\ldots,b_{n}),\ldots,(b_{nr-n+1},\ldots,b_{nr})) belongs to (B)rsuperscriptsuperscript𝐵𝑟(B^{\prime})^{r} or not. In the second case without loss of generality let us assume that (b1,,bn)Bsubscript𝑏1subscript𝑏𝑛superscript𝐵(b_{1},\ldots,b_{n})\not\in B^{\prime}. In particular this means that i[n]F(bi,i)=subscript𝑖delimited-[]𝑛𝐹subscript𝑏𝑖𝑖\bigcap_{i\in[n]}F(b_{i},i)=\emptyset. Then we argue that the formula

((b1,,bn)F(b1,1)X(a1,(b1,,bn))¯)((b1,,bn)F(bn,n)X(a1,(b1,,bn))¯),subscriptsubscriptsuperscript𝑏1subscriptsuperscript𝑏𝑛𝐹subscript𝑏11¯𝑋subscript𝑎1subscriptsuperscript𝑏1subscriptsuperscript𝑏𝑛subscriptsubscriptsuperscript𝑏1subscriptsuperscript𝑏𝑛𝐹subscript𝑏𝑛𝑛¯𝑋subscript𝑎1subscriptsuperscript𝑏1subscriptsuperscript𝑏𝑛\Big{(}{\bigwedge_{(b^{\prime}_{1},\ldots,b^{\prime}_{n})\in F(b_{1},1)}\overline{X(a_{1},(b^{\prime}_{1},\ldots,b^{\prime}_{n}))}}\Big{)}\vee\cdots\vee\Big{(}{\bigwedge_{(b^{\prime}_{1},\ldots,b^{\prime}_{n})\in F(b_{n},n)}\overline{X(a_{1},(b^{\prime}_{1},\ldots,b^{\prime}_{n}))}}\Big{)},

and hence also the substituted formula σ(C)𝜎𝐶\sigma(C) is a logical consequence of qn(qn1)/2+1superscript𝑞𝑛superscript𝑞𝑛121q^{n}(q^{n}-1)/2+1 clauses of G𝐺G: those of type 1 and 2 for a1subscript𝑎1a_{1}. Indeed, those qn(qn1)/2+1superscript𝑞𝑛superscript𝑞𝑛121q^{n}(q^{n}-1)/2+1 clauses imply that a1subscript𝑎1a_{1} is mapped to exactly one element from Bsuperscript𝐵B^{\prime}. Since i[n]F(bi,i)=subscript𝑖delimited-[]𝑛𝐹subscript𝑏𝑖𝑖\bigcap_{i\in[n]}F(b_{i},i)=\emptyset, this in turn implies that there exist i𝑖i such that a1subscript𝑎1a_{1} is not mapped to any tuple from F(bi,i)𝐹subscript𝑏𝑖𝑖F(b_{i},i) and we are done. Otherwise, if the tuple ((b1,,bn),,(bnrn+1,,bnr))subscript𝑏1subscript𝑏𝑛subscript𝑏𝑛𝑟𝑛1subscript𝑏𝑛𝑟((b_{1},\ldots,b_{n}),\ldots,(b_{nr-n+1},\ldots,b_{nr})) belongs to (B)rsuperscriptsuperscript𝐵𝑟(B^{\prime})^{r}, then the substituted formula is a logical consequence of at most rqn(qn1)/2+1𝑟superscript𝑞𝑛superscript𝑞𝑛121rq^{n}(q^{n}-1)/2+1 clauses of G𝐺G: the clauses of type 2 for a1,,arsubscript𝑎1subscript𝑎𝑟a_{1},\ldots,a_{r} and the clause of type 3 for RL𝑅superscript𝐿R\in L^{\prime}, (a1,,ar)R(𝔸)subscript𝑎1subscript𝑎𝑟𝑅𝔸(a_{1},\ldots,a_{r})\in R({\mathbb{A}}), and ((b1,,bn),,(bnrn+1,,bnr))(B)rR(𝔹)subscript𝑏1subscript𝑏𝑛subscript𝑏𝑛𝑟𝑛1subscript𝑏𝑛𝑟superscriptsuperscript𝐵𝑟𝑅superscript𝔹((b_{1},\ldots,b_{n}),\ldots,(b_{nr-n+1},\ldots,b_{nr}))\in(B^{\prime})^{r}\setminus R(\mathbb{B}^{\prime}). This is not very difficult to see since those rqn(qn1)/2+1𝑟superscript𝑞𝑛superscript𝑞𝑛121rq^{n}(q^{n}-1)/2+1 clauses imply that the tuple (a1,,ar)subscript𝑎1subscript𝑎𝑟(a_{1},\ldots,a_{r}) is mapped to at most one tuple from (B)rsuperscriptsuperscript𝐵𝑟(B^{\prime})^{r} and is not mapped to ((b1,,bn),,(bnrn+1,,bnr))subscript𝑏1subscript𝑏𝑛subscript𝑏𝑛𝑟𝑛1subscript𝑏𝑛𝑟((b_{1},\ldots,b_{n}),\ldots,(b_{nr-n+1},\ldots,b_{nr})). This in turn implies that for some j[r]𝑗delimited-[]𝑟j\in[r] and some i[n]𝑖delimited-[]𝑛i\in[n], ajsubscript𝑎𝑗a_{j} is not mapped to any tuple in Bsuperscript𝐵B^{\prime} which has bnjn+isubscript𝑏𝑛𝑗𝑛𝑖b_{nj-n+i} on the i𝑖i-th coordinate, and we are done. More formally, the rqn(qn1)/2+1𝑟superscript𝑞𝑛superscript𝑞𝑛121rq^{n}(q^{n}-1)/2+1 clauses in question imply that for every j[r]𝑗delimited-[]𝑟j\in[r] at most one of the variables in

i[n]{X(aj,(b1,,bn)):(b1,,bn)F(bnjn+i,i)}subscript𝑖delimited-[]𝑛conditional-set𝑋subscript𝑎𝑗subscriptsuperscript𝑏1subscriptsuperscript𝑏𝑛subscriptsuperscript𝑏1subscriptsuperscript𝑏𝑛𝐹subscript𝑏𝑛𝑗𝑛𝑖𝑖\bigcup_{i\in[n]}\{X(a_{j},(b^{\prime}_{1},\ldots,b^{\prime}_{n})):(b^{\prime}_{1},\ldots,b^{\prime}_{n})\in F(b_{nj-n+i},i)\}

is true. Since for every j[r]𝑗delimited-[]𝑟j\in[r], we have that i[n]F(bnjn+i,i)={(bnjn+1,,bnj)}subscript𝑖delimited-[]𝑛𝐹subscript𝑏𝑛𝑗𝑛𝑖𝑖subscript𝑏𝑛𝑗𝑛1subscript𝑏𝑛𝑗\bigcap_{i\in[n]}F(b_{nj-n+i},i)=\{(b_{nj-n+1},\ldots,b_{nj})\}, and at least one of the variables X(a1,(b1,,bn)),,X(ar,(bnrn+1,,bnr))𝑋subscript𝑎1subscript𝑏1subscript𝑏𝑛𝑋subscript𝑎𝑟subscript𝑏𝑛𝑟𝑛1subscript𝑏𝑛𝑟X(a_{1},(b_{1},\ldots,b_{n})),\ldots,X(a_{r},(b_{nr-n+1},\ldots,b_{nr})) is false, this implies that there exists j[r]𝑗delimited-[]𝑟j\in[r] such that for some i[n]𝑖delimited-[]𝑛i\in[n] each of the variables in {X(aj,(b1,,bn)):(b1,,bn)F(bnjn+i,i)}conditional-set𝑋subscript𝑎𝑗subscriptsuperscript𝑏1subscriptsuperscript𝑏𝑛subscriptsuperscript𝑏1subscriptsuperscript𝑏𝑛𝐹subscript𝑏𝑛𝑗𝑛𝑖𝑖\{X(a_{j},(b^{\prime}_{1},\ldots,b^{\prime}_{n})):(b^{\prime}_{1},\ldots,b^{\prime}_{n})\in F(b_{nj-n+i},i)\} is false, which finishes the proof in this case. ∎

Lemma 17.

Let 𝒫𝒫\mathcal{P} be Polynomial Calculus, Sherali-Adams, Positive Semidefinite Sherali-Adams, Sums-of-Squares, Lovász-Schrijver or Positive Semidefinite Lovász-Schrijver. For any positive integers k𝑘k and s𝑠s, and every finite Lsuperscript𝐿L^{\prime}-structure 𝔸𝔸\mathbb{A}, if there is a 𝒫𝒫\mathcal{P} refutation of EQ(𝔸′′,𝔹′′)EQsuperscript𝔸′′superscript𝔹′′\mathrm{EQ}(\mathbb{A}^{\prime\prime},\mathbb{B}^{\prime\prime}) of degree k𝑘k and size s𝑠s, then there is a 𝒫𝒫\mathcal{P} refutation of EQ(𝔸,𝔹)EQ𝔸superscript𝔹\mathrm{EQ}(\mathbb{A},\mathbb{B}^{\prime}) of degree linear in k𝑘k and size polynomial in 2ksuperscript2𝑘2^{k} and s𝑠s.

Proof.

Let F𝐹F denote EQ(𝔸′′,𝔹′′)EQsuperscript𝔸′′superscript𝔹′′\mathrm{EQ}(\mathbb{A}^{\prime\prime},\mathbb{B}^{\prime\prime}) and let G𝐺G denote EQ(𝔸,𝔹)EQ𝔸superscript𝔹\mathrm{EQ}(\mathbb{A},\mathbb{B}^{\prime}). For each i[n]𝑖delimited-[]𝑛i\in[n] and each bB𝑏𝐵b\in B we define F(b,i)𝐹𝑏𝑖F(b,i) as in the proof of Lemma 16 above, and we consider the following substitution σ𝜎\sigma of the variables of F𝐹F:

X((a,i),b)𝑋𝑎𝑖𝑏\displaystyle X((a,i),b):=(b1,,bn)F(b,i)X(a,(b1,,bn)),assignabsentsubscriptsubscript𝑏1subscript𝑏𝑛𝐹𝑏𝑖𝑋𝑎subscript𝑏1subscript𝑏𝑛\displaystyle:=\sum_{(b_{1},\ldots,b_{n})\in F(b,i)}X(a,(b_{1},\ldots,b_{n})),
X¯((a,i),b)¯𝑋𝑎𝑖𝑏\displaystyle\bar{X}((a,i),b):=(b1,,bn)BF(b,i)X(a,(b1,,bn)).assignabsentsubscriptsubscript𝑏1subscript𝑏𝑛superscript𝐵𝐹𝑏𝑖𝑋𝑎subscript𝑏1subscript𝑏𝑛\displaystyle:=\sum_{(b_{1},\ldots,b_{n})\in B^{\prime}\setminus F(b,i)}X(a,(b_{1},\ldots,b_{n})).

Those are polynomials of degree 111 with at most qnsuperscript𝑞𝑛q^{n} many monomials and all coefficients equal 111. We will show that for each equation in F𝐹F and for each axiom inequality and equation, its substitution follows on all evaluations of its variables in {0,1}01\{0,1\} from a bounded number of equations in Eq(G)Eq𝐺\mathrm{Eq}(G). By Lemmas 7 and 8 this implies the statement of the lemma.

Let P=0𝑃0P=0 be any of the equations in F𝐹F, say bBX¯((a,i),b)=0subscriptproduct𝑏𝐵¯𝑋𝑎𝑖𝑏0\prod_{b\in B}\bar{X}((a,i),b)=0 for (a,i)A×[n]𝑎𝑖𝐴delimited-[]𝑛(a,i)\in A\times[n]. The substituted equation is then the following:

bB((b1,,bn)BF(b,i)X(a,(b1,,bn)))=0.subscriptproduct𝑏𝐵subscriptsubscript𝑏1subscript𝑏𝑛superscript𝐵𝐹𝑏𝑖𝑋𝑎subscript𝑏1subscript𝑏𝑛0\prod_{b\in B}\ \big{(}\sum_{(b_{1},\ldots,b_{n})\in B^{\prime}\setminus F(b,i)}X(a,(b_{1},\ldots,b_{n}))\big{)}=0.

Since for each i[n]𝑖delimited-[]𝑛i\in[n] the sets F(b,i)𝐹𝑏𝑖F(b,i) partition Bsuperscript𝐵B^{\prime}, this equation follows on all evaluations of its variables in {0,1}01\{0,1\} from the set of equations of type 2 for a𝑎a. This set of equations implies that X(a,(b1,,bn))𝑋𝑎subscript𝑏1subscript𝑏𝑛X(a,(b_{1},\ldots,b_{n})) is 111 for at most one element of Bsuperscript𝐵B^{\prime}.

Suppose now that P=0𝑃0P=0 is the equation X((a,i),b)X((a,i),b)=0𝑋𝑎𝑖𝑏𝑋𝑎𝑖superscript𝑏0{X}((a,i),b){X}((a,i),b^{\prime})=0 for (a,i)A×[n]𝑎𝑖𝐴delimited-[]𝑛(a,i)\in A\times[n] and (b,b)B2𝑏superscript𝑏superscript𝐵2(b,b^{\prime})\in B^{2} with bb𝑏superscript𝑏b\not=b^{\prime}. Since the sets F(b,i)𝐹𝑏𝑖F(b,i) and F(b,i)𝐹superscript𝑏𝑖F(b^{\prime},i) are disjoint, the substituted equation follows on all evaluations of its variables in {0,1}01\{0,1\} from the set of equations of type 2 for a𝑎a in G𝐺G.

Now, let P=0𝑃0P=0 be the equation X((a1,1),b1)X((ar,n),bnr)=0𝑋subscript𝑎11subscript𝑏1𝑋subscript𝑎𝑟𝑛subscript𝑏𝑛𝑟0{X}((a_{1},1),b_{1})\cdot\ldots\cdot{X}((a_{r},n),b_{nr})=0 for some R^L′′^𝑅superscript𝐿′′\hat{R}\in L^{\prime\prime} of arity nr𝑛𝑟nr, (a1,,ar)R(𝔸)subscript𝑎1subscript𝑎𝑟𝑅𝔸(a_{1},\ldots,a_{r})\in R({\mathbb{A}}), and (b1,,bnr)BnrR^(𝔹′′)subscript𝑏1subscript𝑏𝑛𝑟superscript𝐵𝑛𝑟^𝑅superscript𝔹′′(b_{1},\ldots,b_{nr})\in B^{nr}\setminus\hat{R}({\mathbb{B}^{\prime\prime}}). If for some j[r]𝑗delimited-[]𝑟j\in[r] and some i[n]𝑖delimited-[]𝑛i\in[n] the set F(bnjn+i,i)𝐹subscript𝑏𝑛𝑗𝑛𝑖𝑖F(b_{nj-n+i},i) is empty, then the variable X((aj,i),bnjn+i)𝑋subscript𝑎𝑗𝑖subscript𝑏𝑛𝑗𝑛𝑖{X}((a_{j},i),b_{nj-n+i}) is substituted by 00 and the substituted equation is always satisfied. Otherwise, there are two cases: either the tuple ((b1,,bn),,(bnrn+1,,bnr))subscript𝑏1subscript𝑏𝑛subscript𝑏𝑛𝑟𝑛1subscript𝑏𝑛𝑟((b_{1},\ldots,b_{n}),\ldots,(b_{nr-n+1},\ldots,b_{nr})) belongs to (B)rsuperscriptsuperscript𝐵𝑟(B^{\prime})^{r} or not. In the second case without loss of generality let us assume that (b1,,bn)Bsubscript𝑏1subscript𝑏𝑛superscript𝐵(b_{1},\ldots,b_{n})\not\in B^{\prime}. In particular this means that i[n]F(bi,i)=subscript𝑖delimited-[]𝑛𝐹subscript𝑏𝑖𝑖\bigcap_{i\in[n]}F(b_{i},i)=\emptyset. Hence, the set of equations of type 2 for a1subscript𝑎1a_{1} in G𝐺G imply

(b1,,bn)F(b1,1)X(a1,(b1,,bn))(b1,,bn)F(bn,n)X(a1,(b1,,bn))=0.subscriptsubscriptsuperscript𝑏1subscriptsuperscript𝑏𝑛𝐹subscript𝑏11𝑋subscript𝑎1subscriptsuperscript𝑏1subscriptsuperscript𝑏𝑛subscriptsubscriptsuperscript𝑏1subscriptsuperscript𝑏𝑛𝐹subscript𝑏𝑛𝑛𝑋subscript𝑎1subscriptsuperscript𝑏1subscriptsuperscript𝑏𝑛0\sum_{(b^{\prime}_{1},\ldots,b^{\prime}_{n})\in F(b_{1},1)}{X(a_{1},(b^{\prime}_{1},\ldots,b^{\prime}_{n}))}\cdot\ldots\cdot{\sum_{(b^{\prime}_{1},\ldots,b^{\prime}_{n})\in F(b_{n},n)}{X(a_{1},(b^{\prime}_{1},\ldots,b^{\prime}_{n}))}}=0.

Otherwise, if the tuple ((b1,,bn),,(bnrn+1,,bnr))subscript𝑏1subscript𝑏𝑛subscript𝑏𝑛𝑟𝑛1subscript𝑏𝑛𝑟((b_{1},\ldots,b_{n}),\ldots,(b_{nr-n+1},\ldots,b_{nr})) belongs to (B)rsuperscriptsuperscript𝐵𝑟(B^{\prime})^{r}, then the substituted equation follows on all evaluations of its variables in {0,1}01\{0,1\} from the equations of type 2 for a1,,arsubscript𝑎1subscript𝑎𝑟a_{1},\ldots,a_{r}, the equation of type 3 for RL𝑅superscript𝐿R\in L^{\prime}, (a1,,ar)R(𝔸)subscript𝑎1subscript𝑎𝑟𝑅𝔸(a_{1},\ldots,a_{r})\in R({\mathbb{A}}), and ((b1,,bn),,(bnrn+1,,bnr))(B)rR(𝔹)subscript𝑏1subscript𝑏𝑛subscript𝑏𝑛𝑟𝑛1subscript𝑏𝑛𝑟superscriptsuperscript𝐵𝑟𝑅superscript𝔹((b_{1},\ldots,b_{n}),\ldots,(b_{nr-n+1},\ldots,b_{nr}))\in(B^{\prime})^{r}\setminus R(\mathbb{B}^{\prime}).

For the axiom inequalities and equations the proof is the same as in Lemma 15. ∎

4.7 Homomorphic equivalence

Now let 𝔹superscript𝔹\mathbb{B}^{\prime} be a finite L𝐿L-structure homomorphically equivalent to 𝔹𝔹\mathbb{B}. Any L𝐿L-structure 𝔸𝔸\mathbb{A} maps homomorphically to 𝔹superscript𝔹\mathbb{B}^{\prime} if and only if it maps homomorphically to 𝔹𝔹\mathbb{B}. We fix some homomorphism from 𝔹superscript𝔹\mathbb{B}^{\prime} to 𝔹𝔹\mathbb{B} and denote it by hh.

Lemma 18.

For any positive integers k𝑘k, t𝑡t and s𝑠s, and every finite L𝐿L-structure 𝔸𝔸\mathbb{A}, if there is a Frege refutation of CNF(𝔸,𝔹)CNF𝔸𝔹\mathrm{CNF}(\mathbb{A},\mathbb{B}) of depth t𝑡t, bottom fan-in k𝑘k and size s𝑠s, then there is a Frege refutation of CNF(𝔸,𝔹)CNF𝔸superscript𝔹\mathrm{CNF}(\mathbb{A},\mathbb{B}^{\prime}) of depth t𝑡t, bottom fan-in polynomial in k𝑘k and size polynomial in 2ksuperscript2𝑘2^{k} and s𝑠s.

Proof.

Let F𝐹F denote CNF(𝔸,𝔹)CNF𝔸𝔹\mathrm{CNF}(\mathbb{A},\mathbb{B}) and let G𝐺G denote CNF(𝔸,𝔹)CNF𝔸superscript𝔹\mathrm{CNF}(\mathbb{A},\mathbb{B}^{\prime}). Consider the substitution σ𝜎\sigma defined as follows for every variable in F𝐹F:

X(a,b):=bh1(b)X(a,b),assign𝑋𝑎𝑏subscriptsuperscript𝑏superscript1𝑏𝑋𝑎superscript𝑏X(a,b):=\bigvee_{b^{\prime}\in h^{-1}(b)}X(a,b^{\prime}),

for every aA𝑎𝐴a\in A and bB𝑏𝐵b\in B. By Lemma 6 it suffices to check that, for each clause C𝐶C of F𝐹F, the substituted formula σ(C)𝜎𝐶\sigma(C) is a logical consequence of a bounded number of clauses of G𝐺G.

To argue this, let C𝐶C be any of the clauses in F𝐹F, say bBX(a,b)subscript𝑏𝐵𝑋𝑎𝑏\bigvee_{b\in B}X(a,b) for a𝑎a in the domain of 𝔸𝔸\mathbb{A}. Observe that σ(C)𝜎𝐶\sigma(C) is bBX(a,b)subscript𝑏superscript𝐵𝑋𝑎𝑏\bigvee_{b\in B^{\prime}}X(a,b), which is a clause that belongs to G𝐺G.

Suppose now that C𝐶C is the clause X(a,b)¯X(a,b)¯¯𝑋𝑎𝑏¯𝑋𝑎superscript𝑏\overline{X(a,b)}\vee\overline{X(a,b^{\prime})} for aA𝑎𝐴a\in A and (b,b)B2𝑏superscript𝑏superscript𝐵2(b,b^{\prime})\in B^{2} with bb𝑏superscript𝑏b\not=b^{\prime}. The substituted clause says that a𝑎a is either not mapped to any of the elements in h1(b)superscript1𝑏h^{-1}(b) or it is not mapped to any of the elements in h1(b)superscript1superscript𝑏h^{-1}(b^{\prime}). If any of those sets is empty, then σ(C)𝜎𝐶\sigma(C) is true. Otherwise, since the sets h1(b)superscript1𝑏h^{-1}(b) and h1(b)superscript1superscript𝑏h^{-1}(b^{\prime}) are disjoint, σ(C)𝜎𝐶\sigma(C) is a consequence of the clauses of type 2 for a𝑎a, which imply that a𝑎a can be mapped to at most one element in Bsuperscript𝐵B^{\prime}.

Now, let C𝐶C be the clause X(a1,b1)¯X(ar,br)¯¯𝑋subscript𝑎1subscript𝑏1¯𝑋subscript𝑎𝑟subscript𝑏𝑟\overline{X(a_{1},b_{1})}\vee\cdots\vee\overline{X(a_{r},b_{r})} for some natural number r𝑟r, RL𝑅𝐿R\in L of arity r𝑟r, (a1,,ar)R(𝔸)subscript𝑎1subscript𝑎𝑟𝑅𝔸(a_{1},\ldots,a_{r})\in R({\mathbb{A}}), and (b1,,br)BrR(𝔹)subscript𝑏1subscript𝑏𝑟superscript𝐵𝑟𝑅𝔹(b_{1},\ldots,b_{r})\in B^{r}\setminus R({\mathbb{B}}). Since hh is a homomorphism, all the tuples in h1(b1)××h1(br)superscript1subscript𝑏1superscript1subscript𝑏𝑟h^{-1}(b_{1})\times\ldots\times h^{-1}(b_{r}) belong to (B)rR(𝔹)superscriptsuperscript𝐵𝑟𝑅superscript𝔹(B^{\prime})^{r}\setminus R({\mathbb{B}^{\prime}}). Therefore, σ(C)𝜎𝐶\sigma(C) is a logical consequence of the clauses of type 3 in G𝐺G for the relation symbol R𝑅R, (a1,,ar)R(𝔸)subscript𝑎1subscript𝑎𝑟𝑅𝔸(a_{1},\ldots,a_{r})\in R({\mathbb{A}}) and all tuples (b1,,br)h1(b1)××h1(br)subscriptsuperscript𝑏1subscriptsuperscript𝑏𝑟superscript1subscript𝑏1superscript1subscript𝑏𝑟(b^{\prime}_{1},\ldots,b^{\prime}_{r})\in h^{-1}(b_{1})\times\ldots\times h^{-1}(b_{r}). ∎

Lemma 19.

Let 𝒫𝒫\mathcal{P} be Polynomial Calculus, Sherali-Adams, Positive Semidefinite Sherali-Adams, Sums-of-Squares, Lovász-Schrijver or Positive Semidefinite Lovász-Schrijver. For any positive integers k𝑘k and s𝑠s, and every finite L𝐿L-structure 𝔸𝔸\mathbb{A}, if there is a 𝒫𝒫\mathcal{P} refutation of EQ(𝔸,𝔹)EQ𝔸𝔹\mathrm{EQ}(\mathbb{A},\mathbb{B}) of degree k𝑘k and size s𝑠s, then there is a 𝒫𝒫\mathcal{P} refutation of EQ(𝔸,𝔹)EQ𝔸superscript𝔹\mathrm{EQ}(\mathbb{A},\mathbb{B}^{\prime}) of degree linear in k𝑘k and size polynomial in 2ksuperscript2𝑘2^{k} and s𝑠s.

Proof.

Let F𝐹F denote EQ(𝔸,𝔹)EQ𝔸𝔹\mathrm{EQ}(\mathbb{A},\mathbb{B}) and let G𝐺G denote EQ(𝔸,𝔹)EQ𝔸superscript𝔹\mathrm{EQ}(\mathbb{A},\mathbb{B}^{\prime}). Consider the substitution σ𝜎\sigma defined as follows for every variable in F𝐹F:

X(a,b)𝑋𝑎𝑏\displaystyle X(a,b):=bh1(b)X(a,b),assignabsentsubscriptsuperscript𝑏superscript1𝑏𝑋𝑎superscript𝑏\displaystyle:=\sum_{b^{\prime}\in h^{-1}(b)}X(a,b^{\prime}),
X¯(a,b)¯𝑋𝑎𝑏\displaystyle\bar{X}(a,b):=bBh1(b)X(a,b),assignabsentsubscriptsuperscript𝑏superscript𝐵superscript1𝑏𝑋𝑎superscript𝑏\displaystyle:=\sum_{b^{\prime}\in B^{\prime}\setminus h^{-1}(b)}X(a,b^{\prime}),

for every aA𝑎𝐴a\in A and bB𝑏𝐵b\in B. By Lemmas 7 and 8 it suffices to check that for each equation in F𝐹F and for each axiom inequality and equation, its substitution follows on all evaluations of its variables in {0,1}01\{0,1\} from a bounded number of equations in Eq(G)Eq𝐺\mathrm{Eq}(G).

Let P=0𝑃0P=0 be the equation bBX¯(a,b)=0subscriptproduct𝑏𝐵¯𝑋𝑎𝑏0\prod_{b\in B}\bar{X}(a,b)=0 for a𝑎a in the domain of 𝔸𝔸\mathbb{A}. Since the union of the sets h1(b)superscript1𝑏h^{-1}(b) for bB𝑏𝐵b\in B is Bsuperscript𝐵B^{\prime}, the substituted equality follows on all valuations of its variables in {0,1}01\{0,1\} from the set of equations of type 2 for a𝑎a in G𝐺G.

Suppose now that P=0𝑃0P=0 is the equation X(a,b)X(a,b)=0𝑋𝑎𝑏𝑋𝑎superscript𝑏0{X}(a,b){X}(a,b^{\prime})=0 for aA𝑎𝐴a\in A and (b,b)B2𝑏superscript𝑏superscript𝐵2(b,b^{\prime})\in B^{2} with bb𝑏superscript𝑏b\not=b^{\prime}. Since the sets h1(b)superscript1𝑏h^{-1}(b) and h1(b)superscript1superscript𝑏h^{-1}(b^{\prime}) are disjoint, the substituted equation follows on all valuations of its variables in {0,1}01\{0,1\} from the set of equations of type 2 for a𝑎a.

Now, let P=0𝑃0P=0 be the equation X(a1,b1)X(ar,br)=0𝑋subscript𝑎1subscript𝑏1𝑋subscript𝑎𝑟subscript𝑏𝑟0{X}(a_{1},b_{1})\cdot\ldots\cdot{X}(a_{r},b_{r})=0 for some natural number r𝑟r, RL𝑅𝐿R\in L of arity r𝑟r, (a1,,ar)R(𝔸)subscript𝑎1subscript𝑎𝑟𝑅𝔸(a_{1},\ldots,a_{r})\in R({\mathbb{A}}), and (b1,,br)BrR(𝔹)subscript𝑏1subscript𝑏𝑟superscript𝐵𝑟𝑅𝔹(b_{1},\ldots,b_{r})\in B^{r}\setminus R({\mathbb{B}}). Since hh is a homomorphism, all the tuples in h1(b1)××h1(br)superscript1subscript𝑏1superscript1subscript𝑏𝑟h^{-1}(b_{1})\times\ldots\times h^{-1}(b_{r}) belong to (B)rR(𝔹)superscriptsuperscript𝐵𝑟𝑅superscript𝔹(B^{\prime})^{r}\setminus R({\mathbb{B}^{\prime}}). Therefore, the substituted equation follows on all valuations of its variables in {0,1}01\{0,1\} from the set of equations of type 2 for a1,,arsubscript𝑎1subscript𝑎𝑟a_{1},\ldots,a_{r} and of type 3 for the relation symbol R𝑅R, (a1,,ar)R(𝔸)subscript𝑎1subscript𝑎𝑟𝑅𝔸(a_{1},\ldots,a_{r})\in R({\mathbb{A}}) and all tuples (b1,,br)h1(b1)××h1(br)subscriptsuperscript𝑏1subscriptsuperscript𝑏𝑟superscript1subscript𝑏1superscript1subscript𝑏𝑟(b^{\prime}_{1},\ldots,b^{\prime}_{r})\in h^{-1}(b_{1})\times\ldots\times h^{-1}(b_{r}).

The argument for the axiom equation and inequalities is the same as in the proof of Lemma 15. ∎

4.8 Adding constants

Finally we consider the extension by unary one-element relations under the assumption of 𝔹𝔹\mathbb{B} being a core. For each bB𝑏𝐵b\in B, let Rbsubscript𝑅𝑏R_{b} be a unary relation symbol, not in L𝐿L, and let L=L{Rb:bB}superscript𝐿𝐿conditional-setsubscript𝑅𝑏𝑏𝐵L^{\prime}=L\cup\{R_{b}:b\in B\}. We assume that 𝔹𝔹\mathbb{B} is a core and 𝔹superscript𝔹\mathbb{B}^{\prime} is the Lsuperscript𝐿L^{\prime}-structure with domain B𝐵B, each relation symbol from L𝐿L interpreted as in 𝔹𝔹\mathbb{B}, and Rb(𝔹)={b}subscript𝑅𝑏superscript𝔹𝑏R_{b}(\mathbb{B}^{\prime})=\{b\}, for every bB𝑏𝐵b\in B.

For every finite Lsuperscript𝐿L^{\prime}-structure 𝔸𝔸\mathbb{A} the corresponding L𝐿L-structure 𝔸superscript𝔸\mathbb{A}^{\prime} has domain A=ABsuperscript𝐴𝐴𝐵A^{\prime}=A\cup B (we assume that the sets A𝐴A and B𝐵B are disjoint), and every relation symbol RL𝑅𝐿R\in L interpreted as {R(𝔹)b:=a:bB,aRb(𝔹)}R(𝔸)R(𝔹)conditional-set𝑅subscript𝔹assign𝑏𝑎formulae-sequence𝑏𝐵𝑎subscript𝑅𝑏𝔹𝑅𝔸𝑅𝔹\bigcup\{R(\mathbb{B})_{b:=a}:b\in B,a\in R_{b}(\mathbb{B})\}\cup R(\mathbb{A})\cup R(\mathbb{B}), where R(𝔹)b:=a𝑅subscript𝔹assign𝑏𝑎R(\mathbb{B})_{b:=a} is the relation R(𝔹)𝑅𝔹R(\mathbb{B}) with every occurrence of b𝑏b in a tuple substituted by a𝑎a. It follows from the proof of Lemma 23 in [6] that 𝔸𝔸\mathbb{A} maps homomorphically to 𝔹superscript𝔹\mathbb{B}^{\prime} if and only if 𝔸superscript𝔸\mathbb{A}^{\prime} maps homomorphically to 𝔹𝔹\mathbb{B}.

Lemma 20.

For every two positive integers k𝑘k and s𝑠s and every finite Lsuperscript𝐿L^{\prime}-structure 𝔸𝔸\mathbb{A}, if there is a Frege refutation of CNF(𝔸,𝔹)CNFsuperscript𝔸𝔹\mathrm{CNF}(\mathbb{A}^{\prime},\mathbb{B}) of depth t𝑡t, bottom fan-in k𝑘k and size s𝑠s, then there is a Frege refutation of CNF(𝔸,𝔹)CNF𝔸superscript𝔹\mathrm{CNF}(\mathbb{A},\mathbb{B}^{\prime}) of depth t𝑡t, bottom fan-in k𝑘k and size at most s𝑠s.

Proof.

Let F𝐹F denote CNF(𝔸,𝔹)CNFsuperscript𝔸𝔹\mathrm{CNF}(\mathbb{A}^{\prime},\mathbb{B}) and let G𝐺G denote CNF(𝔸,𝔹)CNF𝔸superscript𝔹\mathrm{CNF}(\mathbb{A},\mathbb{B}^{\prime}). Consider the substitution σ𝜎\sigma defined by the identity on all variables of G𝐺G and defined as follows for every variable in F𝐹F that is not in G𝐺G:

X(b,b):={0,if bb,1,otherwise,assign𝑋𝑏superscript𝑏cases0if 𝑏superscript𝑏1otherwiseX(b,b^{\prime}):=\begin{cases}0,&\text{if }b\neq b^{\prime},\\ 1,&\text{otherwise},\end{cases}

for every (b,b)B2𝑏superscript𝑏superscript𝐵2(b,b^{\prime})\in B^{2}. By Lemma 6 it suffices to check that, for each clause C𝐶C of F𝐹F, the substituted formula σ(C)𝜎𝐶\sigma(C) is a logical consequence of a bounded number of clauses of G𝐺G.

To argue this, let C𝐶C be any of the clauses in F𝐹F, say bBX(a,b)subscript𝑏𝐵𝑋𝑎𝑏\bigvee_{b\in B}X(a,b) for a𝑎a in the domain of 𝔸superscript𝔸\mathbb{A}^{\prime}. If aA𝑎𝐴a\in A, then the clause is left untouched by the substitution. Since the same clause is also in G𝐺G, there is nothing to prove. Suppose now that a=bB𝑎superscript𝑏𝐵a=b^{\prime}\in B. One of the variables in C𝐶C is then X(b,b)𝑋superscript𝑏superscript𝑏X(b^{\prime},b^{\prime}). This variable is substituted by the true formula so σ(C)𝜎𝐶\sigma(C) is true, which finishes the proof in this case.

Suppose now that C𝐶C is the clause X(a,b)¯X(a,b)¯¯𝑋𝑎𝑏¯𝑋𝑎superscript𝑏\overline{X(a,b)}\vee\overline{X(a,b^{\prime})} for a𝑎a in the domain of 𝔸superscript𝔸\mathbb{A}^{\prime} and (b,b)B2𝑏superscript𝑏superscript𝐵2(b,b^{\prime})\in B^{2} with bb𝑏superscript𝑏b\not=b^{\prime}. As in the previous case, if aA𝑎𝐴a\in A, then the clause is left untouched by the substitution and there is nothing to prove. Suppose now that a=b′′B𝑎superscript𝑏′′𝐵a=b^{\prime\prime}\in B. Then either bb′′𝑏superscript𝑏′′b\neq b^{\prime\prime} or bb′′superscript𝑏superscript𝑏′′b^{\prime}\neq b^{\prime\prime}. Therefore, either the variable X(b′′,b)𝑋superscript𝑏′′𝑏X(b^{\prime\prime},b) or the variable X(b′′,b)𝑋superscript𝑏′′superscript𝑏X(b^{\prime\prime},b^{\prime}) gets substituted by the empty formula and σ(C)𝜎𝐶\sigma(C) is true.

Now, let C𝐶C be the clause X(a1,b1)¯X(ar,br)¯¯𝑋subscript𝑎1subscript𝑏1¯𝑋subscript𝑎𝑟subscript𝑏𝑟\overline{X(a_{1},b_{1})}\vee\cdots\vee\overline{X(a_{r},b_{r})} for some natural number r𝑟r, RL𝑅𝐿R\in L of arity r𝑟r, (a1,,ar)R(𝔸)subscript𝑎1subscript𝑎𝑟𝑅superscript𝔸(a_{1},\ldots,a_{r})\in R({\mathbb{A}^{\prime}}), and (b1,,br)BrR(𝔹)subscript𝑏1subscript𝑏𝑟superscript𝐵𝑟𝑅𝔹(b_{1},\ldots,b_{r})\in B^{r}\setminus R({\mathbb{B}}). If (a1,,ar)Arsubscript𝑎1subscript𝑎𝑟superscript𝐴𝑟(a_{1},\ldots,a_{r})\in A^{r} then the same argument as above shows that there is nothing to be proved. If (a1,,ar)Brsubscript𝑎1subscript𝑎𝑟superscript𝐵𝑟(a_{1},\ldots,a_{r})\in B^{r} then C𝐶C is of the form X(b1,b1)¯X(br,br)¯¯𝑋subscriptsuperscript𝑏1subscript𝑏1¯𝑋subscriptsuperscript𝑏𝑟subscript𝑏𝑟\overline{X(b^{\prime}_{1},b_{1})}\vee\ldots\vee\overline{X(b^{\prime}_{r},b_{r})}, where (b1,,br)R(𝔹)subscriptsuperscript𝑏1subscriptsuperscript𝑏𝑟𝑅𝔹(b^{\prime}_{1},\ldots,b^{\prime}_{r})\in R(\mathbb{B}) and (b1,,br)BrR(𝔹)subscript𝑏1subscript𝑏𝑟superscript𝐵𝑟𝑅𝔹(b_{1},\ldots,b_{r})\in B^{r}\setminus R(\mathbb{B}). Then there exists i[r]𝑖delimited-[]𝑟i\in[r] such that bibisubscriptsuperscript𝑏𝑖subscript𝑏𝑖b^{\prime}_{i}\neq b_{i} and the variable X(bi,bi)𝑋subscriptsuperscript𝑏𝑖subscript𝑏𝑖X(b^{\prime}_{i},b_{i}) is substituted by the empty formula, so once again σ(C)𝜎𝐶\sigma(C) is true. The only remaining case is when (a1,,ar)R(𝔹)b:=asubscript𝑎1subscript𝑎𝑟𝑅subscript𝔹assign𝑏𝑎(a_{1},\ldots,a_{r})\in R(\mathbb{B})_{b:=a}, where aRb(𝔹)𝑎subscript𝑅𝑏superscript𝔹a\in R_{b}(\mathbb{B}^{\prime}) and aj=asubscript𝑎𝑗𝑎a_{j}=a for some (possibly more than one) j[r]𝑗delimited-[]𝑟j\in[r]. If there exists i[r]𝑖delimited-[]𝑟i\in[r] such that ai=biBsubscript𝑎𝑖subscriptsuperscript𝑏𝑖𝐵a_{i}=b^{\prime}_{i}\in B and bibisubscriptsuperscript𝑏𝑖subscript𝑏𝑖b^{\prime}_{i}\neq b_{i} then the variable X(bi,bi)𝑋subscriptsuperscript𝑏𝑖subscript𝑏𝑖X(b^{\prime}_{i},b_{i}) is substituted by the empty formula, so once again σ(C)𝜎𝐶\sigma(C) is true. Otherwise, there exists j[r]𝑗delimited-[]𝑟j\in[r] such that aj=asubscript𝑎𝑗𝑎a_{j}=a and bjbsubscript𝑏𝑗𝑏b_{j}\neq b. Then the substituted formula is (possibly a weakening of) the formula X(a,bj)¯¯𝑋𝑎subscript𝑏𝑗\overline{X(a,b_{j})} where bjbsubscript𝑏𝑗𝑏b_{j}\neq b. This formula belongs to G𝐺G: it is the clause of type 3 for aRb(𝔸)𝑎subscript𝑅𝑏superscript𝔸a\in R_{b}(\mathbb{A}^{\prime}) and bjBRb(𝔹)subscript𝑏𝑗𝐵subscript𝑅𝑏superscript𝔹b_{j}\in B\setminus R_{b}(\mathbb{B}^{\prime}). ∎

Lemma 21.

Let 𝒫𝒫\mathcal{P} be Polynomial Calculus, Sherali-Adams, Positive Semidefinite Sherali-Adams, Sums-of-Squares, Lovász-Schrijver or Positive Semidefinite Lovász-Schrijver. For every two positive integers k𝑘k and s𝑠s, and every finite Lsuperscript𝐿L^{\prime}-structure 𝔸𝔸\mathbb{A}, if there is a 𝒫𝒫\mathcal{P} refutation of EQ(𝔸,𝔹)EQsuperscript𝔸𝔹\mathrm{EQ}(\mathbb{A}^{\prime},\mathbb{B}) of degree k𝑘k and size s𝑠s, then there is a 𝒫𝒫\mathcal{P} refutation of EQ(𝔸,𝔹)EQ𝔸superscript𝔹\mathrm{EQ}(\mathbb{A},\mathbb{B}^{\prime}) of degree k𝑘k and size at most s𝑠s.

Proof.

Let F𝐹F denote EQ(𝔸,𝔹)EQsuperscript𝔸𝔹\mathrm{EQ}(\mathbb{A}^{\prime},\mathbb{B}) and let G𝐺G denote EQ(𝔸,𝔹)EQ𝔸superscript𝔹\mathrm{EQ}(\mathbb{A},\mathbb{B}^{\prime}). Consider the substitution σ𝜎\sigma defined by the identity on all variables of G𝐺G and defined as follows for every variable in F𝐹F that is not in G𝐺G:

X(b,b):={0,if bb,1,otherwise,X¯(b,b):={1,if bb,0,otherwise,formulae-sequenceassign𝑋𝑏superscript𝑏cases0if 𝑏superscript𝑏1otherwiseassign¯𝑋𝑏superscript𝑏cases1if 𝑏superscript𝑏0otherwiseX(b,b^{\prime}):=\begin{cases}0,&\text{if }b\neq b^{\prime},\\ 1,&\text{otherwise},\end{cases}\;\;\;\;\bar{X}(b,b^{\prime}):=\begin{cases}1,&\text{if }b\neq b^{\prime},\\ 0,&\text{otherwise},\end{cases}

for every (b,b)B2𝑏superscript𝑏superscript𝐵2(b,b^{\prime})\in B^{2}. By Lemmas 7 and 8 it suffices to check that for each equation in F𝐹F and for each axiom inequality and equation, its substitution follows on all evaluations of its variables in {0,1}01\{0,1\} from a bounded number of equations in Eq(G)Eq𝐺\mathrm{Eq}(G). We show this analoguosly as in the proof of Lemma 20.

Let P=0𝑃0P=0 be any of the equations in F𝐹F, say bBX¯(a,b)=0subscriptproduct𝑏𝐵¯𝑋𝑎𝑏0\prod_{b\in B}\bar{X}(a,b)=0 for a𝑎a in the domain of 𝔸superscript𝔸\mathbb{A}^{\prime}. If aA𝑎𝐴a\in A, then the equation is left untouched by the substitution. Since the same equation is also in G𝐺G, there is nothing to prove. Suppose now that a=bB𝑎superscript𝑏𝐵a=b^{\prime}\in B. One of the variables in bBX¯(a,b)subscriptproduct𝑏𝐵¯𝑋𝑎𝑏\prod_{b\in B}\bar{X}(a,b) is then X¯(b,b)¯𝑋superscript𝑏superscript𝑏\bar{X}(b^{\prime},b^{\prime}). This variable is substituted by 00 so the substituted equation holds for all valuations of its variables in {0,1}01\{0,1\}.

Suppose now that P=0𝑃0P=0 is the equation X(a,b)X(a,b)=0𝑋𝑎𝑏𝑋𝑎superscript𝑏0{X}(a,b){X}(a,b^{\prime})=0 for a𝑎a in the domain of 𝔸superscript𝔸\mathbb{A}^{\prime} and (b,b)B2𝑏superscript𝑏superscript𝐵2(b,b^{\prime})\in B^{2} with bb𝑏superscript𝑏b\not=b^{\prime}. As in the previous case, if aA𝑎𝐴a\in A, then the equation is left untouched by the substitution and there is nothing to prove. Suppose now that a=b′′B𝑎superscript𝑏′′𝐵a=b^{\prime\prime}\in B. Then either bb′′𝑏superscript𝑏′′b\neq b^{\prime\prime} or bb′′superscript𝑏superscript𝑏′′b^{\prime}\neq b^{\prime\prime}. Therefore, either the variable X(b′′,b)𝑋superscript𝑏′′𝑏{X}(b^{\prime\prime},b) or the variable X(b′′,b)𝑋superscript𝑏′′superscript𝑏{X}(b^{\prime\prime},b^{\prime}) gets substituted by 00 so the substituted equation holds for all valuations of its variables in {0,1}01\{0,1\}.

Now, let P=0𝑃0P=0 be the equation X(a1,b1)X(ar,br)=0𝑋subscript𝑎1subscript𝑏1𝑋subscript𝑎𝑟subscript𝑏𝑟0{X}(a_{1},b_{1})\cdot\ldots\cdot{X}(a_{r},b_{r})=0 for some natural number r𝑟r, RL𝑅𝐿R\in L of arity r𝑟r, (a1,,ar)R(𝔸)subscript𝑎1subscript𝑎𝑟𝑅superscript𝔸(a_{1},\ldots,a_{r})\in R({\mathbb{A}^{\prime}}), and (b1,,br)BrR(𝔹)subscript𝑏1subscript𝑏𝑟superscript𝐵𝑟𝑅𝔹(b_{1},\ldots,b_{r})\in B^{r}\setminus R({\mathbb{B}}). If (a1,,ar)Arsubscript𝑎1subscript𝑎𝑟superscript𝐴𝑟(a_{1},\ldots,a_{r})\in A^{r} then the same argument as above shows that there is nothing to be proved. Otherwise, if (a1,,ar)Brsubscript𝑎1subscript𝑎𝑟superscript𝐵𝑟(a_{1},\ldots,a_{r})\in B^{r} then P=0𝑃0P=0 is of the form X(b1,b1)X(br,br)=0𝑋subscriptsuperscript𝑏1subscript𝑏1𝑋subscriptsuperscript𝑏𝑟subscript𝑏𝑟0{X}(b^{\prime}_{1},b_{1})\cdot\ldots\cdot{X}(b^{\prime}_{r},b_{r})=0, where (b1,,br)R(𝔹)subscriptsuperscript𝑏1subscriptsuperscript𝑏𝑟𝑅𝔹(b^{\prime}_{1},\ldots,b^{\prime}_{r})\in R(\mathbb{B}) and (b1,,br)BrR(𝔹)subscript𝑏1subscript𝑏𝑟superscript𝐵𝑟𝑅𝔹(b_{1},\ldots,b_{r})\in B^{r}\setminus R(\mathbb{B}). Hence, there exists i[r]𝑖delimited-[]𝑟i\in[r] such that bibisubscriptsuperscript𝑏𝑖subscript𝑏𝑖b^{\prime}_{i}\neq b_{i} and the variable X(bi,bi)𝑋subscriptsuperscript𝑏𝑖subscript𝑏𝑖{X}(b^{\prime}_{i},b_{i}) is substituted by 00, so once again the substituted equation holds for all valuations of its variables in {0,1}01\{0,1\}. The only remaining case is when (a1,,ar)R(𝔹)b:=asubscript𝑎1subscript𝑎𝑟𝑅subscript𝔹assign𝑏𝑎(a_{1},\ldots,a_{r})\in R(\mathbb{B})_{b:=a}, where aRb(𝔹)𝑎subscript𝑅𝑏superscript𝔹a\in R_{b}(\mathbb{B}^{\prime}) and aj=asubscript𝑎𝑗𝑎a_{j}=a for some (possibly more than one) j[r]𝑗delimited-[]𝑟j\in[r]. If there exists i[r]𝑖delimited-[]𝑟i\in[r] such that ai=biBsubscript𝑎𝑖subscriptsuperscript𝑏𝑖𝐵a_{i}=b^{\prime}_{i}\in B and bibisubscriptsuperscript𝑏𝑖subscript𝑏𝑖b^{\prime}_{i}\neq b_{i} then the variable X(bi,bi)𝑋subscriptsuperscript𝑏𝑖subscript𝑏𝑖{X}(b^{\prime}_{i},b_{i}) is substituted by 00, so once again the substituted equation holds for all valuations of its variables in {0,1}01\{0,1\}. Otherwise, there exists j[r]𝑗delimited-[]𝑟j\in[r] such that aj=asubscript𝑎𝑗𝑎a_{j}=a and bjbsubscript𝑏𝑗𝑏b_{j}\neq b. Then the substituted equation follows on all valuations of its variables in {0,1}01\{0,1\} from the equation X(a,bj)=0𝑋𝑎subscript𝑏𝑗0{X}(a,b_{j})=0 where bjbsubscript𝑏𝑗𝑏b_{j}\neq b. This equation belongs to G𝐺G: it is the equation of type 3 for aRb(𝔸)𝑎subscript𝑅𝑏superscript𝔸a\in R_{b}(\mathbb{A}^{\prime}) and bjBRb(𝔹)subscript𝑏𝑗𝐵subscript𝑅𝑏superscript𝔹b_{j}\in B\setminus R_{b}(\mathbb{B}^{\prime}).

All the axiom equations and inequalities from Ineq(F)Ineq𝐹\mathrm{Ineq}(F) after applying the substitution σ𝜎\sigma either become true or are axiom equations and inequalities for the variables of G𝐺G. ∎

5 Upper bound

In this section we show that templates of bounded width (cf. Section 2.5) admit efficient refutations in resolution. It immediately follows that the bounded width property ensures efficient refutations in bounded depth Frege, as well as in Polynomial Calculus over the reals, Sherali-Adams and Sums-of-Squares proof systems (cf. Lemma 9). Together with matching lower bounds obtained in the next section, this will complete the proof of Theorem 1.

Let k(n)𝑘𝑛k(n) be a function. Let 𝔹𝔹\mathbb{B} be a finite relational structure over a finite vocabulary and let E𝐸E be a propositional encoding scheme for CSP(𝔹)CSP𝔹\mathrm{CSP}(\mathbb{B}). We say that a finite relational structure 𝔹𝔹\mathbb{B} has resolution refutations of width k(n)𝑘𝑛k(n) with respect to the encoding scheme E𝐸Eif, for every finite structure 𝔸𝔸\mathbb{A} over the same vocabulary as 𝔹𝔹\mathbb{B} with n𝑛n elements, if there is no homomorphism from 𝔸𝔸\mathbb{A} to 𝔹𝔹\mathbb{B}, then E(𝔸)𝐸𝔸E(\mathbb{A}) has a resolution refutation of width k(n)𝑘𝑛k(n). We say that 𝔹𝔹\mathbb{B} has resolution refutations of constant width if there exist a local encoding E𝐸E and a function k(n)=O(1)𝑘𝑛𝑂1k(n)=O(1) such that 𝔹𝔹\mathbb{B} has resolution refutations of width k(n)𝑘𝑛k(n) with respect to E𝐸E. Lemma 4 implies that a structure 𝔹𝔹\mathbb{B} has resolution refutations of constant width if and only if it has resolution refutations of constant width with respect to any local encoding scheme. In this section we use the CNF encoding scheme. The goal is to prove the following:

Theorem 8.

Let 𝔹𝔹\mathbb{B} be a finite relational structure. The following are equivalent:

  1. 1.

    𝔹𝔹\mathbb{B}has bounded width,

  2. 2.

    𝔹𝔹\mathbb{B}has resolution refutations of constant width.

In preparation for the proof we revisit the characterization of resolution width in terms of existential pebble games from [7].

Let L={R0,,Rq}𝐿subscript𝑅0subscript𝑅𝑞L=\{R_{0},\ldots,R_{q}\} be a finite relational vocabulary consisting of q+1𝑞1q+1 symbols of arity q𝑞q. Let 𝕊qsubscript𝕊𝑞\mathbb{S}_{q} be an L𝐿L-structure with two-element domain {0,1}01\{0,1\}, where each relation Ri(𝕊q)subscript𝑅𝑖subscript𝕊𝑞R_{i}(\mathbb{S}_{q}) encodes the set of valuations that satisfy a q𝑞q-clause with i𝑖i negated variables. More precisely, for 0iq0𝑖𝑞0\leq i\leq q, let Ri(𝕊q)={0,1}q{(x1,,xq)}subscript𝑅𝑖subscript𝕊𝑞superscript01𝑞subscript𝑥1subscript𝑥𝑞R_{i}(\mathbb{S}_{q})=\{0,1\}^{q}\setminus\{(x_{1},\ldots,x_{q})\} where (x1,,xq){0,1}qsubscript𝑥1subscript𝑥𝑞superscript01𝑞(x_{1},\ldots,x_{q})\in\{0,1\}^{q} is the vector defined by xj=0subscript𝑥𝑗0x_{j}=0 for j>i𝑗𝑖j>i and xj=1subscript𝑥𝑗1x_{j}=1, otherwise. Now for every q𝑞q-CNF F𝐹F, we define an L𝐿L-structure 𝔸Fsubscript𝔸𝐹\mathbb{A}_{F}. Its domain is the set of variables in F𝐹F, and the relation Ri(𝔸F)subscript𝑅𝑖subscript𝔸𝐹R_{i}(\mathbb{A}_{F}) is the set of all tuples (X1,,Xq)subscript𝑋1subscript𝑋𝑞(X_{1},\ldots,X_{q}) such that the clause X1¯Xi¯Xi+1Xq¯subscript𝑋1¯subscript𝑋𝑖subscript𝑋𝑖1subscript𝑋𝑞\overline{X_{1}}\vee\ldots\vee\overline{X_{i}}\vee X_{i+1}\vee\ldots\vee X_{q} belongs to F𝐹F. We allow the variables in the clauses to repeat, so the definition covers clauses with less than q𝑞q literals. Observe that partial homomorphisms from 𝔸Fsubscript𝔸𝐹\mathbb{A}_{F} to 𝕊qsubscript𝕊𝑞\mathbb{S}_{q} correspond to partial truth assignments to the variables of F𝐹F that do not falsify any clause from F𝐹F. Hence, for every q𝑞q-CNF F𝐹F, it holds that F𝐹F is satisfiable if and only if there is a homomorphism from 𝔸Fsubscript𝔸𝐹\mathbb{A}_{F} to 𝕊qsubscript𝕊𝑞\mathbb{S}_{q}.

Theorem 9 ([7]).

Let k𝑘k and q𝑞q be positive integers such that kq𝑘𝑞k\geq q and let F𝐹F be q𝑞q-CNF. Then F𝐹F has a resolution refutation of width k𝑘k if and only if Spoiler wins the existential (k+1)𝑘1(k+1)-pebble game on 𝔸Fsubscript𝔸𝐹\mathbb{A}_{F} and 𝕊qsubscript𝕊𝑞\mathbb{S}_{q}.

In this section we use the above theorem to establish a similar correspondence between existential pebble games on arbitrary structures 𝔸𝔸\mathbb{A} and 𝔹𝔹\mathbb{B} and bounded width resolution refutations of CNF(𝔸,𝔹)CNF𝔸𝔹\mathrm{CNF}(\mathbb{A},\mathbb{B}). In the following, let the notation 𝔸k𝔹superscript𝑘𝔸𝔹\mathbb{A}\leq^{k}\mathbb{B} mean that Duplicator wins the existential k𝑘k-pebble game on 𝔸𝔸\mathbb{A} and 𝔹𝔹\mathbb{B}.

Lemma 22.

Let 𝔸𝔸\mathbb{A} and 𝔹𝔹\mathbb{B} be relational structures over the same vocabulary of maximum arity r𝑟r, and let k𝑘k be an integer such that k|B|𝑘𝐵k\geq|B| and kr𝑘𝑟k\geq r. Then:

  1. 1.

    if 𝔸k+2𝔹superscriptnot-less-than-or-equals𝑘2𝔸𝔹\mathbb{A}\not\leq^{k+2}\mathbb{B}, then CNF(𝔸,𝔹)CNF𝔸𝔹\mathrm{CNF}(\mathbb{A},\mathbb{B})has a resolution refutation of width k+|B|𝑘𝐵k+|B|,

  2. 2.

    if 𝔸k+2𝔹superscript𝑘2𝔸𝔹\mathbb{A}\leq^{k+2}\mathbb{B}, then CNF(𝔸,𝔹)CNF𝔸𝔹\mathrm{CNF}(\mathbb{A},\mathbb{B})does not have a resolution refutation of width k+1𝑘1k+1.

Proof.

Let F𝐹F denote CNF(𝔸,𝔹)CNF𝔸𝔹\mathrm{CNF}(\mathbb{A},\mathbb{B}). Let q𝑞q be the maximum of the number of elements in B𝐵B and the arity of relation symbols in the vocabulary of 𝔸𝔸\mathbb{A} and 𝔹𝔹\mathbb{B}. Observe that F𝐹F is a q𝑞q-CNF. Lemma 22 follows from Theorem 9 together with the following facts.

Claim 4.

If 𝔸Fk+|B|𝕊qsuperscript𝑘𝐵subscript𝔸𝐹subscript𝕊𝑞\mathbb{A}_{F}\leq^{k+|B|}\mathbb{S}_{q}, then 𝔸k+1𝔹superscript𝑘1𝔸𝔹\mathbb{A}\leq^{k+1}\mathbb{B}.

Claim 5.

If 𝔸k𝔹superscript𝑘𝔸𝔹\mathbb{A}\leq^{k}\mathbb{B}, then 𝔸Fk𝕊qsuperscript𝑘subscript𝔸𝐹subscript𝕊𝑞\mathbb{A}_{F}\leq^{k}\mathbb{S}_{q}.

Indeed, if Spoiler wins the existential (k+2)𝑘2(k+2)-pebble game on 𝔸𝔸\mathbb{A} and 𝔹𝔹\mathbb{B} then, by Claim 4, Spoiler wins the existential (k+|B|+1)𝑘𝐵1(k+|B|+1)-pebble game on 𝔸Fsubscript𝔸𝐹\mathbb{A}_{F} and 𝕊qsubscript𝕊𝑞\mathbb{S}_{q} and, by Theorem 9, F𝐹F has a resolution refutation of width k+|B|𝑘𝐵k+|B|. On the other hand, if Duplicator wins the existential (k+2)𝑘2(k+2)-pebble game on 𝔸𝔸\mathbb{A} and 𝔹𝔹\mathbb{B} then, by Claim 5, Duplicator wins the existential (k+2)𝑘2(k+2)-pebble game on 𝔸Fsubscript𝔸𝐹\mathbb{A}_{F} and 𝕊qsubscript𝕊𝑞\mathbb{S}_{q} and, by Theorem 9, F𝐹F does not have a resolution refutation of width k+1𝑘1k+1. It remains to prove Claims 4 and 5.

Proof of Claim 4. We prove the contrapositive. Suppose that Spoiler wins the existential (k+1)𝑘1(k+1)-pebble game on 𝔸𝔸\mathbb{A} and 𝔹𝔹\mathbb{B}. We give a winning strategy for Spoiler in the existential (k+|B|)𝑘𝐵(k+|B|)-pebble game on 𝔸Fsubscript𝔸𝐹\mathbb{A}_{F} and 𝕊qsubscript𝕊𝑞\mathbb{S}_{q}. We simulate each move of Spoiler in the game on 𝔸𝔸\mathbb{A} and 𝔹𝔹\mathbb{B} by |B|𝐵|B| moves in the game on 𝔸Fsubscript𝔸𝐹\mathbb{A}_{F} and 𝕊qsubscript𝕊𝑞\mathbb{S}_{q}. Suppose that Spoiler puts the i𝑖i-th pebble on an element a𝑎a of 𝔸𝔸\mathbb{A}. We simulate this by pebbling elements X(a,b)𝑋𝑎𝑏X(a,b) of 𝔸Fsubscript𝔸𝐹\mathbb{A}_{F}, for each bB𝑏𝐵b\in B. There are two possibilities. If the answer of Duplicator falsifies any of the clauses of types 1 or 2 in F𝐹F then Spoiler wins immediately. Otherwise, the answer of Duplicator is 111 for exactly one element X(a,b)𝑋𝑎superscript𝑏X(a,b^{\prime}). We simulate this by putting a pebble on the element bsuperscript𝑏b^{\prime} of 𝔹𝔹\mathbb{B} in the game on 𝔸𝔸\mathbb{A} and 𝔹𝔹\mathbb{B}. Now, in the game on 𝔸Fsubscript𝔸𝐹\mathbb{A}_{F} and 𝕊qsubscript𝕊𝑞\mathbb{S}_{q} the pebble which lies on the element X(a,b)𝑋𝑎superscript𝑏X(a,b^{\prime}) stays there until Spoiler picks up the i𝑖i-th pebble form the element a𝑎a in 𝔸𝔸\mathbb{A}. The other |B|1𝐵1|B|-1 pebbles which lie on elements X(a,b)𝑋𝑎𝑏X(a,b) for bb𝑏superscript𝑏b\neq b^{\prime} can be used to simulate subsequent moves. Therefore, to simulate the existential (k+1)𝑘1(k+1)-pebble game on 𝔸𝔸\mathbb{A} and 𝔹𝔹\mathbb{B} we need only |B|1𝐵1|B|-1 extra pebbles.

If during the course of the game Spoiler does not win by falsifying any of the clauses of types 1 or 2 then the simulation of the game on 𝔸𝔸\mathbb{A} and 𝔹𝔹\mathbb{B} continues. Since in the simulated game Spoiler has a winning strategy, after a finite number of rounds the partial assignment f:AB:𝑓𝐴𝐵f:A\rightarrow B defined by f(ai)=bi𝑓subscript𝑎𝑖subscript𝑏𝑖f(a_{i})=b_{i} is not a partial homomorphism. This means that there exist a natural number r𝑟r, a relation symbol RL𝑅𝐿R\in L of arity r𝑟r, a tuple (a1,,ar)R(𝔸)subscriptsuperscript𝑎1subscriptsuperscript𝑎𝑟𝑅𝔸(a^{\prime}_{1},\ldots,a^{\prime}_{r})\in R(\mathbb{A}) and a tuple (b1,,br)BrR(𝔹)subscriptsuperscript𝑏1subscriptsuperscript𝑏𝑟superscript𝐵𝑟𝑅𝔹(b^{\prime}_{1},\ldots,b^{\prime}_{r})\in B^{r}\setminus R(\mathbb{B}), such that for every i[r]𝑖delimited-[]𝑟i\in[r] the pairs of elements (ai,bi)subscriptsuperscript𝑎𝑖subscriptsuperscript𝑏𝑖(a^{\prime}_{i},b^{\prime}_{i}) are pebbled by pairs of corresponding pebbles. It follows from the construction that in the simulation game on 𝔸Fsubscript𝔸𝐹\mathbb{A}_{F} and 𝕊qsubscript𝕊𝑞\mathbb{S}_{q} the pairs of elements (X(ai,bi),1)𝑋subscriptsuperscript𝑎𝑖subscriptsuperscript𝑏𝑖1(X(a^{\prime}_{i},b^{\prime}_{i}),1) are pebbled by pairs of corresponding pebbles. This means that the partial assignment defined by the current configuration of the game falsifies one of the clauses of type 3 in F𝐹F and Spoiler wins.

Proof of Claim 5. We prove the contrapositive. Suppose that Spoiler wins the existential k𝑘k-pebble game on 𝔸Fsubscript𝔸𝐹\mathbb{A}_{F} and 𝕊qsubscript𝕊𝑞\mathbb{S}_{q}. We give a winning strategy for Spoiler in the existential k𝑘k-pebble game on 𝔸𝔸\mathbb{A} and 𝔹𝔹\mathbb{B}. We simulate each move of Spoiler in the game on 𝔸Fsubscript𝔸𝐹\mathbb{A}_{F} and 𝕊qsubscript𝕊𝑞\mathbb{S}_{q} by a single move in the game on 𝔸𝔸\mathbb{A} and 𝔹𝔹\mathbb{B}. Suppose that Spoiler puts a pebble on an element X(a,b)𝑋𝑎𝑏X(a,b) of 𝔸Fsubscript𝔸𝐹\mathbb{A}_{F}. We simulate this by pebbling the element a𝑎a of 𝔸𝔸\mathbb{A}. If Duplicator responds by putting the corresponding pebble on the element b𝑏b of 𝔹𝔹\mathbb{B}, then we simulate this by pebbling 111 in 𝕊qsubscript𝕊𝑞\mathbb{S}_{q}, otherwise we pebble 00 in 𝕊qsubscript𝕊𝑞\mathbb{S}_{q}.

It is not difficult to see that this is indeed a winning strategy for Spoiler in the existential k𝑘k-pebble game on 𝔸𝔸\mathbb{A} and 𝔹𝔹\mathbb{B}. Since in the simulated game Spoiler has a winning strategy, after a finite number of rounds the partial assignment f:AF{0,1}:𝑓subscript𝐴𝐹01f:A_{F}\rightarrow\{0,1\} corresponding to the current configuration of the game on 𝔸Fsubscript𝔸𝐹\mathbb{A}_{F} and 𝕊qsubscript𝕊𝑞\mathbb{S}_{q} is not a partial homomorphism. Observe that it is not possible to falsify any of the clauses of type 1 in F𝐹F. If for some aA𝑎𝐴a\in A and some (b,b)B2𝑏superscript𝑏superscript𝐵2(b,b^{\prime})\in B^{2} such that bb𝑏superscript𝑏b\neq b^{\prime}, the partial assignment f𝑓f falsifies X(a,b)¯X(a,b)¯¯𝑋𝑎𝑏¯𝑋𝑎superscript𝑏\overline{X(a,b)}\vee\overline{X(a,b^{\prime})}, it means that pairs of corresponding pebbles lie on pairs of elements (X(a,b),1)𝑋𝑎𝑏1(X(a,b),1) and (X(a,b),1)𝑋𝑎superscript𝑏1(X(a,b^{\prime}),1). Hence, in the simulation game on 𝔸𝔸\mathbb{A} and 𝔹𝔹\mathbb{B} pairs of corresponding pebbles lie on pairs of elements (a,b)𝑎𝑏(a,b) and (a,b)𝑎superscript𝑏(a,b^{\prime}) and the partial assignment is not well defined. Finally, if for some natural number r𝑟r, a relation symbol RL𝑅𝐿R\in L of arity r𝑟r, a tuple (a1,,ar)R(𝔸)subscriptsuperscript𝑎1subscriptsuperscript𝑎𝑟𝑅𝔸(a^{\prime}_{1},\ldots,a^{\prime}_{r})\in R(\mathbb{A}) and a tuple (b1,,br)BrR(𝔹)subscriptsuperscript𝑏1subscriptsuperscript𝑏𝑟superscript𝐵𝑟𝑅𝔹(b^{\prime}_{1},\ldots,b^{\prime}_{r})\in B^{r}\setminus R(\mathbb{B}), the partial assignment f𝑓f falsifies the clause X(a1,b1)¯X(ar,br)¯¯𝑋subscriptsuperscript𝑎1subscriptsuperscript𝑏1¯𝑋subscriptsuperscript𝑎𝑟subscriptsuperscript𝑏𝑟\overline{X(a^{\prime}_{1},b^{\prime}_{1})}\vee\cdots\vee\overline{X(a^{\prime}_{r},b^{\prime}_{r})}, it means that in the game on 𝔸𝔸\mathbb{A} and 𝔹𝔹\mathbb{B} for every i[r]𝑖delimited-[]𝑟i\in[r], the pairs of elements (ai,bi)subscriptsuperscript𝑎𝑖subscriptsuperscript𝑏𝑖(a^{\prime}_{i},b^{\prime}_{i}) are pebbled by pairs of corresponding pebbles, and the partial assignment given by the current configuration of the game is also not a partial homomorphism, which ends the proof. ∎

We are ready to wrap-up:

Proof of Theorem 8.

For the implication 1 to 2, assume that 𝔹𝔹\mathbb{B} has bounded width, say width l𝑙l, and let k=max{|B|,r,l}𝑘𝐵𝑟𝑙k=\max\{|B|,r,l\}, where r𝑟r is the maximum arity of the vocabulary of 𝔹𝔹\mathbb{B}. Let 𝔸𝔸\mathbb{A} be a structure over the same vocabulary as 𝔹𝔹\mathbb{B} and assume that there is no homomorphism from 𝔸𝔸\mathbb{A} to 𝔹𝔹\mathbb{B}. Then Spoiler wins the existential l𝑙l-pebble game on 𝔸𝔸\mathbb{A} and 𝔹𝔹\mathbb{B}, and hence also the existential (k+2)𝑘2(k+2)-pebble game on 𝔸𝔸\mathbb{A} and 𝔹𝔹\mathbb{B}, since k+2l𝑘2𝑙k+2\geq l. The hypotheses of Lemma 22 hold, so by part 1. in that lemma, CNF(𝔸,𝔹)CNF𝔸𝔹\mathrm{CNF}(\mathbb{A},\mathbb{B}) has a resolution refutation of width k+|B|𝑘𝐵k+|B|. This shows that 𝔹𝔹\mathbb{B} has resolution refutations of width k+|B|𝑘𝐵k+|B|, and hence resolution refutations of constant width.

For the implication 2 to 1, assume that 𝔹𝔹\mathbb{B} has resolution refutations of width l𝑙l. Again let k=max{|B|,r,l}𝑘𝐵𝑟𝑙k=\max\{|B|,r,l\} where r𝑟r is the maximum arity of the relations in the vocabulary of 𝔹𝔹\mathbb{B}. Let 𝔸𝔸\mathbb{A} be a structure over the same vocabulary as 𝔹𝔹\mathbb{B} and assume that there is no homomorphism from 𝔸𝔸\mathbb{A} to 𝔹𝔹\mathbb{B}. Then CNF(𝔸,𝔹)CNF𝔸𝔹\mathrm{CNF}(\mathbb{A},\mathbb{B}) has a resolution refutation of width l𝑙l, and hence of width k+1𝑘1k+1 since k+1l𝑘1𝑙k+1\geq l. The hypotheses of Lemma 22 hold, so by part 2. in that lemma, Spoiler wins the existential (k+2)𝑘2(k+2)-pebble game on 𝔸𝔸\mathbb{A} and 𝔹𝔹\mathbb{B}. This shows that 𝔹𝔹\mathbb{B} has width k+2𝑘2k+2, and hence bounded width. ∎

6 Lower bounds

Let d(n)𝑑𝑛d(n), k(n)𝑘𝑛k(n) and s(n)𝑠𝑛s(n) be functions. Let 𝔹𝔹\mathbb{B} be a finite relational structure over a finite vocabulary and let E𝐸E be a propositional encoding scheme for CSP(𝔹)CSP𝔹\mathrm{CSP}(\mathbb{B}). We say that the structure 𝔹𝔹\mathbb{B} has Frege refutations of depth d(n)𝑑𝑛d(n), bottom fan-in k(n)𝑘𝑛k(n), and size s(n)𝑠𝑛s(n) with respect to the encoding scheme E𝐸Eif, for every finite structure 𝔸𝔸\mathbb{A} over the same vocabulary as 𝔹𝔹\mathbb{B} with n𝑛n elements, if there is no homomorphism from 𝔸𝔸\mathbb{A} to 𝔹𝔹\mathbb{B}, then E(𝔸)𝐸𝔸E(\mathbb{A}) has a Frege refutation of depth d(n)𝑑𝑛d(n), bottom fan-in k(n)𝑘𝑛k(n), and size s(n)𝑠𝑛s(n). We say that 𝔹𝔹\mathbb{B} has bounded-depth Frege refutations of subexponential size if there exist a local encoding scheme E𝐸E and functions d(n)=O(1)𝑑𝑛𝑂1d(n)=O(1), k(n)=O(1)𝑘𝑛𝑂1k(n)=O(1) and s(n)=2no(1)𝑠𝑛superscript2superscript𝑛𝑜1s(n)=2^{n^{o(1)}} such that the structure 𝔹𝔹\mathbb{B} has Frege refutations of depth d(n)𝑑𝑛d(n), bottom fan-in k(n)𝑘𝑛k(n), and size s(n)𝑠𝑛s(n) with respect to E𝐸E. Due to Lemma 4 the structure 𝔹𝔹\mathbb{B} has bounded-depth Frege refutations of subexponential size if and only if it has bounded-depth Frege refutations of subexponential size with respect to any local propositional encoding scheme.

Similarly, for any field F𝐹F, if E𝐸E is an algebraic encoding scheme over F𝐹F, we say that the structure 𝔹𝔹\mathbb{B} has PC refutations over F𝐹F of degree d(n)𝑑𝑛d(n) with respect to the encoding scheme E𝐸Eif, for every finite structure 𝔸𝔸\mathbb{A} over the same vocabulary as 𝔹𝔹\mathbb{B} with n𝑛n elements, if there is no homomorphism from 𝔸𝔸\mathbb{A} to 𝔹𝔹\mathbb{B}, then E(𝔸)𝐸𝔸E(\mathbb{A}) has a PC refutation over F𝐹F of degree d(n)𝑑𝑛d(n). We say that 𝔹𝔹\mathbb{B} has PC refutations over F𝐹F of sublinear degree if there exist a local encoding scheme E𝐸E over F𝐹F and a function d(n)=o(n)𝑑𝑛𝑜𝑛d(n)=o(n) such that the structure 𝔹𝔹\mathbb{B} has PC refutations over F𝐹F of degree d(n)𝑑𝑛d(n) with respect to E𝐸E. Due to Lemma 4 the structure 𝔹𝔹\mathbb{B} has PC refutations over F𝐹F of sublinear degree if and only if it has PC refutations over F𝐹F of sublinear degree with respect to any local algebraic encoding scheme.

Finally, if E𝐸E is a semi-algebraic encoding scheme, we say that the structure 𝔹𝔹\mathbb{B} has SOS refutations of degree d(n)𝑑𝑛d(n) with respect to the encoding scheme E𝐸Eif, for every finite structure 𝔸𝔸\mathbb{A} over the same vocabulary as 𝔹𝔹\mathbb{B} with n𝑛n elements, if there is no homomorphism from 𝔸𝔸\mathbb{A} to 𝔹𝔹\mathbb{B}, then E(𝔸)𝐸𝔸E(\mathbb{A}) has a SOS refutation of degree d(n)𝑑𝑛d(n). We say that 𝔹𝔹\mathbb{B} has SOS refutations of sublinear degree if there exist a local encoding scheme E𝐸E and a function d(n)=o(n)𝑑𝑛𝑜𝑛d(n)=o(n) such that the structure 𝔹𝔹\mathbb{B} has SOS refutations of degree d(n)𝑑𝑛d(n) with respect to E𝐸E. Due to Lemma 4 the structure 𝔹𝔹\mathbb{B} has SOS refutations of sublinear degree if and only if it has SOS refutations of sublinear degree with respect to any local semi-algebraic encoding scheme.

The goal of this section is to prove the following:

Theorem 10.

Let 𝔹𝔹\mathbb{B} be a finite relational structure. The following are equivalent:

  1. 1.

    𝔹𝔹\mathbb{B}has bounded width,

  2. 2.

    𝔹𝔹\mathbb{B}has bounded-depth Frege refutations of subexponential size,

  3. 3.

    𝔹𝔹\mathbb{B}has PC refutations over the reals of sublinear degree,

  4. 4.

    𝔹𝔹\mathbb{B}has SOS refutations of sublinear degree.

The equivalence of 1 and 4 is known [47]. Here we provide an alternative proof. The implication 1 to 2 follows from Theorem 8: every resolution refutation is a Frege refutation of depth one, and if the refutation has bounded width, then it has polynomial size and hence subexponential size. The implications 1 to 3 and 1 to 4 follow from Theorem 8 via the fact that bounded-degree Polynomial Calculus and bounded-degree Sherali-Adams simulate bounded-width resolution (cf. Lemma 9); note that the simulation by bounded-degree Sherali-Adams implies also the simulation by bounded-degree Sums-of-Squares and, for both Polynomial Calculus and Sums-of-Squares, bounded-degree implies constant, and hence sublinear, degree. For implications 2 to 1, 3 to 1 and 4 to 1 we use an algebraic characterization of unbounded width. We begin with some definitions.

6.1 Algebraic characterization of unbounded width

Let G=(G,+,0)𝐺𝐺0G=(G,+,0) be a finite Abelian group. For every positive integer n𝑛n, each gG𝑔𝐺g\in G and every (z1,,zn)nsubscript𝑧1subscript𝑧𝑛superscript𝑛(z_{1},\ldots,z_{n})\in\mathbb{Z}^{n}, we define a relation R(g,z1,,zn)={(g1,,gn)Gn:z1g1++zngn=g}subscript𝑅𝑔subscript𝑧1subscript𝑧𝑛conditional-setsubscript𝑔1subscript𝑔𝑛superscript𝐺𝑛subscript𝑧1subscript𝑔1subscript𝑧𝑛subscript𝑔𝑛𝑔R_{(g,z_{1},\ldots,z_{n})}=\{(g_{1},\ldots,g_{n})\in G^{n}:z_{1}g_{1}+\ldots+z_{n}g_{n}=g\}, where zigisubscript𝑧𝑖subscript𝑔𝑖z_{i}g_{i} is a shortcut for the sum of |zi|subscript𝑧𝑖|z_{i}| copies of sign(zi)gisignsubscript𝑧𝑖subscript𝑔𝑖\operatorname{sign}(z_{i})g_{i}. Let similar-to\sim be the equivalence relation on the set n>0G×nsubscript𝑛0𝐺superscript𝑛\bigcup_{n>0}G\times\mathbb{Z}^{n} that identifies tuples defining the same relation, i.e., (g,z1,,zn)(g,z1,,zn)similar-to𝑔subscript𝑧1subscript𝑧𝑛superscript𝑔subscriptsuperscript𝑧1subscriptsuperscript𝑧superscript𝑛(g,z_{1},\ldots,z_{n})\sim(g^{\prime},z^{\prime}_{1},\ldots,z^{\prime}_{n^{\prime}}) if and only if n=n𝑛superscript𝑛n=n^{\prime} and R(g,z1,,zn)=R(g,z1,,zn)subscript𝑅𝑔subscript𝑧1subscript𝑧𝑛subscript𝑅superscript𝑔subscriptsuperscript𝑧1subscriptsuperscript𝑧superscript𝑛R_{(g,z_{1},\ldots,z_{n})}=R_{(g^{\prime},z^{\prime}_{1},\ldots,z^{\prime}_{n^{\prime}})}. Let L(G)𝐿𝐺L(G) be the infinite relational vocabulary that for every equivalence class [(g,z1,,zn)]delimited-[]𝑔subscript𝑧1subscript𝑧𝑛[(g,z_{1},\ldots,z_{n})] has one n𝑛n-ary relation symbol E[(g,z1,,zn)]subscript𝐸delimited-[]𝑔subscript𝑧1subscript𝑧𝑛E_{[(g,z_{1},\ldots,z_{n})]}, and let 𝔹(G)𝔹𝐺\mathbb{B}(G) be the L(G)𝐿𝐺L(G)-structure that has domain G𝐺G and where each relation symbol E[(g,z1,,zn)]subscript𝐸delimited-[]𝑔subscript𝑧1subscript𝑧𝑛E_{[(g,z_{1},\ldots,z_{n})]} is interpreted as R(g,z1,,zn)subscript𝑅𝑔subscript𝑧1subscript𝑧𝑛R_{(g,z_{1},\ldots,z_{n})}. The CSP of 𝔹(G)𝔹𝐺\mathbb{B}(G) is called LIN(G)LIN𝐺\mathrm{LIN}(G). One should think about instances of LIN(G)LIN𝐺\mathrm{LIN}(G) as systems of linear equations over the group G𝐺G. For simplicity, for any instance 𝔸𝔸\mathbb{A} of LIN(G)LIN𝐺\mathrm{LIN}(G) we denote the fact that a tuple (a1,,an)Ansubscript𝑎1subscript𝑎𝑛superscript𝐴𝑛(a_{1},\ldots,a_{n})\in A^{n} belongs to the relation E[(g,z1,,zn)](𝔸)subscript𝐸delimited-[]𝑔subscript𝑧1subscript𝑧𝑛𝔸E_{[(g,z_{1},\ldots,z_{n})]}(\mathbb{A}) by z1a1++znan=gsubscript𝑧1subscript𝑎1subscript𝑧𝑛subscript𝑎𝑛𝑔z_{1}a_{1}+\ldots+z_{n}a_{n}=g.

Observe that, since there are only finitely many relations of a fixed arity k𝑘k on the finite set G𝐺G, the equivalence relation similar-to\sim restricted to G×k𝐺superscript𝑘G\times\mathbb{Z}^{k} has finitely many equivalence classes. For every positive integer k𝑘k, by L(G,k)𝐿𝐺𝑘L(G,k) we denote the finite relational vocabulary which is the subset of L(G)𝐿𝐺L(G) containing all symbols of arity k𝑘k, and by 𝔹(G,k)𝔹𝐺𝑘\mathbb{B}(G,k) we denote the L(G,k)𝐿𝐺𝑘L(G,k)-structure obtained from 𝔹(G)𝔹𝐺\mathbb{B}(G) by removing all relations of arity different than k𝑘k. The CSP problem over 𝔹(G,k)𝔹𝐺𝑘\mathbb{B}(G,k) is called kLIN(G)𝑘LIN𝐺k\mathrm{LIN}(G). Instances of kLIN(G)𝑘LIN𝐺k\mathrm{LIN}(G) correspond to systems of linear equations over the group G𝐺G with k𝑘k variables per equation.

Theorem 11 ([12, 19]).

Let 𝔹𝔹\mathbb{B} be a finite relational structure. The following are equivalent:

  1. 1.

    𝔹𝔹\mathbb{B}does not have bounded width,

  2. 2.

    there exists a non-trivial finite Abelian group G𝐺Gsuch that 𝔹(G,3)𝔹𝐺3\mathbb{B}(G,3)is pp-interpretable in 𝔹+superscript𝔹\mathbb{B}^{+}, where 𝔹+superscript𝔹\mathbb{B}^{+}is the expansion of the core of 𝔹𝔹\mathbb{B}with all constants.

Thus, in view of Theorems 5 and 7, in order to prove that 2. implies 1., 3. implies 1., and 4. implies 1. in Theorem 10, it suffices to prove lower bounds for 3LIN(G)3LIN𝐺3{\rm LIN}(G), for every non-trivial finite Abelian group G𝐺G.

6.2 Lower bound for bounded-depth Frege

In [17], an exponential lower bound on the size of bounded-depth Frege proofs of the so-called Tseitin formulas was obtained by reduction from the pigeonhole principle formulas; the latter are known to be hard for bounded-depth Frege by the so-called Jewel Theorem of Proof Complexity [1, 16, 37]. The Tseitin formulas encode certain systems of linear equations over 2subscript2\mathbb{Z}_{2} that are derived from expander graphs. Here we adapt the formulas to encode systems of linear equations over arbitrary finite Abelian groups, and then show that the reduction in [17] can be generalised to our formulas. We use the CNF encoding scheme.

Theorem 12.

For every integer d𝑑d and every non-trivial finite Abelian group G𝐺G there exists a positive constant δ𝛿\delta and a family of unsatisfiable instances (𝔸n)n1subscriptsubscript𝔸𝑛𝑛1(\mathbb{A}_{n})_{n\geq 1} of 3LIN(G)3LIN𝐺3\mathrm{LIN}(G), where 𝔸nsubscript𝔸𝑛\mathbb{A}_{n} has Θ(n)Θ𝑛\Theta(n) variables and Θ(n)Θ𝑛\Theta(n) equations, such that for every sufficiently large integer n𝑛n every Frege refutation of CNF(𝔸n,𝔹(G,3))CNFsubscript𝔸𝑛𝔹𝐺3\mathrm{CNF}(\mathbb{A}_{n},\mathbb{B}(G,3)) of depth d𝑑d has size at least 2nδsuperscript2superscript𝑛𝛿2^{n^{\delta}}.

The rest of this section is devoted to the proof of Theorem 12. We provide a proof for the special case when G𝐺G is the cyclic group qsubscript𝑞\mathbb{Z}_{q} of integers under addition modulo q𝑞q, for some q2𝑞2q\geq 2. Lemma 25 at the end of this section shows that, thanks to the Fundamental Theorem of Finite Abelian Groups, the special case of G=q𝐺subscript𝑞G=\mathbb{Z}_{q} implies Theorem 12 in full generality. The proof of the general case would actually be the same, however we believe that by focusing on simpler groups we make the arguments easier to follow.

Linear equations over Abelian groups.

For the rest of this section, let us fix G𝐺G to be the cyclic group qsubscript𝑞\mathbb{Z}_{q} of integers under addition modulo q𝑞q, for some q2𝑞2q\geq 2. Whenever we talk about an element z𝑧z of the group G𝐺G, where z𝑧z is some integer, we mean the unique element corresponding to z𝑧z modulo q𝑞q. The instances of 3LIN(G)3LIN𝐺3\mathrm{LIN}(G) that we show to be hard for bounded-depth Frege are special cases of so-called Tseitin graph tautologies for qsubscript𝑞\mathbb{Z}_{q} as defined in [22]. Before defining them we need to introduce some terminology.

For a graph H=(V(H),E(H))𝐻𝑉𝐻𝐸𝐻H=(V(H),E(H)) (directed or undirected) and a set of vertices WV(H)𝑊𝑉𝐻W\subseteq V(H) by (W)𝑊\partial(W) we denote the boundary of W𝑊W which is the set of all edges incident with a vertex in W𝑊W and with a vertex in V(H)W𝑉𝐻𝑊V(H)\setminus W. If the graph H𝐻H is directed, then by (W)subscript𝑊\partial_{-}(W) we denote the set of edges with the head in W𝑊W and the tail in V(H)W𝑉𝐻𝑊V(H)\setminus W, and by +(W)subscript𝑊\partial_{+}(W) we denote the set of edges with the head in V(H)W𝑉𝐻𝑊V(H)\setminus W and the tail in W𝑊W. For single vertices v𝑣v we write (v)𝑣\partial(v) instead of ({v})𝑣\partial(\{v\}). The same convention explains the notation +(v)subscript𝑣\partial_{+}(v) and (v)subscript𝑣\partial_{-}(v).

Consider a directed graph H=(V(H),E(H))𝐻𝑉𝐻𝐸𝐻H=(V(H),E(H)) and a labelling σ:V(H)G:𝜎𝑉𝐻𝐺\sigma\colon V(H)\rightarrow G of the vertices of the graph H𝐻H by elements of G𝐺G. The Tseitin graph tautology 𝔸(H,σ)𝔸𝐻𝜎\mathbb{A}(H,\sigma) is the following system of linear equations over the group G𝐺G:

  • the set of variables is the set E(H)𝐸𝐻E(H) of the edges of the graph;

  • for every vertex vV(H)𝑣𝑉𝐻v\in V(H) there is an equation

    e+(v)ee(v)e=σ(v).subscript𝑒subscript𝑣𝑒subscript𝑒subscript𝑣𝑒𝜎𝑣\sum_{e\in\partial_{+}(v)}e-\sum_{e\in\partial_{-}(v)}e=\sigma(v).

The system 𝔸(H,σ)𝔸𝐻𝜎\mathbb{A}(H,\sigma) can be seen as an instance of LIN(G)LIN𝐺\mathrm{LIN}(G). The formula CNF(𝔸(H,σ),𝔹(G))CNF𝔸𝐻𝜎𝔹𝐺\mathrm{CNF}(\mathbb{A}(H,\sigma),\mathbb{B}(G)) is called a Tseitin formula. If the graph H𝐻H is obtained from directing the edges of a k𝑘k-regular undirected graph, i.e., a graph in which each vertex has degree k𝑘k, then 𝔸(H,σ)𝔸𝐻𝜎\mathbb{A}(H,\sigma) is an instance of kLIN(G)𝑘LIN𝐺k\mathrm{LIN}(G).

It is easy to see that if vV(H)σ(v)0subscript𝑣𝑉𝐻𝜎𝑣0\sum_{v\in V(H)}\sigma(v)\neq 0, then the instance 𝔸(H,σ)𝔸𝐻𝜎\mathbb{A}(H,\sigma) is unsatisfiable. Indeed, since every variable e𝑒e appears positively on the left-hand side of exactly one equation and negatively on the left-hand side of exactly one equation, by summing up all the equations we get 00 on the left-hand side and vV(H)σ(v)subscript𝑣𝑉𝐻𝜎𝑣\sum_{v\in V(H)}\sigma(v) on the right-hand side. If vV(H)σ(v)0subscript𝑣𝑉𝐻𝜎𝑣0\sum_{v\in V(H)}\sigma(v)\neq 0 we obtain a contradiction. It is not difficult to show that for a connected graph H𝐻H, the converse statement holds as well.

Lemma 23.

If H=(V(H),E(H))𝐻𝑉𝐻𝐸𝐻H=(V(H),E(H)) is a connected directed graph and σ:V(H)G:𝜎𝑉𝐻𝐺\sigma\colon V(H)\rightarrow G is a labelling, then the system 𝔸(H,σ)𝔸𝐻𝜎\mathbb{A}(H,\sigma) is satisfiable and if and only if vV(H)σ(v)=0subscript𝑣𝑉𝐻𝜎𝑣0\sum_{v\in V(H)}\sigma(v)=0.

Proof.

The left-to-right direction is clear. For the opposite direction we define a solution to 𝔸(H,σ)𝔸𝐻𝜎\mathbb{A}(H,\sigma) by assigning values to edges of the graph H𝐻H one by one while keeping two invariants: none of the equations gets falsified and the graph induced by unassigned edges is connected. Below we formalize this intuition.

Since H𝐻H is connected, we can enumerate its edges e1,,emsubscript𝑒1subscript𝑒𝑚e_{1},\ldots,e_{m} in such a way that for every i[m]𝑖delimited-[]𝑚i\in[m] the graph Hi+1subscript𝐻𝑖1H_{i+1} obtained from H𝐻H by removing the edges e1,,eisubscript𝑒1subscript𝑒𝑖e_{1},\ldots,e_{i} and then deleting all isolated vertices, is connected. Let us denote the system 𝔸(H,σ)𝔸𝐻𝜎\mathbb{A}(H,\sigma) by 𝔸1subscript𝔸1\mathbb{A}_{1}. We assign values to the edges of H𝐻H in the order specified above. Additionally, for each i[m1]𝑖delimited-[]𝑚1i\in[m-1], after assigning a value to eisubscript𝑒𝑖e_{i} we substitute the variable eisubscript𝑒𝑖e_{i} in the system 𝔸isubscript𝔸𝑖\mathbb{A}_{i} with this value, and next move all constants to the right-hand side of the equations. We denote the obtained system by 𝔸i+1subscript𝔸𝑖1\mathbb{A}_{i+1}. The variables of 𝔸i+1subscript𝔸𝑖1\mathbb{A}_{i+1} are ejsubscript𝑒𝑗e_{j}, for j>i𝑗𝑖j>i. Observe, that for every i[k]𝑖delimited-[]𝑘i\in[k], the sum of group elements that appear on the right-hand side of all the equations in 𝔸isubscript𝔸𝑖\mathbb{A}_{i} is 00.

Assume that we have already assigned values to the edges ejsubscript𝑒𝑗e_{j} for j<i𝑗𝑖j<i without falsifying any of the equations in 𝔸(H,σ)𝔸𝐻𝜎\mathbb{A}(H,\sigma) (this is true for i=1𝑖1i=1). The variable eisubscript𝑒𝑖e_{i} appears in exactly two equations in 𝔸isubscript𝔸𝑖\mathbb{A}_{i}. There are two possibilities:

  • if im𝑖𝑚i\neq m then at least one of the two equations has at least one more variable whose value has not yet been assigned. This is because the graph Hisubscript𝐻𝑖H_{i} is connected. Then we can assign a value to eisubscript𝑒𝑖e_{i} in such a way that none of the equations in 𝔸isubscript𝔸𝑖\mathbb{A}_{i} gets falsified: the value is either forced by the other equation, or can be assigned arbitrarily.

  • if i=m𝑖𝑚i=m then the two equations which mention the variable eksubscript𝑒𝑘e_{k} are of the form em=gsubscript𝑒𝑚𝑔e_{m}=g and em=hsubscript𝑒𝑚-e_{m}=h, for some elements g𝑔g and hh of the group. All the other equations in the system 𝔸msubscript𝔸𝑚\mathbb{A}_{m} are of the form 0=0000=0. Since the sum of the group elements on the right-hand side of the equations in 𝔸msubscript𝔸𝑚\mathbb{A}_{m} is 00, we have that g=h𝑔g=-h and we can assign the value g𝑔g to emsubscript𝑒𝑚e_{m}, satisfying the last two equations.

This finishes the construction of a solution to 𝔸(H,σ)𝔸𝐻𝜎\mathbb{A}(H,\sigma). ∎

Sometimes we want to consider subsystems of 𝔸(H,σ)𝔸𝐻𝜎\mathbb{A}(H,\sigma) induced by some subset of vertices. Let WV(H)𝑊𝑉𝐻W\subseteq V(H). By 𝔸(W,σ)𝔸𝑊𝜎\mathbb{A}(W,\sigma) we denote the system of linear equations obtained from 𝔸(H,σ)𝔸𝐻𝜎\mathbb{A}(H,\sigma) by removing all equations corresponding to vertices in V(H)W𝑉𝐻𝑊V(H)\setminus W. In particular, 𝔸(V(H),σ)=𝔸(H,σ)𝔸𝑉𝐻𝜎𝔸𝐻𝜎\mathbb{A}(V(H),\sigma)=\mathbb{A}(H,\sigma). Moreover, by 𝔸((W),σ)𝔸𝑊𝜎\mathbb{A}(\partial(W),\sigma) we denote the system of linear equations consisting of the single equation e+(W)ee(W)e=vWσ(v)subscript𝑒subscript𝑊𝑒subscript𝑒subscript𝑊𝑒subscript𝑣𝑊𝜎𝑣\sum_{e\in\partial_{+}(W)}e-\sum_{e\in\partial_{-}(W)}e=\sum_{v\in W}\sigma(v) which is the sum of all the equations in 𝔸(W,σ)𝔸𝑊𝜎\mathbb{A}(W,\sigma). It turns out that whenever the subgraph induced by W𝑊W is connected, 𝔸((W),σ)𝔸𝑊𝜎\mathbb{A}(\partial(W),\sigma) carries the essential information about the satisfiablity of 𝔸(W,σ)𝔸𝑊𝜎\mathbb{A}(W,\sigma). The following is an easy generalization of well-known properties of Tseitin tautologies over 2subscript2\mathbb{Z}_{2}.

Lemma 24.

Let H=(V(H),E(H))𝐻𝑉𝐻𝐸𝐻H=(V(H),E(H)) be a directed graph, let WV(H)𝑊𝑉𝐻W\subseteq V(H) be a subset of its vertices such that the subgraph induced by W𝑊W is connected, and let σ:WG:𝜎𝑊𝐺\sigma\colon W\rightarrow G be a labelling of the vertices in W𝑊W. An assignment f:(W)G:𝑓𝑊𝐺f:\partial(W)\rightarrow G extends to a solution to 𝔸(W,σ)𝔸𝑊𝜎\mathbb{A}(W,\sigma) if and only if it satisfies 𝔸((W),σ)𝔸𝑊𝜎\mathbb{A}(\partial(W),\sigma).

Proof.

Since 𝔸((W),σ)𝔸𝑊𝜎\mathbb{A}(\partial(W),\sigma) is the sum of the equations in 𝔸(W,σ)𝔸𝑊𝜎\mathbb{A}(W,\sigma), the left-to-right direction is obvious. For the opposite direction, let f:(W)G:𝑓𝑊𝐺f:\partial(W)\rightarrow G be an assignment satisfying 𝔸((W),σ)𝔸𝑊𝜎\mathbb{A}(\partial(W),\sigma). Let us denote by Hsuperscript𝐻H^{\prime} the graph induced by W𝑊W. Observe that by assigning values to the variables in (W)𝑊\partial(W) according to f𝑓f and moving all the constants in the system 𝔸(W,σ)𝔸𝑊𝜎\mathbb{A}(W,\sigma) to the right we obtain a system 𝔸(H,σ)𝔸superscript𝐻superscript𝜎\mathbb{A}(H^{\prime},\sigma^{\prime}) for some labelling σ:WG:superscript𝜎𝑊𝐺\sigma^{\prime}\colon W\rightarrow G of the vertices in Hsuperscript𝐻H^{\prime} which satisfies vWσ(v)=0subscript𝑣𝑊superscript𝜎𝑣0\sum_{v\in W}\sigma^{\prime}(v)=0. By Lemma 23 there exists a solution g𝑔g to the system 𝔸(H,σ)𝔸superscript𝐻superscript𝜎\mathbb{A}(H^{\prime},\sigma^{\prime}). By extending f𝑓f with g𝑔g we obtain a solution to 𝔸(W,σ)𝔸𝑊𝜎\mathbb{A}(W,\sigma). ∎

Hard Tseitin graph tautologies are usually based on graphs that are expanders. For an undirected graph H=(V(H),E(H))𝐻𝑉𝐻𝐸𝐻H=(V(H),E(H)) the expansion constant is:

e(H)=min{|(W)||W|:WV(H),|W||V(H)|2}.𝑒𝐻:𝑊𝑊formulae-sequence𝑊𝑉𝐻𝑊𝑉𝐻2e(H)=\min\left\{{{|\partial(W)|\over|W|}\colon W\subseteq V(H),|W|\leq{|V(H)|\over 2}}\right\}.

We call a family \mathcal{H} of undirected graphs a family of expander graphs if it is infinite and there exists a positive constant e𝑒e such that e(H)e𝑒𝐻𝑒e(H)\geq e for every graph H𝐻H in \mathcal{H}. For more information on expanders see e.g. the survey [31]. Here we only need the well known fact that expander families exist.

Fact 1.

For every integer l3𝑙3l\geq 3 there exists a family of connected l𝑙l-regular undirected expander graphs (Hn)n1subscriptsubscript𝐻𝑛𝑛1(H_{n})_{n\geq 1}, where the graph Hnsubscript𝐻𝑛H_{n} has Θ(n)Θ𝑛\Theta(n) vertices.

Our proof strategy is now the following. We take a family (H¯n)n1subscriptsubscript¯𝐻𝑛𝑛1(\bar{H}_{n})_{n\geq 1} of connected 333-regular undirected expander graphs and show that for every sufficiently large integer n𝑛n, one can specify edge directions in H¯nsubscript¯𝐻𝑛\bar{H}_{n} obtaining a directed graph Hnsubscript𝐻𝑛H_{n}, and choose an appropriate labelling σnsubscript𝜎𝑛\sigma_{n} of the vertices of Hnsubscript𝐻𝑛H_{n} with elements of G𝐺G, such that every Frege refutation of CNF(𝔸(Hn,σn),𝔹(G,3))CNF𝔸subscript𝐻𝑛subscript𝜎𝑛𝔹𝐺3\mathrm{CNF}(\mathbb{A}(H_{n},\sigma_{n}),\mathbb{B}(G,3)) of depth d𝑑d has size at least 2nδsuperscript2superscript𝑛𝛿2^{n^{\delta}}. To this end, following the lines of [17], we reduce the onto-pigeonhole principle, which states that there is no bijection between sets of size m𝑚m and m+1𝑚1m+1, to a Tseitin formula over a complete bipartite graph and further reduce the latter to the Tseitin formula over Hnsubscript𝐻𝑛H_{n}. Let us begin with the second reduction.

Reducing Tseitin formulas over a complete bipartite graph.

We now define the graphs Hnsubscript𝐻𝑛H_{n} together with labellings σnsubscript𝜎𝑛\sigma_{n} and show a reduction of a Tseitin formula over a complete bipartite graph to the Tseitin formula over Hnsubscript𝐻𝑛H_{n}.

The following is a special case of Theorem 4.2 in [17].

Theorem 13 ([17]).

If \mathcal{H} is a family of connected 333-regular expander graphs, then there exists a positive constant c𝑐c such that for every graph H𝐻H in \mathcal{H}, the set of its vertices V(H)𝑉𝐻V(H) can be partitioned into V1,,Vhsubscript𝑉1subscript𝑉V_{1},\ldots,V_{h} where hc|V(H)|1/3𝑐superscript𝑉𝐻13h\geq c|V(H)|^{1/3} and:

  • For every i[h]𝑖delimited-[]i\in[h]the subgraph induced by Visubscript𝑉𝑖V_{i}is connected.

  • For any 1i<jh1𝑖𝑗1\leq i<j\leq h, there is at least one edge incident to some vertex in Visubscript𝑉𝑖V_{i}and to some vertex in Vjsubscript𝑉𝑗V_{j}.

Let us fix a family (H¯n)n1subscriptsubscript¯𝐻𝑛𝑛1(\bar{H}_{n})_{n\geq 1} of connected 333-regular undirected expander graphs, where the graph H¯nsubscript¯𝐻𝑛\bar{H}_{n} has Θ(n)Θ𝑛\Theta(n) vertices. Let c𝑐c be the constant whose existence follows from the theorem above. Consider the graph H¯nsubscript¯𝐻𝑛\bar{H}_{n}, for some n(5/c)3𝑛superscript5𝑐3n\geq({5/c})^{3}, and take a partition of the set of vertices V(H¯n)𝑉subscript¯𝐻𝑛V(\bar{H}_{n}) into at least hcn1/35𝑐superscript𝑛135h\geq cn^{1/3}\geq 5 subsets satisfying the conditions given in Theorem 13. Let us call them bubbles. Without loss of generality we can assume that the number of bubbles is odd, i.e., h=2m+12𝑚1h=2m+1, otherwise we remove the bubbles Vh1subscript𝑉1V_{h-1} and Vhsubscript𝑉V_{h} and substitute them by a single bubble Vh1Vhsubscript𝑉1subscript𝑉V_{h-1}\cup V_{h}. The set of bubbles is denoted 𝒲𝒲\mathcal{W}.

Based on the undirected expander graph H¯nsubscript¯𝐻𝑛\bar{H}_{n}, together with the partition of the set of its vertices into 2m+12𝑚12m+1 bubbles we define a directed graph Hnsubscript𝐻𝑛H_{n} and a labelling σnsubscript𝜎𝑛\sigma_{n}. The idea is to simulate the complete bipartite graph Km,m+1subscript𝐾𝑚𝑚1K_{m,m+1}. To this end, let us fix a partition of the set of bubbles into two disjoint sets: {V1,,Vm}subscript𝑉1subscript𝑉𝑚\{V_{1},\ldots,V_{m}\} and {W1,,Wm+1}subscript𝑊1subscript𝑊𝑚1\{W_{1},\ldots,W_{m+1}\}. For each i[m]𝑖delimited-[]𝑚i\in[m] and each j[m+1]𝑗delimited-[]𝑚1j\in[m+1] let us fix an edge ei,jsubscript𝑒𝑖𝑗e_{i,j} incident to some vertex in Visubscript𝑉𝑖V_{i} and to some vertex in Wjsubscript𝑊𝑗W_{j}. For future reference, let us say that we paint those edges red. We fix the direction of each of those edges from Wjsubscript𝑊𝑗W_{j} to Visubscript𝑉𝑖V_{i}. The directions of the rest of the edges in the graph Hnsubscript𝐻𝑛H_{n} are set arbitrarily. Now, for each i[m]𝑖delimited-[]𝑚i\in[m] we fix one vertex visubscript𝑣𝑖v_{i} in the bubble Visubscript𝑉𝑖V_{i}, paint it blue and label it with 11-1; similarly for each j[m+1]𝑗delimited-[]𝑚1j\in[m+1] we fix one vertex vjsubscript𝑣𝑗v_{j} in the bubble Wisubscript𝑊𝑖W_{i}, paint it green and label it with 111. The rest of the vertices of the graph Hnsubscript𝐻𝑛H_{n} are labelled with 00. This finishes the definition of the directed graph Hnsubscript𝐻𝑛H_{n} and the labelling σnsubscript𝜎𝑛\sigma_{n}.

Observe that the Tseitin tautology 𝔸(Hn,σn)𝔸subscript𝐻𝑛subscript𝜎𝑛\mathbb{A}(H_{n},\sigma_{n}) is unsatisfiable. Indeed, the sum of all labels of the vertices of Hnsubscript𝐻𝑛H_{n} is (m+1)1m1=10𝑚11𝑚110(m+1)\cdot 1-m\cdot 1=1\neq 0.

We now show that by assigning truth values to some variables in CNF(𝔸(Hn,σn),𝔹(G,3))CNF𝔸subscript𝐻𝑛subscript𝜎𝑛𝔹𝐺3\mathrm{CNF}(\mathbb{A}(H_{n},\sigma_{n}),\mathbb{B}(G,3)) we obtain an encoding of CNF(𝔸(Km,m+1,σ),𝔹(G))CNF𝔸subscript𝐾𝑚𝑚1𝜎𝔹𝐺\mathrm{CNF}(\mathbb{A}(K_{m,m+1},\sigma),\mathbb{B}(G)), where Km,m+1subscript𝐾𝑚𝑚1K_{m,m+1} is a complete bipartite graph, and σ𝜎\sigma is some labelling of its vertices with elements of the group G𝐺G. Let us first make some general comments about partial assignments for instances of LIN(G)LIN𝐺\mathrm{LIN}(G) and variable substitutions for corresponding CNFs.

Let 𝔸𝔸\mathbb{A} be any system of linear equations over the group G𝐺G. For a partial assignment ρ𝜌\rho which maps some of the variables of 𝔸𝔸\mathbb{A} to elements of the group, there is a natural corresponding substitution of the variables in CNF(𝔸,𝔹(G))CNF𝔸𝔹𝐺\mathrm{CNF}(\mathbb{A},\mathbb{B}(G)): if the partial assignment ρ𝜌\rho maps a variable a𝑎a to g𝑔g, then the variable X(a,g)𝑋𝑎𝑔X(a,g) is substituted by 111 and the variables X(a,g)𝑋𝑎superscript𝑔X(a,g^{\prime}), for ggsuperscript𝑔𝑔g^{\prime}\neq g, are substituted by 00; if the partial assignment ρ𝜌\rho leaves the value of some variable a𝑎a unassigned, then for all the variables X(a,g)𝑋𝑎𝑔X(a,g), the substitution is defined by the identity. It is not difficult to see that the result of applying this substitution to CNF(𝔸,𝔹(G))CNF𝔸𝔹𝐺\mathrm{CNF}(\mathbb{A},\mathbb{B}(G)) is CNF(𝔸|ρ,𝔹(G))CNFevaluated-at𝔸𝜌𝔹𝐺\mathrm{CNF}(\mathbb{A}|_{\rho},\mathbb{B}(G)), where 𝔸|ρevaluated-at𝔸𝜌\mathbb{A}|_{\rho} is the system of linear equations obtained by applying ρ𝜌\rho to the variables of 𝔸𝔸\mathbb{A}. For simplicity, we denote the above defined substitution by ρ𝜌\rho. Hence, CNF(𝔸,𝔹(G))|ρ=CNF(𝔸|ρ,𝔹(G))evaluated-atCNF𝔸𝔹𝐺𝜌CNFevaluated-at𝔸𝜌𝔹𝐺\mathrm{CNF}(\mathbb{A},\mathbb{B}(G))|_{\rho}=\mathrm{CNF}(\mathbb{A}|_{\rho},\mathbb{B}(G)).

Coming back to the graph Hnsubscript𝐻𝑛H_{n}, let us consider a partial assignment ρ𝜌\rho which, for each of the bubbles V𝒲𝑉𝒲V\in\mathcal{W}, maps the non-red edges in (V)𝑉\partial(V) to the group element 00, and leaves the value of the rest of the edges in Hnsubscript𝐻𝑛H_{n} unassigned. Observe that ρ𝜌\rho does not falsify any of the equations in 𝔸(Hn,σn)𝔸subscript𝐻𝑛subscript𝜎𝑛\mathbb{A}(H_{n},\sigma_{n}). Indeed, since every subgraph induced by a single bubble is connected, for every vertex v𝑣v, the value of at least one variable that appears in the equation e+(v)ee(v)e=σn(v)subscript𝑒subscript𝑣𝑒subscript𝑒subscript𝑣𝑒subscript𝜎𝑛𝑣\sum_{e\in\partial_{+}(v)}e-\sum_{e\in\partial_{-}(v)}e=\sigma_{n}(v) is left unassigned. Moreover, for every bubble V𝒲𝑉𝒲V\in\mathcal{W} the equation 𝔸((V),σn)|ρevaluated-at𝔸𝑉subscript𝜎𝑛𝜌\mathbb{A}(\partial(V),\sigma_{n})|_{\rho} says that the sum of the red edges in (V)𝑉\partial(V) is 111. This is clear for the bubbles in {W1,,Wm+1}subscript𝑊1subscript𝑊𝑚1\{W_{1},\ldots,W_{m+1}\}, and for the bubbles in {V1,,Vm}subscript𝑉1subscript𝑉𝑚\{V_{1},\ldots,V_{m}\} one only needs to multiply the corresponding equations by 11-1.

Consider a complete bipartite graph Km,m+1subscript𝐾𝑚𝑚1K_{m,m+1} with m𝑚m blue vertices, m+1𝑚1m+1 green vertices and a directed red edge from every green vertex to every blue vertex. Let the labelling σ𝜎\sigma assign 11-1 to the blue vertices and 111 to the green ones. The Tseitin tautology 𝔸(Km,m+1,σ)𝔸subscript𝐾𝑚𝑚1𝜎\mathbb{A}(K_{m,m+1},\sigma) is up to renaming of variables the same as the set of equations in:

i[m]𝔸((Vi),σn)|ρj[m+1]𝔸((Wj),σn)|ρ=V𝒲𝔸((V),σn)|ρ.evaluated-atsubscript𝑖delimited-[]𝑚𝔸subscript𝑉𝑖subscript𝜎𝑛𝜌evaluated-atsubscript𝑗delimited-[]𝑚1𝔸subscript𝑊𝑗subscript𝜎𝑛𝜌evaluated-atsubscript𝑉𝒲𝔸𝑉subscript𝜎𝑛𝜌\bigcup_{i\in[m]}\mathbb{A}(\partial(V_{i}),\sigma_{n})|_{\rho}\ \cup\bigcup_{j\in[m+1]}\mathbb{A}(\partial(W_{j}),\sigma_{n})|_{\rho}\ =\bigcup_{V\in\mathcal{W}}\mathbb{A}(\partial(V),\sigma_{n})|_{\rho}.

Therefore, from now on we denote the above system of linear equations by 𝔸(Km,m+1,σ)𝔸subscript𝐾𝑚𝑚1𝜎\mathbb{A}(K_{m,m+1},\sigma). Note that, for each of the vertices v𝑣v of the graph Km,m+1subscript𝐾𝑚𝑚1K_{m,m+1}, the corresponding equation says that the sum of the variables in (v)𝑣\partial(v) is 111.

Let r=max(3,|G|)𝑟3𝐺r=\max(3,|G|). For ml𝑚𝑙m\leq l, an r𝑟r-CNF F𝐹F over variables X1,,Xlsubscript𝑋1subscript𝑋𝑙X_{1},\ldots,X_{l} is called an implicit encoding [17] of a propositional formula ψ𝜓\psi over variables X1,,Xmsubscript𝑋1subscript𝑋𝑚X_{1},\ldots,X_{m} if the following holds: a truth assignment to the variables of ψ𝜓\psi satisfies ψ𝜓\psi if and only if it can be extended to a truth assignment to the variables of F𝐹F which satisfies F𝐹F. The variables Xm+1,,Xlsubscript𝑋𝑚1subscript𝑋𝑙X_{m+1},\ldots,X_{l} are called auxiliary variables.

It follows from Lemma 24 that, for each bubble V𝒲𝑉𝒲V\in\mathcal{W}, the formula CNF(𝔸(V,σn),𝔹(G))CNF𝔸𝑉subscript𝜎𝑛𝔹𝐺\mathrm{CNF}(\mathbb{A}(V,\sigma_{n}),\mathbb{B}(G)) is an implicit encoding of the formula CNF(𝔸((V),σn),𝔹(G))CNF𝔸𝑉subscript𝜎𝑛𝔹𝐺\mathrm{CNF}(\mathbb{A}(\partial(V),\sigma_{n}),\mathbb{B}(G)), with the set of auxiliary variables being the set of edges of the subgraph induced by V𝑉V. Since on this set of edges the substitution ρ𝜌\rho is defined as the identity, it is not difficult to see that, for each bubble V𝒲𝑉𝒲V\in\mathcal{W}, the substituted formula CNF(𝔸(V,σn),𝔹(G))|ρevaluated-atCNF𝔸𝑉subscript𝜎𝑛𝔹𝐺𝜌\mathrm{CNF}(\mathbb{A}(V,\sigma_{n}),\mathbb{B}(G))|_{\rho} is an implicit encoding of the substituted formula CNF(𝔸((V),σn),𝔹(G))|ρevaluated-atCNF𝔸𝑉subscript𝜎𝑛𝔹𝐺𝜌\mathrm{CNF}(\mathbb{A}(\partial(V),\sigma_{n}),\mathbb{B}(G))|_{\rho}. Moreover, the sets of auxiliary variables in those implicit encodings are pairwise disjoint, hence the formula

V𝒲CNF(𝔸(V,σn),𝔹(G,3))|ρ=CNF(𝔸(Hn,σn),𝔹(G,3))|ρevaluated-atsubscript𝑉𝒲CNF𝔸𝑉subscript𝜎𝑛𝔹𝐺3𝜌evaluated-atCNF𝔸subscript𝐻𝑛subscript𝜎𝑛𝔹𝐺3𝜌\bigcup_{V\in\mathcal{W}}\mathrm{CNF}(\mathbb{A}(V,\sigma_{n}),\mathbb{B}(G,3))|_{\rho}=\mathrm{CNF}(\mathbb{A}(H_{n},\sigma_{n}),\mathbb{B}(G,3))|_{\rho}

is an implicit encoding of the formula

V𝒲CNF(𝔸((V),σn),𝔹(G))|ρ=CNF(𝔸(Km,m+1,σ),𝔹(G)).evaluated-atsubscript𝑉𝒲CNF𝔸𝑉subscript𝜎𝑛𝔹𝐺𝜌CNF𝔸subscript𝐾𝑚𝑚1𝜎𝔹𝐺\bigcup_{V\in\mathcal{W}}\mathrm{CNF}(\mathbb{A}(\partial(V),\sigma_{n}),\mathbb{B}(G))|_{\rho}=\mathrm{CNF}(\mathbb{A}(K_{m,m+1},\sigma),\mathbb{B}(G)).

This way we have reduced an implicit encoding of a Tseitin formula over a complete bipartite graph Km,m+1subscript𝐾𝑚𝑚1K_{m,m+1} to the Tseitin formula over the expander graph Hnsubscript𝐻𝑛H_{n}, where m>Cn1/3𝑚𝐶superscript𝑛13m>Cn^{1/3}, and C𝐶C is a constant which does not depend on n𝑛n.

Reducing the pigeonhole principle.

We now use the technique for removing auxiliary variables without significantly increasing the proof size introduced in [17] to reduce the onto-pigeonhole principle formula OPHP(m,m+1)OPHP𝑚𝑚1\mathrm{OPHP}(m,m+1), as defined below, to the formula CNF(𝔸(Hn,σn),𝔹(G,3))|ρevaluated-atCNF𝔸subscript𝐻𝑛subscript𝜎𝑛𝔹𝐺3𝜌\mathrm{CNF}(\mathbb{A}(H_{n},\sigma_{n}),\mathbb{B}(G,3))|_{\rho}.

For a positive integer l𝑙l and a set of variables X1,,Xlsubscript𝑋1subscript𝑋𝑙X_{1},\ldots,X_{l}, we denote by 𝒰(X1,,Xl)𝒰subscript𝑋1subscript𝑋𝑙\mathcal{U}({X_{1},\ldots,X_{l}}) the CNF which has a clause i[l]Xisubscript𝑖delimited-[]𝑙subscript𝑋𝑖\bigvee_{i\in[l]}X_{i}, and for every 1i<il1𝑖superscript𝑖𝑙1\leq i<i^{\prime}\leq l, a clause Xi¯Xi¯¯subscript𝑋𝑖¯subscript𝑋superscript𝑖\overline{X_{i}}\vee\overline{X_{i^{\prime}}}. For a complete bipartite graph Kl,l+1subscript𝐾𝑙𝑙1K_{l,l+1}, the onto-pigeonhole principle OPHP(l,l+1)OPHP𝑙𝑙1\mathrm{OPHP}(l,l+1) is the CNF which is the union of 𝒰((v))𝒰𝑣\mathcal{U}(\partial(v)) over the set of all vertices v𝑣v of the graph.

Let us consider the following substitution of the variables in CNF(𝔸(Km,m+1,σ),𝔹(G))CNF𝔸subscript𝐾𝑚𝑚1𝜎𝔹𝐺\mathrm{CNF}(\mathbb{A}(K_{m,m+1},\sigma),\mathbb{B}(G)) and its implicit encoding CNF(𝔸(Hn,σn),𝔹(G,3))|ρevaluated-atCNF𝔸subscript𝐻𝑛subscript𝜎𝑛𝔹𝐺3𝜌\mathrm{CNF}(\mathbb{A}(H_{n},\sigma_{n}),\mathbb{B}(G,3))|_{\rho}: for every red edge e𝑒e the variable X(e,1)𝑋𝑒1X(e,1) is substituted by e𝑒e, the variable X(e,0)𝑋𝑒0X(e,0) is substituted by e¯¯𝑒\overline{e}, and for every gG𝑔𝐺g\in G such that g{0,1}𝑔01g\not\in\{0,1\}, the variable X(e,g)𝑋𝑒𝑔X(e,g) is substituted by 00. On the auxiliary variables of the implicit encoding the substitution is defined by the identity. For simplicity, let us consider this substitution as an extension of the substitution ρ𝜌\rho, and let us denote it by ρsuperscript𝜌\rho^{\prime}. Intuitively, the substituted formula CNF(𝔸(Km,m+1,σ),𝔹(G))|ρevaluated-atCNF𝔸subscript𝐾𝑚𝑚1𝜎𝔹𝐺superscript𝜌\mathrm{CNF}(\mathbb{A}(K_{m,m+1},\sigma),\mathbb{B}(G))|_{\rho^{\prime}} encodes those assignments to the variables of 𝔸(Km,m+1,σ)𝔸subscript𝐾𝑚𝑚1𝜎\mathbb{A}(K_{m,m+1},\sigma) that map each variable either to the group element 00 or to the group element 111. Setting the truth value of the variable e𝑒e to 111 corresponds to mapping e𝑒e to 111, and setting it to 00 corresponds to mapping e𝑒e to 00.

Observe that for every vertex vKm,m+1𝑣subscript𝐾𝑚𝑚1v\in K_{m,m+1}, we have 𝒰((v))CNF(𝔸(v,σ),𝔹(G))|ρmodels𝒰𝑣evaluated-atCNF𝔸𝑣𝜎𝔹𝐺superscript𝜌\mathcal{U}(\partial(v))\models\mathrm{CNF}(\mathbb{A}(v,\sigma),\mathbb{B}(G))|_{\rho^{\prime}}. Indeed, 𝒰((v))𝒰𝑣\mathcal{U}(\partial(v)) is satisfied if and only if exactly one of the variables in (v)𝑣\partial(v) is assigned a truth value 111. This truth assignment corresponds to mapping exactly one of the red edges incident to v𝑣v to the group element 111 and mapping the rest of the red edges incident to v𝑣v to the identity element 00. It is not difficult to see that such an assignment satisfies the equation 𝔸(v,σ)𝔸𝑣𝜎\mathbb{A}(v,\sigma).

For a CNF F𝐹F with variables X1,,Xlsubscript𝑋1subscript𝑋𝑙X_{1},\ldots,X_{l}, by DNF(F)DNF𝐹\mathrm{DNF}(F) we denote the l𝑙l-DNF formula which, for every truth assignment satisfying F𝐹F, has an l𝑙l-term representing this assignment, i.e., the unique l𝑙l-term which is satisfied by this assignment and no other.

We now have all the ingredients necessary to remove the auxiliary variables using the technique from [17]. We remark that the Frege system studied therein differs from the system considered in the present paper. The formulas are formed from variables using negation and disjunction only, and there is no introduction of conjunction rule. However, it follows from the theorem of [44] that those two Frege systems polynomially simulate each other up to a constant factor loss in depth. Therefore, since the lower bound we aim at is exponential and for all constant depths, we can apply the technique from [17].

We have the following:

  • CNF(𝔸(Hn,σn),𝔹(G,3))|ρevaluated-atCNF𝔸subscript𝐻𝑛subscript𝜎𝑛𝔹𝐺3superscript𝜌\mathrm{CNF}(\mathbb{A}(H_{n},\sigma_{n}),\mathbb{B}(G,3))|_{\rho^{\prime}} is an implicit encoding of CNF(𝔸(Km,m+1,σ),𝔹(G))|ρevaluated-atCNF𝔸subscript𝐾𝑚𝑚1𝜎𝔹𝐺superscript𝜌\mathrm{CNF}(\mathbb{A}(K_{m,m+1},\sigma),\mathbb{B}(G))|_{\rho^{\prime}};

  • for every vertex vKm,m+1𝑣subscript𝐾𝑚𝑚1v\in K_{m,m+1}, we have that 𝒰((v))CNF(𝔸(v,σ),𝔹(G))|ρmodels𝒰𝑣evaluated-atCNF𝔸𝑣𝜎𝔹𝐺superscript𝜌\mathcal{U}(\partial(v))\models\mathrm{CNF}(\mathbb{A}(v,\sigma),\mathbb{B}(G))|_{\rho^{\prime}};

  • it follows from Lemma 5.7 in [17] that, for every vertex vKm,m+1𝑣subscript𝐾𝑚𝑚1v\in K_{m,m+1}, the size of a depth 444 Frege derivation of the formula 𝒰((v))¯DNF(𝒰((v)))¯𝒰𝑣DNF𝒰𝑣\overline{\mathcal{U}(\partial(v))}\vee\mathrm{DNF}(\mathcal{U}(\partial(v))) is bounded by O(m2)𝑂superscript𝑚2O(m^{2}).

Hence, by Theorem 5.5 of [17] if CNF(𝔸(Hn,σn),𝔹(G))|ρevaluated-atCNF𝔸subscript𝐻𝑛subscript𝜎𝑛𝔹𝐺superscript𝜌\mathrm{CNF}(\mathbb{A}(H_{n},\sigma_{n}),\mathbb{B}(G))|_{\rho^{\prime}} has a Frege refutation of depth d𝑑d and size s𝑠s, then there exists a Frege refutation of OPHP(m,m+1)=vV(Km,m+1)𝒰((v))OPHP𝑚𝑚1subscript𝑣𝑉subscript𝐾𝑚𝑚1𝒰𝑣\mathrm{OPHP}(m,m+1)=\bigcup_{v\in V(K_{m,m+1})}\mathcal{U}(\partial(v)) of depth d+10𝑑10d+10 and size at most polynomial in m4ssuperscript𝑚4𝑠m^{4}s, that is of size at most polynomial in n4/3ssuperscript𝑛43𝑠n^{4/3}s.

To complete the proof in the case of G=q𝐺subscript𝑞G=\mathbb{Z}_{q} it suffices to refer to the following theorem proved independently in [16] and [37] as an exponential improvement over [1].

Theorem 14 (The Jewel Theorem of Proof Complexity [16, 37, 1]).

For every integer d𝑑d there exists a constant δ𝛿\delta such that for every sufficiently large integer m𝑚m every Frege refutation of OPHP(m,m+1)OPHP𝑚𝑚1\mathrm{OPHP}(m,m+1) of depth d𝑑d has size at least 2mδsuperscript2superscript𝑚𝛿2^{m^{\delta}}.

It remains to show that thanks to the Fundamental Theorem of Finite Abelian Groups, the special case of G=q𝐺subscript𝑞G=\mathbb{Z}_{q} implies Theorem 12 in full generality.

Lemma 25.

Let G=q𝒬q𝐺subscriptdirect-sum𝑞𝒬subscript𝑞G=\bigoplus_{q\in\mathcal{Q}}\mathbb{Z}_{q} be a finite Abelian group, and let n𝑛n, d𝑑d and s𝑠s be positive integers. If for some q𝒬𝑞𝒬q\in\mathcal{Q} there is an unsatisfiable instance 𝔸𝔸\mathbb{A} of 3LIN(q)3LINsubscript𝑞3\mathrm{LIN}(\mathbb{Z}_{q}) with n𝑛n variables such that every Frege refutation of CNF(𝔸,𝔹(q,3))CNF𝔸𝔹subscript𝑞3\mathrm{CNF}(\mathbb{A},\mathbb{B}(\mathbb{Z}_{q},3)) of depth d𝑑d has size at least s𝑠s, then there is an unsatisfiable instance 𝔸superscript𝔸\mathbb{A}^{\prime} of 3LIN(G)3LIN𝐺3\mathrm{LIN}(G) with n𝑛n variables such that every Frege refutation of CNF(𝔸,𝔹(G,3))CNFsuperscript𝔸𝔹𝐺3\mathrm{CNF}(\mathbb{A}^{\prime},\mathbb{B}(G,3)) of depth d𝑑d has size at least s𝑠s.

Proof.

Let 𝔸𝔸\mathbb{A} be an unsatisfiable instance of 3LIN(q)3LINsubscript𝑞3\mathrm{LIN}(\mathbb{Z}_{q}) with n𝑛n variables, and assume that every Frege refutation of CNF(𝔸,𝔹(q,3))CNF𝔸𝔹subscript𝑞3\mathrm{CNF}(\mathbb{A},\mathbb{B}(\mathbb{Z}_{q},3)) of depth d𝑑d has size at least s𝑠s. The instance 𝔸𝔸\mathbb{A} is a system of linear equations over the group qsubscript𝑞\mathbb{Z}_{q}. Since qsubscript𝑞\mathbb{Z}_{q} is a subgroup of G𝐺G, we can think of the same system of linear equations as a system of linear equations over the group G𝐺G. Let 𝔸superscript𝔸\mathbb{A}^{\prime} be the corresponding instance of 3LIN(G)3LIN𝐺3\mathrm{LIN}(G). It is not difficult to see that it is unsatisfiable. Moreover, every Frege refutation of CNF(𝔸,𝔹(G,3))CNFsuperscript𝔸𝔹𝐺3\mathrm{CNF}(\mathbb{A}^{\prime},\mathbb{B}(G,3)) of depth d𝑑d has size at least s𝑠s. Indeed, by applying a substitution ρ𝜌\rho which for every aA𝑎superscript𝐴a\in A^{\prime} and every gGq𝑔𝐺subscript𝑞g\in G\setminus\mathbb{Z}_{q} substitutes the variable X(a,g)𝑋𝑎𝑔X(a,g) with 00 and on all other variables is defined by identity, we transform a Frege refutation of CNF(𝔸,𝔹(G,3))CNFsuperscript𝔸𝔹𝐺3\mathrm{CNF}(\mathbb{A}^{\prime},\mathbb{B}(G,3)) of depth d𝑑d to a Frege refutation of CNF(𝔸,𝔹(q,3))CNF𝔸𝔹subscript𝑞3\mathrm{CNF}(\mathbb{A},\mathbb{B}(\mathbb{Z}_{q},3)) of depth d𝑑d. ∎

6.3 Lower bound for Polynomial Calculus

The original motivation in [22] for defining the Tseitin graph tautologies for Abelian groups beyond 2subscript2\mathbb{Z}_{2} was to compare the strength of Polynomial Calculus over different fields. Here we use their results with the different purpose of getting lower bounds for Polynomial Calculus over the real-field for all CSPs of unbounded width. Along the lines of the previous section for bounded-depth Frege, this will be a consequence of Theorem 6, Theorem 11, and the following lower bound (for which we use the EQ encoding scheme).

Theorem 15.

For every non-trivial finite Abelian group G𝐺G there exists a positive constant δ𝛿\delta and a family of unsatisfiable instances (𝔸n)n1subscriptsubscript𝔸𝑛𝑛1(\mathbb{A}_{n})_{n\geq 1} of 3LIN(G)3LIN𝐺3\mathrm{LIN}(G), where 𝔸nsubscript𝔸𝑛\mathbb{A}_{n} has Θ(n)Θ𝑛\Theta(n) variables and Θ(n)Θ𝑛\Theta(n) equations, such that for every sufficiently large n𝑛n every PC refutation over the reals of EQ(𝔸n,𝔹(G,3))EQsubscript𝔸𝑛𝔹𝐺3\mathrm{EQ}(\mathbb{A}_{n},\mathbb{B}(G,3)) has degree at least δn𝛿𝑛\delta n.

By the same argument as in the previous section, Theorem 15 will follow from the special case for Abelian groups of the form msubscript𝑚\mathbb{Z}_{m} proved in [22]. Let us note that the statement in [22] is made only for fields of prime characteristic and for prime m𝑚m, but the same proof goes through for arbitrary fields whose characteristic does not divide m𝑚m.

Strictly speaking, the form of the Tseitin system of equations that we defined in the previous section is slightly more general than the original one from [22]. In [22], the definition starts with an undirected graph H𝐻H and, given a labelling σ:V(H)m:𝜎𝑉𝐻subscript𝑚\sigma:V(H)\rightarrow\mathbb{Z}_{m}, the system of equations 𝔸^(H,σ)^𝔸𝐻𝜎\hat{\mathbb{A}}(H,\sigma) over msubscript𝑚\mathbb{Z}_{m} is defined as follows:

  • there is a pair of variables (u,v)𝑢𝑣(u,v) and (v,u)𝑣𝑢(v,u) for each edge {u,v}𝑢𝑣\{u,v\} in E(H)𝐸𝐻E(H),

  • for every edge {u,v}𝑢𝑣\{u,v\} of E(H)𝐸𝐻E(H) there is an equation (u,v)+(v,u)=0𝑢𝑣𝑣𝑢0(u,v)+(v,u)=0,

  • for every vertex u𝑢u of V(H)𝑉𝐻V(H) there is an equation

    vV(H):{u,v}E(H)(u,v)=σ(u).subscriptFRACOP:𝑣𝑉𝐻absent𝑢𝑣𝐸𝐻𝑢𝑣𝜎𝑢\sum_{v\in V(H):\atop\{u,v\}\in E(H)}(u,v)=\sigma(u).

Let us see that 𝔸^(H,σ)^𝔸𝐻𝜎\hat{\mathbb{A}}(H,\sigma) is isomorphic to the system 𝔸(H^,σ^)𝔸^𝐻^𝜎\mathbb{A}(\hat{H},\hat{\sigma}) for an appropriately defined directed graph H^^𝐻\hat{H} and an appropriately defined labelling σ^^𝜎\hat{\sigma}. The set of vertices V(H^)𝑉^𝐻V(\hat{H}) of H^^𝐻\hat{H} is V(H)E(H)𝑉𝐻𝐸𝐻V(H)\cup E(H). The set of edges E(H^)𝐸^𝐻E(\hat{H}) of H^^𝐻\hat{H} has two directed edges (u,e)𝑢𝑒(u,e) and (v,e)𝑣𝑒(v,e) for each undirected edge e={u,v}𝑒𝑢𝑣e=\{u,v\} of H𝐻H. The labelling σ^^𝜎\hat{\sigma} is the extension of σ𝜎\sigma to V(H^)V(H)𝑉𝐻𝑉^𝐻V(\hat{H})\supseteq V(H) defined by σ^(e)=0^𝜎𝑒0\hat{\sigma}(e)=0 for each eE(H)𝑒𝐸𝐻e\in E(H). It is not hard to see that the mapping (u,e)(u,v)maps-to𝑢𝑒𝑢𝑣(u,e)\mapsto(u,v), for e={u,v}𝑒𝑢𝑣e=\{u,v\}, is an isomorphism from 𝔸(H^,σ^)𝔸^𝐻^𝜎\mathbb{A}(\hat{H},\hat{\sigma}) to 𝔸^(H,σ)^𝔸𝐻𝜎\hat{\mathbb{A}}(H,\sigma). This justifies the claim that the definition of the Tseitin system from the previous section is a generalization of the definition in [22]. Another sense in which the definition of the Tseitin system from the previous section is more general is that the original definition requires m𝑚m in msubscript𝑚\mathbb{Z}_{m} to be a prime number; however, going through their proof it is readily seen that this is not essential. Finally, the original definition also requires the condition that the sum of the labels σ(u)𝜎𝑢\sigma(u) is 111 (mod m𝑚m), but again this is not essential in their proof as long as the sum is non-zero.

Let 𝔹(m,3)superscript𝔹subscript𝑚3\mathbb{B}^{\prime}(\mathbb{Z}_{m},3) be the template 𝔹(m,3)𝔹subscript𝑚3\mathbb{B}(\mathbb{Z}_{m},3) extended with the binary relation {(g,g)m2:g+g=0}conditional-set𝑔superscript𝑔superscriptsubscript𝑚2𝑔superscript𝑔0\{(g,g^{\prime})\in\mathbb{Z}_{m}^{2}:g+g^{\prime}=0\}, and let EQsuperscriptEQ\mathrm{EQ}^{\prime} denote the modification of the encoding scheme EQEQ\mathrm{EQ} in which each twin variable X¯(a,b)¯𝑋𝑎𝑏\bar{X}(a,b) is replaced by 1X(a,b)1𝑋𝑎𝑏1-X(a,b). It turns out that the system of polynomial equations EQ(𝔸^(Hn,σn),𝔹(m,3))superscriptEQ^𝔸subscript𝐻𝑛subscript𝜎𝑛superscript𝔹subscript𝑚3\mathrm{EQ}^{\prime}(\hat{\mathbb{A}}(H_{n},\sigma_{n}),\mathbb{B}^{\prime}(\mathbb{Z}_{m},3)) for a fixed family of 333-regular expander graphs (Hn)n1subscriptsubscript𝐻𝑛𝑛1(H_{n})_{n\geq 1} and a labelling σn:V(Hn)m:subscript𝜎𝑛𝑉subscript𝐻𝑛subscript𝑚\sigma_{n}:V(H_{n})\rightarrow\mathbb{Z}_{m} of total sum 111 mod m𝑚m is literally the same as the system of polynomial equations that [22] calls BTSn,m. Note that BTSn,m has Θ(n)Θ𝑛\Theta(n) variables. We have the following:

Theorem 16 (see Corollary 21 in [22]).

For every integer m2𝑚2m\geq 2 and every field F𝐹F of a characteristic that does not divide m𝑚m there exists a positive δ𝛿\delta such that for every sufficiently large n𝑛n every PC refutation over F𝐹F of BTSn,m has degree at least δn𝛿𝑛\delta n.

This gives us a family of instances of 𝔹(m,3)superscript𝔹subscript𝑚3\mathbb{B}^{\prime}(\mathbb{Z}_{m},3) that are hard for Polynomial Calculus over the real-field. Since the template 𝔹(m,3)superscript𝔹subscript𝑚3\mathbb{B}^{\prime}(\mathbb{Z}_{m},3) is pp-definable in 𝔹(m,3)𝔹subscript𝑚3\mathbb{B}(\mathbb{Z}_{m},3), Theorem 6 implies an existence of such a family for 3LIN(m)3LINsubscript𝑚3\mathrm{LIN}(\mathbb{Z}_{m}).

In order to complete the proof of Theorem 15 from Theorem 16 it suffices to invoke a version of Lemma 25 for Polynomial Calculus, whose statement and proof are virtually identical to those of Lemma 25, and are thus omitted.

6.4 Lower bound for Sums-of-Squares

In the case of Sums-of-Squares, similarly as for Polynomial Calculus, we do not need to adapt an existing lower bound proof from the literature for 2subscript2\mathbb{Z}_{2} to all finite Abelian groups because this was already done. The lower bound that we need to complete the proof of Theorem 10 is the following:

Theorem 17 ([23]).

For every non-trivial finite Abelian group G𝐺G there exists a positive δ𝛿\delta and a family of unsatisfiable instances (𝔸n)n1subscriptsubscript𝔸𝑛𝑛1(\mathbb{A}_{n})_{n\geq 1} of 3LIN(G)3LIN𝐺3\mathrm{LIN}(G), where 𝔸nsubscript𝔸𝑛\mathbb{A}_{n} has Θ(n)Θ𝑛\Theta(n) variables and Θ(n)Θ𝑛\Theta(n) equations, such that for every sufficiently large n𝑛n every SOS refutation of EQ(𝔸n,𝔹(G,3))EQsubscript𝔸𝑛𝔹𝐺3\mathrm{EQ}(\mathbb{A}_{n},\mathbb{B}(G,3)) has degree at least δn𝛿𝑛\delta n.

The exact statement that we are referring to is Theorem G.8 from Appendix G in [23]. In order to be able to state the theorem and compare it to the statement of Theorem 17 we need to introduce some definitions.

Let G𝐺G be a finite Abelian group and let C𝐶C be a subgroup of Gksuperscript𝐺𝑘G^{k}, where k3𝑘3k\geq 3. The problem Additive-CSP(C)𝐶(C), as defined in [23], is the constraint satisfaction problem that has constraint relations of the form {(c1,,ck):(c1b1,,ckbk)C}conditional-setsubscript𝑐1subscript𝑐𝑘subscript𝑐1subscript𝑏1subscript𝑐𝑘subscript𝑏𝑘𝐶\{(c_{1},\ldots,c_{k}):(c_{1}-b_{1},\ldots,c_{k}-b_{k})\in C\}, for all (b1,,bk)Gksubscript𝑏1subscript𝑏𝑘superscript𝐺𝑘(b_{1},\ldots,b_{k})\in G^{k}. Note that if the set of variables is V𝑉V, then the set of all possible constraints can be identified with the set Vk×Gksuperscript𝑉𝑘superscript𝐺𝑘V^{k}\times G^{k}. The instances are presented as distributions π𝜋\pi over Vk×Gksuperscript𝑉𝑘superscript𝐺𝑘V^{k}\times G^{k}. This amounts to assigning weights to the constraints. The value of an instance is the maximum over all assignments of values to variables of the probability that a random constraint chosen from π𝜋\pi is satisfied by the assignment. We say that CGk𝐶superscript𝐺𝑘C\subseteq G^{k} is balanced pairwise independent if for every pair i,j[k]𝑖𝑗delimited-[]𝑘i,j\in[k] with ij𝑖𝑗i\not=j, and every two elements a,bG𝑎𝑏𝐺a,b\in G, the number of k𝑘k-tuples (c1,,ck)subscript𝑐1subscript𝑐𝑘(c_{1},\ldots,c_{k}) from C𝐶C such that ci=asubscript𝑐𝑖𝑎c_{i}=a and cj=bsubscript𝑐𝑗𝑏c_{j}=b is |C|/|G|2𝐶superscript𝐺2|C|/|G|^{2}. For example, any C𝐶C of the form {(c1,,ck):c1++ck=0}conditional-setsubscript𝑐1subscript𝑐𝑘subscript𝑐1subscript𝑐𝑘0\{(c_{1},\ldots,c_{k}):c_{1}+\cdots+c_{k}=0\} is balanced pairwise independent, and it is a subgroup of Gksuperscript𝐺𝑘G^{k}. Chan’s Theorem G.8 in [23] states that if C𝐶C is any balanced pairwise independent subgroup of Gksuperscript𝐺𝑘G^{k} and ϵitalic-ϵ\epsilon is an arbitrary positive constant, then for every sufficiently large n𝑛n, there is an instance M𝑀M of Additive-CSP(C)𝐶(C) with n𝑛n variables, whose value is bounded by |C|/|G|k+ϵ𝐶superscript𝐺𝑘italic-ϵ|C|/|G|^{k}+\epsilon, and that has a Lasserre solution of value 111 for cn𝑐𝑛cn rounds, where c=cG,k,ϵ𝑐subscript𝑐𝐺𝑘italic-ϵc=c_{G,k,\epsilon} is a constant that depends only on the group G𝐺G, the arity k𝑘k, and the tolerance parameter ϵitalic-ϵ\epsilon. Moreover, it follows from the proof in [23] (see Theorem G.7) that the instance M𝑀M can be chosen to have eG,k,ϵnsubscript𝑒𝐺𝑘italic-ϵ𝑛e_{G,k,\epsilon}n constraints, where eG,k,ϵsubscript𝑒𝐺𝑘italic-ϵe_{G,k,\epsilon} is a constant that depends only on the group G𝐺G, the arity k𝑘k, and the parameter ϵitalic-ϵ\epsilon. We discuss what a Lasserre solution is and how it relates to SOS proofs.

Before we do that we fix some of the parameters. We want to build an unsatisfiable instance 𝔸𝔸\mathbb{A}, and we do so by choosing the parameters to make the value of M𝑀M in Chan’s Theorem strictly smaller than 111. Fix k=3𝑘3k=3 and C={(c1,c2,c3):c1+c2+c3=0}𝐶conditional-setsubscript𝑐1subscript𝑐2subscript𝑐3subscript𝑐1subscript𝑐2subscript𝑐30C=\{(c_{1},c_{2},c_{3}):c_{1}+c_{2}+c_{3}=0\}, and take ϵ=1/4italic-ϵ14\epsilon=1/4. Then the value of the instance M𝑀M is bounded by |C|/|G|3+ϵ=1/|G|+1/41/2+1/4<1𝐶superscript𝐺3italic-ϵ1𝐺1412141|C|/|G|^{3}+\epsilon=1/|G|+1/4\leq 1/2+1/4<1. This means that the collection of constraints of M𝑀M that have non-zero probability in π𝜋\pi is unsatisfiable; i.e., not all constraints can be satisfied at the same time by a single assignment. Thus, our unsatisfiable instance 𝔸𝔸\mathbb{A} will just be the set of all constraints with non-zero probability in π𝜋\pi. Now we are ready to define what a Lasserre solution of value 111 is.

According to Definition G.3 from Appendix G in [23], a Lasserre solution of value 111 for t𝑡t rounds is a collection u={uf:fGS,SV,|S|t}𝑢conditional-setsubscript𝑢𝑓formulae-sequence𝑓superscript𝐺𝑆formulae-sequence𝑆𝑉𝑆𝑡u=\{u_{f}:f\in G^{S},\;S\subseteq V,\;|S|\leq t\} of vectors in Euclidean space dsuperscript𝑑\mathbb{R}^{d}, of some finite dimension d𝑑d, such that for every SV𝑆𝑉S\subseteq V with |S|2t𝑆2𝑡|S|\leq 2t there exists a probability distribution μSsubscript𝜇𝑆\mu_{S} over GSsuperscript𝐺𝑆G^{S} with the following properties: for every R,S,TV𝑅𝑆𝑇𝑉R,S,T\subseteq V with |S|,|T|t𝑆𝑇𝑡|S|,|T|\leq t and R=ST𝑅𝑆𝑇R=S\cup T, and every fGS𝑓superscript𝐺𝑆f\in G^{S} and gGT𝑔superscript𝐺𝑇g\in G^{T}, it holds that

PrhμR[h|S=f and h|T=g]=uf,ug,subscriptPrsubscript𝜇𝑅evaluated-at𝑆evaluated-at𝑓 and 𝑇𝑔subscript𝑢𝑓subscript𝑢𝑔\displaystyle\Pr_{h\in\mu_{R}}[\;h|_{S}=f\text{ and }h|_{T}=g\;]=\langle u_{f},u_{g}\rangle,(29)

and for every constraint with variables S𝑆S in the support of π𝜋\pi and every fGS𝑓superscript𝐺𝑆f\in G^{S} that does not satisfy this constraint we have

PrhμS[h=f]=0.subscriptPrsubscript𝜇𝑆𝑓0\Pr_{h\in\mu_{S}}[\;h=f\;]=0.(30)

At this point we have all the necessary material to argue that 𝔸𝔸\mathbb{A}, or more precisely, EQ(𝔸,𝔹(G,3))EQ𝔸𝔹𝐺3\mathrm{EQ}(\mathbb{A},\mathbb{B}(G,3)), does not have SOS refutations of degree δn𝛿𝑛\delta n, where δ=2cG,k,ϵ𝛿2subscript𝑐𝐺𝑘italic-ϵ\delta=2c_{G,k,\epsilon}. Let EQsuperscriptEQ\mathrm{EQ}^{\prime} be the result of replacing each twin variable X¯(a,b)¯𝑋𝑎𝑏\bar{X}(a,b) in EQ(𝔸,𝔹(G,3))EQ𝔸𝔹𝐺3\mathrm{EQ}(\mathbb{A},\mathbb{B}(G,3)) by 1X(a,b)1𝑋𝑎𝑏1-X(a,b). By the remarks at the end of Section 2.3, it suffices to show that EQsuperscriptEQ\mathrm{EQ}^{\prime} does not have SOS refutations of degree δn𝛿𝑛\delta n for the definition of Sums-of-Squares without twin variables. Assume, for the sake of contradiction, that EQsuperscriptEQ\mathrm{EQ}^{\prime} does have such an SOS refutation of degree at most 2t2𝑡2t, where t:=cG,k,ϵnassign𝑡subscript𝑐𝐺𝑘italic-ϵ𝑛t:=c_{G,k,\epsilon}n is the number of rounds for which there exists a Lasserre solution of value 111 for the instance M𝑀M. The refutation has the form

i=1rPiSi=1superscriptsubscript𝑖1𝑟subscript𝑃𝑖subscript𝑆𝑖1\sum_{i=1}^{r}P_{i}\cdot S_{i}=-1(31)

where P1,,Prsubscript𝑃1subscript𝑃𝑟P_{1},\ldots,P_{r} are polynomials that either come from EQsuperscriptEQ\mathrm{EQ}^{\prime}, or they are axiom polynomials from the lists (2) and (4) without twin variables, or they are squares, S1,,Srsubscript𝑆1subscript𝑆𝑟S_{1},\ldots,S_{r} are arbitrary or square polynomials without twin variables as appropriate (i.e., arbitrary if the Pisubscript𝑃𝑖P_{i} they multiply come from an equation, and squares if the Pisubscript𝑃𝑖P_{i} they multiply come from an inequality), and the total degree of each product PiSisubscript𝑃𝑖subscript𝑆𝑖P_{i}\cdot S_{i} is at most 2t2𝑡2t. Multiplications by X𝑋X and 1X1𝑋1-X can be simulated by multiplications by their squares, thanks to the axioms X2X=0superscript𝑋2𝑋0X^{2}-X=0 from (2), so we can assume that the refutation has the form

i=1mPiSi+i=1Qi2=1,superscriptsubscript𝑖1𝑚subscript𝑃𝑖subscript𝑆𝑖superscriptsubscript𝑖1superscriptsubscript𝑄𝑖21\sum_{i=1}^{m}P_{i}\cdot S_{i}+\sum_{i=1}^{\ell}Q_{i}^{2}=-1,(32)

where P1,,Pmsubscript𝑃1subscript𝑃𝑚P_{1},\ldots,P_{m} are polynomials that either come from EQsuperscriptEQ\mathrm{EQ}^{\prime}, or they are one of the axiom polynomials of the form X2Xsuperscript𝑋2𝑋X^{2}-X from (2), and S1,,Sm,Q1,,Qsubscript𝑆1subscript𝑆𝑚subscript𝑄1subscript𝑄S_{1},\ldots,S_{m},Q_{1},\ldots,Q_{\ell} are arbitrary polynomials.

Recall that the variables of EQsuperscriptEQ\mathrm{EQ}^{\prime} have the form X(a,b)𝑋𝑎𝑏X(a,b) where (a,b)V×G𝑎𝑏𝑉𝐺(a,b)\in V\times G. We say that the element a𝑎a is mentioned in X(a,b)𝑋𝑎𝑏X(a,b), and that it is mentioned in any monomial that contains this variable. Now we define a linear functional E:𝒫2t:𝐸subscript𝒫2𝑡E:\mathcal{P}_{2t}\rightarrow\mathbb{R}, where 𝒫2tsubscript𝒫2𝑡\mathcal{P}_{2t} denotes the vector space of polynomials of degree at most 2t2𝑡2t on the X(a,b)𝑋𝑎𝑏X(a,b)-variables, as follows.

For each monomial M𝑀M of degree at most 2t2𝑡2t on the X(a,b)𝑋𝑎𝑏X(a,b)-variables, with all mentioned elements in SV𝑆𝑉S\subseteq V, define

E(M)=𝔼hμS[h(M)],𝐸𝑀subscript𝔼subscript𝜇𝑆delimited-[]𝑀E(M)=\mathbb{E}_{h\in\mu_{S}}[\;h(M)\;],(33)

where the notation h(M)𝑀h(M) stands for the evaluation of the monomial M𝑀M by the partial assignment given by hh; i.e., all variables X(a,h(a))𝑋𝑎𝑎X(a,h(a)) with aS𝑎𝑆a\in S are set to 111, all variables X(a,b)𝑋𝑎𝑏X(a,b) with aS𝑎𝑆a\in S and bh(a)𝑏𝑎b\not=h(a) are set to 00, and all other variables are left unset. Note that (29) ensures that (33) is a well-defined quantity that does not depend on the choice of S𝑆S, as long as S𝑆S contains all the elements that are mentioned in M𝑀M. Once E𝐸E is defined for all monomials of degree at most 2t2𝑡2t, we extend it to 𝒫2tsubscript𝒫2𝑡\mathcal{P}_{2t} by linearity.

The final step in the argument is to show that E𝐸E evaluates the left-hand side in (32) to some non-negative quantity; this will imply that the identity in (32) does not hold, and finish the proof. In order to prove this, the following matrix (AM,N)M,Nsubscriptsubscript𝐴𝑀𝑁𝑀𝑁(A_{M,N})_{M,N} will be instrumental. The indices are monomials M𝑀M of degree at most t𝑡t on the X(a,b)𝑋𝑎𝑏X(a,b)-variables. The entry AM,Nsubscript𝐴𝑀𝑁A_{M,N} of A𝐴A is defined to be E(MN)𝐸𝑀𝑁E(MN). For later use, observe that if S𝑆S denotes the set of elements that are mentioned in M𝑀M and there exists fGS𝑓superscript𝐺𝑆f\in G^{S} such that f(M)=1𝑓𝑀1f(M)=1, then this partial assignment f𝑓f with domain S𝑆S is uniquely determined by M𝑀M. We let fMGS{}subscript𝑓𝑀superscript𝐺𝑆bottomf_{M}\in G^{S}\cup\{\bot\} denote this unique partial assignment f𝑓f that makes f(M)=1𝑓𝑀1f(M)=1, when it exists, or the default value bottom\bot when it does not exist. We argue that Equation (29) ensures that A𝐴A is a positive semi-definite matrix. First, extend the collection of vectors u𝑢u to a new collection of vectors u={uf:fGS{},SV,|S|t}superscript𝑢conditional-setsubscriptsuperscript𝑢𝑓formulae-sequence𝑓superscript𝐺𝑆bottomformulae-sequence𝑆𝑉𝑆𝑡u^{*}=\{u^{*}_{f}:f\in G^{S}\cup\{\bot\},\;S\subseteq V,\;|S|\leq t\} by defining uf=ufsubscriptsuperscript𝑢𝑓subscript𝑢𝑓u^{*}_{f}=u_{f} for fGS𝑓superscript𝐺𝑆f\in G^{S}, and uf=0subscriptsuperscript𝑢𝑓0u^{*}_{f}=0 for f=𝑓bottomf=\bot. Fix indices M𝑀M and N𝑁N, let S𝑆S and T𝑇T be sets of elements mentioned in M𝑀M and N𝑁N, respectively. Let R=ST𝑅𝑆𝑇R=S\cup T. Then E(MN)𝐸𝑀𝑁E(MN), according to its definition (33), is the probability of the event that hμRsubscript𝜇𝑅h\in\mu_{R} makes h(MN)=1𝑀𝑁1h(MN)=1, or equivalently, that hμRsubscript𝜇𝑅h\in\mu_{R} makes h|S(M)=1evaluated-at𝑆𝑀1h|_{S}(M)=1 and h|T(N)=1evaluated-at𝑇𝑁1h|_{T}(N)=1, or equivalently, that hμRsubscript𝜇𝑅h\in\mu_{R} makes h|S=fMevaluated-at𝑆subscript𝑓𝑀h|_{S}=f_{M} and h|T=gNevaluated-at𝑇subscript𝑔𝑁h|_{T}=g_{N}. Thus, equation (29) and the definition of the extended collection of vectors usuperscript𝑢u^{*} ensures that AM,N=ufM,ugNsubscript𝐴𝑀𝑁subscriptsuperscript𝑢subscript𝑓𝑀subscriptsuperscript𝑢subscript𝑔𝑁A_{M,N}=\langle u^{*}_{f_{M}},u^{*}_{g_{N}}\rangle and hence A𝐴A is a Gram matrix. Thus A𝐴A is positive semi-definite.

Now we use the positive semi-definiteness of A𝐴A to show that, for squares Qi2superscriptsubscript𝑄𝑖2Q_{i}^{2} in (32), we have E(Qi2)0𝐸superscriptsubscript𝑄𝑖20E(Q_{i}^{2})\geq 0. Indeed, if Qi=MaMMsubscript𝑄𝑖subscript𝑀subscript𝑎𝑀𝑀Q_{i}=\sum_{M}a_{M}M where the sum extends over all monomials of degree at most t𝑡t and a=(aM)M𝑎subscriptsubscript𝑎𝑀𝑀a=(a_{M})_{M} is the corresponding vector of coefficients, then

E(Qi2)=E(MNaMaNMN)=MNaMaNAM,N=aTAa,𝐸superscriptsubscript𝑄𝑖2𝐸subscript𝑀subscript𝑁subscript𝑎𝑀subscript𝑎𝑁𝑀𝑁subscript𝑀subscript𝑁subscript𝑎𝑀subscript𝑎𝑁subscript𝐴𝑀𝑁superscript𝑎T𝐴𝑎E(Q_{i}^{2})=E\Big{(}\sum_{M}\sum_{N}a_{M}a_{N}MN\Big{)}=\sum_{M}\sum_{N}a_{M}a_{N}A_{M,N}=a^{\mathrm{T}}Aa,(34)

which is non-negative because A𝐴A is a positive semi-definite matrix.

For terms in (32) that are liftings of equations from EQsuperscriptEQ\mathrm{EQ}^{\prime}, the evaluation through E𝐸E is 00. This is clear for equations of type 2, since every monomial which contains a pair of variables X(a,b)𝑋𝑎𝑏X(a,b) and X(a,b)𝑋𝑎superscript𝑏X(a,b^{\prime}), for bb𝑏superscript𝑏b\neq b^{\prime} evaluates to 00 by (33). For the same reason if we take any equation of type 1 in EQsuperscriptEQ\mathrm{EQ}^{\prime}, i.e, bG(1X(a,b))=0subscriptproduct𝑏𝐺1𝑋𝑎𝑏0\prod_{b\in G}(1-X(a,b))=0, for some aV𝑎𝑉a\in V, and an arbitrary monomial M𝑀M on the X(a,b)𝑋𝑎𝑏X(a,b)-variables such that PM𝑃𝑀P\cdot M has a total degree at most 2t2𝑡2t, it holds that

E(bG(1X(a,b))M)=E(M)bGE(X(a,b)M).𝐸subscriptproduct𝑏𝐺1𝑋𝑎𝑏𝑀𝐸𝑀subscript𝑏𝐺𝐸𝑋𝑎𝑏𝑀E\Big{(}\prod_{b\in G}(1-X(a,b))\cdot M\Big{)}=E(M)-\sum_{b\in G}E(X(a,b)\cdot M).(35)

By (33) again, we have E(M)=bGE(X(a,b)M)𝐸𝑀subscript𝑏𝐺𝐸𝑋𝑎𝑏𝑀E(M)=\sum_{b\in G}E(X(a,b)\cdot M) and the right-hand side vanishes too. Finally, liftings of equations of type 333 from EQsuperscriptEQ\mathrm{EQ}^{\prime} evaluate to 00 thanks to equation (30).

For terms in (32) that are liftings of axioms in (2), the evaluation through E𝐸E is also 00 since the partial assignments on the X(a,b)𝑋𝑎𝑏X(a,b)-variables take Boolean values. All in all, the evaluation of the left-hand side of (32) through E𝐸E is non-negative, and the proof is complete.

7 Upper bounds in Lovász-Schrijver

In this section we show that all unsatisfiable instances of 3LIN(2subscript2\mathbb{Z}_{2}) have LS refutations of degree 666 and size polynomial in the number of variables. Indeed, the argument to get polynomial-size upper bound in constant degree works equally well for 3LIN(psubscript𝑝\mathbb{Z}_{p}), when p𝑝p is prime, with some inessential complications [5]. We focus on 2subscript2\mathbb{Z}_{2} for simplicity.

Initial remarks on the encoding.

We identify the elements of the two-element field 2subscript2\mathbb{Z}_{2} with {0,1}01\{0,1\}. Let 𝔼𝔼\mathbb{E} be an instance of k𝑘kLIN(2subscript2\mathbb{Z}_{2}) with n𝑛n variables. In the encoding INEQ(𝔼,𝔹(2,k))INEQ𝔼𝔹subscript2𝑘\mathrm{INEQ}(\mathbb{E},\mathbb{B}(\mathbb{Z}_{2},k)) of 𝔼𝔼\mathbb{E} as a system of linear inequalities, there are four variables X(a,0)𝑋𝑎0X(a,0), X(a,1)𝑋𝑎1X(a,1) ,X¯(a,0)¯𝑋𝑎0\bar{X}(a,0), X¯(a,1)¯𝑋𝑎1\bar{X}(a,1) for each variable a𝑎a in 𝔼𝔼\mathbb{E}. Note, however, that they are restricted to satisfy X(a,0)=X¯(a,1)𝑋𝑎0¯𝑋𝑎1X(a,0)=\bar{X}(a,1) and X¯(a,0)=X(a,1)¯𝑋𝑎0𝑋𝑎1\bar{X}(a,0)=X(a,1) by the inequality X(a,0)+X(a,1)10𝑋𝑎0𝑋𝑎110X(a,0)+X(a,1)-1\geq 0 from INEQINEQ\mathrm{INEQ} and the axiom equations in (2), which in this case read X(a,0)2X(a,0)=X(a,1)2X(a,1)=0𝑋superscript𝑎02𝑋𝑎0𝑋superscript𝑎12𝑋𝑎10X(a,0)^{2}-X(a,0)=X(a,1)^{2}-X(a,1)=0 and X(a,0)+X¯(a,0)1=X(a,1)+X¯(a,1)1=0𝑋𝑎0¯𝑋𝑎01𝑋𝑎1¯𝑋𝑎110X(a,0)+\bar{X}(a,0)-1=X(a,1)+\bar{X}(a,1)-1=0. Consequently, in the following we will ignore the variables of the type X(a,0)𝑋𝑎0X(a,0) and their twins and keep only the variables X(a,1)𝑋𝑎1X(a,1) and X¯(a,1)¯𝑋𝑎1\bar{X}(a,1). In order to simplify the notation even further, we will assume that the variables of 𝔼𝔼\mathbb{E} are called X1,,Xnsubscript𝑋1subscript𝑋𝑛X_{1},\ldots,X_{n}, and that those of INEQINEQ\mathrm{INEQ} are called X1,,Xnsubscript𝑋1subscript𝑋𝑛X_{1},\ldots,X_{n} and X¯1,,X¯nsubscript¯𝑋1subscript¯𝑋𝑛\bar{X}_{1},\ldots,\bar{X}_{n}.

We interpret the variables X1,,Xnsubscript𝑋1subscript𝑋𝑛X_{1},\ldots,X_{n} as ranging over 2subscript2\mathbb{Z}_{2} or \mathbb{Q} depending on the context. Let E𝐸E be an equation of 𝔼𝔼\mathbb{E}, say E:a1X1++anXn=b:𝐸subscript𝑎1subscript𝑋1subscript𝑎𝑛subscript𝑋𝑛𝑏E:a_{1}X_{1}+\cdots+a_{n}X_{n}=b, where a1,,an2subscript𝑎1subscript𝑎𝑛subscript2a_{1},\ldots,a_{n}\in\mathbb{Z}_{2} and b2𝑏subscript2b\in\mathbb{Z}_{2}. Without loss of generality we can assume that there are exactly k𝑘k many aisubscript𝑎𝑖a_{i}’s that are 111. In INEQINEQ\mathrm{INEQ}, the encoding of the constraint represented by this equation is given by the following inequalities:

iTX¯i+iITXi10subscript𝑖𝑇subscript¯𝑋𝑖subscript𝑖𝐼𝑇subscript𝑋𝑖10\displaystyle\sum_{i\in T}\bar{X}_{i}+\sum_{i\in I\setminus T}X_{i}-1\geq 0for all TI such that |T|1bmod2,for all 𝑇𝐼 such that 𝑇modulo1𝑏2\displaystyle\text{ for all }T\subseteq I\text{ such that }|T|\equiv 1-b\!\!\mod 2,

where I={i[n]:ai0}𝐼conditional-set𝑖delimited-[]𝑛subscript𝑎𝑖0I=\{i\in[n]:a_{i}\not=0\}. Note that |I|=k𝐼𝑘|I|=k. We write 𝒮(E)𝒮𝐸\mathcal{S}(E) to denote this set of inequalities; it has exactly 2k1superscript2𝑘12^{k-1} many inequalities, and all of them are satisfied in \mathbb{Q} by a {0,1}01\{0,1\}-assignment if and only if the equation E𝐸E is satisfied in 2subscript2\mathbb{Z}_{2} by the same assignment. Let 𝒮(𝔼)𝒮𝔼\mathcal{S}(\mathbb{E}) be the union of all 𝒮(E)𝒮𝐸\mathcal{S}(E) as E𝐸E ranges over the equations in 𝔼𝔼\mathbb{E}. Observe that, except for the small detail that only half of the variables are used, INEQINEQ\mathrm{INEQ} is basically the same as 𝒮(𝔼)𝒮𝔼\mathcal{S}(\mathbb{E}).

Some technical lemmas.

For every linear form L(X1,,Xn)=i=1naiXi𝐿subscript𝑋1subscript𝑋𝑛superscriptsubscript𝑖1𝑛subscript𝑎𝑖subscript𝑋𝑖L(X_{1},\ldots,X_{n})=\sum_{i=1}^{n}a_{i}X_{i} with rational coefficients a1,,ansubscript𝑎1subscript𝑎𝑛a_{1},\ldots,a_{n} and every integer c𝑐c, let Dc(L)subscript𝐷𝑐𝐿D_{c}(L) be the quadratic polynomial (Lc)(Lc+1)𝐿𝑐𝐿𝑐1(L-c)(L-c+1). In words, the inequality Dc(L)0subscript𝐷𝑐𝐿0D_{c}(L)\geq 0 states that L𝐿L does not fall in the open interval (c1,c)𝑐1𝑐(c-1,c). Such statements have short proofs of low degree:

Lemma 26 ([30]).

For every integer c𝑐c and for every linear form L(X1,,Xn)=i=1naiXi𝐿subscript𝑋1subscript𝑋𝑛superscriptsubscript𝑖1𝑛subscript𝑎𝑖subscript𝑋𝑖L(X_{1},\ldots,X_{n})=\sum_{i=1}^{n}a_{i}X_{i} with integer coefficients a1,,ansubscript𝑎1subscript𝑎𝑛a_{1},\ldots,a_{n}, there is an LS proof of the inequality Dc(L)0subscript𝐷𝑐𝐿0D_{c}(L)\geq 0 (from nothing) of degree at most 333 and size polynomial in max{|ai|:i=1,,n}:subscript𝑎𝑖𝑖1𝑛\max\{|a_{i}|:i=1,\ldots,n\}, |c|𝑐|c| and n𝑛n.

In the following, for I[n]𝐼delimited-[]𝑛I\subseteq[n] and TI𝑇𝐼T\subseteq I, let MTI(X1,,Xn):=iTXiiITX¯assignsubscriptsuperscript𝑀𝐼𝑇subscript𝑋1subscript𝑋𝑛subscriptproduct𝑖𝑇subscript𝑋𝑖subscriptproduct𝑖𝐼𝑇¯𝑋M^{I}_{T}(X_{1},\ldots,X_{n}):=\prod_{i\in T}X_{i}\prod_{i\in I\setminus T}\bar{X}. As usual, MI(X1,,Xn)=1subscriptsuperscript𝑀𝐼subscript𝑋1subscript𝑋𝑛1M^{I}_{\emptyset}(X_{1},\ldots,X_{n})=1. Such polynomials are called extended monomials.

Lemma 27.

For every I[n]𝐼delimited-[]𝑛I\subseteq[n], there is an LS proof of TIMTI1=0subscript𝑇𝐼subscriptsuperscript𝑀𝐼𝑇10\sum_{T\subseteq I}M^{I}_{T}-1=0 (from nothing) of degree |I|𝐼|I| and size polynomial in 2|I|superscript2𝐼2^{|I|}.

Proof.

For simplicity, let q=|I|𝑞𝐼q=|I| and assume I={1,,q}𝐼1𝑞I=\{1,\ldots,q\}. We build the proof inductively on q𝑞q. For q=0𝑞0q=0, what we need is trivial since the left-hand side is 00. Assume now q{1,,n}𝑞1𝑛q\in\{1,\ldots,n\} and that we have T[q1]MTI1=0subscript𝑇delimited-[]𝑞1subscriptsuperscript𝑀𝐼𝑇10\sum_{T\subseteq[q-1]}M^{I}_{T}-1=0. Multiply this once by Xqsubscript𝑋𝑞X_{q} and once by X¯qsubscript¯𝑋𝑞\bar{X}_{q}. Adding the results we get T[q]MTIXqX¯q=0subscript𝑇delimited-[]𝑞subscriptsuperscript𝑀𝐼𝑇subscript𝑋𝑞subscript¯𝑋𝑞0\sum_{T\subseteq[q]}M^{I}_{T}-X_{q}-\bar{X}_{q}=0, from which T[q]MTI1=0subscript𝑇delimited-[]𝑞subscriptsuperscript𝑀𝐼𝑇10\sum_{T\subseteq[q]}M^{I}_{T}-1=0 follows from adding the axiom Xq+X¯q1=0subscript𝑋𝑞subscript¯𝑋𝑞10X_{q}+\bar{X}_{q}-1=0 to it. The size is exponential in |I|𝐼|I| because the inductive step is used twice. ∎

The next lemma is as technical as useful.

Lemma 28.

Let TI[n]𝑇𝐼delimited-[]𝑛T\subseteq I\subseteq[n]. Then there is an LS proof of (iIXi|T|)MTI=0subscript𝑖𝐼subscript𝑋𝑖𝑇subscriptsuperscript𝑀𝐼𝑇0\big{(}{\sum_{i\in I}X_{i}-|T|}\big{)}M^{I}_{T}=0 (from nothing) of degree at most |I|+1𝐼1|I|+1 and size linear in |I|𝐼|I|.

Proof.

Write M𝑀M for MTIsubscriptsuperscript𝑀𝐼𝑇M^{I}_{T}. For every iIT𝑖𝐼𝑇i\in I\setminus T, using XiX¯i=0subscript𝑋𝑖subscript¯𝑋𝑖0X_{i}\bar{X}_{i}=0 we get XiM=0subscript𝑋𝑖𝑀0X_{i}M=0. For every iT𝑖𝑇i\in T, using Xi2Xi=0superscriptsubscript𝑋𝑖2subscript𝑋𝑖0X_{i}^{2}-X_{i}=0 we get XiM=Msubscript𝑋𝑖𝑀𝑀X_{i}M=M. Adding up we get iIXiM=|T|Msubscript𝑖𝐼subscript𝑋𝑖𝑀𝑇𝑀\sum_{i\in I}X_{i}M=|T|M. ∎

Simulating Gaussian elimination.

We use these lemmas to prove the main result of this section.

Theorem 18.

Let 𝔼𝔼\mathbb{E} be an instance of 3LIN(2)3LINsubscript23\mathrm{LIN}(\mathbb{Z}_{2}) with n𝑛n variables and m𝑚m equations. If 𝔼𝔼\mathbb{E} is unsatisfiable, then 𝒮(𝔼)𝒮𝔼\mathcal{S}(\mathbb{E}) has an LS refutation of degree 666 and size polynomial in n𝑛n and m𝑚m.

Proof.

Write 𝔼𝔼\mathbb{E} in matrix form Ax=b𝐴𝑥𝑏Ax=b, where x𝑥x is a column vector of n𝑛n variables, A𝐴A is a matrix in 2m×nsuperscriptsubscript2𝑚𝑛\mathbb{Z}_{2}^{m\times n}, and b𝑏b is a vector in 2msuperscriptsubscript2𝑚\mathbb{Z}_{2}^{m}. Let aj,1,,aj,nsubscript𝑎𝑗1subscript𝑎𝑗𝑛a_{j,1},\ldots,a_{j,n} be the j𝑗j-th row of A𝐴A, so the j𝑗j-th equation of 𝔼𝔼\mathbb{E} is Ej:aj,1X1++aj,nXn=bj:subscript𝐸𝑗subscript𝑎𝑗1subscript𝑋1subscript𝑎𝑗𝑛subscript𝑋𝑛subscript𝑏𝑗E_{j}:a_{j,1}X_{1}+\cdots+a_{j,n}X_{n}=b_{j}. Assume 𝔼𝔼\mathbb{E} is unsatisfiable over 2subscript2\mathbb{Z}_{2}. Then b𝑏b cannot be expressed as a 2subscript2\mathbb{Z}_{2}-linear combination of the columns of A𝐴A, so the 2subscript2\mathbb{Z}_{2}-rank of the matrix [A|b]delimited-[]conditional𝐴𝑏[\;A\;|\;b\;] exceeds the 2subscript2\mathbb{Z}_{2}-rank of A𝐴A. Since the rank of A𝐴A is at most n𝑛n, this means that there exists a subset of at most n𝑛n rows J𝐽J such that, with arithmetic in 2subscript2\mathbb{Z}_{2}, we have jJaj,i=0subscript𝑗𝐽subscript𝑎𝑗𝑖0\sum_{j\in J}a_{j,i}=0 for every i[n]𝑖delimited-[]𝑛i\in[n], and at the same time jJbj=1subscript𝑗𝐽subscript𝑏𝑗1\sum_{j\in J}b_{j}=1. In order to simplify the notation, we assume without loss of generality that J={1,,|J|}𝐽1𝐽J=\{1,\ldots,|J|\}.

For every k{0,,|J|}𝑘0𝐽k\in\{0,\ldots,|J|\}, define the linear form

Lk(X1,,Xn):=12(j=1ki=1naj,iXi+j=k+1|J|bj).assignsubscript𝐿𝑘subscript𝑋1subscript𝑋𝑛12superscriptsubscript𝑗1𝑘superscriptsubscript𝑖1𝑛subscript𝑎𝑗𝑖subscript𝑋𝑖superscriptsubscript𝑗𝑘1𝐽subscript𝑏𝑗L_{k}(X_{1},\ldots,X_{n}):=\frac{1}{2}\left({\sum_{j=1}^{k}\sum_{i=1}^{n}a_{j,i}X_{i}+\sum_{j=k+1}^{|J|}b_{j}}\right).

In this definition of Lksubscript𝐿𝑘L_{k}, the coefficients aj,isubscript𝑎𝑗𝑖a_{j,i} and bjsubscript𝑏𝑗b_{j} are interpreted as rationals. We provide proofs of Dc(Lk)0subscript𝐷𝑐subscript𝐿𝑘0D_{c}(L_{k})\geq 0 for every cRk:={0,,(k+1)n}𝑐subscript𝑅𝑘assign0𝑘1𝑛c\in R_{k}:=\{0,\ldots,(k+1)n\} by reverse induction on k{0,,|J|}𝑘0𝐽k\in\{0,\ldots,|J|\}.

The base case k=|J|𝑘𝐽k=|J| is a special case of Lemma 26. To see why note that the condition jJaj,i=0subscript𝑗𝐽subscript𝑎𝑗𝑖0\sum_{j\in J}a_{j,i}=0 over 2subscript2\mathbb{Z}_{2} means that, if arithmetic were done in \mathbb{Q}, then jJaj,isubscript𝑗𝐽subscript𝑎𝑗𝑖\sum_{j\in J}a_{j,i} is an even natural number. But then all the coefficients of

L|J|(X1,,Xn)=12j=1|J|i=1naj,iXi=i=1n(12j=1|J|aj,i)Xisubscript𝐿𝐽subscript𝑋1subscript𝑋𝑛12superscriptsubscript𝑗1𝐽superscriptsubscript𝑖1𝑛subscript𝑎𝑗𝑖subscript𝑋𝑖superscriptsubscript𝑖1𝑛12superscriptsubscript𝑗1𝐽subscript𝑎𝑗𝑖subscript𝑋𝑖L_{|J|}(X_{1},\ldots,X_{n})=\frac{1}{2}\sum_{j=1}^{|J|}\sum_{i=1}^{n}a_{j,i}X_{i}=\sum_{i=1}^{n}\left({\frac{1}{2}\sum_{j=1}^{|J|}a_{j,i}}\right)X_{i}

are integers. Hence Lemma 26 applies.

Suppose now that 0k|J|10𝑘𝐽10\leq k\leq|J|-1 and that we have a proof of Dd(Lk+1)0subscript𝐷𝑑subscript𝐿𝑘10D_{d}(L_{k+1})\geq 0 available for every dRk+1𝑑subscript𝑅𝑘1d\in R_{k+1}. Fix cRk𝑐subscript𝑅𝑘c\in R_{k}; our immediate goal is to give a proof of Dc(Lk)0subscript𝐷𝑐subscript𝐿𝑘0D_{c}(L_{k})\geq 0. As k𝑘k is fixed, write L𝐿L in place of Lk+1subscript𝐿𝑘1L_{k+1}, and let the (k+1)𝑘1(k+1)-st equation Ek+1subscript𝐸𝑘1E_{k+1} be written as iIXi=bsubscript𝑖𝐼subscript𝑋𝑖𝑏\sum_{i\in I}X_{i}=b, where I={i[n]:ak+1,i=1}𝐼conditional-set𝑖delimited-[]𝑛subscript𝑎𝑘1𝑖1I=\{i\in[n]:a_{k+1,i}=1\}. Note that L=Lk+/2𝐿subscript𝐿𝑘2L=L_{k}+\ell/2 where :=b+iIXiassign𝑏subscript𝑖𝐼subscript𝑋𝑖\ell:=-b+\sum_{i\in I}X_{i}. Fix TI𝑇𝐼T\subseteq I such that |T|bmod2𝑇modulo𝑏2|T|\equiv b\!\!\mod 2, and let d=c+(tb)/2𝑑𝑐𝑡𝑏2d=c+(t-b)/2 where t=|T|𝑡𝑇t=|T|. Note that dRk+1𝑑subscript𝑅𝑘1d\in R_{k+1} as cRk𝑐subscript𝑅𝑘c\in R_{k} and 0tn0𝑡𝑛0\leq t\leq n and 0b10𝑏10\leq b\leq 1 are such that tb𝑡𝑏t-b is even. Multiplying Dd(L)0subscript𝐷𝑑𝐿0D_{d}(L)\geq 0 by the extended monomial MTIsubscriptsuperscript𝑀𝐼𝑇M^{I}_{T} we get (Ld)(Ld+1)MTI0.𝐿𝑑𝐿𝑑1subscriptsuperscript𝑀𝐼𝑇0(L-d)(L-d+1)M^{I}_{T}\geq 0. Replacing L=Lk+/2𝐿subscript𝐿𝑘2L=L_{k}+\ell/2 in the factor (Ld)𝐿𝑑(L-d) and recalling d=c+(tb)/2𝑑𝑐𝑡𝑏2d=c+(t-b)/2, this inequality can be written as

(Lkc)(Ld+1)MTI+(Ld+1)12A0subscript𝐿𝑘𝑐𝐿𝑑1subscriptsuperscript𝑀𝐼𝑇𝐿𝑑112𝐴0(L_{k}-c)(L-d+1)M^{I}_{T}+(L-d+1)\textstyle{\frac{1}{2}}A\geq 0(36)

where A:=(+bt)MTIassign𝐴𝑏𝑡subscriptsuperscript𝑀𝐼𝑇A:=(\ell+b-t)M^{I}_{T}. By Lemma 28 we have a proof of A=0𝐴0A=0, and hence of (Ld+1)A/2=0𝐿𝑑1𝐴20(L-d+1)A/2=0. Composing with (36) we get a proof of (Lkc)(Ld+1)MTI0.subscript𝐿𝑘𝑐𝐿𝑑1subscriptsuperscript𝑀𝐼𝑇0(L_{k}-c)(L-d+1)M^{I}_{T}\geq 0. The same argument applied to the factor (Ld+1)𝐿𝑑1(L-d+1) of this inequality gives (Lkc)(Lkc+1)MTI0.subscript𝐿𝑘𝑐subscript𝐿𝑘𝑐1subscriptsuperscript𝑀𝐼𝑇0(L_{k}-c)(L_{k}-c+1)M^{I}_{T}\geq 0. This is precisely Dc(Lk)MTI0subscript𝐷𝑐subscript𝐿𝑘subscriptsuperscript𝑀𝐼𝑇0D_{c}(L_{k})M^{I}_{T}\geq 0. Adding up over all TI𝑇𝐼T\subseteq I with |T|bmod2𝑇modulo𝑏2|T|\equiv b\!\!\mod 2 we get

Dc(Lk)TI|T|bMTI0.subscript𝐷𝑐subscript𝐿𝑘subscriptFRACOP𝑇𝐼𝑇𝑏subscriptsuperscript𝑀𝐼𝑇0D_{c}(L_{k})\sum_{T\subseteq I\atop|T|\equiv b}M^{I}_{T}\geq 0.(37)

Now note that for each TI𝑇𝐼T\subseteq I such that |T|1bmod2𝑇modulo1𝑏2|T|\equiv 1-b\mod 2, the inequality MTI0subscriptsuperscript𝑀𝐼𝑇0-M^{I}_{T}\geq 0 is the multiplicative encoding of one of the clauses in 𝒮(E)𝒮𝐸\mathcal{S}(E). Thus, by Lemma 4, we get constant-size proofs of MTI0subscriptsuperscript𝑀𝐼𝑇0-M^{I}_{T}\geq 0, and hence of MTI=0subscriptsuperscript𝑀𝐼𝑇0M^{I}_{T}=0, for every TI𝑇𝐼T\subseteq I such that |T|1bmod2𝑇modulo1𝑏2|T|\equiv 1-b\!\!\mod 2. But then also of Dc(Lk)MTI=0subscript𝐷𝑐subscript𝐿𝑘subscriptsuperscript𝑀𝐼𝑇0D_{c}(L_{k})M^{I}_{T}=0 for every such T𝑇T. Adding up and composing with (37) we get

Dc(Lk)TIMTI0.subscript𝐷𝑐subscript𝐿𝑘subscript𝑇𝐼subscriptsuperscript𝑀𝐼𝑇0D_{c}(L_{k})\sum_{T\subseteq I}M^{I}_{T}\geq 0.(38)

From Lemma 27 we get 1TIMTI=01subscript𝑇𝐼subscriptsuperscript𝑀𝐼𝑇01-\sum_{T\subseteq I}M^{I}_{T}=0, and hence Dc(Lk)Dc(Lk)TIMTI0subscript𝐷𝑐subscript𝐿𝑘subscript𝐷𝑐subscript𝐿𝑘subscript𝑇𝐼subscriptsuperscript𝑀𝐼𝑇0D_{c}(L_{k})-D_{c}(L_{k})\sum_{T\subseteq I}M^{I}_{T}\geq 0, from which Dc(Lk)0subscript𝐷𝑐subscript𝐿𝑘0D_{c}(L_{k})\geq 0 follows from addition with (38).

We proved Dc(L0)0subscript𝐷𝑐subscript𝐿00D_{c}(L_{0})\geq 0 for every cR0={0,,n}𝑐subscript𝑅00𝑛c\in R_{0}=\{0,\ldots,n\}. Recall now that j=1|J|bjsuperscriptsubscript𝑗1𝐽subscript𝑏𝑗\sum_{j=1}^{|J|}b_{j} is odd, say 2q+12𝑞12q+1, and at most n𝑛n. In particular q+1𝑞1q+1 belongs to R0subscript𝑅0R_{0} and L0=q+1/2subscript𝐿0𝑞12L_{0}=q+1/2. Thus we have a proof of Dq+1(L0)0subscript𝐷𝑞1subscript𝐿00D_{q+1}(L_{0})\geq 0 where Dq+1(L0)=(1/2)(1/2)=1/4subscript𝐷𝑞1subscript𝐿0121214D_{q+1}(L_{0})=-(1/2)(1/2)=-1/4. Multiplying by 444 we get the contradiction 1010-1\geq 0. ∎

8 Applications to k-coloring

We illustrate the power of the general method of reductions for CSPs by applying it to the graph k𝑘k-coloring problem for k3𝑘3k\geq 3. This will allow us to rederive one of the results in [40], as well as answer one of their open problems.

8.1 Blackbox application to k-coloring

An undirected graph G𝐺G is k𝑘k-colorable if and only if it has a homomorphism into the k𝑘k-clique graph 𝕂k=([k],[k]2{(b,b):b[k]})subscript𝕂𝑘delimited-[]𝑘superscriptdelimited-[]𝑘2conditional-set𝑏𝑏𝑏delimited-[]𝑘\mathbb{K}_{k}=([k],[k]^{2}\setminus\{(b,b):b\in[k]\}). Thus the k𝑘k-coloring problem is a special case of the CSP of the template 𝕂ksubscript𝕂𝑘\mathbb{K}_{k}, which we abbreviate by k-COLOR𝑘-COLORk\text{-COLOR}. We say that it is a special case and not exactly the same problem because the inputs to k-COLOR𝑘-COLORk\text{-COLOR} need not be undirected graphs; in full generality they are directed graphs that allow loops. Note, however, that a directed graph that has loops would never have a homomorphism into the template, and that a loopless directed graph has a homomorphism into the template if and only if the underlying undirected graph that ignores the directions of the edges is k𝑘k-colorable. Thus, for all practical purposes, the two problems are the same, and proof complexity lower bounds for one version of the problem will give proof complexity lower bounds for the other. We discuss this in due time; for now we focus on the proof complexity of the CSP with template 𝕂ksubscript𝕂𝑘\mathbb{K}_{k}.

It is well-known that 𝕂ksubscript𝕂𝑘\mathbb{K}_{k} is a template of unbounded width for each k3𝑘3k\geq 3 (see e.g. [26]). As a consequence of our main result we get the following:

Corollary 5.

For every integer k3𝑘3k\geq 3, there exist families (Gn)n1subscriptsubscript𝐺𝑛𝑛1(G_{n})_{n\geq 1} of unsatisfiable instances of k𝑘k-COLOR, where Gnsubscript𝐺𝑛G_{n} has Θ(n)Θ𝑛\Theta(n) vertices and Θ(n)Θ𝑛\Theta(n) edges, such that for every positive integer d𝑑d there exists a positive ϵitalic-ϵ\epsilon such that, for any local encoding scheme E𝐸E of the appropriate type, and any sufficiently large n𝑛n, the following hold:

  1. 1.

    every resolution refutation of E(Gn)𝐸subscript𝐺𝑛E(G_{n})has width at least ϵnitalic-ϵ𝑛\epsilon nand size at least 2ϵnsuperscript2italic-ϵ𝑛2^{\epsilon n},

  2. 2.

    every Frege refutation of E(Gn)𝐸subscript𝐺𝑛E(G_{n})of depth d𝑑dhas size at least 2nϵsuperscript2superscript𝑛italic-ϵ2^{n^{\epsilon}},

  3. 3.

    every PC refutation over the reals of E(Gn)𝐸subscript𝐺𝑛E(G_{n})has degree at least ϵnitalic-ϵ𝑛\epsilon nand size at least 2ϵnsuperscript2italic-ϵ𝑛2^{\epsilon n},

  4. 4.

    every SOS refutation of E(Gn)𝐸subscript𝐺𝑛E(G_{n})has degree at least ϵnitalic-ϵ𝑛\epsilon n.

Let us note that the size lower bound claim for Polynomial Calculus follows from its degree lower bound claim in conjunction with the size-degree tradeoff [33]. Also, the claim for resolution follows from the claim for Polynomial Calculus and the fact that the latter efficiently simulates resolution (c.f., Lemma 9 in Section 3); it does not follow directly from our main dichotomy theorem.

Next we consider the specific encoding scheme used in [40] and the issue of directed vs. undirected graphs. Let G𝐺G be a directed graph, with vertex-set V𝑉V and edge-set EV2𝐸superscript𝑉2E\subseteq V^{2}, let k𝑘k be a positive integer, and consider the following system of polynomial equations:

  1. 1.

    b[k]X(u,b)1=0subscript𝑏delimited-[]𝑘𝑋𝑢𝑏10\sum_{b\in[k]}X(u,b)-1=0 for each uV𝑢𝑉u\in V,

  2. 2.

    X(u,b)X(u,c)=0𝑋𝑢𝑏𝑋𝑢𝑐0X(u,b)X(u,c)=0 for each uV𝑢𝑉u\in V and b,c[k]𝑏𝑐delimited-[]𝑘b,c\in[k] with bc𝑏𝑐b\not=c,

  3. 3.

    X(u,b)X(v,b)=0𝑋𝑢𝑏𝑋𝑣𝑏0X(u,b)X(v,b)=0 for each u,vV𝑢𝑣𝑉u,v\in V with (u,v)E𝑢𝑣𝐸(u,v)\in E and b[k]𝑏delimited-[]𝑘b\in[k].

It is easy to see that this is a local encoding scheme for k𝑘k-COLOR in the sense of Section 2.6. Thus, Corollary 5 applies to it and we get a family of instances (Gn)n1subscriptsubscript𝐺𝑛𝑛1(G_{n})_{n\geq 1} that are hard for Polynomial Calculus in the indicated encoding scheme. Note that, since the instances are hard, they must be loopless graphs. Indeed, if (u,u)𝑢𝑢(u,u) is a loop in Gnsubscript𝐺𝑛G_{n}, then X(u,b)X(u,b)=0𝑋𝑢𝑏𝑋𝑢𝑏0X(u,b)X(u,b)=0 is an equation in the encoding of the instance Gnsubscript𝐺𝑛G_{n} for all b[k]𝑏delimited-[]𝑘b\in[k]. These equations, together with the axioms X(u,b)2X(u,b)=0𝑋superscript𝑢𝑏2𝑋𝑢𝑏0X(u,b)^{2}-X(u,b)=0 and the equation b[k]X(u,b)1=0subscript𝑏delimited-[]𝑘𝑋𝑢𝑏10\sum_{b\in[k]}X(u,b)-1=0, would give a PC derivation of 1=0101=0 in degree 222 and constant size. Thus the instances in the family are loopless graphs. We may also assume that they are undirected graphs for the simple reason that the equations X(u,b)X(v,b)=0𝑋𝑢𝑏𝑋𝑣𝑏0X(u,b)X(v,b)=0 and X(v,b)X(u,b)=0𝑋𝑣𝑏𝑋𝑢𝑏0X(v,b)X(u,b)=0 are identical (recall that all our variables commute by assumption). It follows that Corollary 5 has the real-field case of Theorem 1.1 from [40] as a special case, except for the fact that, unlike Theorem 1.1 from [40], Corollary 5 does not state that the family of graphs is explicit. In the next section we show that we can also get an explicit family of graphs with the same properties.

8.2 Opening the box

In the rest of this section we open the box of the method that underlies Corollary 5. This will allow us to re-derive Theorem 1.1 from [40] for all fields, and not just for the real-field as is stated in Corollary 5. Moreover, it will suggest a way to apply the method to any other problem that is NP-complete via gadget reductions.

Since 𝕂ksubscript𝕂𝑘\mathbb{K}_{k} for k3𝑘3k\geq 3 is a template of unbounded width, Theorem 11 applies to it. It is not difficult to see that 𝕂ksubscript𝕂𝑘\mathbb{K}_{k} is a core, hence by Theorem 11 there exists a non-trivial finite Abelian group G𝐺G such that 𝔹(G,3)𝔹𝐺3\mathbb{B}(G,3) is pp-interpretable in 𝕂k+superscriptsubscript𝕂𝑘\mathbb{K}_{k}^{+}, where 𝕂k+superscriptsubscript𝕂𝑘\mathbb{K}_{k}^{+} is the expansion of 𝕂ksubscript𝕂𝑘\mathbb{K}_{k} with all constants; i.e., the expansion with the relations R1={1},,Rk={k}formulae-sequencesubscript𝑅11subscript𝑅𝑘𝑘R_{1}=\{1\},\ldots,R_{k}=\{k\}. Indeed, this is the case even for the group G=2𝐺subscript2G=\mathbb{Z}_{2}. Concrete such pp-interpretations are well-known and also easy to construct. For the sake of completeness and by way of example, we propose one such pp-interpretation in two steps. First we pp-interpret 𝔹(2,3)𝔹subscript23\mathbb{B}(\mathbb{Z}_{2},3) in the template of 333-SAT, and then we pp-interpret the template of 333-SAT in 𝕂k+superscriptsubscript𝕂𝑘\mathbb{K}_{k}^{+}. Since pp-interpretations compose, we get what we want.

Recall that 𝔹(2,3)𝔹subscript23\mathbb{B}(\mathbb{Z}_{2},3) has domain {0,1}01\{0,1\} and two relations E0={(b1,b2,b3){0,1}3:b1+b2+b3=0mod2}subscript𝐸0conditional-setsubscript𝑏1subscript𝑏2subscript𝑏3superscript013subscript𝑏1subscript𝑏2subscript𝑏3modulo02E_{0}=\{(b_{1},b_{2},b_{3})\in\{0,1\}^{3}:b_{1}+b_{2}+b_{3}=0\mod 2\} and E1={(b1,b2,b3){0,1}3:b1+b2+b3=1mod2}subscript𝐸1conditional-setsubscript𝑏1subscript𝑏2subscript𝑏3superscript013subscript𝑏1subscript𝑏2subscript𝑏3modulo12E_{1}=\{(b_{1},b_{2},b_{3})\in\{0,1\}^{3}:b_{1}+b_{2}+b_{3}=1\mod 2\}. The template of 333-SAT also has domain {0,1}01\{0,1\} and the eight arity-3 relations R000subscript𝑅000R_{000}, R001subscript𝑅001R_{001}, R010subscript𝑅010R_{010}, R100subscript𝑅100R_{100}, R011subscript𝑅011R_{011}, R101subscript𝑅101R_{101}, R110subscript𝑅110R_{110}, and R111subscript𝑅111R_{111} defined by the eight possible signings of a 3-clause. A pp-interpretation of 𝔹(2,3)𝔹subscript23\mathbb{B}(\mathbb{Z}_{2},3) is given by the following formulas:

  1. 1.

    δ(x1):=x1=x1assign𝛿subscript𝑥1subscript𝑥1subscript𝑥1\delta(x_{1}):=x_{1}=x_{1}, i.e., true so the domain in still {0,1}01\{0,1\},

  2. 2.

    ϵ(x1,x2):=x1=x2assignitalic-ϵsubscript𝑥1subscript𝑥2subscript𝑥1subscript𝑥2\epsilon(x_{1},x_{2}):=x_{1}=x_{2},

  3. 3.

    E0(x1,x2,x3):=R001(x1,x2,x3)R010(x1,x2,x3)R100(x1,x2,x3)R111(x1,x2,x3)assignsubscript𝐸0subscript𝑥1subscript𝑥2subscript𝑥3subscript𝑅001subscript𝑥1subscript𝑥2subscript𝑥3subscript𝑅010subscript𝑥1subscript𝑥2subscript𝑥3subscript𝑅100subscript𝑥1subscript𝑥2subscript𝑥3subscript𝑅111subscript𝑥1subscript𝑥2subscript𝑥3E_{0}(x_{1},x_{2},x_{3}):=R_{001}(x_{1},x_{2},x_{3})\wedge R_{010}(x_{1},x_{2},x_{3})\wedge R_{100}(x_{1},x_{2},x_{3})\wedge R_{111}(x_{1},x_{2},x_{3}),

  4. 4.

    E1(x1,x2,x3):=R000(x1,x2,x3)R011(x1,x2,x3)R101(x1,x2,x3)R110(x1,x2,x3)assignsubscript𝐸1subscript𝑥1subscript𝑥2subscript𝑥3subscript𝑅000subscript𝑥1subscript𝑥2subscript𝑥3subscript𝑅011subscript𝑥1subscript𝑥2subscript𝑥3subscript𝑅101subscript𝑥1subscript𝑥2subscript𝑥3subscript𝑅110subscript𝑥1subscript𝑥2subscript𝑥3E_{1}(x_{1},x_{2},x_{3}):=R_{000}(x_{1},x_{2},x_{3})\wedge R_{011}(x_{1},x_{2},x_{3})\wedge R_{101}(x_{1},x_{2},x_{3})\wedge R_{110}(x_{1},x_{2},x_{3}).

Next we give the standard pp-interpretation of the template of 333-SAT in 𝕂k+superscriptsubscript𝕂𝑘\mathbb{K}_{k}^{+} when k3𝑘3k\geq 3. We use the first two colors 111 and 222 to represent 00 and 111, respectively:

  1. 1.

    δ(x1):=y3yk(b=3kRb(yb)E(x1,yb))assign𝛿subscript𝑥1subscript𝑦3subscript𝑦𝑘superscriptsubscript𝑏3𝑘subscript𝑅𝑏subscript𝑦𝑏𝐸subscript𝑥1subscript𝑦𝑏\delta(x_{1}):=\exists y_{3}\cdots\exists y_{k}\big{(}\bigwedge_{b=3}^{k}R_{b}(y_{b})\wedge E(x_{1},y_{b})\big{)}, i.e., the domain is {1,2}12\{1,2\},

  2. 2.

    ϵ(x1,x2):=x1=x2assignitalic-ϵsubscript𝑥1subscript𝑥2subscript𝑥1subscript𝑥2\epsilon(x_{1},x_{2}):=x_{1}=x_{2},

  3. 3.

    Rabc(x1,x2,x3):=x1x2x3(i=13(δ(xi)δ(xi)E(xi,xi))y1y6(i=16z4zk(b=4kRb(zb)E(yi,zb))E(y1,y3)E(y2,y3)E(y1,y2)E(y3,y4)E(y4,y6)E(y5,y6)E(y4,y5)R2(y6)E(y1,x1(a))E(y2,x2(b))E(y5,x3(c))))assignsubscript𝑅𝑎𝑏𝑐subscript𝑥1subscript𝑥2subscript𝑥3subscriptsuperscript𝑥1subscriptsuperscript𝑥2subscriptsuperscript𝑥3superscriptsubscript𝑖13𝛿subscript𝑥𝑖𝛿subscriptsuperscript𝑥𝑖𝐸subscript𝑥𝑖subscriptsuperscript𝑥𝑖subscript𝑦1subscript𝑦6superscriptsubscript𝑖16subscript𝑧4subscript𝑧𝑘superscriptsubscript𝑏4𝑘subscript𝑅𝑏subscript𝑧𝑏𝐸subscript𝑦𝑖subscript𝑧𝑏𝐸subscript𝑦1subscript𝑦3𝐸subscript𝑦2subscript𝑦3𝐸subscript𝑦1subscript𝑦2𝐸subscript𝑦3subscript𝑦4𝐸subscript𝑦4subscript𝑦6𝐸subscript𝑦5subscript𝑦6𝐸subscript𝑦4subscript𝑦5subscript𝑅2subscript𝑦6𝐸subscript𝑦1subscriptsuperscript𝑥𝑎1𝐸subscript𝑦2subscriptsuperscript𝑥𝑏2𝐸subscript𝑦5subscriptsuperscript𝑥𝑐3R_{abc}(x_{1},x_{2},x_{3}):=\exists x^{\prime}_{1}\exists x^{\prime}_{2}\exists x^{\prime}_{3}(\bigwedge_{i=1}^{3}(\delta(x_{i})\wedge\delta(x^{\prime}_{i})\wedge E(x_{i},x^{\prime}_{i}))\wedge\exists y_{1}\cdots\exists y_{6}(\bigwedge_{i=1}^{6}\exists z_{4}\cdots\exists z_{k}\\ \big{(}\bigwedge_{b=4}^{k}R_{b}(z_{b})\wedge E(y_{i},z_{b})\big{)}\wedge E(y_{1},y_{3})\wedge E(y_{2},y_{3})\wedge E(y_{1},y_{2})\wedge E(y_{3},y_{4})\wedge E(y_{4},y_{6})\wedge E(y_{5},y_{6})\wedge E(y_{4},y_{5})\wedge R_{2}(y_{6})\wedge E(y_{1},x^{(a)}_{1})\wedge E(y_{2},x^{(b)}_{2})\wedge E(y_{5},x^{(c)}_{3}))),

where xi(d)subscriptsuperscript𝑥𝑑𝑖x^{(d)}_{i} is shorthand notation for xisubscript𝑥𝑖x_{i} if d=0𝑑0d=0, and xisubscriptsuperscript𝑥𝑖x^{\prime}_{i} if d=1𝑑1d=1. Checking that these interpretations are correct is a straightforward exercise. Note that, as written, the formulas for Rabcsubscript𝑅𝑎𝑏𝑐R_{abc} are not quite pp-formulas, but they are easily converted into pp-formulas by standard rewriting into prenex normal form. At this point we can apply Corollary 3 of Theorem 5 in conjunction with Theorem 12 to obtain the statement 2 in Corollary 5; Theorem 6 in conjunction with Theorem 15 to obtain the statement 3; and Theorem 7 in conjunction with Theorem 17 to obtain the statement 4 in Corollary 5.

Our next goal is to extend the PC lower bound for k𝑘k-COLOR to all fields. Before we do so, let us note that exactly the same strategy as in the previous paragraph is not enough. The reason is that 333LIN(2subscript2\mathbb{Z}_{2}) is easy for Polynomial Calculus over fields of characteristic two. Surely we could start with an instance of 333LIN(3subscript3\mathbb{Z}_{3}), which is going to be hard for fields of characteristic two, but the result is again not going to be hard for all fields simultaneously as it will fail to be hard for fields of characteristic three. The solution is to start with a problem that has instances that are hard for Polynomial Calculus for all fields simultaneously. Luckily, 3-SAT is such a case:

Theorem 19 (see Theorem 3.13 in [3]).

There exists a positive real δ𝛿\delta and an explicit family (Gn)n1subscriptsubscript𝐺𝑛𝑛1(G_{n})_{n\geq 1} of unsatisfiable instances of 3-SAT, where Gnsubscript𝐺𝑛G_{n} has Θ(n)Θ𝑛\Theta(n) variables and Θ(n)Θ𝑛\Theta(n) clauses, such that, for every field F𝐹F and every sufficiently large n𝑛n, every PC refutation over F𝐹F of Gnsubscript𝐺𝑛G_{n} with respect to the EQEQ\mathrm{EQ} encoding scheme has degree at least δn𝛿𝑛\delta n.

Let us note that in order to get Theorem 19 from the exact statement of Theorem 3.13 in [3] one needs explicit families of 3-regular unique-neighbor expanders. Such families were described in [4].

With the lower bound of Theorem 19 in place we can get the version of the PC lower bound of Corollary 5 for all fields: the corresponding explicit instances of k𝑘k-COLOR are obtained by applying the conjunction of Theorem 19 and Theorem 6 on the already noted fact that the template of 3-SAT pp-interprets in 𝕂k+subscriptsuperscript𝕂𝑘\mathbb{K}^{+}_{k}. This gives a new proof of Theorem 1.1 from [40].

Let us point out the main differences and similarities between the original proof from [40] and our new proof. At a high level, those proofs are very similar: both are gadget reductions that convert hard CNF formulas into hard instances of k𝑘k-COLOR. In our proof, the gadgets are based on the way the template of 3-SAT is constructed from the template of k𝑘k-COLOR by the addition of constants followed by the pp-interpretation (as presented in Section 4). Hence, the starting hard formulas can be any family of 3-CNF formulas that are hard for Polynomial Calculus. The proof from [40] is also a gadget reduction, but in their case the reduction is specifically tailored to a concrete family of CNF formulas that encode a sparse version of the functional pigeonhole principle. Besides the construction of the special-purpose gadgets, the bulk of their proof is to check that the conversion preserves the hardness for PC. In our proof both these parts are handled automatically by our general results.

We close by noting that, for CSPs, this method is completely general. Take any template 𝔹𝔹\mathbb{B} that is not known to be solvable in polynomial time, i.e., any template that is known to be NP-complete. By the Algebraic Dichotomy Theorem for CSP, any finite structure \mathbb{C} pp-interprets in the core of 𝔹𝔹\mathbb{B} with added constants (see [11, 20, 50]). In particular, by taking \mathbb{C} to be the template of 3-SAT and applying Theorems 19 and 6 we get explicit families of instances of CSP(𝔹𝔹\mathbb{B}) that are hard for Polynomial Calculus over all fields. The same applies to all proof systems for which we can prove closure under reductions: explicit lower bounds for any CSP imply explicit lower bounds for all NP-complete CSPs. Since explicit lower bounds for k𝑘kLIN(2subscript2\mathbb{Z}_{2}) are known for Sums-of-Squares [28, 45], this answers the question in Open Problem 5.3 in [40] that asks for explicit hard 3-coloring instances for SOS.

9 Concluding remarks

Theorems 5, 6 and 7 imply that for the proof systems under consideration the class of constraint languages admitting efficient refutations can be characterised algebraically. For most of those proof systems such a characterisation follows from the fact that efficient proofs of unsatisfiability exist exactly for languages of bounded width. However, by Theorem 18 the class of constraint languages admitting efficient refutations in Lovász-Schrijver, and consequently also the class of constraint languages admitting efficient Frege refutations, exceeds bounded width. At the same time both of those classes are shown to admit algebraic characterisations. Providing such characterisations is a natural open problem that arises from our work. In particular, with the Algebraic CSP Dichotomy Conjecture recently confirmed [20, 50], it would be interesting to verify or refute the tempting conjecture that the class of languages admitting polynomial size Frege (or Extended Frege) refutations coincides with the class of all polynomial time solvable constraint languages.

Other proof systems which are shown to be closed under reducibilities and can surpass bounded width are Polynomial Calculus proof systems over fields of prime characteristics. Finding algebraic characterisations for the classes of constraint languages admitting efficient unsatisfiability proofs in each of those proof systems is another question suggested by our work. Importantly, since Polynomial Calculus over a field of non-zero characteristic p𝑝p has efficient refutations for systems of linear equations over psubscript𝑝\mathbb{Z}_{p} and does not have efficient refutations for systems of linear equations over msubscript𝑚\mathbb{Z}_{m} if p𝑝p does not devide m𝑚m (c.f. Theorem 16), for two fields of distinct prime characteristics, such characterisations will necessarily be different.

Both questions raised so far could lead to the discovery of some interesting new families of algebras as has happened before in the development of the algebraic approach to CSPs (c.f., the class of algebras with few subpowers [32]).

A related direction suggested by our work is whether the proof complexity of approximating MAX CSPs is also preserved by reductions. On the one hand, it is known that most of the classical CSP constructions preserve almost satisfiability; e.g., if 𝔹superscript𝔹\mathbb{B}^{\prime} is pp-definable without equality in 𝔹𝔹\mathbb{B}, then if 𝔸𝔸\mathbb{A} is an instance of MAX CSP(𝔹)\mathbb{B}^{\prime}) that is almost satisfiable, then its standard transformation into an instance 𝔸superscript𝔸\mathbb{A}^{\prime} of MAX CSP(𝔹𝔹\mathbb{B}) is also almost satisfiable. The question we raise is the following: For which proof systems is it also the case that if there are efficient proofs of the fact that 𝔸superscript𝔸\mathbb{A}^{\prime} is far from satisfiable then there also are efficient proofs of the fact that 𝔸𝔸\mathbb{A} is far from satisfiable? Depending on how the terms “almost satisfiable” and “far from satisfiable” are quantified, a positive answer for such questions could lead to an algebraic approach to the proof complexity of approximating MAX CSPs and the UGC.


Acknowledgments. We are grateful to Massimo Lauria for his help in reconstructing the proofs in Section 2.4, and for several other insightful comments during the development of this work. We are also grateful to Massimo Lauria and Jakob Nordström for useful discussions on the applicability of our results to the 3-coloring problem and the connection to their results in [40]. Both authors were partially funded by European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme, grant agreement ERC-2014-CoG 648276 (AUTAR). First author partially funded by MINECO through TIN2013-48031-C4-1-P (TASSAT2). Part of this work was done while the authors were in residence at the Simons Institute for the Theory of Computing.

References

  • [1] M. Ajtai. The complexity of the pigeonhole principle. In 29th Annual IEEE Symposium on Foundations of Computer Science, pages 346–355, 1988.
  • [2] M. Alekhnovich, E. Ben-Sasson, A. Razborov, and A. Wigderson. Space complexity in propositional calculus. SIAM Journal on Computing, 31(4):1184–1211, 2002. A preliminary version appeared in STOC’00.
  • [3] M. Alekhnovich and A. Razborov. Lower bounds for polynomial calculus: non-binomial case. Proceedings of the Steklov Institute of Mathematics, 242:18–35, 2003.
  • [4] N. Alon and M. R. Capalbo. Explicit unique-neighbor expanders. In Proceedings of the 43rd Symposium on Foundations of Computer Science, FOCS ’02, pages 73–, Washington, DC, USA, 2002. IEEE Computer Society.
  • [5] A. Atserias. A note on semi-algebraic proofs and gaussian elimination over prime fields. CoRR, abs/1502.03974, 2015.
  • [6] A. Atserias, A. A. Bulatov, and A. Dawar. Affine systems of equations and counting infinitary logic. Theoretical Computer Science, 410(18):1666–1683, 2009. A preliminary version appeared in ICALP 2007.
  • [7] A. Atserias and V. Dalmau. A combinatorial characterization of resolution width. J. Comput. Syst. Sci., 74(3):323–334, May 2008. A preliminary version appeared in CCC 2003.
  • [8] A. Atserias, Ph. G. Kolaitis, and M. Vardi. Constraint propagation as a proof system. In 10th International Conference on Principles and Practice of Constraint Programming, volume 3258 of Lecture Notes in Computer Science, pages 77–91. Springer-Verlag, 2004.
  • [9] A. Atserias, M. Lauria, and J. Nordström. Narrow proofs may be maximally long. ACM Trans. Comput. Log., 17(3):19:1–19:30, 2016. A preliminary version appeared in CCC 2014.
  • [10] Albert Atserias and Joanna Ochremiak. Proof complexity meets algebra. In 44th International Colloquium on Automata, Languages, and Programming, ICALP 2017, July 10-14, 2017, Warsaw, Poland, pages 110:1–110:14, 2017.
  • [11] L. Barto. The constraint satisfaction problem and universal algebra. The Bulletin of Symbolic Logic, 21(3):319–337, 2015.
  • [12] L. Barto and M. Kozik. Constraint satisfaction problems solvable by local consistency methods. J. ACM, 61(1):3:1–3:19, January 2014.
  • [13] L. Barto, A. A. Krokhin, and R. Willard. Polymorphisms, and how to use them. In The Constraint Satisfaction Problem, 2017.
  • [14] L. Barto, J. Opršal, and M. Pinsker. The wonderland of reflections. CoRR, abs/1510.04521, 2015.
  • [15] P. Beame, R. Impagliazzo, J. Krajícek, T. Pitassi, and P. Pudlák. Lower bounds on hilbert’s nullstellensatz and propositional proofs. Proceedings of the London Mathematical Society, 73(3):1–26, 1996.
  • [16] P. Beame, R. Impagliazzo, J. Krajícek, T. Pitassi, P. Pudlák, and A. Woods. Exponential lower bounds for the pigeonhole principle. In 24th Annual ACM Symposium on the Theory of Computing, pages 200–220, 1992.
  • [17] E. Ben-Sasson. Hard examples for bounded depth frege. In 34th Annual ACM Symposium on the Theory of Computing, pages 563–572, 2002.
  • [18] C. Berkholz. The relation between polynomial calculus, sherali-adams, and sum-of-squares proofs. Electronic Colloquium on Computational Complexity (ECCC), 2017.
  • [19] A. Bulatov. Bounded relational width. Manuscript, 2009.
  • [20] A. Bulatov. A dichotomy theorem for nonuniform CSPs. In 58th Annual IEEE Symposium on Foundations of Computer Science, 2017. to appear.
  • [21] A. Bulatov, P. Jeavons, and A. Krokhin. Classifying the complexity of constraints using finite algebras. SIAM Journal on Computing, 34(3):720–742, 2005.
  • [22] S. R. Buss, D. Grigoriev, R. Impagliazzo, and T. Pitassi. Linear gaps between degrees for the polynomial calculus modulo distinct primes. J. Comput. Syst. Sci., 62(2):267–289, 2001.
  • [23] S. O. Chan. Approximation resistance from pairwise-independent subgroups. J. ACM, 63(3):27:1–27:32, August 2016.
  • [24] M. Clegg, J. Edmonds, and R. Impagliazzo. Using the Groebner basis algorithm to find proofs of unsatisfiability. In 27th Annual ACM Symposium on the Theory of Computing, 1995.
  • [25] A. Dawar and P. Wang. Definability of semidefinite programming and lasserre lower bounds for csps. In 32nd Annual ACM/IEEE Symposium on Logic in Computer Science, LICS 2017, pages 1–12, 2017.
  • [26] T. Feder and M. Y. Vardi. The computational structure of monotone monadic SNP and constraint satisfaction: A study through Datalog and group theory. SIAM Journal on Computing, 28(1):57–104, 1998.
  • [27] M. X. Goemans and D. P. Williamson. Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming. J. ACM, 42(6):1115–1145, November 1995.
  • [28] D. Grigoriev. Linear lower bound on degrees of positivstellensatz calculus proofs for the parity. Theor. Comput. Sci., 259(1-2):613–622, 2001.
  • [29] D. Grigoriev and E. A. Hirsch. Algebraic proof systems over formulas. Theoretical Computer Science, 303(1):83 – 102, 2003.
  • [30] D. Grigoriev, E. A. Hirsch, and D. V. Pasechnik. Complexity of semi-algebraic proofs. Moscow Mathematical Journal, 4(2):647–679, 2002.
  • [31] S. Hoory, N. Linial, and A. Wigderson. Expander graphs and their applications. Bull. Amer. Math. Soc., 43(4):439–561, 2006.
  • [32] P. Idziak, P. Marković, R. McKenzie, M. Valeriote, and R. Willard. Tractability and learnability arising from algebras with few subpowers. SIAM Journal on Computing, 39(7):3023–3037, 2010.
  • [33] R. Impagliazzo, P. Pudlák, and J. Sgall. Lower bounds for the polynomial calculus and the groebner basis algorithm. Comput. Complexity, 8(2):127–144, 1999.
  • [34] S. Khot, G. Kindler, E. Mossel, and R. O’Donnell. Optimal inapproximability results for max-cut and other 2-variable csps? SIAM Journal on Computing, 37(1):319–357, 2007.
  • [35] Ph. G. Kolaitis and M. Y. Vardi. A game-theoretic approach to constraint satisfaction. In Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence, pages 175–181. AAAI Press, 2000.
  • [36] J. Krajícek. On the weak pigeonhole principle. Fundamenta Mathematicæ, 170(1–3):123–140, 2001.
  • [37] J. Krajícek, P. Pudlák, and A. Woods. Exponential lower bound to the size of bounded depth Frege proofs of the pigeon hole principle. Random Structures and Algorithms, 7(1):15–39, 1995.
  • [38] J. B. Lasserre. Global optimization with polynomials and the problems of moments. SIAM Journal on Optimization, 11:296–317, 2011.
  • [39] M. Laurent. A comparison of the Sherali-Adams, Lovász-Schrijver and Lasserre relaxations for 0-1 programming. Mathematics of Operations Research, 28:470–496, 2001.
  • [40] M. Lauria and J. Nordström. Graph colouring is hard for algorithms based on hilbert’s nullstellensatz and gröbner bases. In 32nd Computational Complexity Conference, CCC 2017, pages 2:1–2:20, 2017.
  • [41] L. Lovász and A. Schrijver. Cones of matrices and set-functions and 0-1 optimization. SIAM Journal on Optimization, 1(2):166–190, 1991.
  • [42] T. Pitassi. Algebraic propositional proof systems. In N. Immerman and Ph. G. Kolaitis, editors, Descriptive Complexity and Finite Models, volume 31 of DIMACS Series in Discrete Mathematics and Theoretical Computer Science, pages 68–96. American Mathematical Society, 1997.
  • [43] P. Pudlák. On the complexity of the propositional calculus. In Sets and Proofs, Invited Papers from Logic Colloquium ’97, pages 197–218. Cambridge University Press, 1999.
  • [44] R. A. Reckhow. On the lenghts of proofs in the propositional calculus. PhD thesis, Department of Computer Science, University of Toronto, 1976.
  • [45] G. Schoenebeck. Linear level lasserre lower bounds for certain k-csps. In 49th Annual IEEE Symposium on Foundations of Computer Science, FOCS 2008, pages 593–602, 2008.
  • [46] H. D. Sherali and W. P. Adams. A hierarchy of relaxations between the continuous and convex hull representations for zero-one programming problems. SIAM Journal on Discrete Mathematics, 3(3):411–430, 1990.
  • [47] J. Thapper and S. Živný. The limits of SDP relaxations for general-valued csps. In 32nd Annual ACM/IEEE Symposium on Logic in Computer Science, LICS 2017, pages 1–12, 2017.
  • [48] J. Thapper and S. Živný. The power of sherali-adams relaxations for general-valued csps. SIAM J. Comput., 46(4):1241–1279, 2017.
  • [49] M. Tulsiani. CSP gaps and reductions in the Lasserre hierarchy. In 41st Annual ACM Symposium on Theory of Computing (STOC), pages 303–312, 2009.
  • [50] D. Zhuk. The proof of CSP dichotomy conjecture. In 58th Annual IEEE Symposium on Foundations of Computer Science, 2017. to appear.