CSE533 -- PCP Theorem & Hardness of Approximation: November 2005

Sunday, November 27, 2005

Homework back

Hi all -- just wanted to let you know that we hope to have graded homework returned and solutions posted by Wednesday.

Wednesday, November 23, 2005

Proof of the PCP Theorem

We concatenated the notes for Lectures 1 through 9 into one giant Proof of the PCP Theorem. Thanks very much to all the scribes involved. Note that although the we did the "easy" new Irit Dinur proof, the PDF file still runs to 64 pages!

Saturday, November 19, 2005

Questions / discussion about the homework

The comments section of this post is for discussion of the homework problems.

Friday, November 18, 2005

A hint for questions homework #6 and #7

If you need some hints for problems 6b), 6d), and 7c), you can look at this sample calculation in which I work through a similar problem in explicit detail.

PCPs, gaps, and hardness factors

I just wanted to go over one more time the connection between PCPs, gaps, and hardness factors. It took a while for me to "get" this myself when I learned it, so I think it bears repeating. Also I hope it will help with understanding questions #1 and #4 on the homework.

1. Let Phi be a class of predicates; e.g., Phi = 3-variable-linear-equations-mod-2, or Phi = Majorities-of-3-literals, or Phi = ORs-of-3-literals. Then the following statements are exactly equivalent:

(a) "Every problem in NP has a PCP with completeness c, soundness s, and a verifier using predicates from Phi."
(b) "3-Colorability has a PCP with completeness c, soundness s, and a verifier using predicates from Phi."
(c) "Gap-Max-Phi_{c, s} is NP-hard."
(d) "There is a polynomial time reduction which, given a graph G, spits out M many tests from Phi over a bunch of "variables", such that (i) if G is 3-colorable, then there is an assignment to the variables satisfying at least cM of the tests; (ii) if G is not 3-colorable, then every assignment to the variables satisfies fewer than sM of the tests."

(a) <=> (b) is just because 3-Colorability is NP-complete, and polynomial-time reductions can be "built into" PCP systems.

(c) <=> (d) is just because (d) is really the definition of (c).

(b) <=> (d) involves some thought about what the definitions mean. "Indices of proof positions" in (b) correspond to "variables" in (d); "proofs" in (b) correspond to "assignments" in (d); and "picking a random test and understand the probability that it passes" in (b) corresponds to "writing down all the tests and counting how many of them are satisfied" in (d).

------------------------------

2. Further, the statement (c) "Gap-Max-Phi_{c,s} is NP-hard" implies (but is *not* equivalent to) the statement
(e) "There is no poly-time (s/c)-approximation algorithm for the Max-Phi problem unless P = NP."

Proof of (c) => (e): Suppose (c) is true, and further suppose that there *is* a polynomial-time algorithm A which, when given as input M tests from Phi over variables, always outputs an assignment satisfying at least an s/c fraction of whatever the optimum assignment satisfies. We will show that the existence of this algorithm A means that 3-Colorability is solvable in polynomial time, and hence P = NP.

So why is 3-Colorability in polynomial time in this case? Here is the polynomial time algorithm B which solves 3-Colorability:

Given a graph G, run the reduction guaranteed by statement (c); this spits out a list of M many tests from Phi.
Run algorithm A on this list of tests, producing some assignment alpha to the variables.
Check if alpha satsifies at least sM of the tests. If so, output "YES, G is 3-colorable." If not, output "NO, G is not 3-colorable."

Since the reduction guaranteed by (c) is polynomial time, and A is polynomial time, the overall algorithm B is polynomial time. Why is it correct? If the input graph G is 3-colorable, then the reduction guaranteed by (c) spits out a list of M tests with the guarantee that there is an assignment satisfying at least cM of them. I.e., the optimal assignment gives at least cM. Thus by the guarantee of the supposed approximation algorithm A, it will return an assignment alpha satisfying at least (s/c) * (cM) = sM tests. Thus B will correctly say "YES". On the other hand, if G was not 3-colorable, then the reduction guaranteed by (c) will spit out a list where *every* assignment to the variables satisfies fewer than sM of the tests. In this case, no matter what assignment A outputs, there's no way it will satisfy as many as sM tests; hence B will correctly output "NO".

Thursday, November 17, 2005

Not all hardness of approximation needs the PCP Theorem

Almost all the best hardness-of-approximation results we know require at least the PCP Theorem. But not all do... Here are three examples:

1. Minimum Metric k-Center: In this problem you are given an undirected graph with distances satisfying the triangle inequality. You are also given an integer k. Your task is to pick k "centers" (vertices in the graph) with the goal of minimizing the maximum distance of any vertex to a center.

As early as 1979, Hsu & Nemhauser showed by a simple NP-completeness reduction that it's hard to tell the difference between instances with value 1 and instances with value 2 (even in graphs where all edges have distances 1 and 2). This was also shown by Plesník in 1980. This immediately implies there is no (2-eps)-approximation algorithm. On the other hand, there is also an easy 2-approximation, first given by Gonzalez in 1985. This gives a sharp hardness-of-approximation result, as noted in 1985 (seven years before the PCP Theorem!) by Hochbaum & Shmoys.

2. In 1993, right after the PCP Theorem had come out, Håstad, Philips, and Safra considered the problem of satisfiability of multilinear degree two equations over GF[p]. (We tangentially touched on his problem, with p = 2, in the construction of Assignment Testers.) In the paper the authors give an easy (1/p - 1/p^2)-approximation algorithm. They also use a cute argument based on epsilon-biased spaces (which we will see on Monday for SET-COVER; I think the argument can also be based on error-correcting codes) to show that it is NP-hard to give a (1/p + eps)-approximation for any eps > 0 (even polynomially small). The main reason why HPS are able to give this essentially-sharp result is that the quadratic equations don't have to be "local", as they do in PCP-type constructions; they many depend on many variables.

3. Since 3-colorability is NP-hard, we know that it is hard to distinguish graphs with chromatic number 3 from graphs with chromatic number 4. Khanna, Linial, and Safra improved on this in 1993, giving an intricate PCP-based proof that it is hard to distinguish chromatic number 3 from chromatic number 5. This is the strongest result we know about coloring 3-colorable graphs. However, it turns out that KLS's use of the PCP Theorem was unnecessary; in 2000, Khanna and Guruswami gave a new proof of the same result using nothing but a classical NP-completeness reduction from Independent-Set.

Monday, November 14, 2005

11/14 -- Feige-Kilian Confuse/Match games

If you have any questions about today's lecture, please ask.

You may also find it helpful to look at the original sources (although I must warn that I've a) sort of combined them, and b) changed some notation): the conference version of Feige-Kilian, and the journal version of Feige-Kilian.

You can also take a gander at Raz's paper on the Parallel Repetition Theorem, although as mentioned, it's somewhat painful.

Friday, November 11, 2005

Questions / discussion about the course

The comments section of this post is for questions, comments, and discussion about the lectures.

Monday, November 07, 2005

Hardness of approximation for counting problems

Someone just asked me about hardness of approximation results for counting problems. Good question. Off the top of my head, I would say the following...

For some counting problems there is a) a straightforward notion of a random input, and b) a polytime algorithm for checking the property. For example, consider #3SAT -- Given a formula, how many satisfying assignments does it have? Or, what fraction of all assignments from {0,1}^n are satisfying?

For such a problem you can efficiently estimate the answer to within any additive 1/poly(n) in polynomial time, by random sampling: Just pick a bunch of assignments at random, check if they are satisfying, and compute the empirical fraction. It will be additively very close, with high probability.

Note that this is not the "usual" sort of approximation-algorithm guarantee we are looking for; we have been looking at multiplicative guarantees. It's easy to see that for #3SAT no multiplicative factor guarantee is possible: For suppose A always returned an estimate p such that p/C <= true fraction of satisfying assignments <= Cp (where C is anything). Then A could be used to solve 3SAT: when the instance is unsatisfiable A must return 0 and when it is satisfiable it must return a number which is at least 2^{-n}/C. Hence we can distinguish.

Other types of counting problems don't have the above-mentioned two properties. A classic example is #PerfectMatchings -- given a bipartite graph, count the number of perfect matchings in it. (This is equivalent to the "Permanent" problem.) A famous result from four years ago by Jerrum, Sinclair, and Vigoda --
A polynomial-time approximation algorithm for the permanent of a matrix with non-negative entries
-- gives a PTAS for this problem. (I.e., it can efficiently return a multiplicative (1+eps)-factor estimate for any constant eps > 1/poly(n).)

Naturally, therefore, there is no hardness-of-approximation result for the #PerfectMatching problem!

In fact, I can't think of a counting problem with an interesting hardness of approximation result. Does anybody know one?

11/07 -- parallel repetition

Two quick notes about parallel repetition:

1. In the class I hurried over the claim of the following fact: If you have a 2p1R game whose equivalent bipartite CG satisfaction problem has the projection property, then the parallel-repeated version of this game/CG-problem also has the projection property.

To see this, suppose we have a bipartite CG satisfaction problem with the projection property, so each constraint c_{(u,v)} is actually a function, pi_{(u,v)} : Sigma -> Sigma. The "k-times repeated" bipartite CG problem looks like this: Nodes on the left in the new graph correspond to all k-tuples (u_{i1}, ..., u_{ik}) where the u's are left-vertices from the old graph; nodes on the right in the new graph correspond to all k-tuples (v_{i1}, ..., v_{ik}) where the v's are right-vertices from the old-graph.

There is an edge between new vertices U := (u_{i1}, ..., u_{ik}) and V := (v_{i1}, ..., v_{ik}) precisely if there was an edge between u_{ij} and v_{ij} in the old graph for each j = 1...k. (Hence if the old graph was left-regular, so is this graph.) In the new graph, the vertices are labelled from Sigma^k. The constraint on the new edge (U,V) looks at the label (a_1, ..., a_k) for U and the label (b_1, ..., b_k) for V, and accepts iff (a_j, b_j) satisfies the old (u_{ij}, v_{ij}) constraint for each j = 1...k.

In particular, every new edge constraint has the projection property, because if I tell you a label (a_1, ..., a_k) for U, there is a unique good label for V -- namely, (pi_1(a_1), ..., pi_k(a_k)), where I have written pi_j for pi_{(u_{ij}, v_{ij})}.

2. Feige has a pretty nice survey about parallel repetition:
http://www.wisdom.weizmann.ac.il/~feige/Approx/mip.ps
It's old (from 1995), but that's okay since essentially nothing new has been proven about parallel repetition since 1995!

11/07 lecture -- the Feige NA counterexample

There was a little confusion in class over Feige's "NA" game which shows that omega(NA^k) = omega(NA)^k doesn't strictly hold for parallel repetition of 2P1R games. I just thought I'd repost the example here in case people wanted to stare at it.

The (1 round) Noninteractive Agreement (NA) game:

There are two players, Anna and Venkat.
The verifier, Ryan, flips a random coin and sends it to Anna, and flips another random coin and sends it to Venkat.
Venkat and Anna both have to answer with a claim of the form "So-and-so got sent the coin such-and-such" (where So-and-so in {Anna, Venkat}, such-and-such in {0,1}; so, answer set is of size 4.)
The players win if a) they make the same claim, and b) the claim agrees with reality.

Note 1: If a player decides to claim about themself, then of course they should (and can) be consistent with reality, since they know their own coin.

Note 2: omega(NA) = 1/2. The fact that Anna and Venkat can succeed with probability at least 1/2 is obvious (say, Anna always claims about her own coin, and Venkat always claims "Anna got sent the coin 0"; then they win whenever Anna gets sent the coin 0 -- probability 1/2).

The argument that this is the best they can do is the following: Suppose each player always make a claim about the other's coin. Then of course their claims never agree with each other and they never succeed. So we can assume (without loss of generality) that, say, Anna, sometimes makes a claim about her own coin; (WOLOG, again) suppose Anna's strategy includes claiming "Anna got sent the coin 0" whenever she gets sent the coin 0. In this case, I claim that if Venkat gets sent the coin 0 then no matter what his strategy is, there is failure with probability 1/2; and, I also claim the same is true if Venkat gets sent the coin 1. If I can show both of these, then of course overall the players can succeed with probability at most 1/2.

So suppose Venkat gets sent the coin 0. If his strategy is to claim "Anna got sent the coin 0", then whenever it also holds that Anna gets sent 1 (probability 1/2), the players lose; this is because Venkat's claim will not agree with reality. On the other hand, if Venkat's strategy in this case is to claim anything else, then the players lose whenever it also holds that Anna gets sent 0 (probability 1/2) -- because we assume she claims "Anna got sent the coin 0" in this case, so the players' claims will disagree.

The exact same argument holds for the case that Venkat gets sent the coin 1; literally, just change the first sentence in the above paragraph.

----------------------------------------------------------

So we've shown that omega(NA) = 1/2. Let's now show that omega(NA^2) = 1/2.

The proof that omega(NA^2) is at most 1/2 is trivial: In fact, for any game G and for any k, omega(G^k) is at most omega(G). The reason is, if the players are going to win on the k-repeated version of the game, they certainly have to win in, say, the first coordinate. But the probability that they even win in the first coordinate is just omega(G).

To show that omega(NA^2) is at least 1/2, we have to give a strategy for Anna and Venkat.

Now the NA^2 game is a little confusing; let's recap what it is. In this game, Ryan flips two coins for Anna and two coins for Venkat. Let's call Anna's two coins her "round 1 coin" and her "round 2 coin", and similarly for Venkat. Now this terminology is kind of misleading, since all questions get asked in one shot; i.e., Anna gets asked sent her round 1 coin and her round 2 coin simultaneously, and the same for Venkat. Better terminology might be "coordinate 1 coin" and "coordinate 2 coin" or even "red coin" and "blue coin". But let me stick with "round 1 coin" and "round 2 coin".

Both players have to send back an answer of the following two-part format: "So-and-so's round 1 coin was such-and-such; AND whose-it's round 2 coin was thus-and-such." I.e., both players must make a claim about somebody's round 1 coin and also a claim about somebody's round 2 coin. The players win if their claims totally agree with each other and the claims totally agree with reality.

That's the game NA^2; here is the strategy: Both players always claim about Anna's round 1 coin and they both always claim about Venkat's round 2 coin. So at least their claims are guaranteed to be in agreement in terms of the So-and-sos/whose-its. Naturally, to be consistent with reality, the round 1 part of Anna's claim will correctly be about her round 1 coin, and the round 2 half of Venkat's claim will correctly be about his round 2 coin. So each player has half of a claim left to make. The idea is that the players will operate under the assumption that Anna's round 1 coin equals Venkat's round 2 coin. This lets us finish saying the strategies -- the round 2 half of Anna's claim will be "Venkat's round 2 coin equals [whatever her round 1 coin is]" (which she knows), and the round 1 part of Venkat's claim will be "Anna's round 1 coin equals [whatever his round 2 coin is]" (which he knows).

This properly defines Anna and Venkat's strategy, and indeed they succeed completely (i.e. their claims agree with each other and with reality 100%) whenever Anna's round 1 coin equals Venkat's round 2 coin. This happens with probability 1/2.

CSE533 -- PCP Theorem & Hardness of Approximation