What are some open problems in Computer Science?

Most people know about P =?= NP as that is quite widely publicised. However, what are some other, less famous open problems in CS? I’m mostly interested in theoretical CS, things like algorithmics, computational complexity, intractability and formal language semantics, although open problems in other areas will do fine, too.

What are the implications of solving these problems? I’m not particularly interested in contrived problems, but ones with a real world applicability, like P =?= NP.

Almost everything of interest in Theoretical CS is bulging with open problems.


Some complexity classes stuff.

P and NP and just two of dozens of major classes whose relations we know little about. (And there are hundreds of lesser classes that complexity people have defined in recent years in order to generate papers since the can’t solve the big issues.) The main sequence is:

DLOG, NLOG, P, NP, PSPACE, PEXPTIME, NPEXPTIME, PEXPSPACE, …

(I have used the longer forms of some of the names for clarity: det. log space, non-det. log space, det. poly time, non-det. poly time, poly space [det. and non-det. the same due to Savitch’s Theorem], det. poly-to-an-exponential time, non-det of same, space of same.)

Each encloses the previous but no proper inclusions are known for adjacent classes. In fact, you have to go generally 3-4 classes apart in the above list to get proper inclusions. (P proper subset of PEXPTIME and NLOG proper subset of PSPSACE.)

While DLOG vs P may not seem practical, a lot of “side information” has interesting implications. E.g., if DLOG = P, then all poly time problems can be solved in parallel very efficiently with a reasonable amount of hardware.


Individual problems (that don’t obviously relate to completeness and complexity classes).

We really don’t have tight bounds on any everyday poly time problem that aren’t based on input size or restrictions of model.

E.g., matrix multiplication. 2n[sup]2[/sup] inputs so that’s the lower bound to within constant factors. The exponent for the upper bound is around 2.376.

(Note that there is a complexity class defined by log-reductions to matrix multiplication that occupies an important place relative to “Nick’s Class” (NC) so it does relate to complexity theory too.)

Just go down the list of poly time problems. If the upper bound isn’t close to the obvious lower bound, then it’s an open problem.


In formal languages and automata, the big question is det. CSL vs. non-det. CSLs. (Which is actually det. linear space vs. non-det. linear space so it’s really a complexity theory question as well.)

The two other Big Open Problems that used to be mentioned in undergrad FL&AT texts have been solved. So that leaves some generally harder to describe problems. E.g., CFLs can be accepted by 3-head 2-way FAs. But how about 2-headed ones? If true, then CFL membership is quadratic time, vs. the matrix bound given above. We love DCFLs since the parsing in linear time and this impacts computer language time significantly. If general CFLs could be parsed far more efficiently, then computer languages might become “more interesting” in structure.

I just wanted to chime in here to say that I have been working with computers since the early 1970s and the only acronym I recognized in either of the above posts was “CS”.

You might consider going to your local university and taking a “Theory of Computation” course. Many folks can go a long and profitable life in computers without ever needing to know anything about complexity classes, so it’s not odd that you’ve never heard of it if you haven’t been formally trained in CS (as opposed to Software Engineering, Programming, or some other discipline).

Nevertheless, it’s interesting stuff (well, to me, anyway). It’s basically the mildly mathematical study of exactly what we can do with computers; what sorts of problems can be solved, which ones (provably) can’t, and how long it will take to solve them based on the size of the input. Even the idea that there are problems that we can PROVE cannot be solved algorithmically is an interesting idea to many folks.

Anyway, the acronyms the above posts were using are names for complexity classes: they classify problems based on the amount of time it takes to solve them and the capabilities of the “computer” you’re using.

The classic one, mentioned in the OP, is P = NP. Let’s define polynomial time first. Polynomial time means that the amount of time it takes to find an answer for a problem is a polynomial (any polynomial, these classes are broad) in the number of bits of the input. That is, if it takes “n” bits to describe the input, if equation describing the time has a bunch of polynomial terms * n, the algorithm can be said to operate in polynomial time (I’m trying to avoid equations here, but a quick google will show this nicely, since I realize I’m being vague).

“P” is the class of all problems that can return an answer in an polynomial time on a “standard” computing model. “NP” is the class of all problems that can return an answer in polynomial time on a “magic” computational model which is capable of always “guessing right” when it needs to make a decision.

There are several interesting problems (called “NP-complete”) which are known to be the hardest problems in the NP class (meaning that if you can solve any of them in time X, you can solve every problem in the class in time X). The travelling salesman problem, optimal packing of arbitrary sized objects into arbitrary sized bins, and several circuit-board layout problems are all NP-complete.

At the moment, these problems are all known to be in the class NP. They are NOT known to be in the class P - that is, no polynomial time algorithm on a “real” computer is known. I’d wager that most computer science types believe that no such algorithms exist, i.e. that P is not an equivalent class to NP. But it’s remarkably hard to actually PROVE that – so much so that it’s the classic “unsolved problem” of computer science. It’s the classic “prove a negative” problem.

And to actually answer the OP’s question, rather than go off on a side rant:

Strong AI certainly qualifies as a set of open questions in CS:

[ul]
[li]Is it possible to build a human-class intelligence on a computer?[/li][li]What do we mean by that, anyway (shared with cognitive science and psych)?[/li][li]How can we measure it, or know if we’ve succeeded?[/li][/ul]

P, NP and all the other complexity classes are defined relative to computation over {0, 1}. Replace that with a different set–say {0, 1, 2} or N or R–and you have a new model of computation and a new set of complexity classes to go with it. And for every one of those, the question of whether P = NP is still open (as far as I know).

There are also quite a few open problems in combinatorics and the theory of recurrence relations, but I’m not quite sure that those are strictly CS.

And problems like those are solved quite often in mathematics, computer science, and other disciplines that rely on formal logic. The difficulty in deciding P = NP is a bit more subtle than that.

Would proving P = NP over { 0, 1, 2 } have any implications for P = NP over { 0, 1 }? Are any of these complexity classes over a different alphabet linked?

Here is a list of open problems in theoretical computer science. Lots of interesting stuff there.

There’s a book called “Complexity and Real Computation” by Blum et al. about computation over the reals and complex numbers. I haven’t read it, but it has been highly recommended to me.

Shouldn’t any problem over a finite alphabet be polynomially reducable into {0,1}?