Perhaps this becomes more clear by taking a look at how quantum algorithms work.
The most widespread model is the so-called ‘quantum circuit’ (though there are others, such as measurement-based quantum computation). This is essentially the quantum analog to a classical logical circuit. Its elements are wires and gates; each wire carries logical signals (e.g. 1 or 0 in the classical case), which are manipulated by the gates, performing logical operations on the signals.
In the classical case, typical gates are the NOT-gate, which acts on a single input and takes every signal to its negation (1 to 0 and 0 to 1), the AND-gate and the OR-gate, which implement the eponymous logical transformation on their (two) inputs; the classical logic gates are described in this wiki article. An important concept is that of a universal set of logic gates, meaning that with such a set, any conceivable logical transformation can be implemented, and thus, any kind of computation is possible. In the classical case, it turns out that you only need a single gate for universality, the NAND- (or alternatively NOR-) gate, which is just the AND-gate with its output reversed (i.e. it yields a 0 only if both inputs are 1).
The quantum circuit model now generalizes this to the quantum world, where the logical units are no longer classical bits, but quantum bits or qubits. A qubit is a two-level quantum system whose two levels are conventionally denoted |0> or |1> (the funny brackets meaning just ‘this is a quantum object’ for now). The most striking difference to the classical case now is that a qubit can not only be in either of the states |0> and |1>, but also in arbitrary linear superpositions a|0> + b|1>. Operationally, this means that there is, upon measurement, an |a|[sup]2[/sup] chance to find it in the state |0>, and a |b|[sup]2[/sup] chance to find it in the state |1> (which implies |a|[sup]2[/sup] + |b|[sup]2[/sup] = 1, since you’ll find it in some state with certainty).
One hitch I should mention, but won’t explain further is that contrary to the classical case, the operations in the quantum case must be implemented in a reversible fashion—this is just a feature of the evolution of quantum systems (in the classical case, if you only know the output of an AND-gate, you can’t reconstruct its input; in the quantum case, knowing the output—and the mechanism—of the gate always allows you to ‘undo’ its operation and recover the input).
So, a quantum circuit is again given by wires and gates; the wires carry the qubits, and the gates effect transformations upon them (which are mathematically unitary transformations). Because of the reversibility requirement, any quantum gate has as many outputs as there are inputs. A quantum circuit thus typically looks like a set of parallel ‘rails’, some of which enter certain boxes. Despite the reversibility requirement, it turns out that you can find a set of universal gates, and thus, compute anything you want to compute (although often, due to the probabilistic nature of quantum mechanics, your result will only be correct with a certain probability). This is not unique to quantum mechanics: reversible universal computing is also possible in the classical world.
Now we thus have a paradigm of computation, and can see what it can do. Any algorithm in this setting corresponds essentially to a set of gates and wires; the complexity of this algorithm can be measured by the number of transformations you need to implement on the input to produce the desired output. This complexity will typically depend on the size of the input, the instance of the problem. If you have some input of N bits, then processing that input might require N steps, or N[sup]2[/sup] steps, or log(N) steps, or, in particularly bad cases, e[sup]N[/sup] steps.
It turns out that factoring a number, in the classical case, is one of the ‘bad’ cases: no known classical algorithm can do better than using an exponential number of steps (well that’s not quite true, I think the fastest classical algorithm takes larger than polynomial, but smaller than an exponential number of steps). But it turns out that in the paradigm for quantum computation described above, you can create an algorithm, i.e. a quantum circuit, which takes significantly less time (polynomial in the input size, which is generally considered to be ‘efficient’).
So in this sense, we don’t need to know precisely why quantum computation is faster then classical: we know how to do universal computation in both cases, and we can simply compare how long each takes.
There’s a caveat to be attached to this, though: it’s merely the case that the fastest known classical algorithm for prime factoring takes exponentially long; in theory, it could be possible that there is some as yet undiscovered classical algorithm that might equal the quantum algorithm’s performance. This is generally considered unlikely for various indirect reasons having to do with a number of extremely deep problems in complexity theory, but a definite proof is not available at the moment.