What it means to raise something to an imaginary power:
Multiplication: What’s the key property of multiplication? Well, for some purposes, the key property of multiplication is that it distributes over addition; f(x + y + z + w + …) = f(x) + f(y) + f(z) + f(w) + … I’ve used parentheses here suggestively; you can read “f(x)” both as meaning “f times x” or as meaning “a function f, applied to an input value x”. This is intentional; for any operation f with this distributivity property, let us feel free to think of applying f to an input value as a kind of multiplication. Thus, we have a notion of multiplying an “operation” by an input “value”.
More about multiplication: Alright, now what about an expression like f * g * v, where f and g are “operations” and v is a “value”? Well, one way of reading this expression is as f * (g * v); by our above considerations, this amounts to f applied to (g applied to v); i.e., f(g(v)). Another way of reading this expression is as (f * g) * v; i.e., (f * g) applied to v. Very well; let us set these two equal [(f * g)(v) = f(g(v))] and say this provides us with a definition of what it is to multiply two operations: f * g is the composite function which first applies g to an input, then applies f to the result of that.
Similarly, the equation 1 * v = v gives us a definition of what 1 is as an operation; it’s the identity function [since 1(v) = v for any input value v].
More about addition, then: What about an expression like (f + g) * v? Well, normally, we want multiplication to distribute on both sides; that is, we want (f + g) * v = f * v + g * v. Very well; in our operational terms, this means (f + g)(v) = f(v) + g(v). So, again, we can take this to provide us with a definition of what it is to add two operations; f + g is the function which, in parallel, applies f to some input and applies g to the same input, then adds the results of these together.
Similarly, the equation 0 * v = 0 gives us a definition of what 0 is as an operation; it’s the constantly zero function [since 0(v) = the value 0 for any input value v. Note that there’s two different "0"s here: one is an operation, one is a value which is an input to that operation].
Pause for an example (the complex numbers!): So far, I haven’t said what kinds of values and what kinds of operations I want to consider. Well, all the above works very, very generally, but in particular, what I’m interested right now is the case where the values I’m looking at are 2d vectors. Some distributive operations on 2d vectors include rotation; e.g., consider the function f(x) = x rotated by 35 degrees (in the direction from <1, 0> to <0, 1>). So, for example, f(<1, 0>) = <cos(35 degrees), sin(35 degrees)>. In fact, rotation by any particular angle is a distributive operation. So operations like “Rotate 182 degrees” can be thought of as multiplications, just as we’re already used to thinking of things like"Become twice as large" as a multiplication.
One extremely common and important rotation function is the one defined by n(x) = x rotated by 180 degrees. [So that, e.g., n(<2, 4>) = <-2, -4>]. Note that n(v) + v = 0 for any vector v. In other words, (n + 1) * v = 0 * v for all v; in other words, n + 1 = 0. In this sense, n = -1, and indeed, we usually do use the name “-1” for this operator, even though we could just as well call it “rotate by 180 degrees” or “half turn” or such things.
Another extremely common and important rotation function is the one defined by i(x) = x rotated by 90 degrees. [So that, e.g., i(<1, 0>) = <0, 1> and i(<0, 1>) = <-1, 0>). Note that i(i(v)) = n(v) for any vector v [since rotating something by 90 degrees and then rotating it by 90 degrees again is as good as rotating it by 180 degrees]. In other words, i * i * v = n * v; in other words, i * i = n = -1. In this sense, i is a square root of -1, and indeed, often, is introduced and conceptualized solely in these terms, even though we could just as well call it “rotate by 90 degrees” or “quarter turn” or such things.
Continuing on to exponentiation: Just as the key property of multiplication was that it turned addition of inputs into addition of outputs, the key property of exponentiation is that it turns addition of inputs into multiplication of outputs. That is, b^(f + g + h + …) = b^f * b^g * b^h * … As before, let us feel free to think of any function with this sort of property as exponentiation with some base. For now, let us restrict the inputs to such a function to be real numbers >= 0.
In particular, consider the function r(t) = the “rotate by t many radians” operation. This function sends any input angle t (in radians) to an output operation on 2d vector spaces. What’s more, it satisfies the all important property to be thought of as exponentiation: r(t) * r(s) = r(t + s), since r(t) * r(s) * v = (v rotated by t radians) rotated by s radians = v rotated by (t + s) radians = r(t + s) * v. So instead of writing r(t), we’ll feel free to write r^t instead.
Natural logarithms: By ln(x), I mean the derivative of x^t with respect to t when t = 0; that is, ln(x)/unit of time is the “interest rate” of the kind of exponential growth where one multiplies by x over every unit of time.
Note: Although an exponential function isn’t necessarily uniquely determined by its value at 1 [knowing b^1 doesn’t necessarily tell one what b^0.5 is, for example], is uniquely determined by its “interest rate”; that is, the function which sends t to b^t is entirely specified by ln(b).
General exponentiation: One key equation, justified quite readily when p is one of our already allowed exponents [a non-negative real number], is that ln(b^p) = p * ln(b). But we can use this to define exponentiation even when p is a more general type of entity; once we know what b and p are, we know what p * ln(b) is, which tells us what ln(b^p) should be, which tells us what the exponential function with base b^p is. In other words, the function sending t to (b^p)^t is the exponential function whose “interest rate” is p * the interest rate of the function sending t to b^t. Now we know how to carry out exponentiation with all kinds of exponents, not just non-negative real exponents. For example, 5^(rotate by 35 degrees) is the result of exponential growth for one unit of time, in such a way as that one’s rate of growth is always equal to one’s current value * ln(5), rotated by 35 degrees, per unit of time. It’s just the solution to some differential equation in the context of 2d space. Nothing magical.
Euler’s theorem: Returning to the concepts of rotation, exponential growth, natural logarithms, etc., one particular, not really all that amazing fact is that ln® = i, which is simply to say “The derivative at t = 0 of the ‘rotate by t radians’ operation is the ‘rotate by 90 degrees’ operation”. This amounts to little more than recognizing that the tangents to a circle are perpendicular to its radius [and remembering how radians are defined as arclength to radius ratios]. However, this isn’t terribly important really for recognizing the relationship between complex numbers and rotation. Still, it’s there.
I’ll continue from here later, though it should be enough to indicate some of the relationship between complex numbers and rotation. Since trigonometry is fundamentally the study of rotation (and not really so much about triangles, as such, despite the name), this gives us the bridge between complex numbers and trigonometry.