The technical definition of a linear system is as follows. Suppose we know that an input $f_1(t)$ produces an output $g_1(t)$, while an input $f_2(t)$ produces an output $g_2(t)$. Then it will always be the case that any linear combination of the two inputs will produce the same linear combination of the outputs. So for example input $2f_1-f_2$ will give the output $2g_1-g_2$. Provided we also assume that the properties of the system itself do not vary with time (although of course the input will vary with time), then this is enough information to be able to prove the “sine wave in, sine wave out” result described in the main text. I will give the proof at the end of this section.
The underlying mathematical reason for the importance of sine waves is that they are eigenfunctions of the process of differentiation: differentiating a sine wave gives another sine wave, changed only in amplitude and phase. This is related to the “sine wave in, sine wave out” property: if you imagine that your system in the “black box” is described by a differential equation of some kind, then if the equation is linear in the usual mathematical sense, it consists of a linear combination of the function and its derivatives. It follows that if the function is a sine wave, then the total output will still be a sine wave since each separate term in the equation gives a sine wave, and adding sine waves together (all at the same frequency) still leaves another sine wave.
For many algebraic manipulations involving sine waves, it is easier to use a complex-number representation. If you are not familiar with the idea of complex numbers, break off here to look at the next link.
Now we take advantage of the famous result
$$e^{i \omega t}=\cos \omega t+i\sin \omega t . \tag{1}$$
This allows us to package the amplitude and phase of a sine wave into a single complex number. From equation (1), $\cos\omega t=\mathrm{Re}(e^{i \omega t})$. Now suppose we have a sine wave with amplitude $a$ and phase $\phi$:
$$f(t)=a\cos(\omega t + \phi)=\mathrm{Re}(ae^{i \omega t+i\phi})=\mathrm{Re}(Ae^{i \omega t}) \tag{2}$$
where
$$A=ae^{i\phi}. \tag{3}$$
As a shorthand, we usually do not bother to write the $\mathrm{Re}(…)$. We just talk about a sine wave $Ae^{i \omega t}$ with frequency $\omega$ and (complex) amplitude $A$. Just keep in the back of the mind that if the answer to a calculation involves complex numbers, then what we really mean is that the physical solution is the Real part of the complex answer.
Now we are ready to prove the “sine wave in, sine wave out” property. Suppose our input $f(t)=e^{i \omega t}$, and suppose the corresponding output is $g(t)$. Now think about a small time shift by $\delta$. The input is
$$x(t+\delta)=e^{i \omega (t+\delta)} = e^{i \omega t} e^{i \omega \delta} = x(t) e^{i \omega \delta}. \tag{4}$$
By the assumption of linearity, the output must follow the same pattern:
$$y(t+\delta)=y(t) e^{i \omega \delta} \tag{5}$$
because the term $e^{i \omega \delta}$ is simply a constant multiplier. But we know from Taylor’s theorem that
$$y(t+\delta) \approx y(t) + \delta \dot{y}(t) \tag{6}$$
provided $\delta \ll 1$, where $\dot{y}$ is $dy/dt$. So we have
$$y(t) + \delta \dot{y}(t) \approx y(t) e^{i \omega \delta} \approx y(t) (1 + i \omega \delta) \tag{7}$$
leading to
$$\dot{y}(t) \approx i \omega y(t). \tag{8}$$
In the limit $\delta \rightarrow 0$ this equation becomes exact. It is a simple first-order differential equation for the output $y(t)$, which has the general solution
$$y(t) = A e^{i \omega t} \tag{9}$$
where $A$ is an arbitrary (complex) constant. This is the result we want: the output must indeed be a sinusoidal wave at the same frequency $\omega$ as the input, scaled by a factor $a$ and phase-shifted by an angle $\phi$ where $A=a e^{i \phi}$.