I’ve been mulling over the topic these last few days and have some more comments (On the off chance that anyone is still interested…) One major point to keep in mind is that biology is the study of insanely complex systems which are not the result of a rational designer. In such systems, very subtle details can be absolutely critical for understanding how the system functions.
It is advanced in some ways. We’ve learned an awful lot over the last several decades, and can tell all sorts of stories about how cells behave. But our understanding is largely qualitative. That’s led to the enumeration of (probably) most of cell signaling pathways, and a general understanding of how those pathways work. For instance, in developmental biology, using model organism genetics we’ve been able to enumerate nearly every mutation that screws up embryonic development. That work led to the discovery of many conserved signaling pathways, and subsequent work has worked out the general outlines of each pathway, which are usually the same in all animals. To answer the question in the OP directly, I’d say these approaches have probably identified a large majority of the signals and receptors in the human genome.
However, our quantitative understanding of most pathways is very poor. When I started research I was very enthusiastic about mathematical models of cell signaling pathways. Now that I’m familiar with the gory biochemical details, I’m less impressed. The models that exist rely on a lot of approximations and empirical fudging, because we can’t measure relevant biochemical parameters in sufficient detail.
For instance, if you knew concentrations, catalysis rates, and dissociation constants for every set of interacting proteins in a pathway, you ought to be able to make a pretty good model of how signals are transmitted through the pathway. But we can’t measure those in a cell. Most of our biochemical techniques require huge amounts of purified proteins to measure those parameters in isolated, artificial conditions. Even worse, those measurements aren’t even very consistent – I’ve seen measurements of receptor-ligand binding that differ by orders of magnitude. And that was with a system were ligand dimers interact with receptor tetramers. If we can’t reliably measure the interaction between one ligand and one receptor, how the hell can we predict the relative interaction between a dimer composed of two different ligands and four different receptors? When there are also half dozen or so co-receptors and antagonists, all interacting at the same time and place?
And those are the best techniques. Most of the techniques I’m using right now come down to “is this blob darker or smaller than that blob?”
There’s a famous essay on the inadequacy of biological approaches to fully understand the systems we study: “Can A Biologist Fix A Radio?” That essay makes some very good points about how we can only learn so much about a system when our most of our best approaches ultimately come down to breaking it and describing the results. However, while I can agree with the desire for a better quantitative understanding, we simply lack the tools to do so with sufficient accuracy. To take the radio analogy further, let’s say that we can figure out how all the components are wired. But we can’t measure the basic properties of most of the components, and the few measurements that are possible are wildly inaccurate. What the hell use is a circuit diagram where we can measure most of the capacitors with decent accuracy, half of the resistors within two orders of magnitude, and none of the inductors, and don’t even know about the existence of diodes?
It’s very common for the cell signaling pathways I’m familiar with, which actually include large “families” of ligands and receptors. In many cases, one receptor can be activated by multiple ligands. For example, the EGF receptor (EGFR) can be activated by both EGF and TGF-α (as well as a host of other EGF family ligands).
To get back to the OP, EGFR activation is a good example of how subtle differences in protein interactions can lead to very different system behaviors. EGFR is a receptor tyrosine kinase. Conceptually, it has one of the simplest methods of receptor activation: in absence of EGF, the receptors don’t interact. When EGF is present, the extracellular domains of EGFR stick to each other, bringing the intracellular kinase domains close enough to interact, which allows them to phosphorylate each other and downstream signaling proteins.
Naively, you might then think that any way to get two EGFR proteins close enough to activate each other would be equivalent. But that’s not the case. In certain cell types, TGF-α can cause cancerous proliferation, but EGF does not. Yet, both bind to EGFR with similar dissociation constants, and result in similar levels of receptor phosphorylation. How can this be? I’ve read some very fascinating structural biology papers showing how EGF and TGF-α binding cause very subtlly different changes in the structure of the EGFR intracellular domain. It seems to be a matter of two short helices, outside the EGFR catalytic domain, that adopt slightly different orientations when bound by EGF and TGF-α. That in turn, results in a very different output: EGF-bound EGFR is taken into the cell and degraded, but TGF-α-bound EGFR stays on the cell surface and remains active.
That’s the sort of subtle detail that, in human terms, means the difference between cancer and normal wound healing. And since protein structure prediction is still not a solved problem, it’s not yet possible to predict such differences even in the case of one receptor that binds two ligands.
And this is just a single pathway. There are all sorts of ways for different pathways to interact with each other, at multiple levels.
That wall of text deserves a tl;dr: Biology is complicated. We’ve probably identified most cell signaling ligands, agonists, and receptors, but we can only describe their behavior in qualitative terms.