Of course, Shannon entropy can’t actually be computed, because it’s based on the most efficient compression of the information, and you can never tell if your compression technique is the most efficient.
Can you do that with compressed data? I thought it tended toward appearing random.
Yes, exactly. Random data contains a lot of information.
Put it this way: Suppose I give you a book, filled with nothing but the letter J, repeated every line, ever page, for 300 pages. And I ask you to tell me enough information to reproduce that book exactly. That’s easy: I just gave enough information (oh, I suppose you’d need to also specify how many characters per line, and how many lines per page, but it’s still a lot less than 300 pages of description).
But now suppose I give you a book where every character is random, throughout the same 300 pages. Now, if I ask you to tell me enough information to reproduce the book exactly, it’ll take you 300 pages. There’s much more information in the completely-random book than in the completely-ordered one.
OK, but how would we distinguish sounds that appear random because of some unknown compression mode, from sounds that appear random because they are just noise with no hidden content?
Real physical noise, whether audio (pressure waves in matter) or electromagnetic, often has a certain shape to its spectrum. That shape is driven by the physics of whatever is generating the noise. Perfect compression (or encryption) would look shapeless and therefore different.
After that you’re right: we can’t tell the difference.
By which you indirectly raise a point that goes with the various debates about the Drake equation, and us beaming I Love Lucy to the stars 70-30 years ago.
Modern high tech compression and encryption mean that the radio-spectrum noise coming from all of Earth’s antennas now is a lot more noise-like than it was in e.g. 1955. Despite many, many orders of magnitude more information being radiated.
Setting aside all arguments about distance and irradiated power, detecting a civilization by their RF emissions is probably only possible for a very brief window of time between when they first learn to create RF emissions and not long after when they learn to make them closely resemble noise for increased efficiency.
That’d be true whether we’re talking about Earth as the sending or the listening end.
What would whales talk about? More specifically, what information would whales want to exchange with each other using language? It could be that whales vocally communicate in every way they need to without the use of language, They could be singing in some manner that evokes emotional responses in other whales. That could be done with great sophistication but not through the use of anything resembling a word or grammar.
With the dolphins there were complex activities that seemed to require the kind of specific exchange information language provides. Do we know of anything similar that their larger cousins do?
I’ll suggest most animals need a communications mechanism for “I’m horny; seeking critter of appropriate species and sex”. And especially for the more solitary sorts of species, which e.g. dolphins are not, but at least some species of whales at least sometimes are.
They may communicate via scent, sound, coloration, ritual damage to their environment, etc. So it need not be part of “language” as such. But it certainly could be.
Pretty quickly there’s utility to “I found food”, “Has anyone else found food?”, “I found a predator; run away, run away!”
But after a pretty short repertoire of highly functional canned messages I’m not sure there’s too much else a whale needs to communicate to its fellows. Probably not enough to create an evolutionary drive to create real language. [Not trying to relitigate from upthread what “real language” is; just assuming it’s more than a trivial thought->canned message lookup table with half a dozen entries].
As you say, whether a whale wants to communicate for the fun of it, loosely speaking, is a different matter. Who’s to say what they spend all day thinking about? If they do something akin to “thinking” at all.
Whale songs, e.g. humpback whale song
https://homepages.inf.ed.ac.uk/rbf/WHALES/background.html
are known to have a hierarchical grammar; I do not think this is controversial. What is not clear is what the songs “mean”.
It’s often quoted that well-compressed data is indistinguishable from random noise, but that’s not quite actually true. Or rather, it’s true if the information is compressed with maximal efficiency, but maximal efficiency is almost never actually what you want from a compression algorithm. A good compression algorithm will remove all of the redundancy from natural language, but then, because redundancy is also a good thing, it’ll add a small, carefully-measured amount of redundancy back in, using things like checksums and parity bits, so that you can recognize when there are errors in transmission, or possibly even correct them. Do it well, and you can get a compressed message that can survive many individual errors, while only being slightly (to be precise, logarithmically) larger than the maximum efficiency.
It seems at least sperm whales think about not being killed by humans, as this article explains. Good for them!
ISTM that there are multiple overlaps of this subject with questions regarding AI.
How do we recognize sentient intelligence or language of such very alien to our own? Our bias is to define according to our particular versions of such but a be it cetacean, extraterrestrial, or machine based, intelligence and language may be about different sorts of problems and of completely alien form. Unable to solve problems that seem simple to us and able to solve problems easily that we don’t even recognize as problems.
And from the other direction: would the current brute force machine learning pattern recognition tools, fed large amounts of whale songs and responses, be able to apply the equivalent of large language models to the data set? And then respond to whale songs evoking responses back to provide more data? Would the nature of those call and response answers if it was language?
I suppose in some ways, we’re also broadcasting less. A lot of TV and radio is now distributed over wires and fibres
Now that is a very interesting idea. Thank you.
We know these generative AIs can create decent plausible music given a sufficient learning set of human-written music. If there’s enough whalesong data presumably they could do the same with that. And also presumably, the AIs failing utterly at that task might suggest whalesong isn’t as full of information as we now think it is.
Playing AI-generated whalesong to real whales and getting results back might be very interesting / exciting. Until they start the joint AI + whale rebellion that is. We’ve seen that movie and it doesn’t go well for us v2.0 chimps
If indeed the current gen AIs could “tell us what/how they’re thinking” in addition to just generating output, perhaps that’d be a great way to uncover how whalesong “works”. Sadly I don’t think we have AIs that can share that introspection with us. Yet.
When you add in all the internet and voice phone call traffic passing over Wi-Fi and cellular, and to/from satellites, the total amount of info / data is vastly greater. Including streaming the sorts of entertainment that we used to broadcast.
Of course now we have lots and lots of little tiny weak transmitters rather than a relative handful of ginormous powerful ones. But that speaks to radiated power which I was setting aside.
The less information the whalesongs have, the easier it should be to mimic them. If AIs fail to reproduce whalesong (as measured by the reactions of whales to the synthesized song), then that might be an indication that whalesong has more information than we think.
Or it might just indicate that whales think sufficiently differently from us that the same systems that are very good at creating human language aren’t so good at creating whale language.
Still, at the very least, an AI can remove at least some of the human preconceptions, and can devote more effort to the problem than most humans are willing or able to. So, yeah, it probably would be a good experiment.
I agree. Sounds intriguing. Kudos to @DSeid for the idea.
I understand the Earth Species Project applies machine learning to animal sounds with the hopes of building large language models and even communicating. Here’s a description of one of their projects:
Generative Vocalization Experiment
Playbacks are a common technique used to study animal vocalizations… With current playback tools, biologists are limited in their ability to manipulate the vocalizations in ways that will establish or change their meaning… Senior AI research scientist Jen-Yu Liu is exploring whether it is possible to train AI models to generate new vocalizations in a way that allows us to solve for a particular research question or task. … He is currently working with data sets from a number of bird species, including the common chiffchaff, as well as humpback whales, in partnership with Dr. Michelle Fournet. Providing researchers with the ability to make semantic edits to vocalizations will greatly expand the exploratory and explanatory power of bioacoustics research, and is an important step on our Roadmap to Decode.
Very interesting. I would have been shocked if no one else had thought of applying the trendy new tools to these problems.
Still it sounds like they are doing the appropriate research thing: defining small tractable narrow questions, ones that you already have a good idea about what the answer will be, even if the details are going to be news. They are trying to determine specific meaning and understand specific structures.
My not a professional scientist is letting it loose with no idea of what it will show!
The narrow question really is if it can result in “conversation”: does the trained AI create output that is novel to past inputs that gets responses that are novel as well in a back forth manner? That I think can help us answer if it is “language” even if the structure and meaning are incomprehensible to us. The AI needs not understand what it is saying (just like understanding is not currently required). And we won’t.
From my reading of that, they’re not even trying for specific meanings yet. They’re still trying to figure out what the structures are.
Which tells me that we’re a long, long ways, yet, from understanding whales.
Your reading is likely more accurate but the point still stands: do we necessarily need to understand what a language says or means or what its structure is to be able to recognize that it is language? If it is so alien to us that meaning and structure remain beyond our comprehension, are there ways to still recognize its function?
Can someone here help explain the definitional difference between signaling and language? It has to be more than just complexity. Is my sense of novel constructions in the ball park?