Some bad news on the whale front:
I often think that many animals exist in a realm that shapes their language in ways that so far we cannot decode. We are looking at the sounds they make from too different a perspective.
Sea creatures would at a basic level involve three dimensions much more in their language as to locations. Humans live on the surface. A more two dimensional realm. We seldom refer to altitude in things that involve location. Sea creatures exist in a moving medium. Currents. They are like pilots in an aircraft. Continents remain stationary. But the paths among them are flowing. So how do they speak locations and paths? Can we stationary beings decipher the concepts of position of those who live in a constantly flowing domain?
We also have developed our linguistics in respect to things that we control, even create. I think it is probably a more static based language. Lots of solid anchors for things. Sea life exists in obviously a more fluid realm of time and space. Maybe the most extremely nomadic and least constructive surface dwelling people might have some linguistic commonality with sea life. But even then they are two dimensional and construct things.
Also.
How might animals incorporate senses that are more obvious and important to them into their language?
If they are very conscious of geomagnetic influence for direction and location. Depth and pressure.
Compared to many animals we have big blind spots that we have not had to incorporate into our language.
Accept that we cannot decode. Are there still functional characteristics that we can recognize as language?
It’s a good point, but one solution is to incorporate the location and additional sensing data with the audio data during training. It complicates data collection and reduces the value of existing data. However if ocean currents are a popular whale topic, these would be easier concepts to extract.
The really hard part is that it isn’t a translation problem, but an understanding problem. There isn’t a Rosetta Stone (the OG or the SW) to map whale to human. It’s more like WWII cryptography where we expect certain concepts, like islands or bases, to occur more frequently.
Humans have senses for detecting relevant features of our environment, too, but incorporating data from those senses wasn’t necessary for training AIs to do a convincing job of producing human language. We could almost certainly do a better job of it if we had, and the same is doubtless true for whale communications, but it wasn’t needed.
Then again, or corpus of human language is a lot larger than our corpus of whale language, and it was also in a highly simplified version of our language, which simplification whales don’t have.
Yes, but it’s not exactly the same problem.
A language model, without a context engine, degenerates to ‘given the prior words, what is the most likely word to come next?’ it’s trained on a corpus created by humans and the inference is consumed by humans. In most cases the same language is used for both training and inference.
We could try the same for whales provided we knew the equivalent of word boundaries. We could generate whale sounds that they would recognize, but we still wouldn’t know that it means. It would be like a language model trained on Rot13 text.
To understand whale language, we would have to include whale context to find the associations between whale words and concepts.
Even discovering whale word boundaries probably requires context. There was a discussion up-thread about how hard that might be. What’s the signal, what’s the carrier, what’s the noise?
Admittedly I might be going off topic from the OP.
As I understand it, for ChatGPT and its ilk, the only context it had was more text. It doesn’t know the context outside of that, and that imposes some real, fundamental limits on it, but with enough content, you can do a really good job of faking the context.
For what it’s worth, ChatGPT was also trained on Rot13, and understands it… but it understands it as a separate language from English.
That was my understanding, but I haven’t read the paper. Also I wasn’t sure if there was a knowledge base or context engine used during inference to direct the results.
Right, ChatGPT can learn and repeat Rot13, but it looks like noise to us. Similarly we could train an AI to learn and repeat whale, but it would still be noise to us humans. Even with such an AI, it would be difficult to determine if whales have language. We might be able to see how whales react to generated snippets.
My thought is that by providing the AI additional sensing or context we would be providing additional input features besides the whale audio. And while the whale audio is a mystery to us, the additional features are not. We could use those features as an incomplete Rosetta Stone – for example every time a whale is in this ocean current, we get these blurbs of audio.
Even if whales have a language, it would not be that straightforward since whales likely think very differently than humans.
Roundup of Whale Communication Research
I’ll add emphasis.
iScience Perspective June 17, 2022: Toward understanding the communication in sperm whales
Machine learning has been advancing dramatically over the past decade. Most strides are human-based applications due to the availability of large-scale datasets; however, opportunities are ripe to apply this technology to more deeply understand non-human communication. We detail a scientific roadmap…
https://www.cell.com/iscience/fulltext/S2589-0042(22)00664-2
Submitted on 20 Nov 2022: A Theory of Unsupervised Translation Motivated by Understanding Animal Communication
Recent years have seen breakthroughs in neural language models that capture nuances of language, culture, and knowledge. Neural networks are capable of translating between languages – in some cases even between two languages where there is little or no access to parallel translations, in what is known as Unsupervised Machine Translation (UMT). Given this progress, it is intriguing to ask whether machine learning tools can ultimately enable understanding animal communication, particularly that of highly intelligent animals. Our work is motivated by an ambitious interdisciplinary initiative, Project CETI, which is collecting a large corpus of sperm whale communications for machine analysis. We propose a theoretical framework for analyzing UMT when no parallel data are available…
[2211.11081] A Theory of Unsupervised Translation Motivated by Understanding Animal Communication
Submitted on 20 Mar 2023: Approaching an unknown communication system by latent space exploration and causal inference
This paper proposes a methodology for discovering meaningful properties in data by exploring the latent space of unsupervised deep generative models. … We apply the methodology to test what is meaningful in the communication system of sperm whales, one of the most intriguing and understudied animal communication systems. We train a network that has been shown to learn meaningful representations of speech and test whether we can leverage such unsupervised learning to decipher the properties of another vocal communication system for which we have no ground truth. The proposed technique suggests that sperm whales encode information using the number of clicks in a sequence, the regularity of their timing, and audio properties such as the spectral mean and the acoustic regularity of the sequences. Some of these findings are consistent with existing hypotheses, while others are proposed for the first time. …
[2303.10931] Approaching an unknown communication system by latent space exploration and causal inference
Incredible.
Project CETI is pulling together the great and the good. Partial list of institutions: UC Berkeley, Harvard, MIT, Oxford, Amazon Web Services, Google Research, Microsoft Research, National Geographic Society. Everyone seems to want a piece of this. https://www.projectceti.org/
And all of this assumes words.
Can there be language that doesn’t use words? And if so, can we recognize it?
– there is certainly communication that doesn’t use words.
It could be a language that doesn’t have standard components at the word level.
There could be a complex language combining tail movements and free form musical vocalizations that might still qualify as a language yet not have components at the word level or even phrases, yet still have something like grammar that would provide consistent understanding of the language. Blue whales began to diverge from other whale species 5 to 10 million years ago. With not that much to do except find food and mate, while at the same time having a pretty big brain, Big Blues could have developed an extraordinary language.
OTOH, if intelligent animals can communicate effectively even in a non-language system they’ll still likely have some words. What I mean by a word is some repeated, recognized expression with a consistent meaning. Something that means “Look out!” at a minimum, and probably more.
Such as a name? In the case of dolphins, the “word” is a whistle.
Researchers Find That Dolphins Call Each Other By 'Name' : The Two-Way : NPR.
ETA: I know that dolphins aren’t whales, but in following up on this I see that they have areas of their brains that are associated with the use of language (I.e. Broca’s area and Wernicke’s area)
Certainly names are useful. Offhand, I can’t think of an effective alternative to using words for names. The basic words can exist without much else in the way of a language. I doubt that’s the case though.
The trouble is that old thing about what it’s like to be a whale. They don’t need much language to exist, not much need to provide detailed descriptions or instructions. If they’re just talking it’s mostly going to be about food, weather, sex, and children. Which is pretty much most of what humans talk about also, but the experiences of our lives are vastly different.
My first thought is that music (without lyrics) is a form of communication. @TriPolar gave that example as well as fin motions. I assume any language, or communication, can be broken down into primitives – vocalizations or otherwise.
If whales have the capacity for language, but only the need for a limited set of topics, perhaps they apply the excess capacity towards developing a rich set of art. Or maybe they have been telling and retelling an oral history going back thousands of years. Not the most likely options, but at least humans have a tendency to develop to our limits and not just our needs.
Thanks for finding and sharing these reference!
Communication and language are not the same, that is the problem here. For a living organism it is practically impossible not to communicate (simply hiding and staing put may mean “I don’t want to be eaten”), but a language is more complex and strutured. I would set the bar for “language” at translatability (professional deformation shining though here). If you can translate it into another language, it is a language. If you can’t, it is mere communication. As for music: can you translate pop into blues (without words!) and keep the meaning?
Seconded!
The question of what distinguishes “language” from other “communication” is very key.
I found the following to NOT be a satisfactory answer, however.
Native Americans couldn’t translate the European languages when they met. Should they have considered Europeans to not have language?
At best it is an anthrocentric concept: similar enough to human language that it translates to concepts that make sense to us can be language but something that communicates complex concepts unfathomable to humans in varied manners non-stereotyped back and forth between individuals is not.
I don’t buy that.
Our language is thought to have co-evolved with complex tool creation and tool use as linear processes.
Cetaceans’s minds clearly are different than ours, evolved using sounds to create understanding of multiple salient objects relating to each other across vast amounts of three dimensional space across time.
If they have a language it may be more like aurally communicating changing four dimensional maps attaching varying values to different objects with the map inclusive of self, group, specie groups of others, none of which may be translatable to us.
To claim that our minds being that alien to each other such that ours are too different to get theirs defines such hypothetical complex communication as not language? Such would reflect our limitations only, not if such is language.
ETA also thanks to MtM!
Not when they met, maybe, but they did eventually. American languages and European languages can be translated into each other.
I was thinking also of the possibility of images transmitted by sonar. If the image is understood as a whole, there wouldn’t be words involved.
Yes. I think a lot of the question is: are they telling each other stories?
– and if so, are any of those stories fiction, recognized as such? (I don’t think that’s necessary for language – but I think it’s an important fact about human language that it makes it so easy to state counterfactuals. Though other creatures do manage to do so without anything we call language.)
I don’t know. I “translate” the communication of cats and dogs into human language all the time. “I want to go in/out, come open the door!” “I want food! No, not this stuff. Something tastier!” “Give me some pats.” “Leave me alone!” “I don’t like that dog!” “(cat to another cat) I’m about to start a play fight, but it’s just play, I’m washing you first to show that we’re friends.” But I doubt that the cats and dogs are thinking any of those things in words.
And I don’t think a non-human language would need to translate into a human language to be a language.