Are our devices listening to us?

Look at it this way - if they’re not bugging you yet, you can be damned sure they will at some point in the future. It’s too ripe a plum to go unplucked for long.

That is a lot of data to store and analyze.

Who is paying for all this? They would be losing money on me.

Storing and analyzing data is Google’s entire business model.

Stranger

Alexa is listening for keywords before it wakes up and does its magic. While it’s possible that it’s saving some snippets of sound from other times, you would definitely be able to detect massive amounts of data being sent for analysis and collation.

Yeah, that would be on my dime…er, bandwidth.

Not really. State-of-the-art audio codecs can achieve remarkably low bitrates on speech. Opus, which is based on SILK (the same codec developed and used by Skype) can store speech using as little as 0.7 Kb/s. (The quality will be terrible, but in theory the speech will at least be intelligible.) This means that even if you had a personal assistant that you talked to for 24 hours a day for an entire year, it would take only 2.6 GB to store all the audio data. If you had a million people talking to a million such devices, that’s still only 2.5 PB of data, which is well within the capacity of a large company to store and process. (By comparison, at the time Microsoft retired Hotmail, it stored 150 PB of user data. Google and eBay are estimated to process about 100 PB of data per day.)

Of course, this isn’t to say that personal assistant manufacturers really are storing all your speech data indefinitely. They might not be doing it at all, or they might be doing it only for specific users for specific purposes. But they certainly have the technical ability to do it to everyone.

Given how little data is required to store speech these days, I’m not sure that you would be able to detect it, particularly if you didn’t know exactly what to look for. It would be interesting to do a controlled experiment, though—buy two identical devices and configure them the same way. Give them the same audio commands at the same time for a week, but keep one of them in a completely silent room while keeping the other in an environment where it is exposed to normal household background noise. Use your router to monitor their network output to see how much data they send and to where. It may not easy or even possible to tell exactly what that data is, as the formats probably aren’t disclosed, and the stream may be encrypted. But you could at least compare the amount of data sent.

The problem is that even this experiment might not control all the variables, or account for all possible behaviours of the device. For example, the devices might be always recording everything, but caching the recordings locally and not transmitting them while the experiment happens to be running. Or they might be capable of recording everything, but actually do so only at random intervals, or under certain conditions, or on demand by the manufacturer. These scenarios can’t be ruled out except by long-term continuous monitoring, or access to the device’s source code, neither of which is very feasible.

I have some experience in this area and I can tell you the about of data it would need to transport of your voice recordings would be in kb. You would never notice it unless you were looking for it and did some sort of forensics.

Well, obviously Alexa and Google are “always listening”, but I believe they are listening locally for the “listen command” (e.g. “Hey Google”). I don’t think anything is being transmitted until that happens.

If it was constantly streaming audo to Amazon and Google, that should be readily visible using a network traffic monitor. I haven’t bothered, but I’m sure somebody has.

Alexa is listening only for the wake word. This does not require communication with a server. It can handle that part locally.

One of the consequences of this is that you have a very limited number of options for wake words. Amazon has tested out and optimized things for their select list so that the waking process is local only.

Note that Alexa devices need to connect to a server from time to time to get firmware updates and such. But none of that requires an audio data dump.

There are a bunch of people who like to analyze these devices and see what sort of data is being phoned home. Nothing off has yet been found for Alexa and Google devices.

And these are people who find cheap Chinese cell phones calling home with lots of very private data. So they know what they are doing.

I have an Amazon Dash Wand which is so primitive that you have to press a button to wake it. A lot of design decisions are based on cost. How cheaply can you make one of these devices and still make it work? So forget it doing things that it does need to do.

I also have a Google AIY voice kit with an old spare Raspberry Pi 2b inside it. Cost $10 at MicroCenter.

To turn a Pi into a Google assistant the core parts are a dual-mic board, a speaker, a voice hat board and for some reason a LED-push button switch I don’t need.

It’s the hat board that does all the magic in terms of pre-processing the speech. And since it’s part of a $10 kit, the hat ain’t all that tricky of a board.

Again, if this was recording and forwarding all the time there’d be several thousand Pi enthusiasts who would notice this.

Beware of the RAT!

*Remote Access Trojan.

Most of this conversation has been referring to home assistants but your smart TV may be monitoring you too; with your permission if you don’t read the terms and conditions.

Your smart TV might be monitoring you even for things that you haven’t given permission for, and even if you disable the monitoring in the TV’s configuration. LG was the subject of a scandal a couple years back where its smart TVs were found to be scanning for files shared on customers’ local networks and sending the information to LG, even when the TVs were supposedly configured not to phone home with usage data. Further details and discussion in this thread: Can’t cable companies tell what I’m watching? How is Nielsen still relevant?

”An off-switch!..She’ll get years for that.”

Stranger

[Me] “Alexa, turn off the lights.”

[Alexa] “(clap clap)”

It’s probably a coincidence. Yes, there are several different devices out on the market now that listen to you (I have an apple iWatch for instance), but they are listening for key words to wake up (mine uses ‘hey, Siri’). They aren’t streaming your entire conversation back to some narrow AI to process and see if you mentioned cowboy boots and then direct ads your way. Not that they couldn’t do this, but you’d almost certainly notice the extra bandwidth being used for this, especially if as my watch it was on an LTE cellular data network where you are paying for use. Even if you are talking an internet device it would be a noticeable amount of bandwidth, and most of it would be garbage…hell, some of it at least would be your TV playing, which I doubt the current crop of narrow AI’s could differentiate. There would be too much dross for it to be worth while to dedicate processing on the off chance some random conversation would lead to a sale of boots to someone.

That said, brace yourself, because eventually this is probably going to happen. You could do all of this today, and if bandwidth prices drop further and AI improves, especially distributed AI, you could have personal devices able to really sift the dross for the nuggets and make it worth a companies while. Today, I can’t even consistently ask Siri (or the Google version that I’ve used…or Microsoft’s) a question and have it answered in a rational way in a lot of cases. When I first got my watch I used to ask it stuff all the time, but the odd replies I get means I only ask it very narrow questions now, or to do very specific functions.