In a couple of my linguist positions they had me doing exactly this (for audio recordings and written texts, not in person). I was a success at it mostly because we weren’t dealing with a wide-open worldwide scope, but with certain fairly broad regions of the world, and I had concentrated on studying as many languages as possible from those regions. I would start by first identifying the language family, and then picking out more specific clues that distinguished languages within the family.
To take a random example, all Indo-Aryan languages share a range of common phonological, lexical, and syntactical characteristics. Someone who’d studied at least of few of them could recognize the Indo-Aryan group immediately. Then if you heard implosive consonants, that would pinpoint it as either Sindhi or Lahnda. At this point I would ask them to call in the Sindhi specialist to verify and begin translating (if it was Lahnda, she knew that too). I was able to nail it 100% of the time, because the scope of the work included a limited number of language families, all of which I’d already familiarized myself with, down to the specific identifying features of each language.
There were maybe only 100 or so languages within the geographical scope we worked in. If it had been worldwide, this would have been much harder, though I could make educated guesses based on my sketchy knowledge of the phonology of Eskimo-Aleut, Mon-Khmer, Niger-Congo, etc. The written texts whose languages I was asked to identify were much wider in scope, but in the few cases where I couldn’t recognize it right off, Google (as explained by psychonaut) would reveal it in short order. The only times the Google method failed me was when it wasn’t a real language (gibberish) or encrypted. I got a good laugh every time they brought me the “Lorem ipsum” text, which completely baffled the other staff, no one else had heard of it. I could also recognize all the writing systems right off (I had a copy of The World’s Writing Systems on my desk), which helped greatly.
The DC metro area where I live is very ethnically diverse with immigrants from around the world, so anywhere you go you can overhear conversations in many different languages. I could make educated guesses like “that’s probably Quechua” or “that’s definitely Slavic,” but it’s harder trying to identify snippets of overheard speech from passersby on the fly, without stopping them to ask, which I don’t. Once at a picnic, though, I asked some nearby women what language they were speaking and they told me Yoruba, and by listening, paying attention, and remembering examples like this, I’ve continually built up my knowledge base.