The other day, I was looking for the book “Buddenbrooks” by Thomas Mann on the Project Gutenberg web site. It was published in 1901, so the original German version is out of copyright, but I was wondering if they had an English translation on the web site as well.
Unfortunately, I could only find the German version. Google Chrome offered to translate the page, so out of curiosity I tried it. The result was a bit better than I expected, probably good enough that I could get the gist of the novel if I wanted to read it.
But it made me wonder: if one of the data sets that Google Translate used for trained their algorithm was an existing English translation of “Buddenbrooks” that is not out of copyright, then would it theoretically be possible that asking for an English translation might return chunks of copyrighted text? And if that happened, who would be to blame for the copyright violation (if any) – me or Google?
It’s an interesting question, and the answer is likely to be unsatisfying.
Ground rules - copyright only protects a few things, and one of those things is copying. If you separately come up with the same text, but without copying the original text, you do not infringe copyright. This is unlike patent, where even if you come up with the exact same invention without copying the original invention, you still infringe the patent.
By this rule, if you did not directly copy the English translation, you should be fine. However, there is such a thing as “indirect copying”. See Thomson v Barton, where a kiwifruit container was deemed to be copied, even though the manufacturer was designing independently a container that met the requirements of a kiwi fruit container specification.
By feeding the german version into the machine which was trained on the english version, it could be possible that you would be indirectly copying the english version.
Of course, AI is a fast developing area, and I personally think that the whole area is due for some disruption as AI becomes more widespread and more cases appear in the courts. There just isn’t enough certainty now to advise on these things.
Can a computer-generated translation be, itself, copyrighted? If I run a public-domain foreign-language text through translation software, can I copyright the result?
IANAL - if you take a photo with a digital camera - the result is copyrightable. If you photoshop the results, also copyrightable. So using a computer or electronics to create a new work does not invalidate copyright. However, IIRC, the phone book is not copyrightable since there is no creative input; it is just a collation of data. Which one is Google translate?
I’m going to guess, not copyrightable. No creative input. the real value of a human translation is the care that the translator (allegedly) puts in to get not just the content correct, but also capture the essence and the mood of the original; it may not simply be word for word replacement. There are nuances which presumably, mechanical translation may not be able t match.
For example - not a political jab, but a valid translation issue: Putin allegedly called Trump “brilliant”, which Trump took pride in. Some commentators said the correct translation of the word was more closely “brightly colourful” with overtones of lively character, not intellect. Choice of word can seriously nuance the translation. This is the original work that a translator brings that can be copyright.
It would seem to me if you did your own human translation of a public domain work - that would be copyright. However, if you contest someone stealing it, you would have to show how it was just lifted rather than they did their own translation. (I.e. introduce a few hidden embellishments that could only come from copying).
However, if all you have is Google Translate output - you’ve done nothing creative. Anyone else could do the exact same thing, and get the exact same result.
Look up the ‘terms of service’ or license agreement for Google translate. That’s where you’ll usually find copyright information about material derived from or produced with a tool.
You would be fairly safe. As far as you KNOW the German version is out of copyright. The only evidence you have of a possible copyrighted English version is the quality of output from Google translate. If that material were protected by copyright, Google would be obligated to inform you of this fact and to clearly distinguish between copyrighted and noncopyrighted material.
Google is one of the heaviest recipients of DMCA notices (in terms of # of complaints received), so they’re pretty good about things like that.