How do academics looking for plagiarism define plagiarism?

Hi,
I understand plagiarism is passing some text (comment/analysis etc) off as your own, but what if one uses of well-worn phrases or an insight that may be commonplace today without attributing that phrase or insight to someone one has read or borrowed it from? In other words, where is the boundary between plagiarism and freedom of thought (considering the fact that we all acquire some phrases and insights from sources we no longer remember)?
davidmich

There is a free program called viper that will scan an essay for plagiarism. Different universities use different methods, but I guess it might ultimately depend on how good the tutor is at spotting it.

Of course you can plagiarise as much as you like so long as you cite the quotes?

Expand on the idea. Do not throw it out as something that will help what you are saying, but explain why you are putting it in there to help you say what you want to say.

I would say that in the vast, vast majority of cases there is nothing borderline about it: it’s whole sentences or paragraphs lifted directly from text that is never cited anywhere in the paper, so it’s not a matter of “I guess that phrasing just stuck in my mind” or “I just forgot to cite that quote, but I cited the others.” Plagiarism tends to be egregious.

Ambiguous cases are more likely to get a discussion, and spark a closer scrutiny of that students’ past and future work. If there is a pattern of lifting phrases and ideas, it usually will pop out at that point.

I’ve heard of companies like Turnitin, and Viper–but how effective are they, and who uses them?

May I widen the thread a bit? Not just how to define plagiarism, but how to catch it?
How has the problem of plagiarism changed over the past few years? And how has the definition changed?

I went to university before the internet existed, so plagiarism was pretty difficult to do, in any case. If you had a friend who had taken the same course a year or two earlier, he might give you a copy of his typewritten paper; basically there was no other way to access* somebody else’s work.

But today, obviously, anybody–whether 11 years old or 31- can google a paper of whatever length he needs, and turn it in as his own. How do the teachers prevent it–at all levels–middle school, high school , university?

Also–can Dopers currently studying at a unversity tell me stories of plagiarising ? (" friend of a friend" anecdotes welcome–but this is GQ–so try to put them in factual context–i.e. how common is it on your campus?)


*And the verb “to access” didn’t exist , either :slight_smile:

I teach high school, and we do 1) a lot more writing of our writing in class than we did when I was in school and 2) papers tend to have much more specific requirements than was typical 20 years ago. I don’t have to “prove” a paper was lifted off the internet if it earns an F anyway because it doesn’t do any of the things it was supposed to do.

On the other hand, and more philosophically, I don’t really consider it my job to catch cheating. I mean, I take all reasonable precautions, but my job is to teach, not to accurately assign grades. My energy, my time, my worry goes into providing the best learning experience I can for my students, and I am not going to cut that short to spend time making 100% sure that little Johnny didn’t get a grade he didn’t deserve. I don’t really, deep down inside, care. Most habitual cheaters are clustered around the bottom-middle of their class, anyway, because they can’t cheat on everything, and they bomb the things that they can’t fake. So they aren’t keeping some diligent child out of college. And beyond that, it’s not worth taking time from that diligent child to make sure the cheater doesn’t end up with the C+ instead of the C- they deserve.

In a college course, where there is just one paper and 2 tests, it might be very different. But even then I suspect that the kid who rips off a paper whole-hog also tends to fail those tests.

That’s an interesting angle: How does it work?
Do you assign very specific questions to write about? I’m imagining an assignment like “explain what the author mean by the sentence…xxxxxx, who said it , and why.”
That may prevent a high school kid from googling an essay answer , but does it mean you have less of the general discussions such as “what is the theme of this book?”

Two anecdotes. A student once did a paper on Japanese temple problems and had some really neat illustrations. But they looked familiar and I looked up a Scientific American from about a half year earlier. It turned out that the illustrations were identical to those in the SciAm article. The text was rewritten from the actual article, but no actual sentences were copied. Had the student cited the article, it would have been hard to make a plagiarism case (although copyright infringement might have been an issue). But not only didn’t he cite the article, his bibliography consisted of three of the four references in the article. The fourth citation in the article was to something written by the author of the article. So it seemed clear to me (and to the dean) that he was trying to avoid even mentioning the name of the article’s author. Of course, had he cited the article, I would have given him a low mark because he had obviously put very little into it.

The second case is when I was accused of plagiarism. By a French mathematician who accused me of plagiarism for having worked on the same question that he did and had published in a private journal that only he and his students ever published in and was not distributed out of France. I had never seen it and obviously my paper had nothing in common as far as the exposition (there was certainly an overlap in the results–mathematics is like that). He wrote to the editor of the journal where I published it accusing me of plagiarism (I assume that’s what “le plagiat” means). As it happened, the journal I published in was also in France.

To me, failure to cite closely related results is not plagiarism, but poor form.

My mother was a high-school chemistry teacher (now retired). One year (this would have been about 10 years ago), she asked her students to write a brief biography of a chemist of their choice. I remember being home over the Christmas holiday showing her how to Google unusual turns of phrase from the submitted papers, and usually getting hits pretty easily.

In subsequent years, she changed the assignment to, “imagine you are a famous chemist (of your choice). You have come back to your high school to accept an award as a distinguished alumni. Write your acceptance speech.” It forces them to write the paper in a particular, uncommon way, which means they can’t just lift a biography of Duvoisier off of Wikipedia.

We use Turnitin here on campus. What it does is compare the paper with text from sites on the Web, standard reference books, and other papers students have submitted to TurnItIn. Each paper gets an “originality score” which tells what percent of the paper is found elsewhere, and gives an indication where the similar text was found. It’s supposed to ignore properly footnoted text, but that’s hit or miss. Scores of 20% or less are usually routine and don’t indicate plagiarism, since they can be properly referenced, or there’s just only so many ways you can say a fact (for instance, “Abraham Lincoln was shot by John Wilkes Booth in Ford’s Theater” probably shows up in hundreds of papers discussion Lincoln).

Scores closer to 100% indicate some plagiarism – or sometimes the student was allowed to resubmit a edited paper and TII compares it to her earlier draft. The number is not meant to be an absolute judgement, but to flag the paper so the instructor can determine if this is plagiarism.

I don’t think the definition has changed: it’s passing off someone else’s work as your own. But the problem has changed due to the Internet. Plagiarists are basically lazy, and it’s pretty easy to find text on any subject on the Internet and copy and paste it.

As for finding plagiarism, it’s actually easier to determine it now than it was in the past. You may not recognize something taken from an obscure book, but now, since the source was probably taken from the Internet, you can Google it. Even before we had TurnItIn, professors would search on phrases and sentences they thought were plagiarized, and would find the source. That’s how you can do it today if your school doesn’t have TurnItIn (or WebAssign).

My school (Western Governor’s University) also uses TurnItIn, which is integrated with our Taskstream website where we turn in our work. They encourage students to run a TurnItIn report for our work before we submit it, and to do rewrites as needed. TurnItIn makes it super simple to do this - it literally highlights the words that match other sources, so you can go in and jigger with the wording or add any quotation marks and citations you may have missed.

Anything coming back with > (Greater than) 30% matches in general or >10% matches for a single source needs to be rewritten before submission. There is some wiggle room allowed by some teachers for some assignments. My last one was a powerpoint presentation with bulleted points. So I was allowed to change the TurnItIn settings to ignore References and ignore any strings less than 20 words long. There’s only so many ways to bullet “Fluid Mosaic Model of Plasma Membrane,” after all. At the default settings, I matched another student submission 25%; without the References, 8%. My largest area of match was in red:

Roles of Fat/Fatty Acids in the Body
-Energy Storage
-Cell Membrane Structure
-Absorb and Store Fat Soluble Vitamins
-Insulation and Thermoregulation
-Cushion Vital Organs
-Cellular Signalling
-Aids in Inflammatory Immune Response
(Sanders)

So then I get to decide: am I okay with that, or do I want to reword it? Given the bullet thing, I decided to let it be and submitted it. If it had been a narrative report, I might have changed it up a bit, but the reason we match is that’s just the best way to say it for effective and concise communication.

To be honest, I was a bit surprised at how easy it was, and how *little *my ~200 word PowerPoint matched everyone else’s ~200 word PowerPoint. I was expecting much more of a headache.

If you are writing the paper, the general rule seems to be, “Talk to the instructor (if for a class), to the advisor (if writing a thesis or dissertation) or the editor (if for an academic journal)”.

There is a body of actions that are considered plagiarism almost anywhere, including wholesale copying and extremely tight paraphrasing. Beyond that, there can be wiggle room that is dependent on the rules of the context in which you are writing. For example, some classes have a hardliner approach against reusing material you had previously written (e.g. so no beefing up that high school paper with additional citations and analysis and turning it in for that 200 level college class), but some are ok with it if you ask the professor and agree to do at least X amount of new research, since the point really is to prove you can write an essay, not that you wrote the essay during the semester.

Ha! As a university teacher I have had a couple of pre-internet cases where a student had simply copied large chunks from the readings I had assigned the class. In the first case some of the wording was changed a little bit. In the second, he didn’t even bother to do that.

I also had a case, at a very elite (and strictly meritocratic) university, where one student told a classmate (and roommate, I think) that he would hand his essay in for him, then copied out his classmate’s essay with some minor alterations (mostly making it less grammatical, or less well expressed) then handed both essays in to me in person. The fact that they were next to each other in the pile when I graded them, made it easier to spot the similarities!

Of course, those were the ones I caught. I expect many students got stuff past me by copying from sources with which I was less familiar.

Yes it did.

What action, if any, did your editor take?

You would probably be surprised by how little it takes for a particular phrasing to be unique.

Take, for example, the previous sentence (or this one). Put either one in to google with quotes around it, and there are 0 results found. No one has ever written either one word for word (in a source google knows about, which is rapidly approaching all public written knowledge). Yet, neither one uses particularly uncommon words or is expressing a unique idea. Most people’s working vocabulary is 10-20k words, and the number of ways to combine those into sentences of many words is really big.

Even short sentences that retread and echo very well known writing often get no hits (“For sale: Chevy truck, never ran” is only six words long, they’re all common, and only three are changed from the original, yet still: 0 hits)

It’s quite easy to determine plagiarism in the vast majority of cases. Sure, an automated system might get a false positive or two by using a very common phrase. But anyone grading papers should be sufficiently well-versed in the topic to recognize common ideas and statements. And generally, someone’s not going to plagiarize a single line. They’re going to grab entire paragraphs and modify them slightly. After all, putting together a bunch of sentences from different sources and changing the wording of each one is probably more work than just writing the damned paper.

Law students are particularly in danger of being cited, a lot of the time we actually have to lift whole passages verbatim from case law, and sometimes the citation is really obscure. Or in law French.

I tend to catch deliberate plagiarism in two main ways:

  1. When the writer’s style or voice shifts in the middle of a paper. If you don’t read or write much, you don’t have any idea how obvious this is to those of us who do read and write a lot. Equally, the game of just changing some words to near-synonyms to fool plagiarism software sticks out like a sort thumb.

  2. Stupid cut-and-paste errors, often to do more with formatting than the words themselves: smart quotes go to straight-up-and-down quotes, or the font size or spacing shifts.

Maybe not for papers, but when I was TAing a programming class in grad school almost 40 years ago one of the other TAs developed a program to compare two programs. It could handle the plagiarist renaming variables. We didn’t catch too many people, but I’m sure the existence of the program helped reduce the incidence of plagiarism.

And stupid. In my experience one of the big reasons to cite closely related work is that the author of that work may be reviewing your paper, and it is good to butter her up.

I’ve noticed that long ago. Rather common and rather short sentences get no result. However, I had assumed until now that the reason was that a google search was far from complete (for instance that it searched only for, say, 1/50th of a second beginning with the most linked to pages).

Are you sure that a google search actually peruse all accessible material?