Anyone Seen AI submissions?

We’ve all seen, I’m sure, that Clarkesworld and Asimov’s got flooded with AI submissions. It’s not ending there. I just looked at the submitted paper list for a conference in India where I’m on the program committee, and easily 20% of the papers submitted have nothing to do with the topic. I looked at a couple of them and they are hilariously bad.
The best so far has five authors from a real but scam college in Bangalore, two of whom have suspicious names and three of which have emails addresses which are identical except for three digits at the end. The writing is dubious at best. The previous work section has generic talk about “previous work” with no reference to any particular papers, and the fourh of four references has no authors, just a date, a title, and some numbers. I didn’t bother to look to see if the other three exist.

Any one else involved with submissions see this yet?

I received an analysis report of a thermal simulation which was purportedly explaining why the sim differed from flight data, and as I delved into the overly convoluted explanation of statistical mechanics and thermofluid principles applied to what should have been a straightforward heat transfer problem I came to the realization that it was a snowjob of complete gibberish that had passed through several other reviewers who presumably looked at it and decided that anything that complicated to read must be correct (or, more likely, just didn’t want to put forth the effort). When I pressed the author about the fact that the analysis made no sense and why he went to such length to try to conceal it, he fessed up that he’d run variations of the problem through some chatbot repeatedly and stitched up the responses into something that was superficially convincing even though it was actually utter nonsense.

There are real applications for machine learning and abstraction in interpretation of scientific data, and likely even an essential need in order to make progress in areas like protein simulation, cosmology and astrophysics, et cetera, but using generative AI to produce textual explanations is a dangerous and stupid path to go down because these systems aren’t (and likely will never been) validated for producing factual assessments; they just generate verbiage that sounds good but is often semantically void and factually wrong. And they’ll certainly get better at sounding good (and likely at manipulating people who read or view the output) to the point that it may actually be very difficult for even expert peer reviewers in the field to penetrate a dense buffet of bullshit without essentially redoing the research themselves.

I’ve already heard several recommendations for using generative AI to perform some kind of ‘replication studies’ rather than having human scientists perform real work under the thesis that the AI will be much more efficient, but of course if you give a chatbot a prompt to go and validate some experiment through metastudies and simulation, it will naturally respond with a bunch of good-sounding nonsense telling you exactly what you want, complete with a bunch of fraudulent citations and bogus rationale.

I’m seeing a day coming when verified facts are as rare in scientific publications as they are in politics (as if we didn’t already have enough problems with image and data fraud, cherry picking experimental results and p-hacking) and it does not look good.

Stranger

I’m pretty sure my first book was reviewed in a YouTube video by an AI. Does that count?

The review is only purportedly of my book; most of the statements could apply to 85-90% of books written about being genderqueer scholarly articles and studies of what genderqueer people have to navigate in the modern social world. So it’s as if someone had been charged with “write a review about a book that’s about being genderqueer” but hadn’t read it. And it has that AI feel.

My wife teaches at a community college. She gets AI generated assignments all the time. Very easy to identify. The kids earm zeroes.

You are right that the future is bleak when it comes to the reliability of information. Part of the problem is that the harm done by destroying trust in important institutions and systems is vastly under appreciated by the general public and even the so-called experts who participate willingly in unethical and destructive behavior for short term gains.

The only solution I can see is that the repercussions for such behavior become severe enough that people at least think twice. Unfortunately, our society holds students and petty criminals to a much more draconian standard for behaviors that are orders of magnitude less destructive.

AI and unethical fools are currently undermining the whole foundation of knowledge. That’s scary stuff.

Maybe they are going to have to start paying reviewers, since unfortunately some are very superficial in their reviews, not surprising when done for free and if you are over busy anyhow.
In this case it helped that the ones I looked at were not at all relevant to the conference, but since I’m not program chair I didn’t look at some of the papers with relevant titles. For all I know some might be just as bad.
I’m judging an independently published book contest, and a couple of the books had identified AI content. One was as awful as you’d expect. The other had human generated poems with AI generated art. The rules this year didn’t prohibit this kind of entry.
The only solution is going to be requiring AI content to be identified, and a publication death penalty for anyone caught trying to sneak AI in. Kind of like plagiarism is handled.

Unfortunately, I expect the generative AI will get better to a point that it will truly require an ‘expert’ peer to distinguish between real work and impressive-sounding nonsense, and for non-technical work it may become nearly impossible to definitively distinguish work between that of human authors and AI. I’ve seen proposals for textually ‘watermarking’ AI output such that a pattern recognition algorithm could differentiate even if a human reader could not but I doubt there will be broad willingness for compliance. Companies are already trying to integrate chatbots and other generative AI into their workflows to reduce labor costs (headcount) despite how inconsistent and unreliable it is in the name of minimizing expenses, and I’m sure that will continue at an accelerating pace.

Stranger

Peer review has never been perfect. I know of a paper in a highly prestigious journal that “proved” P =NP - and got retracted the very next month.
I think we could train reviewers on what to look for. We might need to send experimental results and not just trust that they are available. It shouldn’t be too hard to get reviewers who know or can look up references. If a Previous Work section from an AI accurately summarizes real papers and where they fall short I think we might need to pay attention to it. Not expecting that to happen any time soon. Right now we trust the authors, so we more or less believe the summaries. With AI we’d have to check for hallucinations.
I don’t believe watermarks are ever going to solve the problem, since people could just copy and maybe even slightly reword the AI output. I’ve seen an article about a method of poisoning source material so that humans see a real paper or picture, but an AI creates (even worse) crap from it.
It takes an expert to distinguish crap research from good research, so if it takes the same expert to distinguish real research from AI research we may not be too far behind. If an AI gets good enough to extract real information from papers and research results, and come up with something new, it might deserve some credit and might actually advance the field. That’s far beyond what we have now.
As for regular business, ever write a press release? My wife does, and I’ve been involved with the process, and an AI could do just as good a job as the average person.
My comments come from hundreds of reviews I’ve written and thousands of reviews I’ve read as an editor and program chair.

Yeah, pretty much. What chatbots generate is hardly flawlessly accurate, but then, so are most press releases. They’re not anywhere near that with technical details in the depth of even a conference paper, much less a peer reviewed journal, but often it is just enough to be sufficiently convoluted that reviewers don’t try to pierce the bullshit, and as grad students and PIs become more desperate to crank out papers even when their research stalls, I can see people pushing enough nonsense that it is no longer possible to go through all of it with sufficient rigor. I have a colleague using a GPT-4 based bot to summarize papers (as a precursor to doing actually lit review) and although the summary is often wrong it is adequately fluent in assembling in technical jargon that what it spits out seems authoritative even when there are obvious errors.

Of course, people point out that chatbots are not physics engines; they are designed to parse natural language in a way that is intelligible and statistically likely to seem like an appropriate response, and so (they say) the fact that such systems can ‘understand’ physics and mathematics at all is an unexpected and emergent capability that demonstrates some deep and broad model of the real world, and thus they are on a path to general artificial intelligence, demonstrate a ‘spark of consciousness’, et cetera ad nauseam. I don’t buy that argument, in part because there is no means by which these systems can actually relate any of the concepts that may be semantically embedded in language (which, as any computational linguist will tell you, has an in-built logic by the way it is used) to phenomena in the real world, and also because the idea of true cognition as it occurs in brains, much less actual sapience or ‘consciousness’ (however you define it) simply doesn’t have the complex interoperating set of mechanisms that occur in even simple animal brains.

What these systems do have is a transformer-regulated neural network trained on an enormous amount of textual data, and thus can generate cromulent-seeming text that approximates a correct answer for the casual reader even if an expert in any particular area can immediately spot errors, if not in fact than in use or underlying logic. But as they have access to more data and get more refined training, I’m sure their capability to seem more authoritative will increase until they become superlative bullshit machines. I am uncertain, however, when or if they will actually become reliable general knowledge generators or parsers, or indeed, how they can ever be validated to be accurate and not produce seemingly correct but actually fake information and data. I personally prefer to focus on the use of machine learning for pattern recognition (or perhaps a better term is ‘divination’) as a tool to assist researchers in managing enormous volumes of data that otherwise could not be processed and evaluated in a human lifetime rather than the nifty trick of a bot that spews impressive nonsense.

Stranger

My employer (which deals with production of regulatory disclosure documents for investment companies in European finance) is trying to figure out how to integrate LLM tools into our processes to accelerate and simplify the content generation steps. The fact that the output quality is demonstrably so unpredictable (ranging from okay to dodgy to total nonsense) has not caused their enthusiasm to flag, despite the fact that these are legal materials for which our clients (and, ultimately, we) will be held accountable. The siren call of “fast, easy, and cheap” is just too powerful.

However, I did make an argument that has given them pause: If we are using tools that anyone can use to generate output that anyone can generate, then why would our clients want to pay us for the work? Our market differentiator is our technical and regulatory expertise, and if we eliminate that, we don’t have a business. That seems to have brought them up short.

(Of course, this leads to the question about whether our expertise in fact does have meaningful market value, and whether an attempt to rely on it will cause our client base to move onto “fast, easy, and cheap” without us, leaving us without a business anyway.)

Nevertheless, I find it astonishing that a business in the legal-compliance sector is apparently willing to put their faith in a robo-writer which not infrequently produces gibberish that is at best meaningless and at worst legally unfounded.

My personal advice, whether you are a lawyer or a mathematician, is that if you are going to cite a paper, do take the time to read it yourself and go through it carefully. This has not changed before or after the proliferation of computer-generated submissions or the Sokal or Bogdanov affairs.

You’d hope that you’d only reference papers you actually read. But there are standard background papers in many fields. IEEE magazines had (maybe still have) strict limits on number of references, while Transactions allow tons.
But people are lazy. I wrote a survey paper in grad school, and five years later over half the papers in the workshop n the field referenced it, since it allowed authors to cite many background papers in one.

Not all sections of a paper are amenable to this. I suspect an AI will be able to generate a superficially convincing introduction section fairly soon, since many introductions are practically identical. I don’t think they’ve taken the first step to understanding other papers enough for a previous work section, which is not a paper summary, but just covers the salient points of the other papers. Grad students seem to be able to do this easily.
I’m dubious that an AI can do experimental design any time soon - the papers I saw didn’t even seem to try to. Results maybe, but they may come from nowhere, did conclusions.
Basically a previous work section requires a lot more semantic understanding than an introduction section.
But someday soon there will be a paper with the results copied from an old Vladivostok telephone directory, mark my words.

Agreed on all points. However, grad students cost money (not much, but still) while generative AI time is cheap, and as universities become more and more about generating ‘results’ than actually producing well-trained researchers or advancing the boundaries of science, the compromise of less useful and possibly wrong publications is the kind of trade that many are willing to make in the name of “productivity”.

As a side observation, I read a lot of older papers because there is much in propulsion that falls under the dictum “everything old is new again”, and it is quite apparent that despite the primitive graphics in figures and typesetting how much clearer papers written even two or three decades ago are than those being published today. In part there is just less jargon and an assumption of discipline knowledge, but in large measure the details are just laid out in plainer language without embellishment or trying to inflate the value of the results, and the data is presented clearly. I haven’t encountered the degree of systematic fraud I see reported in medical, nutritional, and bioscience papers but it would also be more difficult to conceal such machinations even without access to raw data. I think so much of current scientific and technical papers have become so obfuscated that it is easy to believe that a generative AI could produce something of equal merit, even though as you note they are nowhere near being able to actually ‘understand’ or frame original experimental work or observations in a real context.

Of this, I have no doubt.

Stranger

I just got a scientific article to peer review that was obviously written by a chatbot. I didn’t bother going through every little detail, just recommended rejection with a note to the editor “this is absolute garbage. Next time look at these before you send it out to me”.

I have better things to do with my time.

Pointless sidetrack, but I can tell you from personal experience - grad students are friggen EXPENSIVE. Mine cost me >$50K per year…

Compared to a postdoc?

Stranger

It varies among institutions but for me they’re about the same give or take because I have to pay the grad student’s tuition as well - a grad student is about 10K cheaper than a postdoc but even that varies because I take students from multiple programs and the stipends can vary wildly.

It can be cheaper at other places.

On edit - you might ask why take students at all? First, I’m evaluated on many criteria; one of which is how many grad students I graduate. Second, historically, my grad students, on average, are actually more productive than postdocs - students HAVE to have 3 data chapters in their theses, which translates to ~3 papers. I’ve had postdocs who pubished nothing. Grad students can’t get away with that.

Holy inflation, Batman. 50 years ago I made $400 a month in central Illinois and was able to save some of it.
My field doesn’t do postdocs since it is very easy to get a good researchy industry job without one.

I mentioned someplace else that if the paper is old enough not to have been entered into the online paper repository it might as well not exist. I haven’t seen this effect, but I don’t doubt it. Sometimes the Least Publishable Unit as a friend of mine calls it, gets smaller and smaller, so you have to make up for having only a smidgin of useful results with a lot of blab.
Which reminds me of another way to tell the difference between a real and AI paper. One of them always begins with a statement that the industry will just collapse if this 1% improvement in the speed of an algorithm is not adopted. The other is the AI paper. (Grad student papers are awful about this.)

Does her school have a policy yet that equates using AI to plagarism viz. will get your butt kicked out of school?

If a grad student (or post-doc!) costs $50k of research funds, not all of that goes directly to support them; obviously the university gets half or so.