Is it cheating to use AI-generated text in a Steem post?
Discussing the use of AI-generated text (neural texts) with the Steem blockchain.
Background
Pixabay license from Markus Winkler at source.
One of the things I have noticed over the years is that the things that we consider to be "abusive" here on the Steem blockchain are often not unique to Steem. For example, academic literature and the news media both have their fair shares of plagiarism and phishing also happens in email.
In a recent article in The Communications of the ACM, Carlos Baquero asked the question, Is Having AI Generate Text Cheating? I think most of us would answer that question, "yes" as a knee jerk reaction, but Baquero raises a number of points for consideration. He notes, for example, that new technology is often challenged by questions of fairness and resisted by the current encumbents. When the benefits are clear, he argues that this challenge doesn't last very long, and he gives examples of technologies where usefulness triumphed. These include boats vs. swimming, writing vs. memory, and using grammar checking for writers.
Baquero goes on to point out that AI has seen a recent acceleration in usefulness, and discusses the topic from five vantage points: Blended writing (mixing human and AI content in a single document); authorship (does the human or the AI get credit for the work?); Separating wheat and chaff (distinguishing AI content from human content); AI as an oracle (human prompting - or leading questions - to drive AI responses); and The Muse (using AI as a tool to circumvent writer's block).
Making the point about increasing usefulness, he offers the following bit of text - which was generated by an AI system:
Some people argue that using AI-generated text is cheating, as it gives the user an unfair advantage. However, others argue that AI-generated text is simply another tool that can be used to improve writing.
In this post, I'm going to look at the question of using neural texts from the perspective of a participant on the Steem blockchain. As readers will likely be aware, this is especially relevant because the Steem blockchain distributes rewards to accounts that post textual content.
For whom do we work?
Anyone who is participating in content creation or voting on the Steem blockchain can be said, in a sense, to be working for the blockchain, so it is necessary to ask what sort of content the blockchain wants?
Clearly, we're all in agreement that the blockcahin doesn't want plagiarized content or spam (well... one or two voting bots might disagree on this point ;-). But what about original content that wasn't necessarily created by a human? This sort of content is still unique to the blockchain, it still drives clicks from search engines, and - if the quality and relevance are high enough - it might even attract readers and commenters. (look at something like twitter.com/theWhaleAlert, for example)
To the best of my knowledge, the desired level of quality isn't really available in AI systems at the moment, but who knows what might be possible in a year or five or ten?
So, does the blockchain want exclusively human content, or does it want content that attracts human attention without regard to whether the creator was human or silicon? I don't know the answer.
Two big challenges here are pointed out by Baquero: (i) For non-fiction articles, AI is very likely to have factually incorrect information - especially at today's levels of quality; and (ii) it is nearly impossible to distinguish between texts that are generated by humans or AIs. On the latter point, he notes:
To us humans, the more pressing concern is whether we can distinguish human from AI-generated content.
The answer is negative, as the current systems are already very good at fooling humans. Benchmarks of GPT-3 on the human accuracy at identifying if a short text, of about 200 words, was machine-generated lead to a result of 52%, almost equal to random guessing, expected at 50%.
and he goes on to say that detecting AI-generated texts is an "open question". The only tool I'm aware of for this purpose is the Giant Language model Test Room, and to me it definitely seems to be more of an art than a science.
In the end, having an AI creating full articles out of whole cloth is a thorny issue. I'm willing to say that it may be appropriate at some level of technology, but I don't think we're near that point yet. Unfortunately, detecting this sort of activity is very difficult, so it may almost be a moot question. But what about other use cases?
Blended authorship
As we saw in the introduction, there are times when an author might want to include AI text in the content of an article. Conceptually, it's not much different from including a quote from another human author.
So what is the difference between using a computer for grammar checking vs. using a computer for AI text creation? If there are times when AI text is generated, are there guidelines for when and how to use it? For a start, Baquaro gives this link, which gives guidelines for posting the content to social media or for coauthoring with an AI system. The guidelines for posting to social media include the following:
- Manually review each generation before sharing or while streaming.
- Attribute the content to your name or your company.
- Indicate that the content is AI-generated in a way no user could reasonably miss or misunderstand.
- Do not share content that violates our Content Policy or that may offend others.
- If taking audience requests for prompts, use good judgment; do not input prompts that might result in violations of our Content Policy.
I guess there's some limit to what percentage of an article should be AI generated, but in principle this seems like it is OK in moderate amounts, especially with the use of guidelines like the ones above.
Abusive AI
So what constitutes abusive AI use at this point in time? As a starting point, I would argue that on the Steem blockchain, maybe the following characteristics should be considered as abuse:
- Posting AI generated text that is not clearly identified as such.
- Overly frequent posting -OR- overly repetitive content within a single post (both forms of SPAM).
- Posting AI generated text that is barely distinguishable (or indistinguishable) from gibberish.
- What else?
Conclusion
So, my answer to the question, "Is Having AI Generate Text Cheating?" is, "It depends".
If it's clearly identified as AI generated, and it's accompanied by a sufficient amount of human text, then I'd says it's ok. If someone is attempting to pass off AI-generated content as human, then I'd say it becomes more problematic.
The next questions are the harder ones? How do we identify malicious AI-genertaed text and what do we do about it? I have tried using the Giant Language model Test Room, but that is time consuming, and it doesn't really provide certainty. At some point in time, if we don't have answers to these questions than the default answer is that we are welcoming it.
In the end, these may not be purely theoretical questions. I am aware of a number of accounts that are making daily posts which I suspect to be generated by some sort of AI platform. And they are receiving substantial upvotes from a particular user. I'm not naming the accounts because I cannot prove that their content hasn't been created by humans, but this is a question of some relevance.
If you'd like more on the topic, I definitely recommend clicking through to read, Is Having AI Generate Text Cheating? And, please reply here with your thoughts about the use and abuse of neural texts on the Steem blockchain.
Thank you for your time and attention.
As a general rule, I up-vote comments that demonstrate "proof of reading".
Steve Palmer is an IT professional with three decades of professional experience in data communications and information systems. He holds a bachelor's degree in mathematics, a master's degree in computer science, and a master's degree in information systems and technology management. He has been awarded 3 US patents.
Pixabay license, source
Reminder
Visit the /promoted page and #burnsteem25 to support the inflation-fighters who are helping to enable decentralized regulation of Steem token supply growth.
Are there users posting AI generated content? Interesting. I haven't noticed it.
Regardingto your question... I think you've underlined very well the situations. I think it's okay if we use it as a sort of "an extra tool", and not as a supplantation.
Anyways, I've wondered if I could use it, but since I write about news, I don't see how I can.
I can't be sure, but yeah, I think there are a bunch of them.
I agree with this point:
AI needs a lot of training data, and that almost certainly wouldn't exist for news stories.
Not to sound condescending, but people with real intelligence don't need to rely on Artificial Intelligence to make a point. But with that, as long as a person is honest about their sources, it doesn't matter to me what they consult to substantiate their content.
I agree on both points. The AI is only as good as its training data, but as long as the text is not posted in a misleading way, I think it's ok to use (within limits).
The only thing I hope is that we humans are the ones who are first, that artificial intelligence does not surpass ours.
I wish you a happy day
Any copied text is plagiarism. I've seen this kind of fraud before. The text was taken in Spanish, then translated into English through a translator and all the text is unique! No antiplagiator in the world will determine that the text has been copied. This is only possible for people who go deeper into this problem. And an AI that writes texts is an interesting topic for discussion. I really have never encountered such a thing.
Absolute truth in this. Im glad i didn't miss out on this
To be honest, I never thought that an AI for writing texts exists. I used to use voice typing. That is, I tell my thoughts into the microphone and the program records them. But I refused this method. He makes a lot of mistakes when translating from an oral text into a written one. It would be interesting to read the text written by AI.
This is an interesting development: