You are viewing a single comment's thread from:

RE: Chasing shadows: Is AI text detection a critical need or a fool's errand?

in Popular STEM7 months ago

I find the entire LLM and AI landscape fascinating. I’ve already set up my own privateGPT which allows me to use an uncensored LLM and I’m now experimenting with training it with my own dataset. Which is going very badly.

GPT-4 (paid-for which I don’t have) allows people to upload their own training data - training data that can be generated using a web scraper of any website (all available for free on GitHub). This is where I can’t see any detection tool ever being reliable. You can even program it to “talk like (name)” which could be the person you’ve trained it to be.

As you suggest, it’s like trying to identify if maths homework used a calculator or not.

Even the idea of getting an identifiable “watermark” would be easily circumvented via privateGPTs.

It’s probably a very interesting area to research if you’re lucky enough to get paid to do it!

Sort:  
 7 months ago 

I’ve already set up my own privateGPT which allows me to use an uncensored LLM and I’m now experimenting with training it with my own dataset.

I tried setting up LLM Studio but it crushed my computer. One of these days, I'll modernize my hardware and try it again. The censorship, CYA, and scolding from big-tech's free implementations are insufferable.

GPT-4 (paid-for which I don’t have) allows people to upload their own training data - training data that can be generated using a web scraper of any website (all available for free on GitHub).

To me this is the biggest tragedy of the way that many of Steem's high-powered curators are curating. We're missing out on a massive opportunity that we should have been perfectly positioned for. If curators were valuing posts correctly, Steem would be an ideal platform for AI training and implementation, which could massively increase the value of the tokens. Instead, they're flooding the database with noise that's basically worthless if you're training for things that appeal to humans --- or do anything useful for that matter.

It’s probably a very interesting area to research if you’re lucky enough to get paid to do it!

Agreed. On the "detection" side and on the "adversarial attack" side.

I haven't tried LLM Studio yet - do you have the option of using your GPU for processing?

The PrivateGPT that I set up was much quicker with the GPU, even using my laptop. It was a pain to set up and took me pretty much a full day because of the GPU needing CUDA in my WSL environment. So tricky, that I deleted it all and gave up but tried a 2nd time which went more smoothly.

A very quick local GPT would be to use Ollama - there are a few uncensored models available and it's about 3 or 4 command lines to install and run.

We're missing out on a massive opportunity that we should have been perfectly positioned for.

That's perhaps reflective of the majority-community perspective that AI generated content is bad. Full stop. There are only a handful of us now who want to do something different - the herd appear to be happy with their diary games and engagement challenges and choose not to look (or think) beyond that. Maybe aspiring to be a curator or representative themselves, or run a community. Nothing beyond "the template".

 7 months ago 

do you have the option of using your GPU for processing?

I don't remember, and I've already uninstalled it. If I get motivated, maybe I'll reinstall it and check. My PCs are so old, I think I'll need to modernize, though.

That's perhaps reflective of the majority-community perspective that AI generated content is bad. Full stop.

Yeah, true. And, to be fair, AI right now is very easy to misuse here. It's a conundrum.

Coin Marketplace

STEEM 0.22
TRX 0.20
JST 0.034
BTC 91793.75
ETH 3121.53
USDT 1.00
SBD 3.17