As I’ve been sharing examples of sites getting pummeled by the Helpful Content Update (HCU) or the October Spam Update, I’ve also been sharing screenshots from tools that detect AI content (since some sites getting hit are using AI to pump out a lot of lower-quality content – among other things they were doing that could get them in trouble). And based on those screenshots, many people have been asking me which tools I’m using.
So, instead of answering that question a million times (seriously, it might be a million), I figured I would write a quick post listing the top tools I have come across. Then I can just quickly point people to this post versus answering the question over and over.
And note, I’m not saying these tools are foolproof. I have just found them to be pretty darn good at detecting lower-quality AI content. And that’s what we should be trying to detect by the way (not all AI content… but just low-quality AI content that could potentially get a site in trouble SEO-wise).
For example, here is high-quality human content run through a tool:
And here is an example of lower-quality AI content run through a tool:
Again, it’s not foolproof, but can give you a quick feel for if AI was used to generate the content. Below, I’ll cover my favorite AI content detectors I’ve come across so far. I’ll also keep adding to this list so feel free to ping me on Twitter if you have a tool that’s great at detecting lower-quality AI content!
Here is a list of tools covered in this post for detecting AI content:
- Writer’s AI content detector tool.
- Huggingface GPT-2 Output Detector Demo.
- Giant Language Model Test Room (GLTR).
1. Writer’s AI content detector tool:
The first tool I’ll cover is from a company that has an AI writing platform (sort of ironic, but does make sense). Also, it seems like the platform is more for assisting writers from what I can see. You can check out their site for more information about the platform. Well, they also have a nifty AI content detector that works very well. You have probably seen my screenshots from the tool several times on Twitter and LinkedIn. :) Below are some examples.
Here is Writer’s tool detecting higher-quality human content:
And here is Writer’s tool detecting lower-quality AI content:
2. Huggingface GPT-2 Output Detector Demo:
If you’re not familiar with Huggingface, it’s one of the top communities and platforms for machine learning. You can check out their site for more information about what they do. Well, they also have a helpful AI content detector tool. Just paste some text and see what it returns. I have found it to be pretty good for detecting lower-quality AI content.
For example, here is Huggingface’s tool detecting higher quality human content:
And here is Huggingface’s tool detecting lower-quality AI content:
3. Giant Language Model Test Room (GLTR.io)
The third tool I’ll cover was actually down recently, but I had heard good things about it from several people (when it was working). It ends up there was a server issue and the tool was hanging. Well, the GLTR is back online now and I’ve been testing it to see how well it detects AI content.
The tool was developed by Hendrik Strobelt, Sebastian Gerhmann, and Alexander Rush from the MIT-IBM Watson AI Lab and Harvard NLP. It’s definitely not as intuitive as the first tools I covered, but once you get the hang of it, it can definitely be helpful.
How it works:
You can paste text into the tool and view a visual representation of the analysis, along with several histograms providing statistics about the text. I think most people will focus on the visual representation to get a feel for how likely each word would be the predicted word based on the word to its left. And that can help you identify if a text was written by AI or by a human. Again, nothing is foolproof, but it can be helpful (and I’ve found the tool does work well). To learn more about GLTR and how it works, you can read the detailed introduction on the site.
For example, if a word is highlighted in green, it’s in the top 10 of most likely predicted words based on the word to its left. Yellow highlighting indicates it’s in the top 100 predictions, red in the top 1,000, and the rest would be highlighted in purple (even less unlikely to be predicted).
The fraction of red and purple words (unlikely predictions) increases when the text was written by a human. If you see a lot of green and yellow highlighting, then it can indicate the text contains many predicted words based on the language model (signaling the text could have been written by AI).
Here are two examples. The first shows AI content (many words highlighted in green and yellow). This text was generated via GPT-2.
And here is an example from one of my articles about broad core updates. Notice there are many words highlighted in red, and several purple words as well (signaling this is human-written text).
Summary: Although not foolproof, tools can be helpful for detecting AI content.
Again, I’ve received a ton of questions about which tools I’ve been using to detect lower-quality AI content, so I decided to write this quick post versus answering that question over and over. I hope you find these tools helpful in your own projects. And again, if you know of other tools that I should try out, feel free to ping me on Twitter!