We haven’t talked about this newer AI yet but I know a number of tech’s that prefer this over mainstream AI models. It’s said to provide much more data to users without the many guardrails currently in place with the more well-known AI’s.
AI startup Perplexity has been accused of scraping content from websites which is the same way other AI tools have built their LLM (Large Language Models) data bases.
According to internet infrastructure provider Cloudflare, the problem is that Perplexity is apparently scraping data from websites that have explicitly opted out of such activity,
In a blog post published yesterday, Cloudflare revealed research indicating that Perplexity has been bypassing restrictions and concealing its scraping behavior. The company claims Perplexity masked its identity while accessing web pages, allegedly to circumvent site owners’ preferences.
AI models like those developed by Perplexity require vast amounts of data—text, images, and videos—often sourced from the internet. While scraping has long been a common practice among AI startups, many websites have pushed back by implementing the robots.txt protocol, a web standard that signals which pages should or shouldn’t be indexed. However, enforcement of these rules has yielded mixed results.
Cloudflare alleges that Perplexity deliberately circumvented these blocks by altering its bots’ user-agent strings—which identify the type of device and browser accessing a site—and switching autonomous system numbers (ASNs), which identify large networks on the internet.
“This activity was observed across tens of thousands of domains and millions of requests per day. We were able to fingerprint this crawler using a combination of machine learning and network signals,” Cloudflare stated.
In response, Perplexity spokesperson Jesse Dwyer dismissed Cloudflare’s claims, calling the blog post a “sales pitch.” Dwyer also asserted that the screenshots shared by Cloudflare “show that no content was accessed,” and further claimed that the bot identified in the post “isn’t even ours.”
Cloudflare said it began investigating after receiving complaints from customers who noticed Perplexity scraping their sites despite having implemented robots.txt rules and blocks targeting Perplexity’s known bots. Cloudflare conducted tests and confirmed that the startup was bypassing these restrictions.
“We observed that Perplexity uses not only their declared user-agent, but also a generic browser intended to impersonate Google Chrome on macOS when their declared crawler was blocked,” Cloudflare added.
As a result, Cloudflare has removed Perplexity’s bots from its verified list and introduced new techniques to block them.
This isn’t the first time Perplexity has faced allegations of unauthorized scraping. In 2024, media outlets including Wired accused the company of plagiarizing content. During an interview at TechCrunch Disrupt 2024, CEO Aravind Srinivas struggled to define plagiarism when questioned about the controversy.
Perplexity vs ChatGPT: Which AI tool is better?
AI chatbots pretty much all feel the same. Sure, they use different models under the hood, but whether you’re using ChatGPT, Meta AI, or Google Gemini, the experience is pretty similar. You enter your question and a generated AI response comes out—which is why Perplexity AI is so interesting.
Instead of just being another chatbot, Perplexity is billed as an alternative to traditional search engines. Yes, it works kind of like a typical conversational AI chatbot, but it’s designed to be more accurate and up to date. So how does this compare to ChatGPT, which has also been held up as a possible replacement for search engines?
You decide… Give Perplexity a try and maybe compare the results with ChatGPT, Meta AI or Googles Gemini?

For more than 20 years, David Snell’s Tech Talk has been a regular spot on The South Shore’s Morning News on 95.9 WATD fm. At 8:11, David chats with show hosts Rob Hakala and Beth Foster about what’s happening in IT today. The subjects range from computer viruses, scams and cybercriminals to what Amazon, Apple or Microsoft are planning next.
He often shares new product information and reviews software that may help you, especially when there is a free version to try!
On this blog, he provides links, sources and other necessary information. And, on the Tuesday before Christmas, you can expect his annual NORAD Santa report!
If you have a question that you’d like him to answer on the show, please email him.