Caleb Ulku compares three AI writing tools — Claude 3.5, ChatGPT, and Perplexity — across four criteria: content quality, AI detection evasion, SEO optimization, and adherence to word/character limits. Using a single standardized prompt targeting the keyword 'Plumber Houston,' he finds Claude 3.5 consistently outperforms the others: it scored highest on content quality (4-5/5), was nearly undetectable by ZeroGPT (15.4% AI score vs. ChatGPT's 97%), achieved the best SEO optimization score (67/100), and most accurately followed the 1,000-word limit. ChatGPT ranked last in every category, producing generic, fluffy content that was easily flagged as AI-written. As a result, Ulku announces his agency is switching from ChatGPT to Claude 3.5 for all content generation.
Claude 3.5 significantly outperformed the other tools at evading AI detection. When tested with ZeroGPT, Claude scored only 15.4% (meaning it was largely not detected as AI-written), Perplexity scored 55%, and ChatGPT scored 97% — meaning ChatGPT was almost entirely flagged as AI-generated content. The same pattern held across several other AI detection tools tested.
Professional copywriters evaluated each tool's content on a scale of 1 to 5 for helpfulness, engagement, and informativeness. Claude scored the highest at 4 to 5 out of 5, Perplexity came in at 3.5, and ChatGPT scored the lowest at 3 out of 5. The ChatGPT content was described as very generic, vague, and full of fluff — even though the prompt specifically instructed it to be concise and minimize fluff.
Using Page Optimizer Pro, which analyzes top-ranking Google content for LSI keywords and keyword usage patterns, the three tools scored as follows out of 100: Claude scored 67, Perplexity scored 66 (very close to Claude), and ChatGPT scored significantly lower at 59. All scores were relatively low because the articles were written at 1,000 words, well below the 3,000-word target recommended by Page Optimizer Pro.
Claude wrote 1,065 words, Perplexity wrote 1,010 words — both very close to the requested 1,000-word target. ChatGPT wrote approximately 1,400 words, exceeding the limit by about 400 words. While not drastically off, ChatGPT clearly did not obey the requested length.
All three tools performed equally well on the title tag character limit test. Each was asked to generate 15 title tag suggestions optimized for 'Plumber Houston' within a 60-character limit. Perplexity, Claude, and ChatGPT all produced 15 ideas that stayed within the 60-character limit. This was the one area where all three models performed comparably.
The four factors used to compare the three AI writing tools were: (1) Quality and helpfulness of the content, judged by professional copywriters on a scale of 1 to 5; (2) Ability to pass AI detection tests, using ZeroGPT and other tools; (3) How well SEO-optimized the content is for Google search, evaluated using Page Optimizer Pro; and (4) How effectively each tool follows specific character or word limits given in the prompt.
ZeroGPT is a free AI content detection tool that claims an accuracy rate of 98% for detecting AI-written content. In this comparison, it was used to test all three AI writing tools — Claude 3.5, ChatGPT, and Perplexity — to see how detectable their generated content was.
Page Optimizer Pro is an SEO tool that analyzes content currently ranking well on Google, examining LSI keywords, keyword usage, and patterns to evaluate what Google's algorithm looks for when reading content. In this comparison, it was used to score the SEO optimization of articles generated by Claude, Perplexity, and ChatGPT (all targeting 'Plumber Houston') against the top-ranking results for that keyword. Scores are given on a scale of 100, and the tool also recommends a target content length — in this case, 3,000 words.
After the comparison, the agency decided to switch from ChatGPT to Claude 3.5 Sonnet for all content generation. Claude outperformed ChatGPT across every metric tested: it produced higher-quality, more engaging content (rated 4–5 vs. 3 out of 5), scored far better on AI detection (15.4% vs. 97% detected), achieved a higher SEO optimization score (67 vs. 59), and more accurately followed the requested word count. The presenter stated that after seeing Claude 3.5 Sonnet's capabilities, the agency would use Claude going forward.
As AI detectors become more sophisticated and platforms and search engines invest more resources into AI content detection, there is an increasing risk that AI-generated content will be flagged and penalized. This is especially relevant for SEO, where being penalized by Google can significantly hurt a website's rankings and visibility. The challenge is not just creating content with AI, but creating content that is genuinely helpful to users, valuable for SEO, and capable of passing stringent AI detection tests.
Claude 3.5 was the overall winner of the comparison. It outperformed ChatGPT across all four metrics — content quality (4–5/5), AI detection evasion (only 15.4% detected), SEO optimization score (67/100), and word count accuracy (1,065 words vs. the requested 1,000). Perplexity was a close second, performing similarly to Claude on most metrics, but Claude still had the edge. ChatGPT consistently performed the worst across all categories.
All three AI tools — Claude 3.5, ChatGPT, and Perplexity — were given the exact same single prompt with no edits requested. The prompt asked for 1,000 words of content targeting the keyword 'Plumber Houston,' along with 15 title tag suggestions within a 60-character limit. The presenter noted that at his agency, content generation typically involves a three-step process with three individual prompts, but a single prompt was used here to keep the comparison as equal as possible. The resulting content was then evaluated on four metrics: content quality, AI detection, SEO optimization, and adherence to word/character limits.
ChatGPT performed the worst across nearly every metric tested. Its content was rated 3 out of 5 by professional copywriters — described as very generic, vague, and full of fluff, despite the prompt explicitly requesting concise content with minimal fluff. It scored 97% on ZeroGPT's AI detection test, meaning it was almost entirely flagged as AI-written. Its SEO optimization score was 59 out of 100, the lowest of the three tools. It also failed to follow the requested 1,000-word limit, producing about 1,400 words instead. The only area where it matched the others was in generating title tags within the 60-character limit.
According to the video, you should absolutely be using AI to create SEO content if you aren't already, due to the efficiency and scalability it offers. However, the real challenge is not simply generating content with AI — it's creating content that is genuinely helpful to users, valuable for SEO, and capable of passing increasingly stringent AI detection tests. The goal should not be to deceive, but to use AI as a tool to create high-quality content more efficiently. Not all AI writing tools are equally capable of meeting these combined requirements.
Perplexity performed solidly in the middle — better than ChatGPT but slightly behind Claude in most metrics. It scored 3.5 out of 5 for content quality, 55% on ZeroGPT's AI detection test (meaning it was partially flagged as AI-written), 66 out of 100 for SEO optimization (very close to Claude's 67), and wrote 1,010 words — very close to the requested 1,000. It also successfully generated 15 title tags within the 60-character limit. While Perplexity was competitive with Claude, Claude still held the overall edge.