AI Text Detector

Heuristic AI-likelihood analyzer. Free. In-browser. No signup.

0 words · 0 sentences

All analysis runs in your browser. Your text never leaves your device.

AI Likelihood Score
0out of 100
Uncertain

Signals are mixed. Cannot decide either way.

Based on 6 signals

How this detector works

This tool combines six statistical signals into a single 0 to 100 AI likelihood score. Each signal is computed locally in your browser using established text-analysis methods. The signals are weighted by how reliably they distinguish AI-generated text from human writing in the research literature.

Burstiness (30%) measures the coefficient of variation of sentence lengths. Human writing swings between short and long sentences; AI output is more uniform. Vocabulary diversity (20%) uses type-token ratio with a 100-word window. Cliche detection (20%) scans for stock AI phrases like "delve into", "in today's digital age", and "in conclusion". Punctuation profile (15%) tracks em-dash, semicolon, and Oxford comma density. Sentence structure (10%) looks at opener variation and passive voice rate. Repetition (5%) counts repeated 3-word and 4-word phrases.

The detector is calibrated to favor false positives over false negatives at the low end of the score: a score of 30 still means "lean human" rather than "definitely human". Treat the verdict as a starting point, not a verdict.

A heuristic, not a verdict

This is a statistical estimate, not proof. AI detection is inherently uncertain. Tools like this can produce false positives for academic writing, non-native English, technical documentation, and formal styles. Do not use this as the sole basis for accusations of plagiarism or academic dishonesty.

Quick answer

Paste at least 50 words. The tool computes six statistical signals (burstiness, vocabulary diversity, cliche phrases, punctuation, sentence structure, repetition) and combines them into a 0 to 100 AI likelihood score. Heuristic accuracy is roughly 65 to 75 percent. Treat results as a starting point, not proof.

How AI Detection Works

Heuristic AI detectors do not "read" your text. They count surface features that tend to differ between human and AI writing. No single feature is a tell on its own, which is why robust detectors combine multiple signals. Below are the six this tool uses.

1. Burstiness

Burstiness measures the variation in sentence length across a passage. Humans naturally swing between very short sentences (3 to 6 words) and long ones (25 to 40 words). Large language models, trained to optimize for fluency and clarity, produce sentences that cluster around a mean of 15 to 22 words with low variance. The tool calculates the coefficient of variation: a value above 0.6 leans human, below 0.3 leans AI. Burstiness is the single strongest signal in most research benchmarks, which is why we weight it at 30 percent.

2. Vocabulary Diversity

Vocabulary diversity is measured with the type-token ratio: unique words divided by total words. To stabilize the metric across text lengths, we use a 100-word moving window and average the per-window TTR. Human writing typically sits at 0.65 to 0.80 on this measure. AI output often clusters at 0.55 to 0.65, reflecting the model's tendency to reuse vocabulary within a passage. Weight: 20 percent.

3. N-gram Cliche Detection

We scan for a curated list of phrases that AI models overuse: "in today's digital age", "it is important to note", "delve into", "navigate the complexities", "in conclusion", "furthermore", "moreover", "additionally", and others. These phrases are not wrong on their own, but their density in a passage is a strong signal. One per 50 words pushes the score toward AI. Weight: 20 percent.

4. Punctuation Profile

AI models, particularly the GPT-4 family, overuse em-dashes (the long dash character). Most human writers use em-dashes sparingly. The tool counts em-dashes per 100 words, semicolon density, and Oxford comma usage rate. A passage with three em-dashes in 200 words is a strong AI marker on its own. Weight: 15 percent.

5. Sentence Structure

Two structural features: how often sentences start with the same word (low variation leans AI) and how often passive voice appears (high rate leans AI). The detector approximates passive voice by looking for "be" verbs (was, were, is, are, been, being) followed within three words by a likely past participle. Weight: 10 percent.

6. Phrase Repetition

We extract every 3-word and 4-word phrase from the text and count repeats. Phrases that appear three or more times push the score toward AI. Humans usually paraphrase; AI loops on patterns. This is the weakest of the six signals (5 percent) because legitimate writing often repeats terminology, but it adds useful information at the margins.

Accuracy and Limitations

Independent evaluations of heuristic AI detectors (GPTZero, ZeroGPT, Copyleaks, and academic detectors like DetectGPT and GLTR) consistently land in the 65 to 80 percent accuracy range on mixed-domain text. The same studies show that lightly edited AI output can drop the detection rate below 50 percent. There is no detector, paid or free, that breaks 90 percent on adversarial text.

Our tool will produce false positives for:

  • Academic writing - uniform sentence length, formal vocabulary, hedging language.
  • Non-native English - smaller working vocabulary, more repeated structures.
  • Technical documentation - passive voice, repeated terminology, formal style.
  • Corporate or legal text - cliche phrases, Oxford commas, semicolon use.
  • Highly edited or templated writing - patterns that look mechanical because they are.

Use the score as a discussion starter, never as a verdict. For high-stakes decisions (academic discipline, hiring), pair the detector with other evidence: draft history, writing samples, in-person conversation about the content.

When to Use AI Detection

There are real use cases for a heuristic detector, as long as you understand the limits:

  • Teachers checking student essays for a quick screen before a deeper review. A high score signals "look closer", not "fail".
  • Editors verifying freelance work to confirm a writer is delivering the human voice they were hired for.
  • Recruiters reviewing cover letters as one input among many. A 95 score on a cover letter is a yellow flag worth following up on.
  • Self-check before publishing to catch passages that read like ChatGPT and rewrite them in your voice.
  • Content authenticity audits when reviewing a backlog of submissions, blog posts, or product copy.

Comparison with Other Detectors

Several well-known AI detectors exist, each with different tradeoffs:

  • GPTZero uses perplexity and burstiness with proprietary models. Free tier with limits, paid plans for higher volume.
  • Originality.ai is paid-only and aimed at SEO publishers. It uses a custom-trained classifier.
  • ZeroGPT is free with a paid tier. Uses perplexity and burstiness similar to GPTZero.
  • Copyleaks is enterprise-focused with plagiarism plus AI detection in one product.

Our tool is free, fully in-browser, requires no signup, and never sees your text. It does not claim to outperform paid options. The benefit is privacy and zero friction. If you need higher confidence for a single critical decision, a paid detector is reasonable. For day-to-day screening, a transparent heuristic tool is usually enough.

Frequently Asked Questions

The tool combines six statistical signals into a 0 to 100 AI likelihood score: burstiness (sentence-length variation), vocabulary diversity, common AI cliche phrases, punctuation profile (em-dash, semicolon, Oxford comma density), sentence structure (opener variation and passive voice rate), and repetition of 3-word and 4-word phrases. Each signal is weighted by how reliably it distinguishes AI from human writing in the research literature. All computation happens in your browser.

No. No AI detector is 100% accurate, including paid services like GPTZero, Originality.ai, and ZeroGPT. Independent studies show heuristic detectors run at roughly 65 to 75 percent accuracy on mixed text. Our score is a starting point, not a verdict. Treat scores under 30 as 'lean human', 30 to 70 as 'uncertain', and over 70 as 'lean AI'. Never use a detector score as the sole basis for accusations of plagiarism or academic dishonesty.

The signals we track (cliche phrases like 'delve into', heavy em-dash use, uniform sentence lengths) are most pronounced in GPT-3.5 and GPT-4 output. Claude tends to produce more varied sentence lengths and fewer stock phrases, so its text often scores lower. Gemini sits in between. The tool is not tuned to any specific model. It looks for general statistical signatures of LLM-generated text, which is why edited or heavily-prompted AI output can slip below the threshold.

Heuristic detectors look for statistical patterns, not authorship. Several types of writing share patterns with AI: academic papers (uniform sentence length, formal vocabulary), non-native English (limited vocabulary, repeated structures), technical documentation (passive voice, repeated terminology), and corporate or legal text (cliche phrases, formal punctuation). If you write in a measured, consistent style you may score higher than expected. The detector does not 'know' anything beyond the surface features it counts.

No. All analysis runs locally in your browser using JavaScript. Your text never leaves your device, is not uploaded, is not logged, and is not used to train any model. You can verify this by opening your browser developer tools and watching the network tab while the analysis runs - there is no network traffic. We also do not store your text on our servers because we do not have access to it in the first place.

Six signals: (1) Burstiness, the coefficient of variation of sentence lengths. (2) Vocabulary diversity, measured by type-token ratio with a 100-word moving window. (3) N-gram cliche detection, scanning for stock AI phrases like 'in today's digital age' and 'it is important to note'. (4) Punctuation profile, tracking em-dash, semicolon, and Oxford comma density. (5) Sentence structure, measuring opener variation and passive voice rate. (6) Phrase repetition, counting 3-word and 4-word phrases that appear three or more times.

Yes, and easily. Light editing of AI output (swapping cliche phrases, breaking up long uniform sentences, adding personal voice and contractions) will significantly drop the score. This is a fundamental limitation of every heuristic detector. The same edits that fool our tool will fool GPTZero and ZeroGPT too. If you need to verify authorship for a high-stakes decision, combine a detector with other evidence: writing history, in-person verification, draft history in Google Docs or Word, and direct conversation about the content.

Sources

  • Mitchell, E., Lee, Y., Khazatsky, A., Manning, C. D., Finn, C. (2023). "DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature." Proceedings of the 40th International Conference on Machine Learning.
  • Solaiman, I., Brundage, M., Clark, J., et al. (2019). "Release Strategies and the Social Impacts of Language Models." OpenAI Report.
  • Gehrmann, S., Strobelt, H., Rush, A. M. (2019). "GLTR: Statistical Detection and Visualization of Generated Text." ACL System Demonstrations.
  • Bhattacharjee, A., Liu, H. (2023). "Fighting Fire with Fire: Can ChatGPT Detect AI-generated Text?" SIGKDD Explorations Newsletter.
  • GPTZero (2023). "How AI Text Detectors Work." Public methodology documentation.