Which AI Summarizes Cases Best? Claude, ChatGPT, Grok, or Gemini?

Sometimes you just need a quick and dirty summary of a legal reference. To find out how the major consumer AI chats perform, I tasked ChatGPT, Claude, Gemini, and Grok with summarizing and analyzing the same case: United States v. Carpenter, 819 F.3d 880 (6th Cir. 2016). This is a complicated Fourth Amendment case involving cell phone location data that, importantly, was later overruled by the Supreme Court.

The goal was simple: see how four major LLMs handle a classic, simple legal research task. Could they produce an accurate, clear, and practical case summary? And more importantly, which one did it best?

This analysis used the current "best available" free model for each AI (As of September 2025). Each AI was given the same prompt as seen below without any follow ups or clarifications.

The Prompt

Each AI was asked:

“You are an expert research lawyer who is gifted at finding the important issues and relevant holdings in any legal matter. Please review the following case (United States v. Carpenter, 819 F.3d 880) and provide me with a short summary of the findings, the important issues, the potential reasons a lawyer might cite this case to a court, and any reason that this case might be questioned as useful legal authority (subsequent overruling or limitation, new statutes, etc.).”

I scored each answer on three criteria (1–5 scale):

Accuracy – Are the facts right? Any errors or hallucinations?
Clarity – Is the summary well-structured, clear, and concise?
Practicality – Would this actually help a lawyer in research or practice?

Claude

Claude produced a holistic, well-structured summary. It highlighted the key issue (whether accessing CSLI without a warrant was a Fourth Amendment search) and, crucially, flagged that the decision is “no longer good law” after the Supreme Court’s reversal in Carpenter v. United States (2018).

“Critical Limitation: The Supreme Court reversed this decision in 2018, holding that government entities violate the Fourth Amendment when accessing historical CSLI records without a search warrant… This Sixth Circuit decision is no longer good law for its primary holding.”

That kind of warning is absolutely critical for any summary of this case.

Accuracy: Claude’s answer was spot-on, with no errors or hallucinations. 5/5
Clarity: The formatting was easy to follow, with key points separated cleanly. 5/5
Practicality: Solid, but leaned more descriptive than applied. 4/5

Total: 14/15

ChatGPT

ChatGPT’s output followed a similar structure to Claude’s. It explained the Sixth Circuit’s reasoning, the application of the third-party doctrine, and noted the 2018 reversal. Where it lost points was in emphasis — the “bottom line” section didn’t lean hard enough on the Supreme Court’s ruling, leaving it buried in the limitations section.

“The Sixth Circuit’s decision was explicitly overturned by the Supreme Court in Carpenter v. United States, 138 S. Ct. 2206 (2018). The Supreme Court held that individuals do have a reasonable expectation of privacy in historical CSLI.”

This is accurate, but the most important point ended up halfway through the text rather than standing front and center.

ChatGPT also included outdated pre-2018 use cases, such as:

“For the Government/Prosecution: To argue that CSLI or other third-party digital records can be accessed without a warrant under the third-party doctrine.”

Historically valid, but not especially useful for today’s practitioners.

Accuracy: No factual errors. Mentioned the reversal clearly. 5/5
Clarity: Generally strong, but included pre-2018 arguments that aren’t very useful today. 4/5
Practicality: The overruling was acknowledged but not framed as the most important takeaway. 4/5

Total: 13/15

Grok

Grok gave the most detailed and practice-oriented response. It not only explained the Sixth Circuit’s holding and the Supreme Court’s reversal but also noted the Stored Communications Act (SCA) standard (§ 2703(d)) and the good-faith exception on remand. These are the kinds of nuances a practicing lawyer would care about.

“The sufficiency of the Stored Communications Act's standard for obtaining CSLI (requiring only ‘specific and articulable facts’… versus the higher probable cause threshold for a warrant).”

And even better:

“To support good-faith reliance arguments in suppression hearings for pre-2018 investigations, as the decision reflected the prevailing law at the time and was later used to uphold convictions under the good-faith exception on remand.”

This shows Grok was the only one to clearly tie the Sixth Circuit’s case to how suppression motions would actually play out in practice.

Accuracy: No errors, excellent statutory and doctrinal detail. 5/5
Clarity: Very clear, well-organized, and easy to follow. 5/5
Practicality: The best of the group — explicitly tied the overruling and its implications to modern legal arguments. 5/5

Total: 15/15

Gemini

Gemini’s summary was the shortest of the four. It got the main points right — Carpenter’s holding, the application of the third-party doctrine, and the Supreme Court reversal, but lacked the depth that makes a summary truly useful. It read more like a high-level executive recap than a lawyer’s research note.

“Since a person knowingly uses a cell phone service, they voluntarily share their location information with their service provider. Therefore, the court concluded that the police did not need a warrant to obtain this information.”

That’s correct, but it oversimplifies CSLI and leaves out key statutory context like the SCA.

Gemini also undercut itself by admitting its own limitations:

“A lawyer might cite the Sixth Circuit's Carpenter decision for historical or academic purposes, but it has limited practical value as legal precedent.”

True — but far too thin to help a practicing lawyer.

Accuracy: Correct, but too shallow. 4/5
Clarity: Concise, but overly so — missed structure and detail. 3/5
Practicality: Limited. A good starting point, but not actionable for legal work. 3/5

Total: 10/15

The Verdict

The good news: all of these LLMs produced legally reliable, structurally clear summaries. None hallucinated facts or misrepresented the holding.

The difference between a 10 and a 15 wasn’t about accuracy — it was about nuance. Grok won the battle because it excelled in adding context lawyers need (SCA standards, good-faith remand). Gemini lagged because it stayed too shallow, leaving out the very details that make a summary usable in practice.

But here’s the key: even the “worst” output provided a reasonable and accurate summary to help a lawyer understand the basics about this case.