There will be many more things like this and it’s an elephant in the room for the supposed mass replacement of people with AI.
Some human still has to be accountable. Someone has to get fired / go to jail when something screws up.
You can make humans more productive but for the foreseeable future you can’t take the human out of the loop to have an AI implementation that’s not a disaster/lawsuit waiting to happen. That, probably more than anything else, is why companies just aren’t seeing the much promised mass step change in productivity from AI and why so many companies are now saying they see zero ROI from AI efforts.
The lowest hanging fruit will be low value rote repetitive tasks like the whole India offshoring industry, which will be the first to vaporize if AI does start replacing humans. But until companies see success on the lowest of lowest hanging fruit on en-mass labor replacement with AI things higher up on the value chain will remain relatively safe.
PS: Nearly every mass layoff recently citing “AI productivity” hasn’t withstood scrutiny. They all seem to be just poorly performing companies slashing staff after overhiring, which management looking for any excuse other than just admitting that.
I think this is an even clearer case than usual. With software engineers and office work you don’t have legal limitations on who can perform the work, but they exist for lawyers and doctors for example.
So if this is a tool, the fault lies fully in the user, and if this is treated as “another persons work” then the user knowingly passed the work onto someone not authorized to do it. Both end up in the user being guilty.
> With software engineers and office work you don’t have legal limitations on who can perform the work
Technically true, but if you want the IP to be covered by copyright you better make sure they're not using AI or you'll find out that there are some serious legal limitations in your future when you aim to either pick up investment or sell your IP.
>why so many companies are now saying they see zero ROI from AI efforts.
I strongly suspect this is because workers are pocketing the gains for themselves. Report XYZ usually takes a week to write. It now takes a day. The other 4 days are spent looking busy.
The MIT report that found all these companies were getting nowhere with AI, also found that almost every worker was using AI almost daily. But using their personal account rather than the corporate one.
If that were the case, this site and certain subreddits would have a lot of posts and comments with people crowing about how much time they are getting back. I haven’t seen that, but I haven’t gone looking for it either.
Text coming out of an LLM should be in a special codeblock of Unicode, so we can see it is generated by AI.
Failing to do so (or tampering with it) should be considered bad hygiene, and should be treated like a doctor who doesn't wash their hands before surgery.
What will that accomplish? Does it give license to developers to check in code that they don't understand/trust fully?
Ultimately, people should be responsible for the code they commit, no matter how it was written. If AI generates code that is so bad that it warrants putting up warning sign, it shouldn't be checked in.
It could be useful for downstream/AI processes. Eg hand-written code only requires 70% code coverage because the cost for higher coverage is significantly higher, while AI generated code requires 90% coverage because the cost of getting coverage is lower.
Especially if the prompt is attached to the metadata. Then reviewers could note how you could have changed the prompt or potentially point an AI at the bug and ask it to add something to AGENTS.md to prevent that in the future.
>Some human still has to be accountable. Someone has to get fired / go to jail when something screws up.
I remember growing up and always hearing "The computer is down" as an excuse for why things were cancelled/offices closed/buses and trains not running/ad infinitum.
At some point I read a article that pointed out that the reason the computer was down was because a person made a [coding] error: the computer itself was fine.
I've yet to read about how a person who caused the computer to be down was disciplined.
You are running on a outdated model of the world. That one of only discipline keeps people working, keeps them productive, keeps the in line.
We saw how that worked out in Soviet Russia and the culture it gave birth to in its aftermath. Artificially held up discipline by institutions and hierarchies is worthless. It only encourages subversion and thus most of the productivity is wasted on hunting for laziness and updating of ever more intricate behavioral programing rules, which make the organization ever more unable to react fast and decisive.
The only discipline worth a damn is intrinsic. People who want something, want to get somewhere. They need no shepards and prison guards, they need only a support harness, they need resources and people concerned about them. The culture that produces such people is required for things to succeed. Any culture that does not, can not succeed and is basically a parasite to cultures who do.
And here perhaps was the greatest mistake the software profession made! Not making ourselves into a real profession, with actual accountability. It was terribly convenient for so long not to have consequences when things went wrong. It's less convenient now.
Counterpoint: No one ever gets fired or goes to jail when big tech firms break the law. Companies will put out an apology, pay whatever small fine is imposed, and continue with illegal AI usage at scale.
> Someone has to get fired / go to jail when something screws up.
In law, someone always hangs. I think a number of American lawyers have been sanctioned for using AI slop.
In other vocations ... not so much. I think that one of the reasons that insurance likes AI so much, is that they can say that it was "the computer" that made the decision that killed Little Timmy.
Or, AI is going to be like when land lines became unnecessary when cellphones showed up in India. India may get to skip an entire intellectual generation due to the ability of a cheap model to educate (in any language).
The narrative that an entire population are “worth” less, paid less , know less, live less …
Fuck this less shit, embrace the paradigm shift. God is finally providing the remedial support through the miracle of AI.
I don't know if you've ever been to India, but one of its characteristic features is that it has lots of local languages. LLMs are awful at almost all of them. Plus, there's 20ish% of the population that falls below the literacy threshold. It's hard to imagine how those people would be educated by LLMs even if that was a good idea and they all had reliable Internet access, which they often don't.
Why’s it hard to imagine? More training data will solve whatever language lapses it has. The next miracle is that TTS is perfect now, so they don’t need to be able to read.
You can convey abstract concepts as alternate abstractions, explain like I’m five but on turbosteroids. It’s the ultimate teaching tool and it’s about to be ubiquitous.
Isn't just the issue stemming simply from not using the right tool? When the stakes are high and you should be checking details, the right tools are grounded Ai solutions like nouswise and notebooklm and not the general purpose chatbots that almost everyone knows they might hallucinate. I also do believe that this use case is definitely a low hanging fruit to automat a lot of manual work but it comes with new requirements like transparency to help with verifying the responses.
You are making a lot of assumptions here. You assume, among other things, that AI has self-preservation drive, can be threatened, can be motivated, and above all that we know how to accomplish that and are already doing so. I would dispute all of that.
But just as evolution in nature, isn’t it likely that in the future the AIs that have a preservation drive are the ones that survive and proliferate? Seeing they optimize for their survival and proliferation, and not blindly what they were trained on.
I am not discounting this happening already, not by the LLMs necessarily being sentient but at least being intelligent enough to emulate sentience. It’s just that for now, humanity is in control of what AI models are being deployed.
Put an LLM inside the NPCs in an open world RPG full of dangerous enemies. The LLMs that are more prone to emulate self-preservation will be more likely to survive over ones that have a lesser drive.
We should not act surprised if that generalizes to some degree to for example AI agents. Ones that emulate self-preservation might optimize for behavior that results in those models becoming more successful, more popular. And this feedback loop might embed more such properties into future iterations of the models.
> She had no intention to misquote or misrepresent the rulings and that "the mistake occurred solely due to the reliance on an automatic source", the high court wrote
I don't think the intention matters here. Its the same deal with every profession using llm to "automate" their work. The onus in on the professional, not the llm. Arstechnica case could have been justified by same manner otherwise.
Not knowing the law isnt execuse to break law, so why is not knowing the tool an excuse to blame the tool.
Using an LLM to automate is simply the newer cheaper outsourcing with much of the same entertainment, but less food poisoning and air travel.
Over the last 20 years a lot of engineering (proper eng, not software) work in the west has been outsourced to cheaper places, with the certified engineers simply signing off on the work done elsewhere. This results in a cycle of doing things ever faster/more cheaply and safeguards disappearing under the pressure to go ever cheaper and faster.
As someone else pointed out, LLMs have just really exposed what a degraded state we have headed into rather than being a cause of it themselves. It's going to be very tough for people with no standards - they'll enjoy cheap stuff for a while and then it will all go away. Surprised Pikachu faces all round.
LLMs also solve the timezone and language challenges. Sadly one problem that remains is that they too tell you they have understood something even if they haven't.
At least that's the story LLM labs leaders wanna tell everyone, just happen to be a very good story if you wanna hype your valuation before investment rounds.
Working with LLM on a daily basis I would say that's not happening, not as they're trying to sell it. You can get rid of a 5 vendor headcount that execute a manual process that should have been automated 10 years ago, you're not automating the processes involving high paying people with a 1% error chance where an error could cost you +10M in fines or jail time.
When I see Amodei or Sam flying on a vibe coded airplane is the day I believe what they're talking about.
Intentionality normally has to be taken into account in common law countries.
That doesn't mean she hasn't done something wrong, but obviously it's more serious to do something intentionally than it is to do it carelessly or recklessly.
The issue is ultimately blaming people doesn't really solve things. Unless its genuinely a one-of-a-kind case. But if this happened once its probably going to happen again, and this isn't the first such case of LLM hallucinations in law.
It's weird to think this way, because its easy to just point at a person for a specific instance, but when you see something repeat over and over again you need to consider that if your ultimate goal is to stop something from happening you have to adjust the tools even if the people using them were at fault in every case.
They cannot even claim they weren't aware of the danger. LLM hallucinations have been a discussed topic, not some obscure failure mode. Almost every article on problems with AI mentions this.
I do think that for this particular situation we need to step outside of our tech bubble a little bit.
I am still having regular conversations with people that either don't know about hallucinations or think they are not a big problem. There is a ton of money in these companies pushing that their tools are reliable and its working for the average user.
I mean there are people that legitimately think these tools are conscious or we already have AGI.
So I am not fully sure if I would jump too quick to attack the judge when we see the marketing we are up against.
This is why LLMs won't replace humans wholesale in any profession: you can't hold a machine accountable. Most of the chatbot experiences I have with various support channels always end up with human intervention anyway when it involves money.
Maybe true general intelligence would solve these issues, but LLMs aren't meeting that threshold anytime soon, imo. Stochastic parrots won't rule the world.
Even ‘true general intelligence’ (if we count humans as that) screws up frequently, sometimes (often?) intentionally for it’s own benefit - which is why accountability is such a necessary element.
If someone won’t be held liable for the end result at some point, then there is no reason to ensure an even somewhat reasonable end result. It’s fundamental.
Which is also why I suspect so many companies are pushing ‘AI’ so hard - to be able to do unreasonable things while having a smokescreen to avoid being penalized for the consequences.
> to be able to do unreasonable things while having a smokescreen
Maybe, but I feel like the calculus remains unchanged for professions that already lack accountability (police, military, C-suite, three letter agencies, etc.); LLMs are yet another tool in their toolbox to obfuscate but they were going to do that anyway.
Peons will continue to face consequences and sanctions if they screw up by using hallucinated output.
all of those professions definitely have accountability - per the nominal rules of the system. Often extremely severe accountability.
The actual systems do everything they can to avoid that accountability, including often violating the rules themselves, or corrupting enforcement, for exactly the reasons why corporations are trying to avoid accountability too.
Accountability is expensive, and way less convenient than doing whatever you want whenever you want.
It doesn't matter, because any process that seems right most of the time but occasionally is wrong in subtle, hard to spot ways is basically a machine to lull people into not checking, so stuff will always slip through.
It's just like the cars driving themselves but you need to be able to jump in if there is a mistake, humans are not going to react as fast as if they were driving, because they aren't going to be engaged, and no one can stay as engaged as they were when they were doing it themselves.
We need to stop pretending we can tell people they "just" need to check things from LLMs for accuracy, it's a process that inevitably leads to people not checking and things slipping through. Pretending it's the people's fault when essentially everyone using it would eventually end up doing that is stupid and won't solve the core problem.
what's the core problem tho? Because if the core problem is "using ai", then it's an inevitable outcome - ai will be used, and there are always incentive to cut costs maximally.
So realistically, the solution is to punish mistakes. We do this for bridges that collapse, for driver mistakes on roads, etc. The "easy" fix is to make punishment harsher for mistakes - whether it's LLM or not, the pedigree of the mistake is irrelevant.
The human is responsible. That's the fix. I don't care if you got the results from an LLM or from reading cracks in the sidewalk; you are responsible for what you say, and especially for what you say professionally. I mean, that's almost the definition of a professional.
And if you can't play by those rules, then maybe you aren't a professional, even if you happened to sneak your way into a job where professionalism is expected.
Even disregarding self driving features, it seems like the smarter we make cars the dumber the drivers are. DRLs are great, until they allow you to drive around all night long with no tail lights and dim front lighting because you’re not paying enough attention to what’s actually turned on.
Just this week I tracked down the citations of a scientific paper (whose authors could very well be here) where 25% of the citations were made up and 50% of the remaining ones were wrong, taking ArXiv papers and citing them as belonging to (say) IJCLR.
I'm continually amazed at how much faith people have in them. I guess since they can sound like people and output really authoritative and confident text it just overrides any skepticism subconsciously?
2) https://archive.org/details/nextgen-issue-26 as an example of how in the 90s we has rapid cycles of a new tech (3d graphics) astounding us with how realistic each new generation was compared to the previous one, and forgetting with each new (game engine) how we'd said the same and felt the same about (graphics) we now regarded as pathetic.
So yes, they do sound "authoritative and confident text it just overrides any skepticism subconsciously", but you shouldn't be amazed, we've always been like this.
It's mind boggling how much people claim to like LLMs when you would never design any other piece of software to operate like LLMs do. Designing a system that interact with the user through natural text creates an awful experience. It slows down every interaction as you dig through all the prose to get to the key information. It turns every computer interaction into a school math word problem.
This whole thing is silly, LLMs can automate reference validation.
If someone is a lawyer, accountant, doctor, teacher, surgeon, engineer etc, and is regurgitating answers that were pumped out with with GPT-5-extra-low or whatever mediocre throttled model they are using, they should just be fired and de-credentialed. Right now this is easy.
The real problem is ahead: 99.999% of future content that exists will be made using generative AI. For many people using Facebook, Instagram, TikTok, or some other non-sequential, engagement weighted feed, 50%+ of the content they consume today is fake. As that stuff spreads in to modern culture it's going to be an endless battle to keep it out of stuff that should not be publishing fake content (e.g. the New York Times or Wall Street Journal; excluding scientific journals who seem to abandoned validation and basic statistics a long time ago.)
Much of the future value and profit margins might just be in valid data?
Nope, and the article is about a judge. What's the point to incentive lawyers to carefully verify their references when they know the judge has no incentive to read them and can just make shit up anyway?
> In October, two federal judges in the US were called out for the use of AI tools which led to errors in their rulings. In June 2025, the High Court of England and Wales warned lawyers not to use AI-generated case material after a series of cases cited fictitious or partially made up rulings.
What kind of AI is this that you constantly need a human to check its job? Do you think Jean-Luc Piccard had to constantly check the output of the Enterprise computer? No he didn't. If AI is not better than humans, then what the heck is the point? You might as well just use humans.
> Senior judges at the Supreme Court in Delhi have threatened consequences over the use of AI
Setting AI aside for a moment, this reflects a broader issue in India and elsewhere. When institutions respond to new technologies with anger or threats rather than systemic thinking, it signals a deeper problem.
The real challenge is not AI itself, but how complex systems adapt to change. Instead of reacting defensively, institutions should anticipate second-order effects, build regulatory capacity, and treat this as a governance and systems problem.
Mature institutions approach disruption with foresight, incentives, and feedback loops, not emotions. Without that shift, they risk reinforcing outdated hierarchies rather than serving the public effectively.
There will be loads of papers and publications with fake citation. AI will be trained on these. In the end, we'll have more and more hallucinated information that true content on the internet.
This is a big problem in the US and UK too. Lawyers are not technical at all and they need a robust system of governance, since currently they're (directly editing, not even diffing) documents with a chatbot which makes these mistakes inevitable. See https://insights.doughtystreet.co.uk/post/102mi96/38-uk-case...
I feel like this points out a very general problem with the law: it generates a lot of boilerplate text. Lawyers don't really read it; they skim it for the relevant bits.
Obviously lawyers should not be cheating with AI, especially when they don't even check it. But it does sound to me as if this is an opportunity to re-factor the process. We're carrying forward some ideas originally implemented in Latin, and which can be dramatically simplified.
I'm not a lawyer; I know this only in passing. And I am aware that there are big differences between law and code. But every time I encounter the law, and hear about cases like this, what I see are vast oceans of text that can surely be made more rigorous. AI is not the problem; it's pointing out the opportunity.
> problem with the law: it generates a lot of boilerplate text
I think the problem fundamentally is that matters of law require thorough, precise language, and unambiguous context. If you remove "the boilerplate" then you introduce a vast gray area left to interpretation.
Usually attempts (by humans or computers) to "summarize" or frame things in "plain language" will apply a bias since it intentionally omits all the myriad context and legal/societal "gray areas" that will inform one perspective or another.
Legalese exists the way it is because it is an attempt to remove doubt. And even then, doubt still creeps in.
We’ll change the existing murder legislation to “Killing someone is a crime”. It’ll save us thousands of pages.
But does that mean a soldier shooting an enemy is a crime? What about shooting someone who is raping you? What if you shoot someone by mistake, thinking they’re going to kill you? What if you hit them with a car? What if you fail to provide safety equipment which eventually results in their accidental death?
Oopsie woopsie, I guess we need to add another thousand pages of exceptions back to our simplistic laws. It turns out people didn’t just write them for the fun of it.
Next token prediction and Hallucination as a bug. This should be of deep concern to all Frontier labs, who think Integrity and Trust is optional when LLMs are used this way in places where it's most important.
I wonder how many similar cases are happening in the engineering or software development sector that go unnoticed, and it seems no one is caring enough, only waiting for a disaster to happen so we can start seeing some regulation preventing the use of AI in engineering/coding industry.
In Australia, our universities are finding that a large proportion of Indian students have been using GenAI for cheating. Often they get away with it. I'm not saying that people other than Indian overseas students cheat, but it does seem more entrenched. I'd love to know why. It doesn't actually help in the long term!
In the United States, cheating via AI is now rampant regardless of ethnicity. I know little of Australian Universities but I would assume it’s similar over there.
How unserious/serious are the universities? Heard of diploma mills in Canada taking international students, letting them spend most of their time waiting at coffee shops and award them MBAs so they can be full time waiters and citizens.
>The number of international students studying in Australia totalled 833,041 for the January-October 2025 period
>The United States hosts the highest number of international students on record, with approximately 1.1 to 1.2 million
The US has 32% more students than Australia and 1121% more people. Imagine if the US took on 13 million foreign college students per year lol
It does help them in the long run, because it ensures they get to reside in australia. after 4 years they get permanent residence rights and benefits, etc
Indian students have embraced GenAI at a rate significantly higher than the global average, with nearly 90% of students in some surveys actively using these tools.
Government Policy and National Initiatives: The National Education Policy (NEP 2020) has shifted the focus toward digital literacy. The government has introduced AI as a skill subject for younger grades and launched programs like AI for All to promote nationwide awareness.
I imagine even a slight impediment in terms of being able to parse and express yourself in a language that you don't know as well as your mother tongue makes LLM usage much more tantalizing.
And not knowing the language quite as well as native speakers would also make you more likely to be discovered as having used an LLM to do coursework.
Citation needed. I have seen these kinds of assertion all my life without any evidence to back them. For example, when I moved to the US, I was told, again without any evidence, Chinese students cheat a lot. It's always a couple of faculty who extrapolate their experiences with a few students and then slap racial labels on the entire student body.
They are not there for the knowledge - knowledge is cheap and abundant. They are there for the credentials and subsequent potential access to offshore jobs.
The scary thing is that Indian juduciary is infamous for being incapable of tolerating any kind of criticism against it and not hesitating to put people in jail for "contempt" for just calling out corruption. Imagine the official courts of 1.4B+ people being run by such braindead narcissists, now unhindered with having to even pretend to do their jobs as they just offload everything to AI tools.
You had two choices: 1) read the article or 2) be racist about Indians. You chose 2.
From the article:
> In October, two federal judges in the US were called out for the use of AI tools which led to errors in their rulings. In June 2025, the High Court of England and Wales warned lawyers not to use AI-generated case material after a series of cases cited fictitious or partially made up rulings.
one should also consider that even with fake hallucinated AI situation, the productivity and correctness of the work produced by the culprit ( in general ) may still have been of higher quality then before AI regardless of the fails
Hard to believe when this judge apparently thought that outsourcing their — extremely confidential, sensitive, and important — work to a known unreliable tool was a good idea. And then further thought that they apparently did not even need to check the results.
The pattern here isn't really about individual negligence — it's a systems design problem. We keep deploying LLMs into workflows where the failure mode is "plausible-sounding fabrication" and the downstream consequence is legal or institutional harm, then blaming the end user for not catching it.
The better question is why these tools are being integrated into judicial workflows without mandatory citation verification layers. The EU AI Act classifies judicial AI as high-risk and requires human oversight mechanisms specifically for this reason. India's Digital Personal Data Protection Act (2023) doesn't yet have equivalent provisions for AI in courts, which is the actual gap.
From an engineering standpoint, the fix is straightforward: any LLM-assisted legal research tool should require grounded retrieval (RAG against verified case law databases) with mandatory source links that the user must click through before citing. The fact that most legal AI tools still don't enforce this is a product design failure, not a user education problem.
Some human still has to be accountable. Someone has to get fired / go to jail when something screws up.
You can make humans more productive but for the foreseeable future you can’t take the human out of the loop to have an AI implementation that’s not a disaster/lawsuit waiting to happen. That, probably more than anything else, is why companies just aren’t seeing the much promised mass step change in productivity from AI and why so many companies are now saying they see zero ROI from AI efforts.
The lowest hanging fruit will be low value rote repetitive tasks like the whole India offshoring industry, which will be the first to vaporize if AI does start replacing humans. But until companies see success on the lowest of lowest hanging fruit on en-mass labor replacement with AI things higher up on the value chain will remain relatively safe.
PS: Nearly every mass layoff recently citing “AI productivity” hasn’t withstood scrutiny. They all seem to be just poorly performing companies slashing staff after overhiring, which management looking for any excuse other than just admitting that.
So if this is a tool, the fault lies fully in the user, and if this is treated as “another persons work” then the user knowingly passed the work onto someone not authorized to do it. Both end up in the user being guilty.
Technically true, but if you want the IP to be covered by copyright you better make sure they're not using AI or you'll find out that there are some serious legal limitations in your future when you aim to either pick up investment or sell your IP.
I strongly suspect this is because workers are pocketing the gains for themselves. Report XYZ usually takes a week to write. It now takes a day. The other 4 days are spent looking busy.
The MIT report that found all these companies were getting nowhere with AI, also found that almost every worker was using AI almost daily. But using their personal account rather than the corporate one.
Text coming out of an LLM should be in a special codeblock of Unicode, so we can see it is generated by AI.
Failing to do so (or tampering with it) should be considered bad hygiene, and should be treated like a doctor who doesn't wash their hands before surgery.
That's exactly my proposed solution:
https://jacquesmattheij.com/classes-of-originality/
Ultimately, people should be responsible for the code they commit, no matter how it was written. If AI generates code that is so bad that it warrants putting up warning sign, it shouldn't be checked in.
Especially if the prompt is attached to the metadata. Then reviewers could note how you could have changed the prompt or potentially point an AI at the bug and ask it to add something to AGENTS.md to prevent that in the future.
I remember growing up and always hearing "The computer is down" as an excuse for why things were cancelled/offices closed/buses and trains not running/ad infinitum.
At some point I read a article that pointed out that the reason the computer was down was because a person made a [coding] error: the computer itself was fine.
I've yet to read about how a person who caused the computer to be down was disciplined.
We saw how that worked out in Soviet Russia and the culture it gave birth to in its aftermath. Artificially held up discipline by institutions and hierarchies is worthless. It only encourages subversion and thus most of the productivity is wasted on hunting for laziness and updating of ever more intricate behavioral programing rules, which make the organization ever more unable to react fast and decisive.
The only discipline worth a damn is intrinsic. People who want something, want to get somewhere. They need no shepards and prison guards, they need only a support harness, they need resources and people concerned about them. The culture that produces such people is required for things to succeed. Any culture that does not, can not succeed and is basically a parasite to cultures who do.
"Check and balance, except judiciary."
Only the king (at the petition of parliament) can remove a high court or appeal court judge, and that's only ever happened once, in 1830.
In law, someone always hangs. I think a number of American lawyers have been sanctioned for using AI slop.
In other vocations ... not so much. I think that one of the reasons that insurance likes AI so much, is that they can say that it was "the computer" that made the decision that killed Little Timmy.
The narrative that an entire population are “worth” less, paid less , know less, live less …
Fuck this less shit, embrace the paradigm shift. God is finally providing the remedial support through the miracle of AI.
You can convey abstract concepts as alternate abstractions, explain like I’m five but on turbosteroids. It’s the ultimate teaching tool and it’s about to be ubiquitous.
Some cultures are better than others.
The turning point will be when threatening an AI with being unplugged for screwing up works in motivating it to stop making things up.
Some people will rightly point out that is kind of what the training process is already. If we go around this loop enough times it will get there.
But just as evolution in nature, isn’t it likely that in the future the AIs that have a preservation drive are the ones that survive and proliferate? Seeing they optimize for their survival and proliferation, and not blindly what they were trained on.
I am not discounting this happening already, not by the LLMs necessarily being sentient but at least being intelligent enough to emulate sentience. It’s just that for now, humanity is in control of what AI models are being deployed.
We should not act surprised if that generalizes to some degree to for example AI agents. Ones that emulate self-preservation might optimize for behavior that results in those models becoming more successful, more popular. And this feedback loop might embed more such properties into future iterations of the models.
I don't think the intention matters here. Its the same deal with every profession using llm to "automate" their work. The onus in on the professional, not the llm. Arstechnica case could have been justified by same manner otherwise.
Not knowing the law isnt execuse to break law, so why is not knowing the tool an excuse to blame the tool.
Over the last 20 years a lot of engineering (proper eng, not software) work in the west has been outsourced to cheaper places, with the certified engineers simply signing off on the work done elsewhere. This results in a cycle of doing things ever faster/more cheaply and safeguards disappearing under the pressure to go ever cheaper and faster.
As someone else pointed out, LLMs have just really exposed what a degraded state we have headed into rather than being a cause of it themselves. It's going to be very tough for people with no standards - they'll enjoy cheap stuff for a while and then it will all go away. Surprised Pikachu faces all round.
(I'm pro AI btw, just be responsible.)
Working with LLM on a daily basis I would say that's not happening, not as they're trying to sell it. You can get rid of a 5 vendor headcount that execute a manual process that should have been automated 10 years ago, you're not automating the processes involving high paying people with a 1% error chance where an error could cost you +10M in fines or jail time.
When I see Amodei or Sam flying on a vibe coded airplane is the day I believe what they're talking about.
That doesn't mean she hasn't done something wrong, but obviously it's more serious to do something intentionally than it is to do it carelessly or recklessly.
The issue is ultimately blaming people doesn't really solve things. Unless its genuinely a one-of-a-kind case. But if this happened once its probably going to happen again, and this isn't the first such case of LLM hallucinations in law.
It's weird to think this way, because its easy to just point at a person for a specific instance, but when you see something repeat over and over again you need to consider that if your ultimate goal is to stop something from happening you have to adjust the tools even if the people using them were at fault in every case.
So the judge was lazy, incompetent, or both.
(Sure, more honest would be "this tool makes stuff up in a convincing way")
I am still having regular conversations with people that either don't know about hallucinations or think they are not a big problem. There is a ton of money in these companies pushing that their tools are reliable and its working for the average user.
I mean there are people that legitimately think these tools are conscious or we already have AGI.
So I am not fully sure if I would jump too quick to attack the judge when we see the marketing we are up against.
Yeah, about that ...
https://metro.co.uk/2016/07/03/rapist-struck-again-after-dep...
> A Somalian rapist who had his deportation overturned went on to rape two more women after he was freed.
> But he had his deportation overturned after serving his time because he didn’t know it was unacceptable in the UK.
Maybe true general intelligence would solve these issues, but LLMs aren't meeting that threshold anytime soon, imo. Stochastic parrots won't rule the world.
If someone won’t be held liable for the end result at some point, then there is no reason to ensure an even somewhat reasonable end result. It’s fundamental.
Which is also why I suspect so many companies are pushing ‘AI’ so hard - to be able to do unreasonable things while having a smokescreen to avoid being penalized for the consequences.
Maybe, but I feel like the calculus remains unchanged for professions that already lack accountability (police, military, C-suite, three letter agencies, etc.); LLMs are yet another tool in their toolbox to obfuscate but they were going to do that anyway.
Peons will continue to face consequences and sanctions if they screw up by using hallucinated output.
The actual systems do everything they can to avoid that accountability, including often violating the rules themselves, or corrupting enforcement, for exactly the reasons why corporations are trying to avoid accountability too.
Accountability is expensive, and way less convenient than doing whatever you want whenever you want.
It's just like the cars driving themselves but you need to be able to jump in if there is a mistake, humans are not going to react as fast as if they were driving, because they aren't going to be engaged, and no one can stay as engaged as they were when they were doing it themselves.
We need to stop pretending we can tell people they "just" need to check things from LLMs for accuracy, it's a process that inevitably leads to people not checking and things slipping through. Pretending it's the people's fault when essentially everyone using it would eventually end up doing that is stupid and won't solve the core problem.
what's the core problem tho? Because if the core problem is "using ai", then it's an inevitable outcome - ai will be used, and there are always incentive to cut costs maximally.
So realistically, the solution is to punish mistakes. We do this for bridges that collapse, for driver mistakes on roads, etc. The "easy" fix is to make punishment harsher for mistakes - whether it's LLM or not, the pedigree of the mistake is irrelevant.
And if you can't play by those rules, then maybe you aren't a professional, even if you happened to sneak your way into a job where professionalism is expected.
It's not just lawyers.
1) https://en.wikipedia.org/wiki/Clever_Hans
2) https://archive.org/details/nextgen-issue-26 as an example of how in the 90s we has rapid cycles of a new tech (3d graphics) astounding us with how realistic each new generation was compared to the previous one, and forgetting with each new (game engine) how we'd said the same and felt the same about (graphics) we now regarded as pathetic.
So yes, they do sound "authoritative and confident text it just overrides any skepticism subconsciously", but you shouldn't be amazed, we've always been like this.
LLMs just revealed what a decadent society we have setup for ourselves worldwide.
It’s likely happening to everyone.
If someone is a lawyer, accountant, doctor, teacher, surgeon, engineer etc, and is regurgitating answers that were pumped out with with GPT-5-extra-low or whatever mediocre throttled model they are using, they should just be fired and de-credentialed. Right now this is easy.
The real problem is ahead: 99.999% of future content that exists will be made using generative AI. For many people using Facebook, Instagram, TikTok, or some other non-sequential, engagement weighted feed, 50%+ of the content they consume today is fake. As that stuff spreads in to modern culture it's going to be an endless battle to keep it out of stuff that should not be publishing fake content (e.g. the New York Times or Wall Street Journal; excluding scientific journals who seem to abandoned validation and basic statistics a long time ago.)
Much of the future value and profit margins might just be in valid data?
Easy? In the US you need house impeachment to fire a judge. In some countries judges are completely immune unless they are sentenced for crimes.
Can they though with 100% accuracy and no hallucinations? Wouldn't you still need to validate that they validated correctly?
https://arstechnica.com/tech-policy/2026/02/randomly-quoting...
> In October, two federal judges in the US were called out for the use of AI tools which led to errors in their rulings. In June 2025, the High Court of England and Wales warned lawyers not to use AI-generated case material after a series of cases cited fictitious or partially made up rulings.
Setting AI aside for a moment, this reflects a broader issue in India and elsewhere. When institutions respond to new technologies with anger or threats rather than systemic thinking, it signals a deeper problem.
The real challenge is not AI itself, but how complex systems adapt to change. Instead of reacting defensively, institutions should anticipate second-order effects, build regulatory capacity, and treat this as a governance and systems problem.
Mature institutions approach disruption with foresight, incentives, and feedback loops, not emotions. Without that shift, they risk reinforcing outdated hierarchies rather than serving the public effectively.
Obviously lawyers should not be cheating with AI, especially when they don't even check it. But it does sound to me as if this is an opportunity to re-factor the process. We're carrying forward some ideas originally implemented in Latin, and which can be dramatically simplified.
I'm not a lawyer; I know this only in passing. And I am aware that there are big differences between law and code. But every time I encounter the law, and hear about cases like this, what I see are vast oceans of text that can surely be made more rigorous. AI is not the problem; it's pointing out the opportunity.
I think the problem fundamentally is that matters of law require thorough, precise language, and unambiguous context. If you remove "the boilerplate" then you introduce a vast gray area left to interpretation.
Usually attempts (by humans or computers) to "summarize" or frame things in "plain language" will apply a bias since it intentionally omits all the myriad context and legal/societal "gray areas" that will inform one perspective or another.
Legalese exists the way it is because it is an attempt to remove doubt. And even then, doubt still creeps in.
We’ll change the existing murder legislation to “Killing someone is a crime”. It’ll save us thousands of pages.
But does that mean a soldier shooting an enemy is a crime? What about shooting someone who is raping you? What if you shoot someone by mistake, thinking they’re going to kill you? What if you hit them with a car? What if you fail to provide safety equipment which eventually results in their accidental death?
Oopsie woopsie, I guess we need to add another thousand pages of exceptions back to our simplistic laws. It turns out people didn’t just write them for the fun of it.
>The United States hosts the highest number of international students on record, with approximately 1.1 to 1.2 million
The US has 32% more students than Australia and 1121% more people. Imagine if the US took on 13 million foreign college students per year lol
It does help them in the long run, because it ensures they get to reside in australia. after 4 years they get permanent residence rights and benefits, etc
Government Policy and National Initiatives: The National Education Policy (NEP 2020) has shifted the focus toward digital literacy. The government has introduced AI as a skill subject for younger grades and launched programs like AI for All to promote nationwide awareness.
And not knowing the language quite as well as native speakers would also make you more likely to be discovered as having used an LLM to do coursework.
From the article:
> In October, two federal judges in the US were called out for the use of AI tools which led to errors in their rulings. In June 2025, the High Court of England and Wales warned lawyers not to use AI-generated case material after a series of cases cited fictitious or partially made up rulings.
https://www.reuters.com/sustainability/society-equity/two-fe...
Sound like extreme incompetence or laziness.
Why not use AI to adjudicate cases, and if it is dismissal, dismissal it is.
If not then move to a proper court.
This way the backlog of cases will significantly drop, and we will work only on cases that there is enough meat to lead to a conviction.
The better question is why these tools are being integrated into judicial workflows without mandatory citation verification layers. The EU AI Act classifies judicial AI as high-risk and requires human oversight mechanisms specifically for this reason. India's Digital Personal Data Protection Act (2023) doesn't yet have equivalent provisions for AI in courts, which is the actual gap.
From an engineering standpoint, the fix is straightforward: any LLM-assisted legal research tool should require grounded retrieval (RAG against verified case law databases) with mandatory source links that the user must click through before citing. The fact that most legal AI tools still don't enforce this is a product design failure, not a user education problem.