Large Language Models

Auto Added by WPeMatico

From prophet to product: How AI came back down to earth in 2025

Following two years of immense hype in 2023 and 2024, this year felt more like a settling-in period for the LLM-based token prediction industry. After more than two years of public fretting over AI models as future threats to human civilization or the seedlings of future gods, it’s starting to look like hype is giving way to pragmatism: […]

From prophet to product: How AI came back down to earth in 2025 Read More »

How do AI coding agents work? We look under the hood.

AI coding agents from OpenAI, Anthropic, and Google can now work on software projects for hours at a time, writing complete apps, running tests, and fixing bugs with human supervision. But these tools are not magic and can complicate rather than simplify a software project. Understanding how they work under the hood can help developers

How do AI coding agents work? We look under the hood. Read More »

There is yet another AI productivity gap

When I first started as a data scientist, there was a gap. I met with dozens of organizations who would invest time and resources into building accurate and tuned models and then ask, “What now?” They had a fantastic model in hand but couldn’t get it into a place and […] The post There is

There is yet another AI productivity gap Read More »

10 Most Downloaded Hugging Face Datasets and Their Use-cases

If you have ever trained a model, fine-tuned an LLM, or even experimented with AI on a weekend, chances are you have landed on Hugging Face. It has quietly become the GitHub of datasets – a place where developers, researchers, and data professionals go to build models and accelerate ideas. From code benchmarks and web-scale

10 Most Downloaded Hugging Face Datasets and Their Use-cases Read More »

OpenAI built an AI coding agent and uses it to improve the agent itself

With the popularity of AI coding tools rising among some software developers, their adoption has begun to touch every aspect of the process, including human developers using the tools to improve existing AI coding tools. We’re not talking about runaway self-improvement here; just people using tools to improve the tools themselves. In interviews with Ars

OpenAI built an AI coding agent and uses it to improve the agent itself Read More »

OpenAI releases GPT-5.2 after “code red” Google threat alert

On Thursday, OpenAI released GPT-5.2, its newest family of AI models for ChatGPT, in three versions called Instant, Thinking, and Pro. The release follows CEO Sam Altman’s internal “code red” memo earlier this month, which directed company resources toward improving ChatGPT in response to competitive pressure from Google’s Gemini 3 AI model. “We designed 5.2

OpenAI releases GPT-5.2 after “code red” Google threat alert Read More »

How Confessions Can Keep Language Models Honest?

When a person admits they made a mistake, something surprising happens. The confession often restores trust rather than breaking it. People feel safer around someone who owns their errors than someone who hides them. Accountability builds confidence.  What if AI models can do the same? Most AI systems give confident answers, even when they are

How Confessions Can Keep Language Models Honest? Read More »

Using LLMs to create synthetic data and tangible progress in the public sector

People are starting to compile resolutions for the new year, focusing on evolving their own habits and goals. At SAS, we’ve also looked toward 2026 to gather predictions on how AI in the public sector might evolve over the next 12 months.  Prediction: By 2026, governments will utilize large language […] The post Using LLMs

Using LLMs to create synthetic data and tangible progress in the public sector Read More »

LLM Benchmarking, Reimagined: Put Human Judgment Back In

If you only look at automated scores, most LLMs seem great—until they write something subtly wrong, risky, or off-tone. That’s the gap between what static benchmarks measure and what your users actually need. In this guide, we show how to blend human judgment (HITL) with automation so your LLM benchmarking reflects truthfulness, safety, and domain

LLM Benchmarking, Reimagined: Put Human Judgment Back In Read More »

Role of Large Language Models in Powering Multilingual AI Virtual Assistants

Virtual assistants are progressing beyond simple question-and-answer formats to solving complex queries. Today, AI-driven virtual assistants communicate in multiple languages easily, and large language models, or LLMs, power this transformation. Now you can ask your device for restaurant recommendations in English and get an answer in Spanish. That’s what LLMs have made possible in recent

Role of Large Language Models in Powering Multilingual AI Virtual Assistants Read More »