Technology

Auto Added by WPeMatico

Google Introduces TurboQuant: A New Compression Algorithm that Reduces LLM Key-Value Cache Memory by 6x and Delivers Up to 8x Speedup, All with Zero Accuracy Loss

The scaling of Large Language Models (LLMs) is increasingly constrained by memory communication overhead between High-Bandwidth Memory (HBM) and SRAM. Specifically, the Key-Value (KV) cache size scales with both model dimensions and context length, creating a significant bottleneck for long-context inference. Google research team has proposed TurboQuant, a data-oblivious quantization framework designed to achieve near-optimal […]

Google Introduces TurboQuant: A New Compression Algorithm that Reduces LLM Key-Value Cache Memory by 6x and Delivers Up to 8x Speedup, All with Zero Accuracy Loss Read More »

Meta ordered to pay $375m after being found liable in child exploitation case

New Mexico hails ‘historic’ win after jury finds firm misled consumers over safety and enabled harm against usersA New Mexico jury on Tuesday ordered Meta to pay $375m in civil penalties after it found the company misled consumers about the safety of its platforms and enabled harm, including child sexual exploitation, against its users.This is

Meta ordered to pay $375m after being found liable in child exploitation case Read More »

OpenAI shutters AI video generator Sora after just six months

App that allowed people to make and share AI videos was popular but received criticism for racist and violent contentIn an abrupt announcement on Tuesday, OpenAI said it was “saying goodbye” to its AI video generator Sora. The move comes just six months after the company’s splashy launch of a stand-alone app where people could

OpenAI shutters AI video generator Sora after just six months Read More »

Paged Attention in Large Language Models LLMs

When running LLMs at scale, the real limitation is GPU memory rather than compute, mainly because each request requires a KV cache to store token-level data. In traditional setups, a large fixed memory block is reserved per request based on the maximum sequence length, which leads to significant unused space and limits concurrency. Paged Attention

Paged Attention in Large Language Models LLMs Read More »

Anthropic and Pentagon face off in court over ban on company’s AI model

After Anthropic refused to let its AI to be used in autonomous weapons systems, Trump ordered US agencies to quit using itSign up for the Breaking News US email to get newsletter alerts in your inboxAnthropic is facing off against the Department of Defense in a federal court on Tuesday afternoon, as the artificial intelligence

Anthropic and Pentagon face off in court over ban on company’s AI model Read More »

Baltimore sues Elon Musk’s AI company over Grok’s fake nude images

Lawsuit argues XAI failed to disclose risks, limitations and exposure to harm that come with using chatbotThe mayor and city council of Baltimore, Maryland, filed a lawsuit against Elon Musk’s xAI company on Tuesday, alleging that its Grok chatbot violated consumer protections by generating nonconsensual sexualized images.Baltimore’s lawsuit argues that xAI deceptively marketed Grok as

Baltimore sues Elon Musk’s AI company over Grok’s fake nude images Read More »

This AI Paper Introduces TinyLoRA, A 13-Parameter Fine-Tuning Method That Reaches 91.8 Percent GSM8K on Qwen2.5-7B

Researchers from FAIR at Meta, Cornell University, and Carnegie Mellon University have demonstrated that large language models (LLMs) can learn to reason using a remarkably small number of trained parameters. The research team introduces TinyLoRA, a parameterization that can scale down to a single trainable parameter under extreme sharing settings. Using this method on a

This AI Paper Introduces TinyLoRA, A 13-Parameter Fine-Tuning Method That Reaches 91.8 Percent GSM8K on Qwen2.5-7B Read More »

Does your business English let you down? Turn it into pure corporate gibberish with LinkedIn Speak

Struggling to find the right buzzwords to adorn your CV, or to put a gloss on a series of professional setbacks? There’s a translation app for thatName: LinkedIn Speak.Age: One month old. Continue reading…

Does your business English let you down? Turn it into pure corporate gibberish with LinkedIn Speak Read More »

Kalshi and Polymarket ban insider trading as senators look to curb prediction markets

Top prediction market sites usher in new guardrails after senators introduced bill that could limit booming industryKalshi and Polymarket, the two biggest prediction market sites, rushed to institute new industry guardrails and add new surveillance tools on Monday after two key senators announced legislation that could severely curtail the industry’s prospects.Kalshi said it would ban

Kalshi and Polymarket ban insider trading as senators look to curb prediction markets Read More »

Divide between Silicon Valley and ordinary people grows ever larger

Big tech believes the future is AI while everyday Americans remain wary; and the dangers of riding in a Tesla Cybertruck Hello, and welcome to TechScape. I’m your host, Blake Montgomery. This week in tech, we discuss a moment of divergence between Silicon Valley and everyday people; deep cuts at Meta to maximize spending on

Divide between Silicon Valley and ordinary people grows ever larger Read More »