Red Hat introduces “LLM-D” to power the next generation of AI

Red Hat, the global leader in open source software, has been released LLM-D, Designed to solve the key challenges of generating AI, new open source projects efficiently execute large-scale AI models at scale. By combining Kubernetes with VLLM technology, LLM-D It enables fast, flexible, and cost-effective AI performance across a variety of clouds and hardware.

CoreWeave, Google Cloud, IBM Research, and Nvidia have founded contributors for LLM-D. Partners such as AMD, Cisco, Hugging Face, Intel, Lambda and Mistral AI are also on board. Researchers from the University of California, Berkeley and the University of Chicago supported the project, which developed the VLLM and LMCache.

A new era of flexible and scalable AI

Red Hat’s goal is clear. Enterprises run on any hardware, on any hardware, on any hardware, without being locked to expensive or complex systems. To help Red Hat make Linux a standard for enterprises, we want to make VLLM and LLM-D a new standard for running at scale.

By building a strong and open community, Red Hat aims to make AI easier, faster, faster and more accessible.

Also Read: Kubectl-Ai: AI for Kubernetes CLI Management 2025

What LLM-D brings to the table

LLM-D Introducing a variety of new technologies to speed up and simplify your AI workloads.

VLLM IntegrationA widely adopted open source inference server that runs on many hardware types, including the latest AI models and Google Cloud TPU.
Partition processing (Prefill and Decode): Divide the model’s tasks into two steps and run them on different machines to improve performance.
Smarter memory usage (kv cache offload):Storing expensive GPU memory using inexpensive CPU or network memory with LMCache.
Efficient resource management using Kubernetes: Balance your computing and storage needs in real time to keep things fast and smooth.
AI AWARE Routing: Send requests to a server where the relevant data is already cached, speeding up responses.
Faster data sharing between servers: Move data quickly between systems using fast tools such as NVIDIA’s NIXL.

Red Hat’s LLM-D is a powerful new platform for running large AI models quickly and efficiently, helping businesses use AI without high costs or slowing down.

Conclusion

Red Hat Release LLM-D Take the major steps towards making generated AI practical and scalable. Combining the power of Kubernetes, VLLM, and the advanced AI infrastructure strategies, LLM-D enables businesses to run large language models more efficiently in the cloud, hardware, or environment. Focusing on strong industry support and open collaboration, Red Hat not only solves the technical barriers of AI inference, it also lays the foundation for a flexible, affordable, standardized AI future.

Red Hat introduces “LLM-D” to power the next generation of AI

A new era of flexible and scalable AI

What LLM-D brings to the table

Conclusion

Leave a Reply Cancel reply

Follow US

Popular News

Recycling Mysteries: Metal Credit Cards

Global Coronavirus Cases

Importent Links

About US

Quick Links

Categories & Tags

Subscribe US

A new era of flexible and scalable AI

What LLM-D brings to the table

Conclusion

You Might Also Like

Just add humans: Oxford Medical Research highlights missing links in chatbot tests

The iPad I recommend to most users is now only $299

What to read this weekend: Vampires and other vampires

Liquid Glass, New Photos App and All the Other iOS 26 Features Coming to Your iPhone

The final squid game trailer is (almost) a game

Leave a Reply Cancel reply

Follow US

Weekly Newsletter

Popular News

Recycling Mysteries: Metal Credit Cards

Global Coronavirus Cases

Importent Links

About US

Quick Links

Categories & Tags

Subscribe US