Anzeige
Mehr »
Freitag, 30.05.2025 - Börsentäglich über 12.000 News
7 aus 7: Entdeckung in Labrador entfacht Hoffnung auf Nordamerikas nächste Titan-Vanadium-Sensation
Anzeige

Indizes

Kurs

%
News
24 h / 7 T
Aufrufe
7 Tage

Aktien

Kurs

%
News
24 h / 7 T
Aufrufe
7 Tage

Xetra-Orderbuch

Fonds

Kurs

%

Devisen

Kurs

%

Rohstoffe

Kurs

%

Themen

Kurs

%

Erweiterte Suche
ACCESS Newswire
185 Leser
Artikel bewerten:
(1)

Atlas Cloud Launches High-Efficiency AI Inference Platform, Outperforming DeepSeek

Developed with SGLang, Atlas Inference surpasses leading AI companies in throughput and cost, running DeepSeek V3 & R1 faster than DeepSeek themselves.

NEW YORK CITY, NEW YORK / ACCESS Newswire / May 28, 2025 / Atlas Cloud, the all-in-one AI competency center for training and deploying AI models, today announced the launch of Atlas Inference, an AI inference platform that dramatically reduces GPU and server requirements, enabling faster, more cost-effective deployment of large language models (LLMs).

Atlas Cloud Logo

Atlas Cloud Logo
Atlas Cloud logo

Atlas Inference, co-developed with SGLang, an AI inference engine, maximizes GPU efficiency by processing more tokens faster and with less hardware. When comparing DeepSeek's published performance results, Atlas Inference's 12-node H100 cluster outperformed DeepSeek's reference implementation of their DeepSeek-V3 model while using two-thirds of the servers. Atlas' platform reduces infrastructure requirements and operational costs while addressing hardware costs, which represent up to 80% of AI operational expenses.

"We built Atlas Inference to fundamentally break down the economics of AI deployment," said Jerry Tang, Atlas CEO. "Our platform's ability to process 54,500 input tokens and 22,500 output tokens per second per node means businesses can finally make high-volume LLM services profitable instead of merely break-even. I believe this will have a significant ripple effect throughout the industry. Simply put, we're surpassing industry standards set by hyperscalers by delivering superior throughput with fewer resources."

Atlas Inference's performance also exceeds major players like Amazon, NVIDIA and Microsoft, delivering up to 2.1 times greater throughput using 12 nodes compared to competitors' larger setups. It maintains sub-5-second first-token latency and 100-millisecond inter-token latency with more than 10,000 concurrent sessions, ensuring a scaled, superior experience. The platform's performance is driven by four key innovations:

  • Prefill/Decode Disaggregation: Separates compute-intensive operations from memory-bound processes to optimize efficiency

  • DeepExpert (DeepEP) Parallelism with Load Balancers: Ensures over 90% GPU utilization

  • Two-Batch OverlapTechnology: Increases throughput by enabling larger batches and utilization of both compute and communication phases simultaneously

  • DisposableTensor Memory Models: Prevents crashes during long sequences for reliable operation

"This platform represents a significant leap forward for AI inference," said Yineng Zhang, Core Developer at SGLang. "What we built here may become the new standard for GPU utilization and latency management. We believe this will unlock capabilities previously out of reach for the majority of the industry regarding throughput and efficiency."

Combined with a lower cost per token, linear scaling behavior, and reduced emissions compared to leading vendors, Atlas Inference provides a cost-efficient and scalable AI deployment.

Atlas Inference works with standard hardware and supports custom models, giving customers complete flexibility. Teams can upload fine-tuned models and keep them isolated on dedicated GPUs, making the platform ideal for organizations requiring brand-specific voice or domain expertise.

The platform is available immediately for enterprise customers and early-stage startups.

About Atlas Cloud

Atlas Cloud is your all-in-one AI competency center, powering leading AI teams with safe, simple, and scalable infrastructure for training and deploying models. Atlas Cloud also offers an on-demand GPU platform that delivers fast, serverless compute. Backed by Dell, HPE, and Supermicro, Atlas delivers near instant access to up to 5,000 GPUs across a global SuperCloud fabric with 99% uptime and baked-in compliance. Learn more at atlascloud.ai.

Contact Information

Jason Dotson
Head of Marketing
jason.dotson@atlascloud.ai
214-878-3807

.

SOURCE: Atlas Cloud



View the original press release on ACCESS Newswire:
https://www.accessnewswire.com/newsroom/en/computers-technology-and-internet/atlas-cloud-launches-high-efficiency-ai-inference-platform-outper-1032683

© 2025 ACCESS Newswire
Die USA haben fertig! 5 Aktien für den China-Boom
Die Finanzwelt ist im Umbruch! Nach Jahren der Dominanz erschüttert Donald Trumps erratische Wirtschaftspolitik das Fundament des amerikanischen Kapitalismus. Handelskriege, Rekordzölle und politische Isolation haben eine Kapitalflucht historischen Ausmaßes ausgelöst.

Milliarden strömen aus den USA – und suchen neue, lukrative Ziele. Und genau hier kommt China ins Spiel. Trotz aller Spannungen wächst die chinesische Wirtschaft dynamisch weiter, Innovation und Digitalisierung treiben die Märkte an.

Im kostenlosen Spezialreport stellen wir Ihnen 5 Aktien aus China vor, die vom US-Niedergang profitieren und das Potenzial haben, den Markt regelrecht zu überflügeln. Wer jetzt klug investiert, sichert sich den Zugang zu den neuen Wachstums-Champions von morgen.

Holen Sie sich den neuesten Report! Verpassen Sie nicht, welche 5 Aktien die Konkurrenz aus den USA outperformen dürften, und laden Sie sich das Gratis-PDF jetzt kostenlos herunter.

Dieses exklusive Angebot gilt aber nur für kurze Zeit! Daher jetzt downloaden!
Werbehinweise: Die Billigung des Basisprospekts durch die BaFin ist nicht als ihre Befürwortung der angebotenen Wertpapiere zu verstehen. Wir empfehlen Interessenten und potenziellen Anlegern den Basisprospekt und die Endgültigen Bedingungen zu lesen, bevor sie eine Anlageentscheidung treffen, um sich möglichst umfassend zu informieren, insbesondere über die potenziellen Risiken und Chancen des Wertpapiers. Sie sind im Begriff, ein Produkt zu erwerben, das nicht einfach ist und schwer zu verstehen sein kann.