QNAP Pairs 6-Year-Old Zen 2 EPYC With Blackwell's NVIDIA 96GB RTX PRO 6000 in Its New Edge AI NAS

QNAP has introduced its new AI NAS, which packs a 16-Core AMD EPYC “Zen 2” CPU & can be paired with a Blackwell RTX PRO 6000 GPU.

Old Meets New At QNAP”QAI-h1290FX” AI NAS: 16 Zen 2 Cores & 96 GB RTX PRO 6000 GPU

“QAI-h1290FX” is QNAP’s latest Edge AI offering that combines two different hardware components. But before we get into the details, it’s worth mentioning that QNAP’s latest AI NAS is designed for LLM, RAG, and various GenAI applications.

There are two components that power the server, the first is the AMD EPYC 7302P CPU, which has 16 cores and 32 threads. The chip is based on the Zen 2 core architecture and offers enough performance to handle AI inference tasks at the edge.

The second component is the GPU, of which QNAP offers two options: a 32 GB RTX PRO 4500 Blackwell or NVIDIA’s flagship 96 GB RTX PRO 6000 Blackwell. Both offer massive computing power for AI, with the PRO 4500 aimed at ~30 billion LLMs, while the PRO 6000 is ideal for 70 billion+ AI LLMs.

Apart from CPU/GPU, QNAP QAI-h1290FX provides support for 12 NVMe/SATA U.2 SSDs, as well as high-speed network support in the form of dual 25GbE and dual 2.5GbE LAN ports. The PCIe slot can also carry an additional 100GbE AIC, although it is sold separately. The NAS is also compatible with QNAP’s JBOD expansion enclosures for large-scale AI data storage.

Key NAS features include:

All-Flash Storage Architecture: Twelve U.2 NVMe/SATA SSD slots enable ultra-fast I/O for high-frequency AI model execution and data streaming.
AMD EPYC 7302P 16-core Processor: Provides 32 threads of server-class computing power—ideal for AI inference, virtualization, and heavy parallel workloads.
GPU ready architecture: Supports optional NVIDIA RTX PRO 6000 Blackwell Max-Q Workstation GPU, featuring up to 96 GB of GPU memory and support for CUDA, Tensor, and Transformer Engine acceleration—significantly improving performance for local LLM inference, image generation, and deep learning workloads.
AI Environment in Containers & GPU Resource Management: Supports Docker and LXD with intuitive GPU allocation. Users can quickly launch AI tools through the built-in AI application center and assign GPU resources without command line configuration.
Completely Local Deployment without Cloud Dependencies: Run an AI-powered chat assistant, document search engine, or knowledge base completely on-premises. Keep sensitive data in-house while accelerating AI workflows.
High Speed Network and Scalable Architecture: Equipped with dual 25GbE and dual 2.5GbE ports. PCIe slot supports optional 100GbE upgrade. Compatible with QNAP JBOD expansion enclosures for large-scale AI data storage.

QNAP also shared some real-world performance of its new AI NAS using Blackwell’s NVIDIA RTX PRO 6000 96 GB GPU. Testing includes a variety of models of different sizes, offering up to 172 Tokens/second. The results can be seen below:

Mode	Tokens/sec	VRAM usage
gpt-oss:120b (MXFP4)	90 Tokens/sec	~63GB
depth search-r1:70b (q4_K_M)	24 Tokens/sec	~41GB
qwen3:32b (q4_K_M)	46 Tokens/sec	~21GB
gem3:27b (q4_K_M)	54 Tokens/sec	~19GB
depth search-r1:8b (q4_K_M)	140 Tokens/sec	~7GB
qwen3:8b (q4_K_M)	172 Tokens/sec	~7GB

In addition to LLM running natively via Ollama, QNAP also shared a vLLM concurrent inference test for the same configuration with results given below:

Large Language Model Tested: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B (Hugging Face)

Thread	Total Tokens/sec	avg Token/Thread/Second
1	79 Tokens/sec	79 Tokens/sec
2	166 Tokens/sec	83 Tokens/sec
5	410 Tokens/sec	82 Tokens/sec
10	688 Tokens/sec	68.8 Tokens/sec
20	810 Tokens/sec	40.5 Tokens/sec
50	850 Tokens/second	17 Tokens/sec

Large Language Model Tested: openai/gpt-oss-20b (Hugging Face)

Thread	Total Tokens/sec	avg Token/Thread/Second
1	218 Tokens/sec	218 Tokens/sec
2	340 Tokens/sec	170 Tokens/sec
5	1045 Tokens/sec	209 Tokens/sec
10	880 Tokens/sec	88 Tokens/sec
20	600 Tokens/sec	30 Tokens/sec

QNAP offers a wide range of storage, networking, and interface expansion cards, which can be purchased separately to expand the AI server’s capabilities. RAM is also sold separately with options ranging from 8 GB DDR4-3200 modules to 64 GB DDR4-3200 kits. The system comes with a 5-year warranty, and is priced at $8999 for the 64 GB, $13,499 for the 128 GB, and $15,999 for the 256 GB variant.

Hassan Mujtaba's photo

About the author: A Software Engineer by training and PC enthusiast by passion, Hassan Mujtaba serves as Wccftech’s Senior Editor for the hardware section. With years of experience in the industry, he specializes in in-depth technical analysis of next-generation CPU and GPU architectures, motherboards, and cooling solutions. His work includes not only the latest news on upcoming technologies, but also extensive hands-on reviews and benchmarks.

Follow Wccftech on Google to get more of our news coverage in your feed.

PakarPBN

A Private Blog Network (PBN) is a collection of websites that are controlled by a single individual or organization and used primarily to build backlinks to a “money site” in order to influence its ranking in search engines such as Google. The core idea behind a PBN is based on the importance of backlinks in Google’s ranking algorithm. Since Google views backlinks as signals of authority and trust, some website owners attempt to artificially create these signals through a controlled network of sites.

In a typical PBN setup, the owner acquires expired or aged domains that already have existing authority, backlinks, and history. These domains are rebuilt with new content and hosted separately, often using different IP addresses, hosting providers, themes, and ownership details to make them appear unrelated. Within the content published on these sites, links are strategically placed that point to the main website the owner wants to rank higher. By doing this, the owner attempts to pass link equity (also known as “link juice”) from the PBN sites to the target website.

The purpose of a PBN is to give the impression that the target website is naturally earning links from multiple independent sources. If done effectively, this can temporarily improve keyword rankings, increase organic visibility, and drive more traffic from search results.

Jasa Backlink

Download Anime Batch

About The Author

cbdliving

More From Author

Windrose PC Performance Analysis & Tuning Guide

NVIDIA Unites Foxconn, Palantir, and Oracle Behind Nemotron 3 Nano Omni as New Open AI Model Offers 9x Improvements

High-End Cameras Become Perfect Sacrificial Lambs For Smartphone Makers During DRAM Crisis As Companies Prioritize Algorithm Improvements