QNAP has introduced its new AI NAS, which packs a 16-Core AMD EPYC “Zen 2” CPU & can be paired with a Blackwell RTX PRO 6000 GPU.
Old Meets New At QNAP”QAI-h1290FX” AI NAS: 16 Zen 2 Cores & 96 GB RTX PRO 6000 GPU
“QAI-h1290FX” is QNAP’s latest Edge AI offering that combines two different hardware components. But before we get into the details, it’s worth mentioning that QNAP’s latest AI NAS is designed for LLM, RAG, and various GenAI applications.
There are two components that power the server, the first is the AMD EPYC 7302P CPU, which has 16 cores and 32 threads. The chip is based on the Zen 2 core architecture and offers enough performance to handle AI inference tasks at the edge.
The second component is the GPU, of which QNAP offers two options: a 32 GB RTX PRO 4500 Blackwell or NVIDIA’s flagship 96 GB RTX PRO 6000 Blackwell. Both offer massive computing power for AI, with the PRO 4500 aimed at ~30 billion LLMs, while the PRO 6000 is ideal for 70 billion+ AI LLMs.
Apart from CPU/GPU, QNAP QAI-h1290FX provides support for 12 NVMe/SATA U.2 SSDs, as well as high-speed network support in the form of dual 25GbE and dual 2.5GbE LAN ports. The PCIe slot can also carry an additional 100GbE AIC, although it is sold separately. The NAS is also compatible with QNAP’s JBOD expansion enclosures for large-scale AI data storage.
Key NAS features include:
- All-Flash Storage Architecture: Twelve U.2 NVMe/SATA SSD slots enable ultra-fast I/O for high-frequency AI model execution and data streaming.
- AMD EPYC 7302P 16-core Processor: Provides 32 threads of server-class computing power—ideal for AI inference, virtualization, and heavy parallel workloads.
- GPU ready architecture: Supports optional NVIDIA RTX PRO 6000 Blackwell Max-Q Workstation GPU, featuring up to 96 GB of GPU memory and support for CUDA, Tensor, and Transformer Engine acceleration—significantly improving performance for local LLM inference, image generation, and deep learning workloads.
- AI Environment in Containers & GPU Resource Management: Supports Docker and LXD with intuitive GPU allocation. Users can quickly launch AI tools through the built-in AI application center and assign GPU resources without command line configuration.
- Completely Local Deployment without Cloud Dependencies: Run an AI-powered chat assistant, document search engine, or knowledge base completely on-premises. Keep sensitive data in-house while accelerating AI workflows.
- High Speed Network and Scalable Architecture: Equipped with dual 25GbE and dual 2.5GbE ports. PCIe slot supports optional 100GbE upgrade. Compatible with QNAP JBOD expansion enclosures for large-scale AI data storage.
QNAP also shared some real-world performance of its new AI NAS using Blackwell’s NVIDIA RTX PRO 6000 96 GB GPU. Testing includes a variety of models of different sizes, offering up to 172 Tokens/second. The results can be seen below:
| Mode | Tokens/sec | VRAM usage |
|---|---|---|
| gpt-oss:120b (MXFP4) | 90 Tokens/sec | ~63GB |
| depth search-r1:70b (q4_K_M) | 24 Tokens/sec | ~41GB |
| qwen3:32b (q4_K_M) | 46 Tokens/sec | ~21GB |
| gem3:27b (q4_K_M) | 54 Tokens/sec | ~19GB |
| depth search-r1:8b (q4_K_M) | 140 Tokens/sec | ~7GB |
| qwen3:8b (q4_K_M) | 172 Tokens/sec | ~7GB |
In addition to LLM running natively via Ollama, QNAP also shared a vLLM concurrent inference test for the same configuration with results given below:
Large Language Model Tested: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B (Hugging Face)
| Thread | Total Tokens/sec | avg Token/Thread/Second |
|---|---|---|
| 1 | 79 Tokens/sec | 79 Tokens/sec |
| 2 | 166 Tokens/sec | 83 Tokens/sec |
| 5 | 410 Tokens/sec | 82 Tokens/sec |
| 10 | 688 Tokens/sec | 68.8 Tokens/sec |
| 20 | 810 Tokens/sec | 40.5 Tokens/sec |
| 50 | 850 Tokens/second | 17 Tokens/sec |
Large Language Model Tested: openai/gpt-oss-20b (Hugging Face)
| Thread | Total Tokens/sec | avg Token/Thread/Second |
|---|---|---|
| 1 | 218 Tokens/sec | 218 Tokens/sec |
| 2 | 340 Tokens/sec | 170 Tokens/sec |
| 5 | 1045 Tokens/sec | 209 Tokens/sec |
| 10 | 880 Tokens/sec | 88 Tokens/sec |
| 20 | 600 Tokens/sec | 30 Tokens/sec |
QNAP offers a wide range of storage, networking, and interface expansion cards, which can be purchased separately to expand the AI server’s capabilities. RAM is also sold separately with options ranging from 8 GB DDR4-3200 modules to 64 GB DDR4-3200 kits. The system comes with a 5-year warranty, and is priced at $8999 for the 64 GB, $13,499 for the 128 GB, and $15,999 for the 256 GB variant.
Follow Wccftech on Google to get more of our news coverage in your feed.
PakarPBN
A Private Blog Network (PBN) is a collection of websites that are controlled by a single individual or organization and used primarily to build backlinks to a “money site” in order to influence its ranking in search engines such as Google. The core idea behind a PBN is based on the importance of backlinks in Google’s ranking algorithm. Since Google views backlinks as signals of authority and trust, some website owners attempt to artificially create these signals through a controlled network of sites.
In a typical PBN setup, the owner acquires expired or aged domains that already have existing authority, backlinks, and history. These domains are rebuilt with new content and hosted separately, often using different IP addresses, hosting providers, themes, and ownership details to make them appear unrelated. Within the content published on these sites, links are strategically placed that point to the main website the owner wants to rank higher. By doing this, the owner attempts to pass link equity (also known as “link juice”) from the PBN sites to the target website.
The purpose of a PBN is to give the impression that the target website is naturally earning links from multiple independent sources. If done effectively, this can temporarily improve keyword rankings, increase organic visibility, and drive more traffic from search results.
