top of page

Neuronspike Moore chip

Meet the world's fastest LLM chip! Neuronspike chip uses compute-in-memory architecture that is optimized for memory speed in order to deliver extremely high throughput.

Neuronspike Moore boasts up to 650 tokens/s in LLM computations with up to 3x more energy-efficiency and up to 8x lower price. 

Memristor_edited.jpg
nspike.png

01

Breakthrough acceleration

Compute-in-memory processing unit (CMPU®) removes the memory wall limitation and opens up new computing paradigm. CMPU® is well-suited for memory-bound computations such as generative AI.

02

Versatile architecture

Neuronspike Moore is well-suited for any transformer based AI model including text-to-text, text-to-image, text-to-video, or any other AI models that has self-attention mechanism. 

transformer_edited.png
road_v3.png

03

Low power, superior performance

Neuronspike Moore supports up to 180GB of capacity, supports mixed precision multiply-add computations, offers up to 650 tokens/sec throughput on a single chip. CMPU® is powered by low-power ReRAM technology that makes Neuronspike Moore highly energy-efficient.

04

Scalable design

Neuronspike Moore chip is designed to scale where two or more chips can be combined for server-class compute and to train extremely massive generative AI models.

server.png
bottom of page