Neuronspike Moore chip

Meet the world's fastest LLM chip! Neuronspike chip uses compute-in-memory architecture that is optimized for memory speed in order to deliver extremely high throughput.

Neuronspike Moore boasts up to 650 tokens/s in LLM computations with up to 3x more energy-efficiency and up to 8x lower price.

Breakthrough acceleration

Compute-in-memory processing unit (CMPU®) removes the memory wall limitation and opens up new computing paradigm. CMPU® is well-suited for memory-bound computations such as generative AI.

Versatile architecture

Neuronspike Moore is well-suited for any transformer based AI model including text-to-text, text-to-image, text-to-video, or any other AI models that has self-attention mechanism.

Low power, superior performance

Neuronspike Moore supports up to 180GB of capacity, supports mixed precision multiply-add computations, offers up to 650 tokens/sec throughput on a single chip. CMPU® is powered by low-power ReRAM technology that makes Neuronspike Moore highly energy-efficient.

Scalable design

Neuronspike Moore chip is designed to scale where two or more chips can be combined for server-class compute and to train extremely massive generative AI models.