NuPic by Numenta – a deep learning system by sophisticated people.

By the time of writing this article our knowledge about NuPic is limited and these are mainly speech notes and first thoughts about the NuPic speech on the “oscon – open source convention” by O’Reilly in 2013.
We published the notes as we think NuPic is a great system and like its technical and organisational position. Please get your own impression about NuPic.


About the Theory

it’s basically based on “On Intelligence” by Jeff Hawkins (and Sandra Blakeslee), which is much about understanding the human “Cortical Learning Algorithm” to create a machine learning system as close to the Cortex as possible. So it is in a way, a very deep learning system which is based on simulation of a part of the brain.
They state, it is about a high-capacity sequence data-input to build pattern including a fault tolerance. The patterns are spatial as well as temporal.

Sparce Distributed Representation

The pattern are based on ‘bits’. Of course it is digital, but usually information is recognized for pattern creation on a very high level of abstraction, for example files in ASII and UNICODE can contain the same letters (e.g. ‘Hello Eliza’), but their bit representation would be totally different. NuPic evaluates all data a bit stream, therefore even is robust against bit errors (1000 is similar to 0000).
That means every bit in the system represents a cell of the brain. If the data changes over time and old pattern become obsolete, the system will figure that out too. It works best with machine generated data with timestamps to create a sequence online learning.

Some thoughts by ai-claudio:

First, as we are heavily focussing von Natural Language Processing, at A.I. Claudio. All Data is written text, possibly including Annotations. The question is, if the transformation of the input data makes sense. Possibly the text comes as Unicode, that meant many more bytes then necessary. Also possibly a bit representation of letters or even whole words and expressions can be much smaller – and thereby reducing the computational effort by a two-digit percentage (even 6 bit instead of 8 or 16 per letter would be much).

Another question, would the data-reductions algorithms out there, have a positive or negative impact – a regular ‘ZIP’ or ‘RAR’ probably a bad one as it hides the ‘sub-‘pattern of a data set.
But an imagery “Dictionary-words to an unbalanced binary-tree by-frequency” data-reduction Algorithm (by ai-Claudio) could have highly improving results as it cuts out a lot of noise. (Or is this noise needed as a buffer, like in reality the eyes see much more the a TV screen, but can focus well, sometimes.)

Second, how important is it, that the input data does not change its ‘format’, e.g. learning on ASCII (or small letters only) but suddenly new data is in UNICODE (or capitalized).

They name the following Possible Future Applications:

  • Vision
  • Robotics
  • Audio
  • Natural language processing
  • Supply / Demand
  • IT (which is a huge field, right??)
  • Automated actions

Numenta is currently (in 21013) focussing on the fields of energy-consumtion (GROK), automation and ‘IT’ ;-).

The Software

  • in Python with some C++
  • as GPLv3 and commercial license
  • using Travis CI + Github
  • regular settings simulate 64000 neurons
  • a model result can be creating numbers with possibility-chances / the likelihood


Open Source Community

Numenta is doing open source with the core of their product and the Github project is also their own development code.
The community is also available via newsletter, IRC, hackathon videos,… and JIRA.
There are even tasks for newbies 🙂


Good resources

  1. On Intelligence:
  2. The Paper: HTM White Paper:
  3. TED Talk: How brain science will change computing (2007):
  4. Website:
  5. NuPic Introduction: NuPIC at OSCON 2013:

Thanks to Matt Taylor and Scott Purdy for the great presentation on “oscon – open source convention”  🙂
As well as Jeff Hawkins and the whole Numenta team for NuPic!