Posts

  • Deep Learning for RegEx

      Recently I decided to try my hand at the Extraction of product attribute values competition hosted on CrowdAnalytix, a website that allows companies to outsource data science problems to people with the skills to solve them. I usually work with image or video data, so this was a refreshing exercise working with text data. The challenge was to extract the Manufacturer Part Number (MPN) from provided product titles and descriptions that were of varying length – a standard RegEx problem. After a cursory look at the data, I saw that there were ~54,000 training examples so I decided to give Deep Learning a chance. Here I describe my solution that landed me a 4th place position on the public leaderboard.

      Read more...

  • A Response to Anti-Representationalists

      Coming from a background in cognitive science, where representationalist positions are the norm, I have read literature on non-representationalist viewpoints such as Lawrence Shapiro’s “Embodied Cognition” and O’Regan’s work on Sensorimotor Theory of Consiousness to put my work in perspective. These works have not had a major impact on my research, as representations of some form still seemed necessary for many of the examples they cited. I now work daily on artificial neural networks and “deep learning” is part of my basic vocabulary. Recently, I’ve come across this article from aeon multiple times through media outlets such as Reddit and Facebook. In the essay, the much respected Dr. Robert Epstein voices his position against using computational metaphors for the brain. Here, I give my response to his, and more general claims surrounding this issue, from my background in cognitive psychology and relatively short experience conducting research in machine learning and artificial intelligence. Please share your thoughts!

      Read more...

  • Visualizing CIFAR-10 Categories with WordNet and NetworkX

      In this post, I will describe how the object categories from CIFAR-10 can be visualized as a semantic network. CIFAR-10 is a database of images that is used by the computer vision community to benchmark the performance of different learning algorithms. For some of the work that I’m currently working on, I was interested in the semantic relations between the object categories, as other research has been in the past. We can do this by defining their relations with WordNet and then visualizing them using NetworkX combined with Graphviz.

      Read more...

  • Topographic Locally Competitive Algorithm

      Recent studies have shown that, in addition to the emergence of receptive fields similar to those observed in biological vision using sparse representation models, the self organization of said receptive fields can emerge from group sparsity constraints. Here I will briefly review research demonstrating topological organization of receptive fields using group sparsity principles and then describe a two-layer model implemented in a Locally Comptetitive Algorithm that will be termed Topographical Locally Competitive Algorithm (tLCA).

      Read more...

  • Sparse Filtering in Theano

      Sparse Filtering is a form of unsupervised feature learning that learns a sparse representation of the input data without directly modeling it. This algorithm is attractive because it is essentially hyperparameter-free making it much easier to implement relative to other existing algorithms, such as Restricted Boltzman Machines, which have a large number of them. Here I will review a selection of sparse representation models for computer vision as well as Sparse Filtering’s place within that space and then demonstrate an implementation of the algorithm in Theano combined with the L-BFGS method in SciPy’s optimizaiton library.

      Read more...