Some good readings

And a brief overview of each, written by myself.

Also take a look at the MIT Probabilistic Computing Project's Reading List



Introduction to Mathematical Logic
by Alonzo Church

Formal logic is the foundation of most modern scientific breakthroughs. This book provides a strong and thorough stepping stone into mathematics.

Set Theory and Its Logic
by Willard Van Orman Quine

This book delves into the mathematical theory of classes. I'll leave a quote by Zermelo, which Quine mentions in the introduction: "Set theory is that branch of mathematics whose task is to investigate mathematically the fundamental notions of 'number', 'order', and 'function' taking them in their pristine, simple form, and to develop thereby the logical foundations of all of arithmetic and analysis."

The Calculi of Lambda-Conversion
by Alonzo Church

The content of this work has heavily served for advancement in the field of computation. At its core lies functions and abstraction as concepts for a formal system. Also, it's extremely well written.

Introduction to Probability
by Dimitri P. Bertsekas & John N. Tsitsiklis

This book effectively communicates basic concepts in probability theory and thoroughly describes them by both motivation and formal expression.


Introduction to the Theory of Computation
by Michael Sipser

This book covers the foundations of computation: automata & languages, computability theory, and complexity theory.

Introduction to Algorithms
by Thomas H. Cormen, Ronald L. Rivest, Charles E. Leiserson, & Clifford Stien

This book covers algorithms and algorithmic design techniques that are crucial to efficiently solving large classes of mathematical problems.

The Structure and Interpretation of Computer Programs
by Harold Abelson, Gerald Sussman, & Julie Sussman

This book teaches principes of computer programming. It's focus is on abstraction, recursion, modularity, and interpreters. It can be appreciated more after reading The Calculi of Lambda-Conversion by Alonzo Church (see above).

Principles of Computer System Design
by Jerome H. Saltzer & M. Franz Kaashoek

This book analyzes fundamental principles and abstractions in computer system design. It teaches how to analyze computer systems and how to design good computer systems.

Artifical Intelligence: A Modern Approach
by Stuart Russel & Peter Norvig

This book is standard for understanding what AI typically refers to, and covers ideas for solving problems, including the roles of learning, reasoning, planning, and predicting in solving goal-oriented tasks.

Elements of Statistical Learning: Data Mining, Inference, and Prediction
by Trevor Hastie, Robert Tibshirani, & Jerome Friedman

This book covers the field of statistics as it applies to learning from data. It explains concepts for learning in a statistical framework as a mean for solving variants of a basic learning problem. It covers topics including regression, classification, neural networks, inference, and graphical models.

Machine Learning: A Probabilistic Perspective
by Kevin P. Murphy

This book applies the tools of probability theory to machine learning, where the goal of machine learning is "to develop methods that can automatically detect patterns in data, and then to use the uncovered patterns to predict future data or other outcomes of interest." It covers topics including generative models, Bayesian variants of regressors and classifiers, Gaussian processes, {exact, variational, & Monte Carlo} inference, clustering, structure learning, and deep learning.

Reinforcement Learning: An Introduction
by Richard S. Sutton & Andrew G. Barto

This book covers ideas and algorithms of reinforcement learning (as both planning and learning). It discusses dynamic programming for perfect models, Monte Carlo methods, and temporal-difference learning methods such as Q-learning. It expands on these methods for approximation in large or infinite state spaces. For more reinforcement learning, check out Deep Reinforcement Learning: An Overview by Yuxi Li (on arXiv).

Deep Learning in Neural Networks: An Overview. [arXiv]
by Jürgen Schmidhuber

This preprint of an overview of deep learning covers the historical progression of artificial neural networks and their applications, primarily focused in supervised an unsupervised learning settings with some coverage of reinforcement learning. From data handling to backpropagation, feedforward networks to recurrent networks, and shallow to deep networks, this work is a great handbook for deep learning, particularly in an historical context.

Natural Intelligence

Society of Mind
by Marvin Minky

This classic book models intelligence as a well-organized collection of simple interacting agents. It develops theories for learning, memory, action, and the processing of language within this system. It is extremely well written, and a must-read for any study of intelligence.

Gödel, Escher, Bach: An Eternal Golden Braid
by Douglas R. Hofstadter

This classic book discusses themes in mathematics, art, and music as metaphors for (a theory of) the emergence of cognition. Its focus on the philosophy of mind is maintained primarily as an undertone, though the analogies become apparent as the book progresses. It is extremely well written.

The Architecture of Cognition
by John R. Anderson

This book covers a theory of production systems, "ACT", as a basic framework for a unified theory of cognition.

Unified Theories of Cognition
by Allen Newell

This book covers a thoery of problem-solvers, "Soar," as a basic framework for a unified theory of cognition.

Research Papers

Artificial Intelligence

Neuro-Symbolic Program Synthesis [arXiv]
(2016) Parisotto, E., Mohamed, A., Singh, R., Li, L., Zhou, D., & Kohli, P.

This introduces a technique for synthesizing programs in a domain-specific language from input-output examples. It maps the examples to a continuous space using a selection of encoders with deep recurrent networks, and sends those encoded examples to neural networks that generate and encode partial solution trees.

Probabilistic data analysis with probabilistic programming [arXiv]
(2016) Saad, F., & Mansinghka, V.

This is a thorough introduction and overview of probabilistic programming and its applications with the conceptual and mathematical formalism for Composable Generative Population Models (CGPMs). Figure 8 (p. 23) has a really useful example and a diagram describing the flow of logic at a high level.

Building Machines that Learn and Think Like People [arXiv]
(2016) Lake, B. M., Ullman, T. D., Tenenbaum. J. B., & Gershman, S. J.

This reviews progress in cognitive science and AI research, and argues for future direction to get close to building machines that learn and think like people. This primarily involves the building of explanatory and causal models rather than pattern-recognizers, initializing systems with intuitive theories of physics and psychology, and build machings that support learning-to-learn to accelerate the learning process for related tasks.

Do you see what I mean? Visual resolution of linguistic ambiguities [arXiv]
(2016) Berzak, Y., Barbu, A., Harari, D., Katz, B., & Ullman, S.

This describes a model that uses visual context to disambiguate a sentence. It constructs an HMM for each predicate interpreted from a query sentence, which are connected to variable trackers that are themselves HMMs for potential bounding boxes. A sort-of linguistic encoder then constrains the system to resolve the ambiguities.

Understanding visual concepts with continuation learning [arXiv]
(2016) Whitney, W. F., Chang, M., Kulkarni, T., & Tenenbaum, J. B.

This describes a model for disentangling visual concepts by learning the transformations of input over time, yielding a symbolic representation for objects and motion. It uses a gating mechanism which receives from two consecutive frames. The goal is essentially to learn concepts like lighting and azimuth, which were given in a supervised manner in Kulkarni's deepconv paper (see below).

Human-level concept learning through probabilistic program induction [link]
(2015) Lake, B. M., Salakhutdinov, R., & Tenenbaum, J. B.

This describes a model for program induction that aims to perform well in one-shot learning scenarios — to successfully generalize from a single example. It introduces Bayesian program learning. In the context of handwritten characters (the focus of the paper), it achieves human-level performance and outperforms deep learning.

Deep convolutional inverse graphics network [arXiv]
(2015) Kulkarni, T. D., Whitney, W. F., Kohli, P., & Tenenbaum, J. B.

This describes a system centered on performing analysis by synthesis. It uses a hybrid encoder-decoder model to learn a disentangled representation for transformations like lighting at azimuthal rotation. After the encoding step, it uses a generic Stochastic Gradient Variational Bayes estimator to backpropagate reconstruction error when decoding. Clamping is used for (and against) training on each transformation to disentangle the representation.

Concepts in a probabilistic language of thought [pdf]
(2014) Goodman, N. D., Tenenbaum, J. B., & Gerstenberg, T.

This describes knowledge of concepts in a probabilistic language of thought, formally represented as functions in an enriched stochastic lambda calculus. Primary features are compositionality and symbolic scaffolding for productive thought, and probabilistic inference for successful reasoning under uncertainty.

Bootstrap learning via modular concept discovery [pdf]
(2013) Dechter, E., Malmaud, J., Adams, R. P., & Tenenbaum, J. B.

This describes a model for concept learning as a problem of program induction. It represents programs as a subset of simply-typed lambda calculus restricted to primitive combinators, and defines a stochastic grammer over programs. The frontier of promising programs are split into partial solutions which are preferentially enumerated to decide on modular subtrees which yield the most compressive set of solutions. The experiments are in symbolic regression and Boolean function learning.

Structure Discovery in Nonparametric Regression through Compositional Kernel Search [pdf]
(2013) Duvenaud, D. K., Lloyd, J. R., Grosse, R. B., Tenenbaum, J. B., & Ghahramani, Z.

This presents an automatic statistician, focused on regression problems. It uses kernel families and their attributes, and compositions of those kernels, as an expressive language of models. It does stochastic search over an implementation of Bayesian Occam's Razor on the model language. Also check out the follow-up paper and source code for more info.

Bootstrapping in a language of thought: A formal model of numerical concept learning [link]
(2012) Piantadosi S. T., Goodman, N. D., & Tenenbaum, J. B.

This describes a model for making elegant inductive leaps on a logical system. It is built on a lambda calculus with a collection of sort-of cognitive primitives, and inference is done with a Markov-chain Monte-Carlo algorithm to find the highest probability lexicon.

Theory learning as stochastic search in a language of thought [pdf]
(2012) Ullman, T. D., Goodman, N. D., & Tenenbaum, J. B.

This describes a hierarchical Bayesian framework for theory acquisition, aimed at being an algorithmic model for the development of theories in children. It uses a construction of statements with probabilistic Horn clause grammars starting with foundational predicates, then with theories as simple laws relating them, then with concrete models to extend the predicates to objects in the given domain, then finally tying to the observations to yield constraints for inference.

Sound texture perception via statistics of the auditory periphery: evidence from sound synthesis [pdf]
(2011) McDermott, J. H., & Simoncelli, E. P.

This describes a model for processing real-world sound textures. Figure 1 shows an overview of the process for analyzing waveforms to obtain values of statistical representations for the sounds. Sound is then synthesized by applying an iterative procedure to random knows which converges to meeting the same original statistics. There are discrepancies in realism when sound perception involves pitch, rhythm, and reverberation. Introducing feedback mechanisms to perpetuate perceptions would certainly improve the system, and also aligns with typical musical recognition.

Church: a language for generative models [pdf]
(2008) Goodman, N. D., Mansinghka, V. K., Roy, D. M., Bonawitz, K., & Tenenbaum, J. B.

This introduces a universal language for describing stochastic generative processes. The most remarkable features are its querying capabilities with conditional sampling, stochastic memoization, and generic schemes for inference. The language, Church, is a generalization of Scheme which introduces probabilistic computation to the language. The most useful resource for probabilistic models of cognition using Church is at

A rational analysis of rule-based concept learning [pdf]
(2008) Goodman, N. D., Tenenbaum J. B., Feldman J., & Griffiths, T. L.

This describes a model for concept learning with rational analysis. It starts with a context-free grammar comprising of a set of feature predicates and logical connectives, and defines a syntactic prior over formulae of the language to capture simplicity bias and a likelihood corresponding to the explanatory completeness of a formula. It goes on to define the Rational Rules model which joins these priors and likelihoods to yield generalized inference consistent with the standard practices of Bayesian modeling.

The rational basis of representativeness [pdf]
(2001) Tenenbaum, J. B. & Griffiths, T. L.

This defines representativeness in the context of Bayesian inference. It analyzes how Bayesian inference stands compared to statistical and inductive methods in addressing the task of categorization. It features the famous example of coin flips and addressing whether a coin is biased.

Distributed Systems

No compromises: distributed transactions with consistency, availability, and performance [pdf]
(2015) Dragojević, A., Narayanan, D., Nightingale, E. B., Renzelmann, M., ... & Castro, M.
Large-scale cluster management at Google with Borg [pdf]
(2015) Verma, A., Pedrosa, L., Korupolu, M., Oppenheimer, D., Tune, E., & Wilkes, J.
Wormhole: reliable pub-sub to support geo-replicated internet services [pdf]
(2015) Sharma, Y., Ajoux, P., Ang, P., Callies, D., Choudhary, A., ... & Kumar, S.
In search of an understandable consensus algorithm [pdf]
(2014) Ongaro, D., & Ousterhout, J.
The tail at scale [pdf]
(2013) Dean, J., & Barroso, L. A.
Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing [pdf]
(2012) Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., ... & Stoica, I.
ZooKeeper: wait-free coordination for Internet-scale systems [pdf]
(2010) Hunt, P., Konar, M., Junqueira, F. P., & Reed, B.
PNUTS: Yahoo!'s hosted data serving platform [pdf]
(2008) Cooper, B. F., Ramakrishnan, R., Srivastava, U., Silberstein, A., Bohannon, P., ... & Yerneni, R.
Dynamo: amazon's highly available key-value store [pdf]
(2007) DeCandia, G., Hastorun, D., Jampani, M., Kakulapati, G., Lakshman, A., ... & Vogels, W.
Chord: A scalable peer-to-peer lookup service for internet applications [pdf]
(2001) Stoica, I., Morris, R., Karger, D., Kaashoek, M. F., & Balakrishnan, H.

Copyright (c) 2016, 2017 Lucas Morales