Monday, January 16, 2017

What is artificial intelligence?

Most people think of artificial intelligence as a walking talking robot that looks and acts almost human, and that this is the goal of AI, to make a simulacrum of all that a human does. But there are a lot of nuances to this, well, frankly a lot of major differences with what artificial intelligence really is.

I am writing this to clear up a lot of misconceptions that I read about AI, not to give _a_ definition, but rather to give _some_ definition to it.

So first, a first attempt. Artificial Intelligence is a computer method (rarely mechanical) that seems to do something only a human could do. A talking responding machine is AI, but a walking machine, with articulated legs, is not. Unless... well there it is, sometimes it is the application that is AI and sometimes it is the methods. And a chess playing machine....well is that AI or just really good engineering, exploring all possible game possibilities quickly?

We already have two nuances: how an AI is engineered, and what does the label apply to. The methods of AI engineering fall into two broad areas: human simulation, attempting to mimic the internal biological/psychological processes, and direct engineering, doing what it takes no matter what to make the engineered device act externally intelligent. A language parser uses scientific theories of linguistics to mimic how the brain is supposed to manipulate the text of a sentence to determine understanding. A chess playing machine uses alpha-beta pruning of deep game trees, contrary to the usual method people which is to look at only two or three moves deep and vaguely judge 'strength' of a position based on experience.

A lot of what was once considered AI is now considered plain old engineering. Once you see under the covers that the method used to look outwardly human is only a boring step by step recipe or lookup table, it loses it's 'wow' AI appeal and just seems like regular non-AI computing. in the 1600's, calculating machines were magical and might have been considered AI (if that were a thing then) because most people thought that people themselves doing it was magical much less a machine (OK this is a bit hyperbolic. what is 'most'? who are these people?). But you get the idea. A machine doing chess is magical (or has a small hidden turkish chess master manipulating the controls within). But Deep Blue in 1995, which beat Kasparov the chess grand-master, was pretty simple just using search trees, no ethereal silicon-embodied intuition ('just' = very rocket-sciency search trees and some hand-curated openings, gambits and end games).

An AI, with the indefinite article, is by popular usage, the thinking talking machine. Because of the potential for confusion the term for that nowadays is usual referred to as an AGI, Artificial General Intelligence.

But now to more substance. All along we were going along in vague style, not really knowing what AI really refers to concretely, relying on a presumed common idea of AI without really knowing. So here are some concrete examples of what AI is.

First, applications, which is what externally looks like everybody's general idea of AI

  • vision - converting digital images into words 
  • language - speech recognition, natural language processing and understanding, chatbots
  • games - chess, go, crosswords
  • problem solving/reasoning - word problems and puzzles, expert systems
  • planning - taking a set of goals and initial conditions
  • robotics - not the mechanical part of for example an articulated hand grasping an object but coordinating a path over a complicated landscape or separating parts on an assembly line
  • learning and memory - most of computer science has been intent on solving the memory problem and has done it well. Financial statements, medical records, airline reservations, online libraries.

And then there's tools and methods, how it's actually done, the man behind the curtain.

  • logic and other classical mathematical methods - for reasoning
  • probability/statistics - most machine learning methods are within the realm of stats
  • combinatorial algorithms - when a perfect algorithm for one domain works most of the time for another, then it's called a heuristic
  • decision trees - chess? one big decision tree? Yes.
  • optimization - specifically numerical algorithms for optimization like simplex or A*
  • cognitive science - cognitive psychology, neurology (brain physiology, neurons), linguistics, so you know how it works in actual biology, so you can either be inspired by it (neural networks) or simulate it (parsing a natural language according to linguist's rules)
Apps and tools are orthogonal; an app can use different tools. So NLP is one application area of AI, and many tools are used by it: machine learning using large language corpora, and independently linguistic knowledge.

The man behind the curtain is an apt metaphor for AI because like in Oz, the apps look magical, but it turns out it is, well, not exactly charlatany; the methods are so...mechanical and spartan and inhuman and small tricks. For example, crosswords seems like you need to have not just broad and deep knowledge, but also clever ability with associative memory and puns to read behind the clues. The recent very successful AI methods for crosswords is just to use a dictionary search and almost ignore the clues themselves, by any means necessary no matter how cleverless.

With all that said, the motivation for this was what I find to be a lot of misuse of all these labels, one term being used for another. So here's a bullet point summary:

  • AI is what you call the big show, all the different things together
  • AGI (Artifical General Intelligence) is a very limited example of an AI app but the most salient, the 2001/HAL intelligent like spoken interface. This seems to be like the end goal of AI but currently unrealistic, or rather very realistic with very limited expectations.
  • ML (Machine Learning), which is mostly just statistics, is a tool to quickly create AI apps from lots of data without having to rely on a domain expert
  • ANN (Artificial Neural Nets) and DL (Deep Learning, which is just a big NN) is just (of many) ML techniques

Friday, January 6, 2017

A meta-note about NP-completeness and its current state of (lack of) knowledge of

Recently, Scott Aaronson wrote a short-book-length summary of the current state of knowledge about NP-completeness. This is an honorable tradition, starting with the classic and definitive book by Garey and Johnson, then Sipser, Cook, Wigderson, and Allender. Aaronson gives his spin on things (a little more on the quantum computing side) but hits well (like the others) all the great points: Turing Machines, reductions, completeness, the hierarchy of harder problems, what is conjectured to be the case (P != NP), etc.

What I want to do is point out the unsaid things, those things that are obvious to insiders and don't need to be said, but are inscrutable to outsiders. The outsider may realize it or be told it explicitly but not really get it understand it because, you know, math.

The most important thing is that, of all the things stated, all the theorems, all the papers and books, the great number of words and symbols written about the subject, all these things are written about what we don't know about NP-completeness. We don't know whether P = NP. We don't know whether NP = coNP. We don't even know whether Graph Isomorphism is in P (early 2017 Babai retracted 2015 quasi-polytime algorithm, just sub-exponential). We don't know. We don't know so much. Every day there are books and papers published that are essentially just outlining the boundaries of all the so many things we don't know about P vs NP.

Most of the theory surrounding P vs NP was developed in the 1970's. Lots of small results (with great effort!) have been established over the years. We do know now that NP = PCP(log n,1) (Arora et al. 1998), NEXP is not in ACC^0 (Williams 2011). No, not small results but proofs of things that we really expected to be easy. Some specific problems have been shown to be in a particular class unknown before: Primality is in P (AKS 2004),  LinProg is in P (Karmarkar 1986)  UGraph Reachability is in L (Reingold 2004), but one usually expects it (relatively) easy to find an optimal algorithm for a problem...it just  turns out that all these were very difficult to discover.

But all these are minor results. One-offs. They may spark a cottage industry in their subcommunity and are undeniably important. But they are... they are only making infinitesimal progress towards P vs NP. In fact, because of this meta-intractability (get it? the problem of intractability is itself intractable), no one actually is currently attacking the direct problem of P vs NP. Most of the (great amount of) work goes into ancillary situations, extending minor results, changing the problem, all in the hopes that there might uncover one little tiny bit that makes the difference. It feels like panning for gold in that you dig up tons and toms of material, wash it through multiple levels of gratings, wash it some more, refine the dirt, wash the dirt some more, and if you get a couple of tiny specks of non-dirt that just turns out to be copper, you call that a good day.

As far as history of science goes, this is a rare situation. Usually, a bunch of data is collected, some patterns are noticed, you collect a lot more data (and all this new data is useful and interesting), these patterns look like they repeat in different weird ways, then a consolidating theory (a scientific experiential theory that is) is formulated that generalizes and explains the prior data, and then a search is made for even more data which then either confirms the theory or encourages refinement or maybe even over-throw of the first theory. Similarly in math (which the P vs NP problem is a part of), a bunch of small theorems are proven, then some (usually a group but usually a single person gets good press) see a pattern, add a few definitions and prove a big theorem that combines them all, and then further consequences of that big theory are applied to more instances (note that yes, in math it is very rare (never?) for a big theory to be overthrown because a proof is a proof (right?) of course it is, but the refinement comes either in how many new more complicated problems to explore are created that go beyond the first theory.

Anyway, the rare situation with P vs NP is that we've essentially gone almost 40 years now with the same state of affairs of lack of knowledge. The progress that there is is not steady, and any gains seem underwhelming at least from the outside (the first outsider thought is "You're so smart, shouldn't you have figured that out a long time ago?"). Even the most minor seeming result is astounding (NEXP is a very intractable class and ACC^0 a very tractable one - how come it was so hard to show NEXP not in ACC^0?).

Also, back to more basic things. All this literature, all this theorem proving hasn't really established, hasn't really given a definitive proof of the pinnacle question P vs NP...but it's almost a scientific proof, a proof by having lots of examples in mostly one direction. All the reductions, all the intermediate problems, al the proofs of no proof by method X, are really supporting evidence (not mathematical definitude) just supporting the particular direction that P is a strict, proper, much smaller subset of NP, that NP problems are actually harder and we really can't find exact and fast algorithms for NP-complete problems. That is, even though we have 40 years of not proving one way or the other, it (almost) all goes in the direction that they're probably not the same.