Bringing Open CourseWare to North Korea

I was just interviewed by Voice of America about my trip with Choson Exchange last September to take Open CourseWare to North Korea: 북한 대학, 고도의 자료 공유체계 구축 [audio recording]

It's all in Korean, so here is an executive summary of the key points in English:

We took OCW and Wikibooks to North Korea and presented them at the Pyongyang Intl Sci and Tech Book Fair. Choi Thae Bok, one of the highest leaders in NK, came to inspect our books and said "every student and professor in NK needs access to these materials", so he called the Dean of Kim Chaek University (the "MIT of North Korea") to come meet with us.

The reporter asked about Internet access in NK. Some people have access to the Internet at high levels, and they copy information from the Internet and put it on the national Intranet (human-based content filtering).

Finally North Korea has an extensive book scanning project where they digitize thousands of books and make them accessible to anyone with access to the Intranet. Their book scanning project and their computer labs were pretty advanced, they had all the latest Dell and HP computer equipment with big flat panel monitors, and the computers all had signs on them that read, "Gift from the Great General Kim Jong Il".


Watson's Jeopardy win, and a reality check on the future of AI

So Watson beat the two best Jeopardy champions at their own game. What now?

Call me cynical, but as someone who has undertaken machine learning research for 14 years, the Jeopardy result is really not that surprising -- you would win too if you had the equivalent knowledgebase of all of Wikipedia at your fingertips for instant recall, and if you had a huge buzzer advantage by being able to process individual pieces of information in parallel at much faster rates than the human brain!

But there is a much deeper problem with some of the media pontification about the future of AI and machines taking over the world: try asking Watson how "he" feels about winning.

Watson's learning model is currently (only?) really, really good at figuring out what question you were asking given an answer to a general knowledge question. I'm sure there are lots of reusable pieces of the Watson system (some natural language processing (NLP) code, etc.). But what the mainstream media doesn't seem to understand is that it would be an enormous stretch to say that the system could simply and easily be applied to other domains.

The promise of machine learning is that algorithms should in theory be reusable in many situations. The Weka machine learning toolkit, for example, provides a generic ML framework that is used for all sorts of things. But extracting the right features from your data, and deciding how to represent them, is a huge problem on its own, and can be tackled completely separately from the learning issues. (This is all further muddied once you throw in NLP.)

Today most of the feature selection for any given learning task is done by hand engineers. An AGI (Artificial General Intelligence) would have to do that itself. We don't have much of a clue yet how to teach an AGI how to pick its own reasonable and useful feature sets in a totally generic or smart way. But it's quite easy to show that, for most complex datasets, your feature selection strategy is almost as important as, or more important than, the exact machine learning algorithm you apply.

What very few people appreciate is that machine learning has so far amounted to little more than learning arbitrary function approximators. You learn a mapping from a domain to a range, or from an input to an output. Minimizing the classification error is the process of refining that function approximation to minimize error on as-yet unseen data (the test dataset, i.e. data that was not used to train the previous iteration of function approximation). Because all machine learning algorithms (as they are currently framed) are basically just trying to learn a function, they are all in some deep sense quite equivalent. (Of course in practice, not all algorithms even work with the same data types, so that's why this is mostly only true in the deepest sense, but there has been quite a bit of work done to show that at the end of the day, most of today's machine learning algorithms are basically doing the same thing with different strengths and weaknesses.)

Incidentally, the fact that the whole field of machine learning is about learning arbitrary function approximators is pretty much the whole reason that a lot of people in CS learning theory don't really talk about AI anymore, only ML. There's nothing much intelligent about machine learning as it stands currently. I heard it said that CSAIL (the CS and AI Lab) here where I work at MIT is only still called CSAIL in deference to Marvin Minsky and the glory days of AI, and that a lot of people don't like the name and want to change it when Marvin finally totally retires. (That probably won't happen, but the statement alone was illustrative...) We need a complete revolution in learning theory before we can start to truly claim we're creating AI, even if the behaviors of ML algorithms feel "smart" to us: they only feel smart because they are correctly predicting outputs given inputs. But you could write down a function to do that on paper.

I'm not claiming we can't do it -- "It won't happen overnight, but it will happen" -- I'm just stating that ML and AI are quite different, and we're very good at ML and not at all good at AI.

Efforts to simulate the brain are moving along, and Ray Kurzweil predicts that in just a decade or two we should be able to build a computer as powerful as the brain. While that may be true in terms of total computational throughput of the hardware, there is no way to know if we will be able to create the right software to run on this hardware by that time. The software is everything.

One of the problems is that we don't know exactly how neurons work. People (even many neuroscientists) will tell you, "of course we know how a neuron works, it's a switching unit, it receives and accumulates signals until a certain potential is reached, then it sends on a signal to the other neurons it is connected to." I suspect in several years' time we will realize just how naive that assumption is. For now, there are already lots of fascinating discoveries made that show that things are just not that simple, e.g. (hot off the press yesterday): http://www.eurekalert.org/pub_releases/2011-02/nu-rtt021711.php

From that article:
> "It's not always stimulus in, immediate action potential out. "
> "It's very unusual to think that a neuron could fire continually without stimuli"
> "The researchers think that others have seen this persistent firing behavior in neurons but dismissed it as something wrong with the signal recording."
> "...the biggest surprise of all. The researchers found that one axon can talk to another."

This is exactly the sort of thing that makes me think it's going to take a lot longer than Ray predicts to simulate the brain: we don't even know what a neuron is doing. A cell is an immense, extraordinarily complex machine on the molecular scale, and simplifying it to a transistor or thresholded gate is not necessarily going to produce the correct emergent behavior when you connect a lot of them together. I'm glad people like the researcher conducting the above research are doing some more fundamental work into what a neuron actually is and how it really functions. I suspect that years down the line we'll discover much more complicated information processing capabilities of individual cells -- e.g. the ability of a nerve cell to store information in custom RNA strands based on incoming electrical impulses in order to encode memories internally [you read it here first], or something funky like that.

Of course even a simplified model is still valuable: "Essentially, all models are wrong, but some are useful" (--George E. P. Box). However we have to get the brain model right if we want to recreate intelligence the biologically-inspired way. Simply stated, we can't predict what it will take to build intelligence, or how long it will take, until we understand what it actually is we're trying to build. Just saying "it's an emergent property" is not a sufficient explanation. And emergent properties might only emerge if some very specific part of the behavior of our simplified models works correctly -- but we have no way of knowing which salient features must be modeled correctly and which can be simplified.

But a much bigger problem will hold up the arrival of AGI: not only do we not know how single neurons really work, we have NO CLUE what intelligence really is. And even less clue what consciousness really is. And the problem with Ray's predictions is that even though we can forecast the progress of a specific quantifiable parameter of known technology, perhaps even if the exact underlying technology that embodies the parameter changes form (e.g. Moore's Law continued to hold across at least 50 years, even across the switch from vacuum tubes to transistors to silicon wafers etc.), we can't forecast the time of creation or invention of a new technology that is for all intents "magic" right now because we still don't know how it would work. In fact we can predict the arrival of a specific magic technology about as well as we can predict the time of discovery of a specific mathematical proof or scientific principle. Nature sometimes chooses simply not to reveal herself to us. Can we even approximately predict when we will prove or disprove P=NP or the Goldbach Conjecture? How much harder is it to define intelligence (or even more so, consciousness) than to prove or disprove a mathematical statement?

Finally, and most importantly, somebody needs to get Watson to compete in Jeopardy against Deep Thought to guess the correct question to the answer 42...


The Social Network 2.0 -- and the REAL reason for the competition between Google and Facebook

A friend of mine is writing an article about The Future of Facebook, and asked me to review his article. I sent him the following response, which covers my view of the needed reinvention of the social network, and the real reason for the major competition between Google and Facebook.


First, make sure you look closely through “The Real Life Social Network”. Paul Adams, the author, worked at Google and got lured away by Facebook after he became famous for publishing these slides. This presentation reframes everything about social networking, and his talk already resulted in some big changes in how Facebook works.

Google is currently trying to create their own social network, as you have probably heard -- “Google Me” is supposedly the working name. Google figured they could easily duplicate all of Facebook’s functionality -- and improve upon it -- with only about three months of work. Word on the street is that they shut down entire project teams to work on this... but after major overrun of the initial time estimate, there’s still a lot of disagreement even inside Google as to whether or not what they have built so far should even be released. I haven’t seen it but I get the feeling it’s basically Buzz all over again in the social space. It’s interesting to see Google struggling to figure out the social space -- especially since Facebook has not done a stellar job of it themselves (privacy etc.), but they got the critical mass.

Also -- the big big point that most commentators miss in Google-vs.-Facebook commentaries is that Facebook is starting to move into the advertising space in a *major* way -- and advertising is 99% of Google’s revenue. However Facebook has the potential to monetize their ads more effectively than Google, because there is so much information on demographics, friendships and interests in somebody’s profile and social network graph that could provide extremely targeted ads and therefore be incredibly valuable to marketers. *This* is why Google is trying to move into the social space -- because if they don’t, the rug will be pulled out from under their feet and they will start losing major advertising revenue to Facebook once Facebook feels ready to launch a full-on frontal attack to Google’s advertising dominance. (This hasn’t happened yet, but it will, and by then it may be too late if Google doesn’t act now and do things right with building their own social graph infrastructure.)

The great equalizing factor could be that a lot of people are disgruntled with Facebook (due to privacy reasons, annoyances of apps, spam, password stealing, the forced overlap of social circles that would never overlap in RL leading to “oops” moments when their boss friend-requests them and they feel like they can’t say no, then their boss sees their drunken weekend photos etc.) -- and that creates a business opportunity for a competitor to Facebook (be it Google or somebody else) to gain ground on Facebook’s almost complete monopoly in the social space. However ultimately the social network needs to use open protocols and federated servers or decentralization/P2P of some form, allowing users to keep their data on whatever service they want and still tie in with their friends’ social networks on other services. I suspect this will be Google’s approach with Google Me (since that’s what they tried to do with Wave and other products), and employing open protocols and federation is also the approach taken by Diaspora and other direct Facebook competitors (so it’s a growing trend). Once people expect and demand that their data be openly federated, Facebook will lose their grip on their own walled garden if they don’t innovate one level of value above the social graph itself -- for at that point the social graph will have become commoditized. See the “Ubiquity creates infrastructure” figure here: http://www.searls.com/doc/os2/docchapter.html