Untitled Document

The idea for creating a center for applying machine learning to incomprehensible amounts of unusable data in order to unlock solutions to real world challenges began in 2001 and by April 2003, the founders were ready to launch as the Center for Computational Learning Systems.

The Idea. After the attacks on September 11, 2001, David Waltz, then a _____ with Columbia’s [EDIT? Lamont Lab in the Computer Sciences Department or other?] wanted to apply the skills of research scientists, like himself, in machine learning and data mining to strengthen New York City’s understanding of crucial systems, the predictability of occurrences of events such as fail points or desired results, and then design ways to produce needed outcomes. That meant finding urgent, high priority challenges involving massive amounts of data that people had despaired of ever interpreting, then applying the machine learning and data mining skills of research scientists and their powerful computers to look for patterns and learn what people cannot learn from the data on their own.

The Founders and Early Pioneers. Waltz had been working with Albert Boulanger and Roger Anderson. Boulanger had been an early pioneer in thinking machines and Internet hardware and software with Bolt, Beranek & Newman. Anderson was a specialist in electricity, energy and engineering from the University of Illinois. As Boulanger and Anderson discussed with Waltz how they could put their skills at the service of the wounded city, they considered the vast archives of raw data collected by financial institutions, libraries, medical services, bio-medical research labs, linguistic labs, utilities and government services. With computers becoming more and more powerful -and as a result more data being saved - they were excited about the potential uses of machine learning to understand patterns, predict and produce vital outcomes. Vladimir Vapnik, an expert in Support Vector Machines (SVM) theory, provided high level insights from the very beginning and still contributes at least 20% of his time to the Center’s work. [EDIT: Add something about his NEC connection?]

Because of their experience in energy, in 2002, the initial founders had begun to focus their application of computational learning systems to the functioning and security of New York City’s electrical grid. They were soon joined by other Columbia scientists with projects in linguistics where Kathy McKeown and Julia Hirschberg were spear-heading the application of machine learning to natural language processing. In addition, synergies were developed with Christine Leslie and other scientists with Columbia’s Center for Computational Biology and Bioinformatics where the Computational Biology group was applying machine learning and data mining to protein and DNA studies, genome research and other areas.

The Launch. By the first week of April 2003, the Center was launched. Each research scientist’s individual experience and interests had shaped the four initial cornerstones of the Center’s foundational projects: 1) urgent challenges confronting industry and government, 2) life sciences need for bio-informatics, 3) the processing of natural languages, and, an area that all members of the emerging team were passionate about, 4) advancing the field of machine learning and its applications to projects of service to humanity.

Columbia’s School of Engineering and Applied Science (SEAS) Vice Dean Mort Friedman had agreed to fund the start-up Center for the first three to four years. But even before then, the Center had become fully self-sufficient, returning 50% to pay for facilities and other infrastructure and overhead costs. The Center began hiring aggressively from the beginning, seeking research scientists who were not only leaders in their academic disciplines but also committed to focus their efforts full time on applying their expertise to industry and government projects in order to devise needed solutions, not just interesting theories. The timing turned out to be auspicious; as a result of an economic downturn, the Center could hire the best scientists impacted by the downsizing of corporate research labs. The Center first hired senior pathfinders in their fields, then younger research scientists and then tapped into the talent pool of top graduate students from Computer Science and other Columbia University departments.