编者按:在计算机领域,每隔30年,就会卷起一场计算运用新浪潮。上世纪50年代,人们为物质世界建模;80年代,人们设法利用计算机增强的彼此的联系;在新世纪的2010年,计算应用翻开了与物质世界交互利用的新篇章。未来的计算机科学发展将为我们带来什么?它又需要什么样的设备、方法和技术的支持?看图灵奖获得者Butler Lampson对这些问题的思考。


本文译自Butler Lampson纪念图灵诞辰100周年大会的发言稿

What Computers Do: Model, Connect, and Engage


The first uses of computers, around 1950, were to model or simulate other things. Whether the target is a nuclear weapon or a payroll, the method is the same: build a computer system that behaves in some important ways like the target, observe the system, and infer something about the behavior of the target. The key idea is abstraction: there is an ideal system, often defined by a system of equations, which behaves like both the target system and the computer model. Modeling has been enormously successful; today it is used to understand, and often control, galaxies, proteins, inventories, airplanes in flight and many other systems, both physical and conceptual, and it has only begun to be exploited.



Models can be very simple or enormously complex, quite sketchy or very detailed, so they can be adapted to the available hardware capacity even when it is very small. Using early computers to connect people was either impossible or too expensive, compared to letters, telephones and meetings. But around 1980 Moore’s law improvements in digital hardware made it economic to use computers for word processing, e-mail, mobile phones, the web, search, music, social networks, e-books, and video. Much of this communication is real time, but even more involves stored information, often many petabytes of it.

模型可以很简单,也可以非常复杂,可能很粗糙,也可能要求非常详细,所以就算非常小,它们也能够配合可利用的硬件容量。和书信、电话还有面谈相比,利用早期计算机来“联系”人们,要么无法实现,要么费用昂贵。但是在1980年左右,摩尔定律在数字硬件方面的进步让计算机变得经济实用,可用于文字处理、电子邮件、移动手机、网络、搜索、音乐、社交网络、电子图书、视频等领域。很多这样的通讯都是实时的,但更多涉及到信息存储,数量常常达到许多PB (petabyte)



So modeling and connection are old stories—there must be little more to do. Not so. Both the physical and the conceptual worlds are enormously complex, and there are great opportunities to model them more accurately: chemical reactions, airplane wings, disposable diapers, economies, and social networks are still far from being well understood. Telepresence is still much worse than face-to-face meetings between people, real time translation of spoken language is primitive, and the machine can seldom understand what the user doing a search is actually looking for. So there’s still lots of opportunity for innovations in modeling and connection. This is especially true in education, where computers could provide teachers with power tools.



Nonetheless, I think that the most exciting applications of computing in the next 30 years will engage with the physical world in a non-trivial way. Put another way, computers will become embodied. Today this is in its infancy, with surgical robots and airplanes that are operated remotely by people, autonomous vacuum cleaners, adaptive cruise control for cars, and cellphone-based sensor networks for traffic data. In a few years we will have cars that drive themselves, prosthetic eyes and ears, health sensors in our homes and bodies, and effective automated personal assistants. I have a very bad memory for people’s names and faces, so my own dream (easier than a car) is a tiny camera I can clip to my shirt that will whisper in my ear, “That’s John Smith, you met him in Los Angeles last year.” In addition to saving many lives, these systems will have vast economic consequences. Autonomous cars alone will make the existing road system much more productive, as well as freeing drivers to do something more useful or pleasant, and using less fuel. 



What is it that determines when a new application of computing is feasible? Usually it’s improvements in the underlying hardware, driven by Moore’s law (2× gain / 18 months). Today’s what-you-see-is-what-you-get word processors were not possible in the 1960s, because the machines were too slow and expensive. The first machine that was recognizably a modern PC was the Xerox Alto in 1973, and it could support a decent word processor or spreadsheet, but it was much too small and slow to handle photographs or video, or to store music or books. Engagement needs vision, speech recognition, world modeling, planning, processing of large scale data, and many other things that are just beginning to become possible at reasonable cost. It’s not clear how to compare the capacity of a human brain with that of a computer, but the brain’s 1015 synapses (connections) and cycle time of 5 ms yield 2×1017 synapse events/sec, compared to 1012 bit events/sec for a 2 GHz, 8 core, 64 bit processor. It will take another 27 years of Moore’s law to make these numbers equal, but a mouse has only 1012 synapses, so perhaps we’ll have a digital mouse in 12 years (but it will draw more power than a real mouse).

当一个计算应用可能实现时,什么是决定性的因素呢?通常在于基础硬件的提高,根据摩尔定律,硬件性能每18个月提升一倍。在20世纪60年代,“所见即所得”的文字处理器还没有开发出来,因为那时的机器运行速度太慢,价格太高。第一台被公认为现代个人计算机的机器是1973年研发的Xerox Alto,它可以完成不错的文字处理或者电子表格,但是它太小太慢,不能处理图片、视频,或者存储音乐、图书。交互利用需要场景、语音识别、世界建模、规划、大规模的数据处理、以及很多工作,这些技术才刚开始在合理的成本下变得可能。现在还不清楚如何比较人脑和电脑的容量,但大脑的1015次神经键(连接)和5 毫秒的循环时间能处理每秒2×1017次神经键事件,而一台2GHz内存、8核、64位处理器的电脑能达到每秒1012 比特的处理速度。根据摩尔定律,这些数据还需要27年的时间才能对等,但是一只老鼠只有1012 次神经键,那么也许在12年内我们可以研发一只数字鼠(但是它可比一只真正的老鼠消耗更多的能量)。


Hardware is not the whole story, of course. It takes software to make a computer do anything, and the intellectual foundations of software are algorithms (for making each machine cycle do more useful work) and abstraction (for mastering complexity). We measure a computer or communication system externally by its bandwidth (jobs done per unit time), latency (start to finish time for one job) and availability (probability that a job gets done on time). Internally we measure the complexity, albeit much less precisely; it has something to do with how many component parts there are, how many and how complex are the connections between parts, and how well we can organize groups of parts into a single part with only a few external connections.



There are many methods for building systems, but most of them fit comfortably under one of three headings: Approximate, Increment, and Divide and conquer—AID for short.

我们有许多建立系统的方法,但大多数都可归到三类:近似(Approximate),增量(Increment),分治(Divide and conquer)——缩写为AID

● An approximate result is usually a good first step that’s easy to take, and often suffices. Even more important, there are many systems in which there is no right answer, or in which timeliness and agility are more important than correctness: internet packet delivery, search engines, social networks, even retail web sites. These systems are fundamentally different from the flight control, accounting, word processing and email systems that are the traditional bread and butter of computing.

● 得出一个近似值往往是容易踏出的第一步,而且也经常能够满足需求。更重要的是,很多系统没有正确的答案,或者时效性和灵活性比正确性更重要:网络数据包传输、搜索引擎、社交网络、甚至零售网站。比起传统和主要的计算,这些系统和飞行控制、统计、文字处理以及电子邮件系统有着本质的不同。

● Incrementally adjusting the state as conditions change, rather than recomputing it from scratch, is the best way to speed up a system (lacking a better algorithm). Caches in their many forms, copy on write, load balancing, dynamic scale out, and just in time compilation are a few examples. In development, it’s best to incrementally change and test a functioning system. Device drivers, apps, browser plugins and JavaScript incrementally extend a platform, and plug and play and hot swapping extend the hardware.

● 当情况变化时,比起重新抓取信息,重新计算,通过增量调整状态是加快系统速度的最佳方法(缺乏一个更好的算法)。比如,各种形式的缓存:写时拷贝(Copy on write)、负载平衡(load balancing)、动态扩容(dynamic scale out)、即时编译(just in time compilation)。在开发中,最好是通过增量来改变和测试一个功能系统。设备驱动、(应用程序)、浏览器插件以及JavaScript这些组件增量式地扩展了平台,而且即插即用、热调接等技术也扩展了硬件。

● Divide and conquer is the best single rule: break a big problem down into smaller pieces. Recursion, path names such as file or DNS names, redo logs for failure recovery, transactions, striping and partitioning, and replication are examples. Modern systems are structured hierarchically, and they are built out of big components such as an operating system, database, a browser or a vision system such as Kinect.

●  分治是最好的单一原则:将一个大问题拆成很多小部分,例如,递归、文件或DNS的路径名称、记录前次更新失败的日志、硬盘的分区、复制等。现代系统分层明确,它们有由操作系统等大组件、数据库、浏览器或视觉系统(如Kinect)构成。


For engagement, algorithms and abstraction are not enough. Probability is also essential, since the machine’s model of the physical world is necessarily uncertain. We are just beginning to learn how to write programs that can handle uncertainty. They use the techniques of statistics, Bayesian inference and machine learning to combine models of the connections among random variables, both observable and hidden, with observed data to learn parameters of the models and then to infer hidden variables such as the location of vehicles on a road from observations such as the image data from a camera.



Some applications of engagement are safety critical, such as driving a car or performing sur-gery, and these need to be much more dependable than typical computer systems. There are methods for building dependable systems: writing careful specifications of their desired behav-ior, giving more or less formal proofs that their code actually implements the specs, and using replicated state machines to ensure that the system will work even when some of its components fail. Today these methods only work for fairly simple systems. There’s much to be learned about how to scale them up, and also about how to design systems so that the safety critical part is small enough to be dependable.



Engagement can be very valuable to users, and when it is they will put up with a lot of hassle to get the value; consider an artificial eye for a blind person, for example. But other applications, such as a system that tells you which of your friends are nearby, are examples of ubiquitous computing that although useful, have only modest value. These systems have to be very well engineered, so that the hassle of using them is less than their modest value. Many such systems have failed because they didn’t meet this requirement.



The computing systems of the next few decades will expand the already successful application domains that model the world and connect people, and exploit the new domain that engages computers with the physical world in non-trivial ways. They will continue to be a rich source of value to their users, who will include almost everyone in the world, and an exciting source of problems, both intellectual and practical, for their builders. 



作者简介:Butler Lampson,知名计算机科学家,1992年图灵奖获得者。现任微软技术院士(Technical Fellow)。他是Xerox PARC的创始人之一,参与设计了SDS 940分时系统、Alto个人分布式计算系统、Xerox 9700激光打印机、两阶段提交协议、Autonet局域网、网络安全的SPKI系统、微软公司台式计算机软件、微软公司的Palladium高保证堆栈存储器以及多种编程语言。他从事的工作内容包括计算机体系结构、局域网、光栅打印机、页面描述语言、操作系统、远程程序通话、编程语言及其语义、大规模编程、容错计算、事务处理、计算机安全、所见即所得编辑程序以及台式计算机等。曾获1984年美国计算机协会软件系统奖,1996年IEEE计算机先驱奖,2001年获得冯·诺依曼勋章,2004年国家工程院的德雷珀奖等众多表彰与荣誉。


1 收藏 评论

网友最新评论 (0)