Invited Program

 

COOL Chips VII Organization committee would like to inform you of the topics of COOL Chips VII because the time limit of On-line Registration (5th April 2004 ) is coming.

 

   Contents
Panel Discussion            :   How can we survive in an era of billion-transistor chips ?  :
ACCESS Co. Ltd., ARM Ltd., Fujitsu Limited
IPFlex Inc. Transmeta Corporation
Lecture-1 Lecture-2-1 Lecture-2-2
Lecture-3



 

Special Lecture

Date:  Wednesday, April 14, 2004
Place:  Yokohama Joho Bunka Center  (Yokohama Media & Communications Center)

  • Lecture 1   (9:30-12:00)
    "Grid and Cluster Computing"
     Chair: Yuetsu Kodama (National Institute of Advanced Industrial Science and Technology)

  • Lecture 2   (13:00-15:30)
    "Chip Multiprocessor"
     Chair: Hideharu Amano (Keio University)

  • Lecture 3   (16:00-18:30)
    "Low Energy Computing"
     Chair: Shinichiro Haruyama (Keio University)

 


 

Registration Fee

             Member    Non-Member   Student
(after 26 March)   36,000     46,000      10,000      

 


 

Keynote 1.    John Cornish   (Director, Product Marketing ARM Ltd.,)

[A] Title:
Energy Optimisation in Multi-Core Systems

[B] Abstract:
Energy efficiency is now the number one issue for many applications, determining weight and cost, and constraining performance. As the industry moves to 90 nm and 65 nm processes the efficiency limitations of complex uni-processors are becoming more apparent. Multi-processor systems have traditionally been used to deliver high computational performance for demanding applications. In the system-on-chip world multi-core designs can also deliver greater energy efficiency than comparable uni-processors. In a multi-core system the programmer identifies opportunities for parallel execution and writes threads which can be dynamically assigned to individual processors. In a high performance uni-processor, instruction level parallelism must be extracted using large numbers of logic gates which consume both silicon area and power. Multi-core implementations can deliver other power advantages as a result of their scalability. For light workloads CPUs can be turned off completely or can be voltage scaled to minimise dynamic power while still satisfying application requirements and user expectations. The fundamental energy advantages of multi-core design provide a powerful incentive to address the software challenges of symmetric multi-processing.
This presentation describes work which ARM and its partners are doing to realise the full potential of multi-core design for system-on-chip.

Keynote 2.    Daniel F. Zucker   (Director of Technology ACCESS Co. Ltd.,)

[A] Title:
Cool Chips Enable Ubiquitous Web Browsing

[B] Abstract:
Computing is now in the midst of yet another great paradigm. First mainframes gave way to mini-computers, then mini-computers to PCs, and today handheld computers are drastically outnumbering PCs. For 2003, compared with 164 Million PCs, approximately 510 Million mobile handsets were sold (Gartner). Virtually everyone now carries a cellphone with them everywhere; and a cellphone is today just a specialized type of mobile computer. With increasing computing power, increasing storage capacity, and plummeting costs, soon everyone will be armed with an ubiquitous general purpose mobile computer.

ACCESS Co. Ltd., is the world's leading supplier of web browsers for mobile, wireless, and embedded devices. ACCESS is based in Tokyo, a center of leading-edge mobile wireless technology. Our browser has been ported to many operating systems and hardware devices. The web browser is often the gateway application to mobile information, media, and services.
This talk presents an overview of mobile applications and trends in mobile terminals and PDA devices.
I survey smartphones OSes, some popular mobile CPU architectures, and present several demonstrations of the latest mobile multi-media technology. The system-level components that enable comprehensive mobile computing of today will also drive the ubiquitous web browsing of tomorrow.

Invited 1.      Tomoyoshi Sato   (Vice President & CTO IPFlex Inc.)

[A] Title:
Dynamically Reconfigurable Processor DAP/DNA-2 and Development DAP/DNA-FW

[B] Abstract:
DAP/DNA-2 is developed for production usage as a dynamically reconfigurable processor. It is 12 million gates size and 166 MHz clock frequency. It mainly consists of two parts. One is 32 bit original RISC and the other is DNA matrices that is capable of fully 16/32 bit parallel data processing. In the session the actual data is shown about the performance result and power consumption per some application examples.
We describe some trade off on architecture of the device and especially current effort to save power and the future plan.
In the technology it's much more important points to discuss about development environment including compilation technology and cycle accurate simulation technology with meeting emulation result. We provide two performance analysis ways, statistics method by simulation or emulation and source code analysis.More seamless software tool is necessary to realize easy design.

We also describe about granularity issue on the dynamically reconfigurable processor technology. The architecture of both course grain and fine one can handle wide range implementation of various applications. We're going to challenge it and some issues exist to be solved in the next step.

Invited 2.      Masahito Kubo   (Director, Architecture/ FR-V Solution, LSI Group, Fujitsu Limited)

[A] Title:
Bio Server :
--- Packing a New Level of Processing in a Small Footprint
--- Density-Intensive Computing for Higher Parallelism

[B] Abstract:
We have developed a network based highly parallel computer system adopting 8way-VLIW FR550 series CPUs for molecular dynamics simulation required in Bioinformatics-based genomic drug design.
The system is called "BioServer" and has been developed as a bioinformatics platform for molecular dynamics simulation (MD simulation) of proteins. A few prototype systems have been built and started operation in November, 2003. The systems are being evaluated with respect to the applicability, usability, and performance for the application. The system performs high-speed computation, including protein structure analysis and the movement of protein molcules essential for today's ever-evolving biomedical technology. The system offers the possibility to drastically improve the efficiency of genomic drug design.
The system is a parallel processing server that adopts 8way-VLIW FR550s as its core engines. The FR550 is high-end product of the FR-V series microprocessors based on VLIW architecture. The system connects numerous FR550 processors through network.
Taking advantage of the low power consumption and the high performance of FR550, 128CPUs are packed in a compact, rack-mountable chassis. Based on the rack-mountable building-blocks, about 2000 CPUs can be packed in a standard, 19" Rack. With this extremely high processing density, it achieves ultra-high parallel processing at low TCO.

In the presentation, author discusses the requirements for the project, the architecture of the system, and briefly FR550 CPU being used in the system. Author also discusses a new index of computing performance, processing density, which is one of the new area of computing only achievable with high-performance, but low-power microcontrollers; in a sense, hot but cool chips.... hot in its performance, cool in its thermal and efficiency index.

Invited 3.      David R. Ditzel   (Founder & CTO Transmeta Corporation)

[A] Title:
Transmeta's Low Power Efficeon Microprocessor

[B] Abstract:
Transmeta's Efficeon processor is a new high-performance, low-power x86 compatible microprocessor.
This talk will present the system interfaces and micro architecture of Efficeon. Efficeon provides for x86 compatibility with a unique approach of VLIW hardware and Code Morphing translation software. The different stages of Code Morphing translation will be described. The performance, power and board space compared to other x86 processors will be discussed. Transmeta's processor roadmap in 130nm, 90nm and beyond will be shown.

This talk will also describe a new power management technique, called LongRun2, which can help reduce leakage power and transistor Vt variations at runtime.

Lecture 2-2.    Shorin KYO   (Principal Researcher NEC Corporation)


[A] Title:
Programmable SIMD Highly Parallel Processor Chips for Real-time Video Recognition Applications

[B] Abstract:
Real-time video recognition requires very high performance against complex algorithms, due to the large amount of pixel data and also the large variety of open-air scenes, nevertheless compactness and power effectiveness are often the requirement of its hardware implementation. After a brief review of some previous challenges, a highly parallel video recognition chip targeting for vision-based intelligent cruise control applications is presented. By integrating 128 4-way VLIW (Very Low Instruction Word) PEs based on a SIMD linear array architecture and operating at 100 MHz, the chip provides a computation power enough for a weather robust lane mark and vehicle detection function written in a high level programming language, to run in video rate, while at the same time satisfies power efficiency requirements of an in-vehicle LSI. Basing on four basic parallel methods and a software environment including an optimizing compiler of an extended C language, efficient development of real-time video recognition applications that effectively utilize the 128 processing elements are facilitated. Benchmark results show that, the chip can provide a four times better performance compared with a 2.4 GHz general purpose micro-processor. The result shows the potential of highly parallel processors as a major hardware architecture for real-time video recognition applications.

Lecture 1.    Satoshi Matsuoka   (Professor / Global Scientific Information & Computing Center, Tokyo Institute of Technology)


[A] Title:
Grid-Cluster Federation: Towards a Petascale Research Grid Infrastructure

[B] Abstract:
Grid computing will not only allow researchers to tackle large problems not possible with current computing infrastructures, but also will make available large, collaborative virtual computing environments throughout within organizations as well as across organizations, in some cases on a global scale. We believe that a federation of clusters, ranging from the very small to very large on a Grid, will serve as the mainstream computing as well as storage resources to address its massive IT resource needs. The talk will cover our ongoing R&D efforts to effectively implement such infrastructures that surmount to facilitating multi-teraflops to petaflops of computing power and multi-petabytes of storage in an ubiquitous fashion, namely (1) The Titech Campus Grid project, an experimental Grid deployment project within the Titech campus, consisting of over 1000 processors of various GSIC resources, (2) the latest Japanese National Research Grid Initiative (NAREGI) that aims to build the next generation Grid middleware intended to be used throughout the research Grids in Japan as well as other in other countries, and structured to be a triage of Grid middleware R&D, Grid enabling of Nanoscience Applications, as well as facilitating of 100Teraflops scale testbed and demonstration of grand-challenge Nanoscince applications thereof, and finally (3) the JST-CREST   MegaScale   project that attempts to address the issue of massive power consumption, very large space, as well as reliability requirements via advanced application of low power and dependable computing technology to such Grid-Cluster federation environments.

Lecture 3.    Krishna V. Palem   (Professor / School of Electrical and Computer Eng. & College of Computing Senior Research Leader, Georgia Institute of Technology)


[A] Title:
Low Energy Computing

[B] Abstract:
The energy consumed by computations is a significant concern, especially within the context of embedded systems, on par with the past focus on raw speed or its derivative performance in the high-performance computing domain. In this talk, we will outline a novel framework for low energy computing built on the fundamental and novel thesis that the energy consumed by a computation is proportional to the associated accuracy, characterized as the probability with which each computed bit is correct. This has led to the exploration of an entirely new dimension of energy-aware computing: trading the probability of the BIT being correct for savings in the energy consumed, yielding a probabilistic bit or PBIT (instead of a conventional BIT which is guaranteed to be correct).
With this as background, probabilistic hardware devices and gates realized from conventional CMOS technology for computing PBITs will be described. These probabilistic hardware devices provide a natural implementation for designing and analyzing probabilistic algorithms wherein the figure of merit is the energy complexity---a measure of the (physical) energy consumed. Such algorithms have found wide use on emerging applications such as speech and image understanding, robotics , and decision aids. Probabilistic devices compute with a definite probability of error, and have been shown to serve as natural building-blocks for realizing probabilistic logic designs, yielding significant energy savings in a variety of embedded computing applications---over a factor of 100 in interesting cases.
Some of these results, as well as the implications of probabilistic hardware to sustaining and accelerating Moore's law, will be surveyed. At a deeper level, all of this work rests on the twin foundations of classical thermodynamics (of Maxwell, Boltzmann and Gibbs), and the relatively modern computational complexity theory. Time permitting, these foundations will be surveyed.

Panel Discussion

[A]Title:
Outlook for low-power and high-performance processor design:
- How can we survive in an era of billion-transistor chips? -

Thanks to rapid progress in semiconductor technology, we are entering an era of billion-transistor chips. Even at the moment, aggressive architectures discussed on the papers a couple of years ago are now commercially available as advanced processors is not a rare case. As we will virtually be able to bring anything we want into a chip, one of the primary limitations in designing processor architectures may be architects' imagination. To survive in the billion-transistor era, we have to find the right solution to avoid the serious situation of starving in plenty of food in processor design with billion transistor budget. The panel tries to explore the various kinds of design space to pursue not only computing performance and functionality, but also their power and space efficiency, from the viewpoints from logic design to system-level integration, under the consideration of the requirements in many application fields.

[B]Moderator:
       Hiroaki Kobayashi, Professor, Tohoku University

[C]Panelists:
       Hideharu Amano, Professor, Keio University

       Tack-Don Han, Professor, Yonsei University

       Albert A. Liddicoat, Professor, California Polytechnic State University

       Yasuhiko Hagihara, Researcher, NEC

       Wonyong Sung, Professor, Seoul National University

       Osamu Takahashi, Researcher, IBM Austin





Prof. Hideharu Amano
Hideharu Amano received the Ph.D. degree from the Department of Electronic Engineering from Keio University in Yokohama Japan, in 1986. He is currently a professor in Department of Information and Computer Science, Keio University. His research interests include parallel architectures and reconfigurable systems.


Prof. Tack-Don Han
Tack-Don Han received his Ph.D. from University of Massachusetts at Amherst, U.S.A. in 1986. He received his Master degree from Wayne State University in Detroit, U.S.A in 1982 and B.S. from Yonsei University in Korea in 1978.

He was an assistant professor at Cleveland State University between 1987 and 1989. He was a faculty member (Part time teaching position) at Nasa Lewis Center in 1998. He has been a faculty member of Yonsei University since1999. Currently, he is a director of National Research Lab. and leads research team of 3-D graphics architecture.

He was also a visiting professor at Stanford University in 1996 and Kyoto University in1993. He is a founder of Colorzip Media inc, and he has developed the Color Code solution which has been used as a new code system in the mobile camera phone since 2000.

His research interests include High Performance Computing Architecture, 3-D Graphics Architecture and Ubiquitous Computing.


Prof. Albert A. Liddicoat
Albert A. Liddicoat received his B.S. degree in Electronic Engineering from California Polytechnic State University, San Luis Obispo in 1989. He received his M.S. degree in Electrical Engineering, his M.S. degree in Industrial Engineering and Engineering Management and his Ph.D. degree in Electrical Engineering from Stanford University in 1997, 1999, and 2002.

Albert joined the Computer Engineering Program and EE Department at Cal Poly in September 2002. Before joining the Cal Poly faculty Albert worked for IBM in the Storage Technology Division where he held many positions in the area of disk drive development including; servo system test and integration, ASIC development, system electronics and architecture, project leadership, program management, and business line management. In the spring of 1999 and the spring of 2001 he was a teaching fellow at the Stanford Japan Center in Kyoto Japan, where he taught undergraduate engineering courses and helped coach industry sponsored product design teams. Albert's research interests include computer architecture, computer arithmetic, networks, and re-configurable computing.


Mr. Yasuhiko Hagihara
Mr. Hagihara received the B.S. and M.S. degrees in electrical engineering from the Univ. of Tokyo in 1986 and 1988, respectively. He joined NEC, Microelectronics laboratory in 1988, and he is currently a member of System Devices Research Laboratories. His interests include high-speed circuit design and design methodology.


Prof. Wonyong Sung
Wonyong Sung received his B.S. degree in electrical engineering from Seoul National University in 1978, M.S. degree in electrical engineering from Korea Advanced Institute of Science and Technology (KAIST) in 1980, and Ph.D. degree in electrical and computer engineering from University of California, Santa Barbara, in 1987. During his Ph.D. course, he studied parallel processing algorithms, vector and multiprocessor implementation, and low-complexity FIR filter design. He has been a member of the faculty of Seoul National University since 1989. From January of 1998 to December of 1999, he worked as a chief of the SEED (System Engineering and Design Center, now Embedded Systems Research Center) in Seoul National University. He was a deputy chair of the School of Electrical Engineering from September 2001 to August 2003. He is now visiting the Center for Embedded Computer Systems in University of California, Irvine, for the period of January to December of 2004.

Wonyong Sung has been an active member of the academic circle of his field. He was an associate editor of the IEEE Tr. Circuits and Systems II from 2000 to 2001, and is a member of the design and implementation technical committee of the IEEE Signal Processing Society, and also a member of the VLSI systems and application technical committee of the IEEE Circuits and Systems Society. He was the general chair for the 2003 SIPS (Signal Processing Systems) Workshop in Seoul. He also attended the SOC Workshop held in Tampere, Finland, as an invited speaker in 2003.

He has also been interested in applying his engineering expertise to actual industrial fields. From 1980 to 1983, he worked for the LG electronics in Korea. He consulted from 1993 to 1994 for the Alta Group of the Cadence Design Systems for developing the Fixed Point Optimizer. In 2000, he founded the EduMediaTech, and developed a handheld multimedia educational device, SpeakingPartner.

Currently, his major research interests are the development of fixed-point optimization tools, embedded multimedia system architecture design and optimization, parallel processor architecture based implementations, VLSI design for digital signal processing, and high-bandwidth memory architecture design.


Dr. Osamu Takahashi
Osamu Takahashi received the B.S. degree in engineering physics and the M.S. degree in electrical engineering from the University of California, Berkeley, in 1993 and 1995, respectively. He received the Ph.D. in computer and mathematical sciences from Tohoku University, Sendai, Japan, in 2001.

He joined IBM Austin Research Lab, Texas, in 1995 as a member of high performance VLSI design research team. He has been involved in the development of the world first 1 GHz processor presented in 1998 ISSCC as well as the development of fully synchronous and pipelined 1GHz clock 1MB embedded DRAM macro presented in 2000 ISSCC. His other work includes the high performance embedded PowerPC processor core development and the multi Giga-Hertz PowerPC processor development. His specialties include general purpose multi port register file design, SRAM array design, dynamic programmable logic array design, latch design, custom logic design, processor internal power distribution design, and physical chip integration. He is the leading author of seven publications and one of co-authors of six publications. He currently has 16 US patents.

Since he transferred to the IBM Microelectronics division in 2001, he has been involved in the media processor design jointly developed by SONY, TOSHIBA, and IBM. Currently, he is the senior engineering manager of the design team for the auxiliary processor. He is responsible for the physical aspects of the auxiliary processor design including circuits, timing, physical design verification, and chip integration.



Page last updated: April 2, 2004