March 30, 2008
4 More Years Of Silicon Chip Speed Increases Left?
If silicon chips stop speeding up 5 years from now we'll experience lower economic growth. Faster computer chips are one of the drivers of higher productivity and economic output. Well, silicon might finally be approaching the end of the line for speed-ups. This has important (mainly negative) implications for economic growth.
The silicon chip, which has supplied several decades’ worth of remarkable increases in computing power and speed, looks unlikely to be capable of sustaining this pace for more than another decade – in fact, in a plenary talk at the conference, Suman Datta of Pennsylvania State University, USA, gives the conventional silicon chip no longer than four years left to run.
As silicon computer circuitry gets ever smaller in the quest to pack more components into smaller areas on a chip, eventually the miniaturized electronic devices are undermined by fundamental physical limits. They start to become leaky, making them incapable of holding onto digital information. So if the steady increases in computing capability that we have come to take for granted are to continue, some new technology will have to take over from silicon.
We could still extend the speed-up time by several years with more parallel architectures. That's already happening some now with multi-core CPU chips. But software that can effectively utilize many CPU cores in parallel has been slow in coming. You can see this with Mozilla Firefox for example. I have a dual core CPU. If I open up 50 to 100 web pages with FireFox at once (and I do this often) then FireFox never takes more than 50% of available CPU usage. Why? It can't use multiple threads (at least with FireFox v2.x revs - can v3?) and so FireFox maxes out a single thread of execution fully utilizing just one of my 2 CPU cores. The 50% of total CPU usage in Windows Task Manager means 50% of 2 core in my case. So 100% of 1 core.
This limit on the part of FireFox is disappointing. If a very popular development project with a large number of contributors and millions of users is lagging then how will it take for less important apps to get more parallelized?
Some areas of computing could still accelerate once silicon chip sizes stop getting faster. Subsets of computer algorithms could migrate into gate logic rather getting expressed in software that runs as a sequence of instructions in memory. In other words, abandon the Von Neumann architecture. Not easy to do in the general case. But already lots of algorithms (such as in graphics chips) get implemented in logic gates.
As a way to get past the silicon speed limits carbon nanotubes might replace silicon in computer fabrication.
At the conference, researchers at Leeds University in the UK will report an important step towards one prospective replacement. Carbon nanotubes, discovered in 1991, are tubes of pure carbon just a few nanometres wide – about the width of a typical protein molecule, and tens of thousands of times thinner than a human hair. Because they conduct electricity, they have been proposed as ready-made molecular-scale wires for making electronic circuitry.
It seems unlikely carbon nanotubes will be ready to replace silicon in 5 years. So I suspect we are going to enter a gap period where computing capacity doesn't grow as rapidly as it has in the last 50 years.
Graphene also might replace silicon in computer fabrication.
Research results from University of Maryland physicists show that graphene, a new material that combines aspects of semiconductors and metals, could be a leading candidate to replace silicon in applications ranging from high-speed computer chips to biochemical sensors.
The research, funded by the National Science Foundation (NSF) and published online in the journal Nature Nanotechnolgy, reveals that graphene conducts electricity at room temperature with less intrinsic resistance than any other known material.
"Graphene is one of the materials being considered as a potential replacement of silicon for future computing," said NSF Program Manager Charles Ying. "The recent results obtained by the University of Maryland scientists provide directions to achieve high-electron speed in graphene near room temperature, which is critically important for practical applications."
Graphene is a sheet of carbon that is only 1 atom thick. That's as thin as thin gets.
Carbon comes in many different forms, from the graphite found in pencils to the world's most expensive diamonds. In 1980, we knew of only three basic forms of carbon, namely diamond, graphite, and amorphous carbon. Then, fullerenes and carbon nanotubes were discovered and all of a sudden that was where nanotechnology researchers wanted to be. Recently, though, there has been quite a buzz about graphene. Discovered only in 2004, graphene is a flat one-atom thick sheet of carbon.
We might hit a computer Peak Silicon at the same time we hit Peak Oil. But while the 2010s are looking problematic I'm more bullish on the 2020s due to advances in biotechnology that should really start to cause radical changes by then. Also, by the 2020s advances in photovoltaics, batteries, and other energy technologies should start to bring in replacement energy sources faster than fossil fuels production declines.
We still have a lot of space for computing speed growth even while the speed of the chips remains the same. I'm talking about efficiency of algorithms. As of today, the programs are extremely inefficient in their use of hardware, and the reason is cost: no need to think about (and invest into) saving memory and processing cycles if machines become faster all the time. I know it, I see it every day in course of my work, and I'm very frustrated.
So, we'll just need more (and better) programmers to speed things up.
Greg, Time spent optimizing is a cost. I happen to like doing optimizations. But I'm always having to trade off between time optimizing in one area versus time spent adding new functionality in other areas. The need to optimize will slow down development.
Also, memory capacity won't increase as rapidly.
Parallel computing will soon become important for mainstream software. Already, some many software companies are insisting on multithreading skills before hiring. Already the 8-core version of Intel server chips are about to be marketed, and in a few years, we shall almost certainly have 100 cores per CPU. But then the challenge will be to use these cores simultaneously. Also improvements will be made for bandwidth. If with multithreading, the Windows programs can be made dramatically faster, if the programmers do take advantage of 100 cores.
Additionally, optical computing will dramatically accelerate computing by many orders of magnitude. The limit will be the speed of light. But since it is practically impossible for any mass to travel faster than the speed of light, it follows that this means that interstellar expeditions will suffer from severe constraints: there will be mutinies in large interstellar trips if the new born crew members disagree with the original plan and decide to return back to earth.
This is a postdiction, actually. CPUs have been two-point-something GHz for the last five years. It's already over.
One of the reasons to bet big on virtualization is that behemoths like Windows will never be able to scale to hundreds of CPU cores. So you write a VM (virtual machine) that *can* scale, and then run all your old crappy single-threaded apps and operating systems in parallel as separate VMs. It's criminally inefficient to add yet another layer of goop like this, but it's more efficient than fixing all the code.
The reality of parallel computing is that it is just hard. Very few everyday algorithms are parallel by nature or easy to make parallel in an effective way. Sure you can thread but not everything is sensibly threadable either. Operating systems are particularly hard since most of the obviously parallel stuff started migrating out into device controllers in the 60s! (However, why on earth do CD drives still spin? Why don't they just have a gig of memory, read the disc once and then shutdown? The reverse for writing. This really bugs me)
There's already, I believe, a model out there for a non-Von Neumann architecture machine, where most of the processing power is in a vast pool of programmable logic arrays, and the software simply directs the machine on how to wire them up to implement the program in logic gates. This makes the logic optimization tools already available for chip design applicable to 'program' design.
Of course, I've stopped guessing where these things are going, since I blew all that money on a transputer array to practice my programming skills on, in an effort to get ahead of the curve...
But the number of clock ticks per instruction has been dropping it is my impression. Also, design rules are still shrinking. So there has been some increase in performance if not at the rate of the 80s and 90s.
I totally agree about the non-parallel nature of most algorithms. We can't just switch to more parallel algorithms either. Most problem statements aren't that parallel.
Maybe climate models lend to parallelization since each square of atmosphere has to have calcs done on it. Some other math models likewise. But a lot of stuff is transactional with all sorts of order dependency built in.
xman, compact disc drives don't have work the way you describe because reading an entire disc once requires an excessive amount of time. Even a 52x drive requires over thirteen minutes to completely read a 700MB disc.
There are at least 2 measures of interest. One is clock speed, which has stagnated for reasons of power consumption. The other is feature size, which has not yet stagnated, and which is the measure I think led to the "4 year" forecast. Silicon may indeed be approaching end of life on that measure. But as the graphene meme shows, we are likely to be able to shift tracks and keep things going awhile longer - but not forever.
Also, most applications that a) require lots of CPU; and b) are sufficiently important that people are willing to pay > commodity prices, have c) already been parallelized so that commodity hardware suffices. E.g., databases and web sites.
Unlike Randall, most people (excepting gamers and video editors) won't benefit much from > 2 cores, at least not until usage patterns change significantly because we mostly do 1 thing at a time, and most things we do (surf, email, word processing) are fine with 1 thread. FYI, I got Firefox 3 beta 4 to drive my 2 cores to a total of 50.7%, opening many pages. That may be a measurement error.
Speed up the processor near the speed of light and you'll get more work done per unit time, at least relative to other, slower processors.
I work in the industry.
The consensus is that silicon will work down to the 22nm design rule (around 2012). The 22nm rule is likely to be the last generation in terms of dimensionality. There is considerable work (and money being spent) on silicon replacements. The two leading contenders in the industry are graphene and carbon nanotubes. Carbon nanotubes are considered most likely, but recent advances in graphene processing could change this. Both of these technologies will lead to molecular-scale electronics. Both of these also have significant technical hurdles in order to make it to the prime time. Progress in miniturization will end once electronics reaches the molecular level, which is expected sometime around 2030.
Future technological advances, beyond 2030, will come mostly from biotechnology and "wet" nanotechnology (which is really a form of biotechnology).
Moore's Law will last until 2018, by which time Nanotubes, Molecular computing, etc. will be viable.
Plus, no one is talking about storage slowing down. Why would silicon chips slow down when storage is not?
As someone who has actually prepared graphene in the laboratory, I guarantee you will not see it in a computer near you any time soon. The yields are terrible. It likes to fold over on itself. It cannot be manipulated.
The method for manufacturing graphene at the moment is to rub a piece of highly-order pyrolytic graphite onto a silicon wafer coated with a thin layer of SiO2, and then locate graphene by interference microscopy and _hope_ it's one layer and not two, or three, or whatever. There's no way you can use this process commercially. Graphene will almost certainly end up at the same dead-end as type-2 superconductors: too difficult to fabricate and form for commercial applications.
More economic productivity is lost due to poor software system design brought on by code bloat than is lost due to slow processors -- lots more -- I would venture to guess a factor of 10 more.
There are several things that can be done to address this but none of them involve empowering more bad software system design with more powerful processors.
Robert, unlike you I don't have the advantage of actually having worked with graphene, but predicting that today's "state of the art" graphene fabrication will never improve to commercial levels seems a bit premature.
"The consensus is that silicon will work down to the 22nm design rule (around 2012). The 22nm rule is likely to be the last generation in terms of dimensionality. There is considerable work (and money being spent) on silicon replacements. The two leading contenders in the industry are graphene and carbon nanotubes. Carbon nanotubes are considered most likely, but recent advances in graphene processing could change this. Both of these technologies will lead to molecular-scale electronics. Both of these also have significant technical hurdles in order to make it to the prime time. Progress in miniturization will end once electronics reaches the molecular level, which is expected sometime around 2030. "
wikipedia states it is 16 nm...
Why won't it go down to 16 nm?
aren't all of these laws of minimum size correlates to chip speed assuming 2D architecture?
that can change if it has to.
Yes, silicon may go down to the 16nm design rule. I think it unlikely to go beyond this.
I personally am guessing we have ten years of scaling down to go on semiconductors. I don't know if we'll see a new technology to continue scaling beyond that, Intel and co are researching a number of different ideas.
But even if we can't scale down below say 16nm, we can move outwards and upwards. Outwards being chips side by side, or even processors on another part of the motherboard. And then upwards stacking on top of each other, two layers, 4 layers, 8 layers etc. And to speed up applications you make different types of cores specialized at whatever you are wanting to do.
Ok, chip experts. Any hope on the speed front?
Software experts - Progress there is nowhere near following Moore's law. Research on parallelism has been running for decades. The last big breakthrough was what? Standard markup? Debuggers? IDEs? Component architectures? Any other big breakthroughs coming?
There is an IBM developed system for automatically migrating repeated software functions over to FPGA. (they brand it as warp technology) Can speed systems up 100 times.
There is work by IBM, NEC, and SUN on optical connections between chips and cores. Could speed processing up 10-1000 times and reduce power usage 10 times.
There is work by IBM to use DNA to position carbon nanotubes to a precision of 2 nanometers.
There are several new computer memory architectures which could enlarge memory and speed up memory. Giving the equivalent of unlimited cache and main memory and again speeding up the overall processing.
Clock ticks per instruction doesn't actually matter anymore. When memory was fast relative to CPU, deep pipelines and superscalar designs made a big difference. Today, your CPU spends the overwhelming majority of its time in load stalls (waiting for memory to return the result of a load instruction). A single load stall is typically over 100 cycles. You could design a 100GHz CPU and it wouldn't matter. The speed of memory is the dominant factor. To address this, recent CPU designs have simpler pipelines, less superscalar instruction issue, and more hardware threading. The idea of hardware threading is that the CPU has a few (4-8) threads it knows about, and whenever one thread stalls waiting for a load to complete, the CPU switches to a different thread. It's a single-cycle context switch. This doesn't make the individual threads any faster, but it allows you to get a lot more work done in parallel.
Back in the late 1980s a guy I knew named Mario Nemirovsky was doing his Ph.D. at UCSB on the topic of an architecture (DISC) where the CPU does have multiple threads. Every single cycle a different thread executed for 4, 8, 16 threads.
Mario was promoting this approach to General Motors as a sure fire way to get guaranteed execution of all threads in a real time deterministic system. Ultimately GM didn't fund the commercialization of the idea. But I had a lot of talks with him about jumps, bus traffic, and all that. He argued his approach would reduce cache misses by giving the CPU other things to do during a wait for main memory. Just run a thread that isn't blocked. The shift between threads would occur at the time interval of a single clock tick. No need for the OS scheduler to run to switch threads.
He had lots of register files and the CPU had to know which register file it was accessing at any given time. In a way it is slightly like SPARC (if memory serves) where the register frame window shifts during routine calls. So there are a lot more variables in register rather than in a memory stack. Mario's difference is that he had no cost to thread switching.
Memory latency is a big problem. I've spent a month last year shrinking down call paths on a Linux ARM device getting some protocol stacks running in cache. Long call paths are killer for cache. So is object oriented design. The pieces of data that get accessed together are not necessarily in in the same class in close proximity. I give a lot of thought to locality of access when I'm coding for speed.
I've read a little about optical connections between chips/cores, but I'm not sure what that actually means (admittedly, I'm not a computer expert, but hell)--does it make the connection between multiple chips so seamless that it practically acts as one chip?
Also, anybody know anything about any advances in quantum computing? I've heard that quantum computers will make all other computers--and the way it sounds the need for human thinking--obsolete in rapid order once they become practical?
> Faster computer chips are one of the drivers of higher productivity and economic output.
It's a nice assertion, but it also happens to be unjustified.
There main drivers of higher productivity are word processors, e-mail, databases, and various kinds of groupware.
For all of these, the microprocessors of five years ago were quite adequate. (Adequate meaning that they produced response times faster than human users could react).
Faster processors are mostly needed for entertainment (video, games, etc) - and for relatively uncommon cases of scientific computation.
The reality, of course, is that productivity depends more on quality of software than on processor speeds. And that is not improving much (and, looking at Vista and the new king of bloatware aka Linux one could argue that it is getting worse).
Very cool! There aren't many of us left -- i.e. programmers who truly understand what we're actually asking the hardware to do. Most of what is called "programming" today would have been called "using" 20 years ago -- scripting, macros in Word, etc.
You'll appreciate this. When I interview someone for a kernel programming position, this is my first question:
"On your computer, you type the letter L. An L appears on your screen. What just happened?"
Most people, even most CS grads, have no idea. Among people who can offer any answer at all, the depth of understanding varies by orders of magnitude. It's a good filter, not only to assess knowledge, but as a measure of curiosity. If you've been programming since you were 12 and never wondered how your computer works -- or if you did wonder, but couldn't be bothered to find out -- that's valuable information.
One last random observation: every once in a while, I use a computer to perform an actual computation. When I do, it's a stunning reminder of how *fast* these machines have become. You don't notice because most software is *such* a pig. Saving a document as PDF takes forever, but doing a 1000 x 1000 matrix multiply is almost instantaneous.
"On your computer, you type the letter L. An L appears on your screen. What just happened?"
So Jeff, let's see how deep your knowledge is... lots of room to give an answer below:
Currently we have wire connections between chips and cores. Optical connections can speed those up by a hundred times or more.
To get to total speed there are delays all over.
The processor may have to wait for different memory to return the result.
For a hard drive there is a delay to physically spin the drive to the right spot and then read the info and transfer it over to the registers on the chip. Caches can also be involved.
For larger computations or with parallelization there can be a lot of communication between chips and cores. Faster is better.
Think of if you had a dial up modem versus an always on broadband connection. You are on the internet and communicating with other servers (other CPUs). How much does a faster communication and less latency help versus switching to a faster machine on either end.
Quantum computers. What type of quantum computer - adiabatic (heat, analog) versus gate model.
Adiabatic seems like it is being realized (some controversy) by Dwave Systems.
What are the quantum algorithms that can be implemented?
Optimization problems, simulations of quantum effects. For problems that can be successfully implemented on the type of quantm computer then those problems can be substantially sped up. It will depend upon the class and instance of that problem how much difference that will make. If it is an important problem there are already approximate answers, how much better will the quantum computer generated answer be ?
Eventually large scale quantum computers will be made that can run all of the desired quantum algorithms. Then we can have quantum computers as co-processors to speed up for those instances. If quantum database searches can be implemented then that is a generic speed up of large instances which could be more generally useful, many others are specialized speedups that are important to society for the result but not for most individuals to solve. Actually having more quantum computers available should allow more quantum algorithms to be developed. In general, quantum algorithms involve speeding up a fourier transform as a key step. If your problem cannot be helped by a massive speedup of such a step then it will not matter if it is a quantum computer or not. There are a dozen or so significant quantum computer algorithms.
Randall you regularly open 50 to 100 web pages at a time?
Umm - why?
I've written DSP assembly without operating systems. In fact, I've got code in orbit around Saturn on the Cassini probe for the inertial reference unit. Since we had to design a custom rad hard DSP at the time I even got to have a say in the choice of instructions in the instruction set. So I understand what is going on under the hood.
I do not know how people can stand to be programmers without being able to picture the assembly language beneath the C/C++/Java/etc. I find these developers who have these deficiencies to be real obstacles at times. They come up with designs incredibly inefficient and I have to fix what they did.
I have taught critical sections and mutexes and all that to dozens of programmers over the years. I've done this lately for some developers in India. It is so not commonly understood that it is a shame.
Actual computations: A Ph.D. physicist friend who models bond markets can never get enough computational power for what he does. There are always the problems we haven't solved yet that need more computational power. For example, and very importantly, we need to be able to do 3-D folding for proteins and beyond that simulate a complete mammalian cell and predict what molecules have affinity for each other. Imagine what that would do for drug development and the study of disease processes. Imagine you could model whether a gene therapy could make it into a cell.
There's always need for more computational power for simulation.
No, you are completely wrong. Stop and think about the far larger number of processors that get used in devices other than desktop and laptop computers. Better performance is always useful in lots of embedded apps. You can reduce the number of processors in, say, a car if you can make processors faster. You can use more complex algorithms to control an engine and change the timing of fuel, air, and ignition more often and more accurately if you have more processing power.
AI needs much more processing power than we have. Steps along the road toward AI each need additional processing power. Climate modeling needs orders of magnitude more processing power than we have. Computer aided engineering needs more processing power for a large assortment of modeling problems.
I see lots of ways to increase productivity with more processing power. The more processing power you have then the more signals you can collect and process to extract useful info. Use of more processing power is happening in many industries in many ways to raise productivity. You personally might not see it. But work in or near the right engineering development teams and you'll see this up close.
1) FireFox will reestablish the session which I last had when I last shut down. Since I have lots of pages open when I shut down then when I start up FireFox opens them all back up again. I do this in part because I'm looking for different articles and reports to weave into posts.
2) I open whole folders at a time in tabs. I've got folders that fit with themes I do digging thru. Again, this is part of digging for material to write about.
My browser use case is not typical.
A big factor in embedded apps is power consumption. Geometry shrinking has done wonders for cramming more stuff into a given die area and (after fixes for the leakage and gate capacitance issues) this usually means less power per gate toggle. So a slow down in Moore's law would be very bad. However, even if the gate thickness limits become a hard obstacle (finfets looke like a joke) sub-22nm, there may still be power consumption benefits with finer lithography because you can make more complex transistors that are more power efficient.
I'm skeptical that hedonic efficiency in electronics is as strongly coupled to Moore's law as it was in the past. I'm not saying that this isn't a problem but most of the "killer apps" that I can think of aren't as substantially constrained by electronic density as I think they were 15 years ago. Maybe it's just wishful thinking because it's hard to imagine the pace of my industry slowing.
Interesting - I did not know that FireFox re-opened sessions like that - thanks for the tip!
It's probably about time I increased my web efficiency - Thx!
Go into Tools | Session Manager | Settings and turn it on. You have to turn on session management to get the option to restore sessions. I've been using this feature for years.
Carbon nanotubes (multiwalled at least) were discovered by russians scientists way back in mid 60ies. (if i recall correctly.)
It's just they did not publish in the western journals at the time, obvious why.
Also, back then they werent deemed useful for anything, and they were forgotten, until somewhat recently.