|
By LinuxHPC.org and Cluster Resources
|
|||
|
In computing, FLOPS (or flops) is an abbreviation of Floating Point Operations Per Second. This is used as a measure of a computer's performance, especially in fields of scientific calculations that make heavy use of floating point calculations. (Compare to MIPS -- million instructions per second.) One should speak in the singular of a FLOPS and not of a FLOP, although the latter is frequently encountered. The final S stands for second and does not indicate a plural. Alternatively, the singular FLOP (or flop) is used as an abbreviation for "floating-point operation", and a flop count is a count of these operations (e.g., required by a given algorithm or computer program). In this context, "flops" is simply the plural rather than a rate. Computing devices exhibit an enormous range of performance levels in floating-point applications, so it makes sense to introduce larger units than the FLOPS. The standard SI prefixes can be used for this purpose, resulting in such units as the megaFLOPS (MFLOPS, 106 FLOPS), the gigaFLOPS (GFLOPS, 109 FLOPS), the teraFLOPS (TFLOPS, 1012 FLOPS), the petaFLOPS (PFLOPS, 1015 FLOPS) and the exaFLOPS (EFLOPS, 1018 FLOPS). The performance spectrum A relatively cheap but modern desktop computer using, for example, a Pentium 4 or Athlon 64 CPU, typically runs at a clock frequency in excess of 2 GHz and provides computational performance in the range of a few GFLOPS. Even some video game consoles of the late 1990s and early 2000s, such as the Nintendo GameCube and Sega Dreamcast, had performance in excess of one GFLOPS (but see below). The original supercomputer, the Cray-1, was set up at Los Alamos National Laboratory in 1976. The Cray-1 was capable of 80 MFLOPS (or, according to another source, 138–250 MFLOPS). In fewer than 30 years since then, the computational speed of supercomputers has jumped a millionfold. According to the TOP500 list, the fastest computer in the world as of June 2006 was the IBM Blue Gene/L supercomputer, measuring a peak of 280.6 TFLOPS. That's more than twice the previous Blue Gene/L record of 136.8 TFLOPS, set when only half the machine was installed. Blue Gene (unveiled October 27th, 2005) contains 131,072 processor cores, yet each of these cores are quite similar to those found in many mid-performance computers (PowerPC 440). Blue Gene/L is a joint project of the Lawrence Livermore National Laboratory and IBM Article. In June of 2006, a new computer was announced by Japanese research institute RIKEN, the MDGRAPE-3. The computer's performance tops out at one petaflop, over three times faster than the Blue Gene/L. MDGRAPE-3 is not a general purpose computer, which is why it does not appear in the TOP500 list. It has special-purpose pipelines for simulating molecular dynamics. MDGRAPE-3 houses only 4,808 processors rather than the 131,072 needed for the Blue Gene/L, and, rather than costing billions of dollars, the machine only costs 7 million dollars to build. The computer is a joint project between Riken, Hitachi, Intel, and NEC subsidiary SGI Japan. Distributed computing uses the Internet to link personal computers to achieve a similar effect: Folding@home, the most powerful distributed computing project, has been able to sustain over 200 TFLOPS. SETI@home computes data at more than 100 TFLOPS. As of June 2005, GIMPS is sustaining 17 TFLOPS, while Einstein@home is actually crunching more than 50 TFLOPS against 167 TFLOPS of its theoretical computing speed.[citation needed]Pocket calculators are at the other end of the performance spectrum. Each calculation request to a typical calculator requires only a single operation, so there is rarely any need for its response time to exceed that needed by the operator. Any response time below 0.1 second is experienced as instantaneous by a human operator, so a simple calculator could be said to operate at about 10 FLOPS. Humans are even worse floating-point processors on the mathematic level. If it takes a person a quarter of an hour to carry out a pencil-and-paper long division problem with 10 significant digits, that person would be calculating in the milliFLOPS range. Bear in mind, however, that a purely mathematical test may not truly measure a human's FLOPS, as a human is also processing smells, sounds, touch, sight and motor coordination. FLOPS as a measure of performance In order for FLOPS to be useful as a measure of floating-point performance, a standard benchmark must be available on all computers of interest. One example is the LINPACK benchmark. FLOPS in isolation are arguably not very useful as a benchmark for modern computers. There are many factors in computer performance other than raw floating-point computation speed, such as I/O performance, interprocessor communication, cache coherence, and the memory hierarchy. This means that supercomputers are in general only capable of a small fraction of their "theoretical peak" FLOPS throughput (obtained by adding together the theoretical peak FLOPS performance of every element of the system). Even when operating on large highly parallel problems, their performance will be bursty, mostly due to the residual effects of Amdahl's law. Real benchmarks therefore measure both peak actual FLOPS performance as well as sustained FLOPS performance. For ordinary (non-scientific) applications, integer operations (measured in MIPS) are far more common. Measuring floating point operation speed, therefore, does not predict accurately how the processor will perform on just any problem. However, for many scientific jobs such as analysis of data, a FLOPS rating is effective. Historically: the earliest reliably documented serious use of the Floating Point Operation as metric appears to be AEC justification to Congress for purchasing a Control Data CDC 6600 in the mid-1960s. The terminology is currently so confusing that until April 24, 2006 U.S. export control was based upon measurement of "Composite Theoretical Performance" (CTP) in millions of "Theoretical Operations Per Second" or MTOPS. On that date, however, the U.S. Department of Commerce's Bureau of Industry and Security amended the Export Administration Regulations to base controls on Adjusted Peak Performance (APP) in Weighted TeraFLOPS (WT). All text used in this article is available under the GNU Free Documentation License. It uses material from the Wikipedia article "FLOPS". |
||
| © 2010 Adaptive Computing | |||