Pittsburgh Technical Council

Supercomputing in the Pittsburgh Region

Supercomputing in the Pittsburgh Region

Article Published: June 12, 2014

Supercomputing can be described as a search for ever faster, more powerful computing capabilities via groundbreaking advances in hardware, software, memory, data storage and networking equipment. But while the technical achievements are often awe-inspiring, it is the important applications that both demonstrate the value and dictate the growth of global advances in supercomputing.

Applications often made possible solely through supercomputing can include complex scientific calculations, visualization, simulation and modeling, data collection, processing and storage and biomedical discovery tools of an unprecedented range. Researchers nationwide use supercomputers for a range of projects that include AIDS research, astrophysics, fluid dynamics, weather prediction, and materials science.

Known biological projects include the folding of a relatively small protein molecule, three-dimensional explorations of the visible human nervous system and DNA research. Simulation capabilities address everything from earthquakes and hurricanes to colliding galaxies, atoms, icecaps or cars. Nearly every aspect of research and industry is affected by supercomputing capacity, and with it comes an always-increasing need for more.

Competing Globally

In late 2004, the U.S. reclaimed its position as the global leader of supercomputing resources, following a two-and-half-year period of Japanese dominance. 

According to the Top 500, a biannual survey of global supercomputing rankings, the Tianhe-2, which is housed at the Chinese University of Defense Technology, earned the crown of world’s fastest supercomputer. The Tianhe-2 achieved 33.86 quadrillion operations per second (petaflop/s.) A flop is an acronym for floating-point operations per second. A petaflop is one thousand times faster than a teraflop.

To grasp the scale of performance of the Tianhe-2 supercomputer, if every one of the 6.5 billion people on earth held a calculator and did one calculation per second, they would all together still be almost 60,000 times slower than the Tianhe-2.

The U.S. houses eight of the world’s top 10 supercomputers, five of which are manufactured by IBM, and one ironically is made by Dell, a company that is known mostly for their consumer laptops.

One example of the type of problem that can now be examined with the expanded computing power is global warming, which defies experimentation in a lab. Now people can describe these processes in mathematical language and then put all those equations in a computer program.

The National Academies’ National Research Council reported that a 1,000-fold increase in computing power is needed almost immediately and a one-million-fold increase ultimately will be required for applications, such as drug discovery, climate prediction and automobile collision simulations. To that end, the next milestone being pursued is the exaflop, which is 1,000 times faster than the petaflop.

High Performance Computing

The Council on Competitiveness is the nation’s leading organization of CEOs, university presidents and labor leaders committed to promoting U.S. economic growth, success in global markets and raising the standard of living for all Americans, and it has made High Performance Computing (HPC) one of its top priorities. The Council on Competitiveness fully recognizes the importance of HPC and the need to make it accessible to private businesses.

The Council has an HPC systems initiative intended to stimulate and facilitate wider usage of HPC across the private sector to propel productivity, innovation and competitiveness. To do this, the Council has brought together a national brain trust of industrial HPC users to gain insights into how the private sector currently uses advanced computing capabilities.

Conventional wisdom is that the U.S. is a now service economy and no longer can be a leader in manufacturing. However, HPC is enabling a renaissance in advanced manufacturing where technology can be used for rapid prototyping to negate the labor cost advantages of other countries.

High-performance computing can help companies reduce costs by minimizing the need to build physical models, by allowing more thorough testing of designs before building and by creating the ability to develop more robust processes and higher quality products.

By performing more calculations per time unit and moving from design to production in a shorter timeframe, HPC can help speed time to market, which is a critical factor to compete successfully in the global market.

Pittsburgh is playing a dramatic role in HPC with the Pittsburgh Supercomputing Center (PSC).

PSC

In 1985, two Pittsburgh physics professors, Ralph Z. Roskies of the University of Pittsburgh and Michael J. Levine of Carnegie Mellon University, collaborated with Jim Kasdorf, then vice president of supercomputing at Westinghouse Electric Corporation, to develop the proposal that led to the creation of the Pittsburgh Supercomputing Center (PSC). Established in 1986, the PSC is a joint venture of Carnegie Mellon and the University of Pittsburgh, together with Westinghouse Electric Company. 

Laboratories such as Oak Ridge, Lawrence Berkeley and Los Alamos housed early supercomputer facilities and were reserved almost exclusively for classified research. As one of the first public research supercomputing facilities, the PSC has become a leading edge site in the National Science Foundation’s (NSF) TeraGrid programs, which provide U.S. academic researchers with support for and access to high-end computing infrastructure and research.

The PSC mission is to:

  • provide university, government and industrial researchers, scientists and engineers with access to several of the most powerful systems for high-performance computing, communications and data handling available for unclassified research.
  • advance the state-of-the-art in high-performance computing, communications and informatics
  • offer a flexible environment for solving the largest and most challenging problems in computational science.
  • act as a leading partner in XSEDE, the National Science Foundation program of coordinated cyber infrastructure for education and research

PSC works with its XSEDE partners to harness the full range of information technologies to enable discovery in U.S. science and engineering.

Jim Kasdorf, a high-performance computing operations and hardware expert at Westinghouse, helped secure the original National Science Foundation funding which brought the PSC into existence. Today, the organization stands as one of the region’s most successful experiments in collaboration.

With computer room facilities housed at Westinghouse, the PSC is administered from a building on South Craig Street in Oakland owned by Carnegie Mellon. Approximately 75 staff members serve the organization, and Roskies continues to serve as a scientific director. 

Funding

Currently, there are only eight university-based supercomputing centers originally funded by the NSF. They include Purdue University, the University of Minnesota, Colorado State University, Princeton University, the University of Illinois at Champaign-Urbana, the University of California at San Diego, Cornell University and the PSC. The PSC was among the first four to be established.

In addition to NSF funding, the PSC also have received funding from the U.S. Department of Energy, the National Institutes of Health, the Commonwealth of Pennsylvania and private industry. Over its 28-year history, the PSC has received nearly $30 million from the state. Using the state’s funding as leverage, the Center has received more than $378 million from the federal government and industry grants.

In 2004, the PSC and the University of Pittsburgh received a research grant of $900,000 from IBM for a three-year regional research project to develop a software tool, called the Standardized User Monitoring Suite, or SUMS. The software quantifies and analyzes the programming time required for next-generation supercomputing. IBM received $53 million from DARPA as one of three contractors pursuing the research. Rami Melhem, chairman of Pitt’s computer science department, directed the local effort.

The NSF awarded a five-year grant ending in 2010 totaling $52 million to support the PSC as a leading partner in the TeraGrid, NSF’s program to provide national cyberinfrastructure for education and research. Built over the last decade, the TeraGrid is the world’s largest, most comprehensive distributed cyber infrastructure for open scientific research. The PSC also had received about $5 million to continue its role in user support and security.

Much as physical infrastructure, such as power grids, telephone lines and water systems enables modern life, cyberinfrastructure makes possible much of modern scientific research. Through high-performance network connections, the TeraGrid integrates high-performance computers, data resources and tools and high-end experimental facilities at eight partner sites around the country.

More than 10,000 scientists used the TeraGrid to complete thousands of research projects, at no cost to the scientists.

In 2010, the PSC led a coalition of statewide organizations that won $100 million in federal stimulus grants to build a high-bandwidth network across Pennsylvania. This network serves businesses and medical center and improves rural access to the Internet.

State-of-the-Art in Pittsburgh

Pittsburgh maintains a reputation for providing the “big iron,” the largest and most powerful systems, along with particular expertise in maximizing the productivity of these systems. Pennsylvania researchers routinely use upwards of seven million hours of processor usage, approximately 30 percent of the time on PSC’s five major computing platforms.

The PSC’s first supercomputer was a CrayX-MP, which costs $18 million and could perform 840 million flops. Today, a typical laptop has more computing power.

Currently, the lineup at the PSC consists of:

  • Anton, a special purpose supercomputer for molecular dynamics simulations, designed and constructed by D. E. Shaw Research (DESRES). In collaboration with DESRES, the National Resource for Biomedical Supercomputing at PSC is hosting an Anton machine for general availability to the national biomedical community.
  • Axon, a 32-node cluster with a total of 256 cores. Most nodes contain eight gigabytes (GB) of memory; four nodes have 16 GB
  • BioU, a three-node computational cluster containing 16 cores and 128 GB of memory per node. BioU is available to researchers conducting biomedical research
  • Blacklight, a SGI Altix UV1000 coherent shared memory machine with 512 eight-core Intel Xeon 7500 processors with 32 terabytes of memory. It is partitioned into two connected 16-terabyte coherent shared-memory systems, creating the largest coherent shared-memory system in the world.
  • TheData Supercell, PSC's archival system, a low cost, high bandwidth, high capacity and high reliability data management system
  • Salk, an SGI Altix SMP machine with 144 procession cores and 288 Gbytes of shared memory, dedicated to biomedical research
  • Sherlock is a YarcData Universal RDF Integration Knowledge Appliance with PSC enhancements. It enables large-scale, rapid graph analytics through massive multithreading, a shared address space, sophisticated memory optimizations, a productive user environment, and support for heterogeneous applications.

Many PSC systems benchmarked important firsts in the field of supercomputing. Jaromir was a 512-processor SGI Cray T3E 900, featuring peak-performance rating of 460 billion flops.

The PSC pursued MPP in 1993 with the first Cray T3D installed anywhere in the world. The $15 million machine featured 540 processors. It is now decommissioned along with Mario, the first nongovernmental Cray C90 installed in the United States. Purchased for $35 million in 1992, the vector-processing machine used 16 high-speed processors, each arrayed on 70-pound circuit boards. 

Biomedical Supercomputing

During 2006, the PSC received $8.5 million from the National Institutes of Health (NIH) to renew its program in biomedical supercomputing. Through this program, the National Resource for Biomedical Supercomputing (NRBSC), PSC scientists pursue research in the life sciences and foster exchange nationwide among experts in computational science and biomedicine. The renewal award supports NRBSC’s research in three core areas: spatially realistic cellular modeling, large-scale volumetric visualization and analysis and computational structural biology.

Established in 1987, the PSC’s biomedical supercomputing program, renamed NRBSC, was the first such program in the country external to NIH. Along with core research, NRBSC develops collaborations with biomedical researchers at many centers around the country and provides computational resources, outreach and training. A grant was awarded by the National Institute of General Medical Sciences in late 2010 for continued operations.

The Extreme Science and Engineering Discovery Environment

High-performance computers often work with large data sets, and often the data and the processing power are not in the same location. To bring together supercomputing resources with users across the country, in 2001the NSF launched the TeraGrid, a trans-continental high-performance network. 

TeraGrid’s unified user support infrastructure and software environment allow users to access storage, information and computational resources at 11 centers across the U.S. via a single allocation, either as stand-alone resources or as components of a distributed application using

Grid software capabilities. The multi-year effort builds and deploys the world’s largest, most comprehensive distributed infrastructure for open scientific research. 

The PSC involvement with the TeraGrid began in 2002, when it announced that the TeraGrid has entered full production mode, providing a coordinated set of services for the nation’s science and engineering community. 

The TeraGrid underwent a transition to the Extreme Science and Engineering Discovery Environment (XSEDE), a follow-on project that was approved in 2011 involving a partnership of 17 institutions NSF announced funding the XSEDE project for five years, at $121 million.

Currently, the PSC is a leading partner in XSEDE, which replaces and expands the TeraGrid. This NSF-funded program provides U.S. academic researchers with support for and access to leadership-class computing infrastructure and research.

Other XSEDE partners include:

  • Cornell University Center for Advanced Computing
  • Indiana University
  • Jülich Supercomputing Centre
  • National Center for Atmospheric Research
  • National Center for Supercomputing Applications - University of Illinois at Urbana-           Champaign
  • National Institute for Computational Sciences - University of Tennessee Knoxville/Oak      Ridge National Laboratory
  • Ohio Supercomputer Center - The Ohio State University
  • Purdue University
  • Rice University
  • San Diego Supercomputer Center - University of California San Diego
  • Shodor Education Foundation
  • Southeastern Universities Research Association
  • Texas Advanced Computing Center - The University of Texas at Austin
  • University of California Berkeley
  • University of Chicago
  • University of Virginia

Advanced Networking

In 2004, the PSC staff members demonstrated that real-world data transmission at 40 gbps is now attainable over a single light wave, or lambda. The link was established with two next-generation Cisco CRS-1 routing systems, fitted with OC-768 interfaces. One OC-768 will support the same bandwidth as four OC-192s, the current standard.

The PSC and the University of Pittsburgh share membership and a seat on the board of the National LambdaRail (NLR), a national network infrastructure supporting experimental and production networks for the U.S. research community. The consortium joins leading U.S. universities and companies in deploying an advanced, nationwide fiber-optic infrastructure to encourage next-generation applications in science, engineering and medicine. Through NLR, many different networks will exist side-by-side in the same fiber-optic cable, but will be independent of each other, each supported by its own lightwave or lambda.

The PSC’s Advanced Networking Group conducts research on network performance and analysis in support of high performance computing applications. The group also develops software to support distributed supercomputing applications and to implement high-speed interfaces to archival and mass storage systems. Research is focused on such areas as TCP implementations, tools to tune TCP for better performance and software to monitor and improve network performance. 

Work addresses the maximization of usable network bandwidth for any office or research computing system. National projects such as the Net100 and Web100seek to improve “real” network performance for each network host and to provide tools to diagnose problems between the host and the network that might limit the host’s available bandwidth.

Partnerships

From its inception, PSC has fostered a spirit of collaboration throughout the Pittsburgh Region.

The Pittsburgh Supercomputing Center joined with the Department of Energy’s National Energy Technology Laboratory, Carnegie Mellon University, West Virginia University and the West Virginia Governor’s Office of Technology, in creating the Supercomputing Science Consortium. Known simply as (SC)2, the regional partnership acts to advance energy and environment technologies through the application of high performance computing and communications. Since its establishment, the University of Pittsburgh, The Pennsylvania State University, Duquesne University, Waynesburg College, the Institute for Scientific Research and the NASA Independent Verification and Validation facility also have joined the partnership. 

Research by the life sciences community is supported by the PSC’s high-performance computing resource as part of the NRBSC. The internal users group has operated at the PSC offices for nearly 20 years, and it is funded primarily through the National Institutes of Health, instead of by the PSC’s core NSF grant.

Other users in the region include PPG and the Bettis Atomic Power Laboratory. The reach of the PSC now extends throughout the Pittsburgh region in the form of a network called the Three River’s Optical Exchange (3ROX). Administered by PSC’s Advanced Networking Group, 3ROX provides high-bandwidth Internet access to area educational institutions, including the University of Pittsburgh, The Pennsylvania State University, Carnegie Mellon University, West Virginia University and several Pittsburgh area public schools.

Supercomputing Conferences

The PSC won the award for “Best Demonstration at TG08” during the annual conference of the TeraGrid, the National Science Foundation’s program of cyberinfrastructure for U.S. science and education. A PSC team of two scientists and a University of Pittsburgh student received the award for “WiiMD,” an innovative project that merges the video-game technology of the Nintendo Wii with interactive supercomputing.

Pittsburgh hosted a previous supercomputing conference, the SC2004, which drew more than 7,000 attendees interested in high-performance computing.

An annual feature of the conference includes SC Global, an Access Grid-enabled component that provides remote participation in conference events. This year marked the first ever demonstration of a simultaneous connection of AG nodes on all six inhabited continents.

Organizers of the event also assembled the largest data storage capacity on the planet. StorCloud provided an unprecedented one petabyte of memory, storage for more than 1,000 trillion bytes of computer information or the equivalent of 100 times the contents of the Library of Congress, the largest library in the world. Thirty-two tons of equipment worth about $80 million and donated by 22 vendors were required to create StorCloud on the exhibition hall floor. Some 300 kilowatts of power were consumed by StorCloud, making it the warmest place in the building.

Accessing Supercomputing

Users may apply to use the PSC’s other supercomputing resources through the National Science Foundation’s TeraGrid program, through the Corporate Affiliates program and through biomedical or starter grants. Any academic researcher is eligible to use the PSC facility under the NSF funding, and non-classified corporate research also is supported for a fee. 

The PSC Corporate Affiliates Program is designed to bring resources to bear on helping businesses solve their most challenging information-processing and research problems. The program combines training, consulting and access to high-performance systems to meet the needs of each participant. Each affiliate relationship is uniquely designed to meet the needs of the corporate partner.

The technology is useful to anybody who needs to visualize large volumes of data in a three-dimensional space — perhaps a jet engine, skyscraper or human heart — with potential customers including engineering firms and medical laboratories. The PSC staff actively participates in research, including co-authoring papers, and mobilization of the staff’s expertise is an added benefit that Pittsburgh region researchers enjoy.  

Research suitable for supercomputing resources often shares a common chicken-or-egg complexity: a researcher needs to know enough to create a model, but the problem is complicated enough that he or she cannot see their way through it, short of creating the model. In general, however, proposals that justify that “the science is worthy” are sufficient for PSC work.

Each year, researchers who use resources at the PSC number in the thousands from government, industry and universities across the U.S. Projects run the gamut: heart modeling and prosthetics, Nobel Prize-winning work in protein simulations, research on diabetes and kidney disease, beverage can modeling, coatings for eyewear lenses, epidemiological modeling for an H1N1 flu outbreak and tracing connections among neurons in a portion of the visual cortex, to name a few. Availability of PSC resources and its pool of computing professionals also have led to thousands of published scientific papers. 

Sign In or Join Up