Research in Indiana researcher participation in SC2003 technical program

Technical papers
Tutorials
Panels
Posters
Birds of a Feather sessions
HPC Challenge

Technical papers

A Compiler Analysis of Interprocedural Data Communication

Chair: John Feo (Sun Microsystems)
Date: Tuesday, November 18
Time: 1:30PM - 2:00PM
Room: 36-37

Speaker(s)/Author(s):
Yonghua Ding (Purdue University), Zhiyuan Li (Purdue University)

Description:
This paper presents a compiler analysis for data-communication for the purpose of transforming ordinary programs into ones that run on distributed systems. Such transformations have been used for process migration and computation offloading to improve the performance of mobile computing devices. In a client-server distributed environment, the efficiency of an application can be improved by careful partitioning of tasks between the server and the client. Optimal task partitioning depends on the tradeoff between the computation workload and the communication cost. Our compiler analysis, assisted by a minimum set of user assertions, estimates the amount of data communication between procedures. The paper also presents experimental results based on an implementation in the GCC compiler. The static estimates for several multimedia programs are compared against dynamic measurement performed using Shade, a SUN Microsystem's instruction-level simulator. The results show a high precision of the static analysis for most pairs of the procedures.

Link: Download PDF


A Self-Organizing Flock of Condors

Chair: Xian-He Sun (Illinois Institute of Technology )
Date: Tuesday, November 18
Time: 2:30PM - 3:00PM
Room: 38-39

Speaker(s)/Author(s):
Ali Raza Butt (Purdue University), Rongmei Zhang (Purdue University), Y. Charlie Hu (Purdue University)

Description:
Condor provides high throughput computing by leveraging idle-cycles on off-the-shelf desktop machines. It also supports flocking, a mechanism for sharing resources among Condor pools. Since Condor pools distributed over a wide area can have dynamically changing availability and sharing preferences, the current flocking mechanism based on static configurations can limit the potential of sharing resources across Condor pools. This paper presents a technique for resource discovery in distributed Condor pools using peer-to-peer mechanisms that are self-organizing, fault-tolerant, scalable, and locality-aware. Locality-awareness guarantees that applications are not shipped across long distances when nearby resources are available. Measurements using a synthetic job trace show that self-organized flocking reduces the maximum job wait time in queue for a heavily loaded pool by a factor of 10 compared to without flocking. Simulations of 1000 Condor pools are also presented and the results confirm that our technique discovers and utilizes physically nearby resources.

This paper has been nominated for the Best Student Paper of SC2003 award.

Link: Download PDF

Return to top


Tutorials

S2: A Tutorial Introduction to High Performance Data Transport

Chair: Robert L. Grossman (University of Illinois at Chicago)
Date: Sunday, November 16
Time: 8:30AM - 5:00PM
Room:

Speaker(s)/Author(s):
Bill Allcock (Argonne National Laboratory), Robert Grossman (University of Illinois at Chicago), Steven Wallace (Indiana University)

Description:
Content-Level: 40% Introductory 40% Intermediate 20% Advanced

Abstract:
Developing high performance data intensive applications requires not only high performance computing resources but just as importantly high performance data transport linking them. With emerging 1, 2.5 and 10 Gigabit per second links, there is unprecedented opportunity for creating distributed data intensive applications. In this tutorial, we give an overviw of different protocols for high performance data transport and how to build applications using them.

Link: Download PDF


S14: Computational Biology

Chair:Craig A. Stewart (Indiana University)
Date: Sunday, November 16
Time:1:30PM - 5:00PM
Room:

Speaker(s)/Author(s):
Craig A. Stewart (Indiana University)

Description:
Content-Level: 15% Introductory 70% Intermediate 15% Advanced

Abstract:
Computational biology, bioinformatics, genomics, systems biology and related areas stand to be very important to the high performance community. There are tremendous opportunities to advance knowledge in biological and biomedical research areas through the use of high performance computing. This tutorial will begin with a brief overview of the essential biological bases for the current revolution in life sciences computing. Topics to be covered in depth include: sequence alignment and pattern matching; protein structure prediction; phylogenetics; systems biology; grid computing applications; and thoughts about the future of computational biology. This tutorial is intended for people who are interested in a rapid and useful introduction to computational biology and high performance computing. Tutorial attendees can expect to have a basic understanding of the area of computational biology and have a real feel for the nature of the work in this area as a result of hands-on experience with key applications. There will be hands-on exercises as part of the tutorial. A limited number of laptops will be provided and assigned on a first-come, first-served basis. Attendees with laptops and wireless network adapters are encouraged to bring them to the tutorial. Attendee laptops must have ssh installed in order to participate but be aware that there will be no support available to debug problems with attendee laptops. Hands-on exercises may also be done throughout the week at the "Research in Indiana" exhibit.

Link:
Download PDF
Download exercises for tutorial

Return to top


Panels

Open Source Software Policy Issues for High Performance Computing

Chair: Rod Oldehoeft (Los Alamos National Laboratory)
Date: Friday, November 21
Time: 10:30AM - 12:00PM
Room: 40-41

Speaker(s)/Author(s):
Chair: Rod Oldehoeft (Los Alamos National Laboratory);
Panelists: Paul Gottlieb (DOE), Terry Bollinger (MITRE), Tony Stanco (The Center of Open Source and Government), Tim Witham (Open Source Development Lab), Dennis Gannon (Indiana University), Todd Needham (Microsoft Research)

Description:
Each of the panelists has experience with open-source software, as a government or industrial policy-m,aker, or as an implementor, distributor, or consumer of open-source software products. Panelists will summarize their experiences, and address issues of how open-source software can best be harnessed to advance high-performance computing. The collective experience of panelists will be valuable to those in industry and government as they formulate policies about open-source software for their organizations.

Return to top


Posters

1 TFLOPS achieved with distributed Linux cluster

Chair: Michelle Hribar (Pacific University) and Karen L. Karavanic (Portland State University)
Date: Tuesday, November 18
Time: 5:01PM - 7:00PM
Room: Lobby 2

Speaker(s)/Author(s):
Craig A. Stewart (University Information Technology Services, Indiana University), George Turner (UITS, Indiana University), Peng Wang (UITS, Indiana University), David Hart (UITS, Indiana University), Stephen Simms (UITS, Indiana University), Daniel Lauer (UITS, Indiana University), Mary Papakhian (UITS, Indiana University), Matthew Allen (UITS, Indiana University), Jeff Squyres (Open Systems Laboratory, Indiana University), Andrew Lumsdaine (Open Systems Laboratory, Indiana University)

Description:
The most recent Top500 list released in June 2003 includes, for the first time ever, a persistent, distributed Linux cluster with an achieved performance of more than 1 TFLOPS on the Linpack benchmark - the IU AVIDD facility. The purposes of this report are to describe this distributed cluster, describe the results of our performance analyses and tuning efforts, and consider the costs and benefits of this distributed approach in terms of robustness and performance.

Indiana University's AVIDD facility (Analysis and Visualization of Instrument-Driven Data) includes two identical Linux clusters, each with 208 2.4 GHz Prestonia processors. One is located in Indianapolis, the other in Bloomington (IN). They are connected via 53 miles of the I-light network (www.i-light.org). Communication over the I-light network is achieved via two Force10 E600ª switches, configured with 1000Base-TX (Gigabit Ethernet over copper) ports for connections to local machines, and with 10Gbase-ER modules to achieve communication between Indianapolis and Bloomington. Each local cluster also includes a Myrinet interconnect.

Methods and Materials:
Performance of the AVIDD facility was investigated with the High Performance Linpack (HPL) benchmark (http://www.netlib.org/benchmark/hpl/) using a beta version of LAM 7.0 (CVS snapshot date 03/23/2003) (http://www.lam-mpi.org/). Benchmarks were run under three conditions. HPL was run on each local cluster using the Myrinet interconnect. This test was repeated using gigabit Ethernet running over the local Force10 switch. The HPL benchmark was also run across the combined Bloomington and Indianapolis clusters, using gigabit Ethernet, the Force10 switches, and two dedicated fibers of the I-light network. For the benchmarks we used 192 processors in each cluster, for a peak theoretical capacity per cluster of 0.922 TFLOPS, and a combined peak theoretical capacity of 1.843 TFLOPS.

Results:
With one cluster, connected locally via Myrinet, the HPL benchmark achieved 0.614 TFLOPS, or 66.7% of peak theoretical capacity. The peak theoretical capacity of the cluster used for this benchmark was 0.921 TFLOPS. This benchmark was performed using MPICH-GM 1.2.4.8 and Myrinet GM 1.6.3. Problem size was set to 150,000.

With one cluster, connected locally via gigabit Ethernet and a Force10 E600 switch, the HPL benchmark achieved 0.575 TFLOPS, or 62.4% of peak theoretical capacity. This benchmark performed using Force10 FTOS v4.3.2.0b. Problem size was again 150,000.

The benchmarks run on one local cluster were duplicated, which verified that the two clusters were operating identically.

Running across the combined Bloomington and Indianapolis clusters, the HPL benchmark achieved 1.058 TFLOPS, or 57.4% of the peak theoretical capacity of those two clusters, for a peak achieved performance of 1.058 TFLOPS. The problem size was set to 220000. Network parameters were tuned as follows: jumbo frames were enabled (packet size 9252). Adaptive packet tuning was disabled and packet handling latencies were minimized on the compute node's NIC. The default and maximum read & write TCP buffer sizes were optimized for the latencies (distances) involved. The ability of the gigabit Ethernet NICs to do the segmentation and checksumming work was critical to the performance achieved.

Conclusion:
The AVIDD facility provides an unusual opportunity to directly consider the costs and benefits of a distributed, grid-based approach as compared to a single installation. The upper bound on the cost in terms of performance was approximately 9% from the performance, on the HPL benchmark, from the results expected for a single location installation. (This 9% figure is based on doubling the size of one local Myrinet-connected cluster with perfect scaling). The benefits of the distributed approach have to do primarily with facilities and robustness. The distributed installation reduced the impact on IUÕs existing machine rooms in terms of electrical power and cooling. This was important - neither of the two existing machine rooms had electrical power and cooling facilities sufficient to house the entire system. In addition, the distributed approach creates resilience in the face of a disaster striking one of the two machine rooms.

There are many other reasons for a grid-based approach to computing infrastructure, but overall the grid approach has proven useful to IU in practice in this installation.

NOTE: Prior to presentation of our poster, all of the benchmarks will be re-run with the latest versions of software. We also plan two additional variants of the benchmark tests. Namely, we plan to run the benchmark of the cluster using Force10 switches and Gigabit Ethernet, but connected over a shared network. In addition, we plan to run a performance analysis using multi-protocol support in LAM/MPI, which will permit us to use a hybrid of Myrinet locally and Gigabit Ethernet between the two clusters.

Return to top


Birds of a Feather sessions

.NET and Grids/HPC

Chair: Marty Humphrey (University of Virginia)
Date: Tuesday, November 18
Time: 5:00PM - 6:00PM
Room: 42-43

Speaker(s)/Author(s):
Marty Humphrey (University of Virginia)

Description:
Microsoft is actively promoting .NET as a standard platform for supporting a wide variety of services and applications. Given the increasing reliance of HPC and Grid activities on commodity systems, and the near certain ubiquity of .NET in the future, it is important that .NET technologies be evaluated relative to the needs of the HPC and the Grid. The purpose of this BoF is to gauge community interest in .NET as a platform for high-performance and Grid computing and to initiate the process of creating an on-going discussion and development of .NET as it relates in particular to these communities.

This year's BOF follows-up on the initial, successful BOF held at SC2002, which largely focused on HPC. At this year's BOF, we are broadening the scope to more equally focus on HPC and the Grid. Specifically, updates of HPC.NET (Indiana University), OGSI.NET (University of Virginia), and other related activities will be given. Both producers and consumers of all things .NET and HPC/Grid are strongly encouraged to attend!


OSCAR Community Meeting

Chair: Stephen L. Scott (Oak Ridge National Lab)
Date: Tuesday, November 18
Time: 5:00PM - 6:00PM
Room: 40-41

Speaker(s)/Author(s):
Stephen L. Scott (Oak Ridge National Laboratory), Thomas Naughton (Oak Ridge National Laboratory), Chokchai Leangsuksun (Louisiana Tech University), Jeff Squyres (Indiana University), Benoit des Ligneris (University of Sherbrooke), Richard M. Libby (Intel), Tom Lehmann (Intel), Jeremy Enos (NCSA), Neil Gorsuch (NCSA)

Description:
Since the first public release in 2001, there have been well over 100,000 downloads of the Open Source Cluster Application Resources (OSCAR) software stack. OSCAR is a self-extracting cluster configuration, installation, maintenance, and operation suite consisting of "best known practices" for cluster computing. In the past year we have seen an expansion of the OSCAR effort in the directions of diskless clusters (Thin OSCAR) and high-availability cluster computing (HA-OSCAR). Furthermore, this past May, over 100 attendees gathered in Sherbrooke, Quebec, Canada at the 1st international OSCAR symposium. With the continued growth of the OSCAR community, this meeting will be a focal point for the OSCAR community where both developers and users may gather to discuss the current state as well as future directions for the OSCAR software stack.


Impacts of Public and Private Sector Collaborations for Advanced Cyberinfrastructure

Chair: Dawnetta Michelle Van Dunk (Purdue University)
Date: Wednesday, November 19
Time: 5:00PM - 6:00PM
Room: 38-39

Speaker(s)/Author(s):
Dawnetta M. Van Dunk(Purdue University), Gary Bertoline(Purdue University), Gilbert Rochon (Purdue University)

Description:
"...A vast opportunity exists for creating new research environments based upon cyberinfrastructure, but there are also significant risks and costs if we do not act quickly and at a sufficient level of investment. The dangers, all increasing with the passage of time,... increased technological ('not invented here') balkanizations rather than interoperability among disciplines, wasteful redundant system-building activities among science fields or between science fields and industry; lack of synergy among information technology research, and the IT industry… resulting in under- or overestimating technological futures..." This statement in the Blue Ribbon Panel's final report on advanced cyberinfrastructure presents a significant challenge for maximizing the possibilities of advanced cyberinfrastructure - establishing sustainable linkages between academia and private industry.

The private sector seems to be vital to advancing cyberinfrastructure. While a large number of innovations are generated by universities, the private sector implements them and assures dissemination to the public. Therefore, developments of advanced cyberinfrastructure will require strengthened and lasting public-private sector relationships. Well-developed technology transfer strategies are important in this development as it is the solution to the challenge of effective collaboration between scientists and the private business sector.

Pundits for advanced cyberinfrastructure envision its benefits to the economy on many levels, including increased efficiency among workers and lower start-up costs for small companies. These efforts give new markets a boost - creating jobs. The workforce becomes more efficient through knowledge exchange and therefore increasing value for the private sector.

This BOF session will address issues in the reliability of the private sector for advanced cyberinfrastructure (interoperability concerns, open source community?); new market creation (what would the business models look like?); and workforce generation (what are impacts for graduate education and filling the talent pipeline?).

Return to top


HPC Challenge

Global Analysis of Arthropod Evolution

Date: 11/19
Time: 11am
Room: 40-41

Speakers/Presenter:
Rainer Keller (HLRS, University of Stuttgart), Craig Stewart (UITS, Indiana University), John Colbourne (Center for Genomics and Bioinformatics, Indiana University), Matthias Hess (HLRS, U. Stuttgart), David Hart (UITS, Indiana U.), Jennifer Steinbachs (Center for Genomics and Bioinformatics, Indiana University), Uwe Woessner (HLRS, U. Stuttgart), Donald Berry (UITS, Indiana U.), Richard Repasky (UITS, Indiana U.), Matthias Mueller (HLRS, U. Stuttgart), Huian Li (UITS, Indiana U.), Gary W. Stuart (Center for Genomics and Bioinformatics, Indiana University), Michael Resch (HLRS, U. Stuttgart), Katy Boerner (School of Library and Information Science, Indiana U.), Eric Wernert (UITS, Indiana U.), Markus Buchhorn (Australia National University), Hiroshi Takemiya (National Institute of Advanced Industrial Science and Technology, Japan), Rim Belhaj (ISET'Com, Tunesia), Wolfgang E. Nagel (Center for High Performance Computing (ZHR), Technical University of Dresden), Sergui Sanielevici (Pittsburgh Supercomputing Center), Sergio takeo Kofuji, LCCA/CCE-USP), David Bannon (Victorian Partnership for Advanced Computing, Australia), Norihiro Nakajima (Japan Atomic Energy Research Institute), Rosa Badia (CEPBA-IBM Research Institute), Mark A. Miller (San Diego Supercomputer Center), Hyungwoo Park (Korea Institute of Science and Technology Information), Rick Stevens (Argonne National Laboratory), Fang-Pang Lin (National Center for High Performance Computing), John Brooke (Manchester Computing), David Moffett (Purdue University), Tan Tin Wee (National University of Singapore), Greg Newby (Arctic Region Supercomputer Center), J.C.T. Poole (CACR, Cal-Tech), Ramched Hamza (Sup'com, Tunesia), Mary Papakhian (UITS, Indiana U.), Leigh Grundhoefer (UITS, Indiana U.), Peter Cherbas (Center for Genomics and Bioinformatics, Indiana U.), John Trueman (Australia National University)

Description:
fastDNAml is a well-established parallel program for inference of phylogenetic relationships based on DNA sequences. PACX is a tool for linking and geographically distributing MPI codes (http://www.hlrs.de/organization/pds/projects/pacx-mpi/) that permits use of distributed MPI applications across heterogeneous systems with the use of multiple, vendor- or system-specific optimized MPI libraries. We will use with Dimemas to predict the performance of the application implemented across the grid created for this project, and will attempt to match the pattern of work distribution to variations in communication speeds across the network to optimize the overall effectiveness of the calculations.
Our approach will be to link many supercomputers and workstations across as many continents as possible, using PACX to link fastDNAml so as to solve problems that would otherwise be too large to solve in a reasonable amount of time. Our plan, then, is to simultaneously demonstrate a global-scale distributed computing application and at the same time perform meaningful analysis of an important biological problem.

Return to top

Return to Projects and profiles

Last revised November 14, 2003

Copyright 2003, The Trustees of Indiana University
Comments