Accelerating Depth Map Building with the GPU
Brown, Nathan Daniel
abstract +/-
The Scanning Electron Microscope (SEM) produces images that illustrate the structure of the specimen's surface. By tilting the image plane, one can create a pair of stereo images and qualitatively view the 3D structure.
A quantitative measure of the depth at each image pixel from this pair is much more desirable. While there are a variety of methods to compare the two images, normalized cross correlation (NCC) seems to be the best choice in these images.
Unfortunately, NCC is rather computationally intensive on large SEM images. Luckily, this problem is very adaptable to run on a low-cost Graphics Processing Unit (GPU), offering acceptable performance for interactive use. Extracted from LA-UR-08-2439
ADIOS User Library
Wolf, Matthew
abstract +/-
Transitioning to and beyond petascale architectures places an extreme stress on the I/O interfaces. Work being done jointly by Oak Ridge and Georgia Tech, along with other partners, showcases some technologies for adaptive and adaptable I/O libraries to address this need. The ADIOS user library captures a simplified user interface designed to enable highly optimized transfers, and the DataTap asynchronous I/O library leverages that to provide large throughput with minimal perturbation of the main executing code. Measurements for both in NCCS production codes and environments will be presented.
The Climate End Station
Buja, Lawrence
abstract +/-
The Grand Challenge of Climate Change Science is to predict future climates based on scenarios of anthropogenic emissions and other changes resulting from options in energy policies. The Climate Science Computational End Station (CCES) is an ongoing dedicated effort to apply DOE leadership computing resources to this problem through advanced computational simulations of the Earth System.
Flame Stabilization in Turbulent Lifted Autoignitive Jet Flames
Chen, Jacqueline H.
abstract +/-
In many modern combustion systems such as diesel engines and gas turbines, fuel is injected into an environment of hot gases, and a flame may be stabilized through the recirculation of hot air and combustion products. Under such conditions, a turbulent lifted flame is formed and the hot environment admits the possibility of autoignition, as a mechanism contributing to the stabilization of the flame base. Recent terascale direct numerical simulations of a lifted autoignitive turbulent jet flame have been performed to study stabilization mechanisms. The roles of autoignition, flame propagation, and organized fluid motion at the lifted flame base are identified. Diverse fuels exhibiting a wide range of ignition characteristics representative of oxygenated and multi-stage ignition fuels will be considered in future DNS studies.
Gyrokinetic Tokamak Simulation
Kolesnikov, Roman Alexandrovich
abstract +/-
We use Gyrokinetic Tokamak Simulation (GTS) code to carry out realistic simulations for the National Spherical Torus Experiment (NSTX) plasmas for magnetic fusion research. GTS is the only truly global code in the U.S. that has the right geometry to interface with the data given by the advanced experimental diagnostics from NSTX. Current research topics include investigation on electron temperature gradient (ETG) driven transport, understanding of the non-diffusive mechanisms for toroidal rotation as well as study of trapped electron mode (TEM) driven heat and momentum transport, where high beta and strong shaping makes these simulations computationally very challenging.
Habanero Project
Koelbel, Charles Howard
abstract +/-
The Habanero project at Rice University (http://habanero.rice.edu/Habanero_Home.html) was initiated in Fall 2007 to address the multicore software challenge by developing new programming technologies --- languages, compilers, managed runtimes, concurrency libraries, and tools --- that support portable parallel abstractions for future multicore hardware with high productivity and high performance. Our goal is to ensure that future software rewrites are done on software platforms that enable application developers to reuse their investment across multiple generations of homogeneous and heterogeneous multicore hardware.
High-reliability, availablility, and serviceability for extreme-scale HPC systems
Englemann, Christian
abstract +/-
In order to address anticipated high failure rates, resiliency characteristics have become an urgent priority for next-generation extreme-scale high-performance computing (HPC) systems. This poster summarizes our efforts in novel system software solutions for providing high-level reliability, availability and serviceability (RAS) for extreme-scale HPC systems. The poster showcases results of developed proof-of-concept implementations and performed theoretical analyses, and outlines planned activities. Recent breakthrough accomplishments, detailed in the poster, include two newly developed technologies for (1) providing high availability of service nodes, and (2) supporting advanced proactive fault resilience on compute nodes.
Large Eddy Simulation
Oefelein, Joseph C.
abstract +/-
Progress related to our recent INCITE allocation will be presented with emphasis on the Large Eddy Simulation technique. Emphasis is placed on turbulent combustion processes typically encountered in power and propulsion systems, and in particular those associated with internal combustion engines. Establishing a high-fidelity simulation capability in this area will provide the foundational capability required to develop a validated, predictive combustion modeling capability for optimization of evolving fuels in advanced engines for transportation applications.
Multiscale Materials Modeling via Petascale Computing
Keffer, David
download poster (pdf)
abstract +/-
Poster Abstract: The Computational Materials Research Group at the University of Tennessee (UT CMRG; http://clausius.engr.utk.edu/cmrg/index.html) synergistically integrates a broad repertoire of materials modeling tools to develop structure/property relationships for complex, nanostructured materials. These tools include quantum mechanical modeling, classical equilibrium and non-equilibrium molecular dynamics, Monte Carlo methods, reactive molecular dynamics, mesoscale models and continuum models. This suite of tools is applied to develop a molecular-level understanding of the fundamental mechanisms underlying the structure/property relationship of interest. Examples include (i) proton transport in PEM fuel cells, (ii) chain dynamics in flowing polymers, and (iii) nanoporous materials tailored for sensing of explosive material.
Performance Evaluation and Analysis Consortium End Station
Worley, Patrick
download poster (pdf)
abstract +/-
Overview of recent results and of current and planned research activities of the Performance Evaluation and Analysis Consortium End Station INCITE project
Performance Improvement in memory access costs
Jessup, Elizabeth
abstract +/-
Some scientific programs use only 10% of the processor's peak computing capacity due to data retrieval bottlenecks. Our work concerns the reduction of such memory access costs for problems in matrix algebra.
Matrix algebra codes are typically constructed as a sequence of calls to the Basic Linear Algebra Subprograms (BLAS) which promotes readability and maintainability at the expense of memory efficiency. We present a method for combining or composing sequences of matrix operations through loop fusion. The resulting performance improvement is directly proportional to the reduction in memory traffic, and speedups of up to 90% have been observed.
Recent progress in lattice QCD under INCITE
Joo, Balint
download poster (pdf)
abstract +/-
We present recent progress in Lattice QCD calculations under INCITE and discuss future challenges and opportunities.
Reconstruction of transcriptional networks of Shewanella oneidensis
Rocha, A. M., G. Kora, P. Stroot, and N. Samatova
abstract +/-
Application of system models in microbial studies is important to understanding transcriptional regulatory interactions in potentially important microorganisms. Prediction of gene response to environmental perturbations is critical to developing bioremediation strategies. The goal of this study is to evaluate two different algorithms, cMonkey and context of likelihood of relatedness, for predicting transcriptional gene response. Transcriptional regulatory networks were constructed using annotated gene sequences, available micro-array data, and known association networks for S. oneidensis. Reconstruction results will identify co-regulated transcriptional gene clusters and potential transcriptional gene response by Shewanella to environmental factors.
Resource Aware Partial Differential Equation Compiler
Duffy, Edward B
abstract +/-
In this poster, we present the design a PDE compiler that transforms equations written by scientists into optimized code that can be executed on different petascale systems.
By incorporating a higher level PDE language specification and Computer Algebra System, we empower the scientist to spend more time creating executables for their specific research rather than waste time debugging code that uses error-prone coding methods.
The hardware/software requirements are provided by the Application Specification Language, and the description of available resources are provided by the Resource Description Language. These specifications are used to generate executable code optimized for the specific architecture.
Resource-centric monitoring and performance analysis tools
Fowler, Robert, Wu Feng
download poster (pdf)
abstract +/-
In high-end systems, there will be hundreds of thousands, or even millions, of concurrent computational threads. The resources through which these threads interact and coordinate will be scarce resources that constrain system performance. Performance, correctness, and reliability in the system will be based on the use and contention for these resources. Therefore, we present resource-centric monitoring and performance analysis tools to capture the interaction and contention of these resources while at the same time not contributing to the problem. We intend for these tools to complement and interoperate with other projects that focused on the problem of scaling monitoring to petascale systems.
System-level Virtualization for High-Performance Computing
Scott, Stephen L.
abstract +/-
Current system-level virtualization solutions are guided by the server consolidation market, creating a gap with High Performance Computing requirements. In the context of HPC, virtualization typically enables execution environment customization, workload isolation, security, reliability, and work load balancing. We have developed OSCAR-V, a system management tool for virtual platforms, which enables the customization of the execution environment via the concept of Virtual System Environments. It also allows users to switch between non-virtualized environments and virtualized systems. We also developed a new virtualization solution, based on a modular architecture that aims at decreasing the system footprint and at enabling performance characterization.
Untitled
Mueller, Frank
abstract +/-
Petascale application require monitoring and analysis tools capable of processing enormous data volumes. For 100,000 cores, an instrumented application can generate terabytes of event traces. Thus, the transfer/storage of complete traces is infeasible, necessitating tools that reduce, process and analyze data on-line before human exploration and optimization.
We are developing a scalable, reconfigurable infrastructure for performance analysis on large-scale machines that collects near-constant size communication traces. We introduce novel mechanisms that annotate runtime traces with timing statistics and computational load measures. Preliminary results confirm the applicability of these techniques to visualize load imbalance and to replay codes for accurate analysis.
Untitled
Romero, Adrian
abstract +/-
Appropriate monitoring for high-performance clusters goes beyond real-time alerts. Suitable sifting of data, convenient access to system data for event investigation, and an accurate historical record of issues are all "monitoring". Correlated information from disparate data sources is vital in determining the overall health of these complex systems. Data generated via these machines provide the clues necessary in resolving past, present and future problems. Use of an effective monitoring tool will boost the ability to understand and resolve problems, help identify potential problems, reduce costs by increasing efficiency of existing personnel. This increase in efficiency allows management without additional staff.
Untitled
Scheibe, Timothy
download poster (jpg)
abstract +/-
Many important subsurface flow and transport problems involve coupled non-linear processes in porous media exhibiting complex heterogeneity. Experimental research has revealed important details about the physical, chemical, and biological mechanisms that control these processes from the molecular to laboratory scales. We are developing a hybrid multiscale modeling framework that combines discrete pore-scale models (which explicitly represent the pore space geometry at a local scale) with continuum field-scale models (which conceptualize flow and transport in a porous medium without a detailed representation of the pore space geometry). The framework utilizes scalable parallel codes at both scales linked in a component-based network.