An examination of the dual-core capability of the HP xw9300 Workstation.

By employing single- and dual-core AMD Opteron™ processor technology, users benefit from multiple processing power options in a high-end, ultra-high-performance personal workstation.

 


Introduction................................................................................................................................. 2

Single vs. Multi-core Technology...................................................................................................... 3

Introduction.............................................................................................................................. 3

Overview of Dual-core Technology................................................................................................ 3

Multitasking and Multithreading.................................................................................................... 4

Applications Environment............................................................................................................ 6

Applications Licensing................................................................................................................ 6

Performance Comparison............................................................................................................... 7

Application benchmarks.............................................................................................................. 7

Ansys.................................................................................................................................. 7

3D Studio Max...................................................................................................................... 7

GeoProbe/SeisWorks............................................................................................................ 8

Performance Summary............................................................................................................... 9

Conclusion.................................................................................................................................. 9

For more information................................................................................................................... 10

 

 


Introduction

Performance requirements for applications in the major market segments for personal workstations continue to grow. In fact, there is a “leapfrog” phenomenon at work—the more powerful the hardware becomes, the larger the problems that can be solved, and the more functionality software vendors add to applications. As users employ these larger problem sizes and increased amounts of functionality, workstation technology scrambles to increase performance, and the cycle repeats itself.

The two most common methods of increasing performance are: (a) increase the clock speed of the system’s processor(s), and/or (b) increasing the number of processors. (This excludes changing the underlying processor architecture, e.g., pipeline lengths, etc. Here, the discussion focuses on increasing performance within a specific architecture). Increasing performance through higher clock speeds is often impractical; increasing performance by increasing the number of processors often works quite well—especially in multitasked and multithreaded environments (see below).

The latest high-end workstation from HP, the HP xw9300 (Figure 1), leverages dual-core technology to double the number of processors (from two to four) available in the same physical enclosure. Doubling the number of processors provides ultra-high-end levels of performance for scientists, engineers, designers and digital artists who have extremely complex analyses and/or advanced visualization requirements.

Because different applications take advantage of multiple processors differently, it is useful to examine how multiple processors are used and where the benefit of their use lies. This paper discusses uses of dual-core technology, the different processor options available on the HP xw9300, and offers suggestions to help customers select the appropriate processor configuration.

 

Figure 1. The current line of HP personal workstations and the position of the HP xw9300.

 

 


Single vs. Multi-core Technology

Introduction

The availability of a dual-core processor has a simple implication—there are more processors available to do work in a given environment, hopefully translating into more performance. However, one of the first difficulties we encounter is determining what is really meant by “performance.” For example, the popular standard benchmark SPECfp2000 from the SPEC Corporation is often used to determine the relative floating-point performance of a system. Assuming the benchmark has not been parallelized (see below), a multiprocessor system would record the same results on the SPECfp2000 benchmark regardless of whether it had one, or one hundred, processors.

Another standard benchmark, the SPECfp_rate2000, runs multiple copies of the SPECfp2000 benchmark, and provides results based on the total throughput (number of jobs) that a system is capable of executing in a fixed amount of time. In this case, the more processors the better the result. Therefore, “performance” can depend on both how fast a single job completes as well as how quickly many jobs complete.

More will be discussed about performance in the “Applications” and “Performance” sections below. For now, keep in mind that performance is nearly always a combination of multiprocessor throughput and single application performance. Generally, we characterize performance as primarily multiprocessor throughput performance (i.e., aggregate system performance).

Overview of Dual-core Technology

Dual-core technology is a design whereby more than one processor core is placed on a die and, in general, using the same package as a single core processor (Figure 2). In fact, the first dual-core processors produced by AMD are socket-compatible with appropriate compatible single core processors[1]. Using dual-core processors provide the best performance per watt, and allows HP to provide more aggregate performance in the same workstation enclosure as is used by single core processors.

 

Figure 2. Dual-core technology places two processor cores on a single die.

 

Multitasking and Multithreading

Successfully employing multiple processors to increase performance always involves splitting work up across the system’s processors, whether these pieces of work are processes (jobs) or threads (portions of a single process). The former is called multitasking (or multiprocessing[2]); the latter is called multithreading. Figure 3 illustrates multitasking—note that three tasks may use multiple processors to get more work done in a shorter amount of time.

An example of multitasking might be a routine set of operations in a Digital Content Creation (DCC) environment. In a typical video editing workflow, the artist must render segments of video, compress video streams, and capture video from an external source, all of which are compute- and I/O-intensive applications. In a multitasking environment, all of these processes may be executing at the same time; each process is scheduled for some amount of time on the available processors in the system. The result is an overall reduction in the amount of time to complete the entire set of tasks.

 

Figure 3. Overall system performance (throughput) can be increased through multitasking.

 

A single task may also use multiple processors to increase performance through multithreading (parallelism). Multithreading involves breaking an application into pieces and spreading the pieces over multiple processors (Figure 4).

In the DCC workflow example, rendering of video frames is a highly parallelizable operation. This is because video frames in a video segment are largely independent, and the rendering of each frame, or groups of frames, can be distributed across threads of the rendering application. Thus, employing multiple processors increases the performance (reduces the time-to-result) of the rendering application.

As a general example, some compilers (HP Fortran and C for example) automatically identify parallelism in a program, and generate code that is thread-level parallelized. Additionally, industry-standard programming interfaces (APIs) and preprocessors are available that allow programmers to explicitly parallelize applications.

 

Figure 4. Performance of a single application can be accelerating by multithreading (parallelism).

 

Modern operating systems do some of each (multitasking and multithreading)—for example, the UNIX operating system itself is multithreaded (so parts of the operating system can run on multiple processors), however, UNIX also supports multiprocessing (to enable multiple jobs to run simultaneously if multiple processors are available). As shown in Figure 5, a multitasking/multithreaded environment allows multiple threads of multiple tasks to be scheduled to run on the available processors.

 

Figure 5. System and application performance can be increased through both multitasking and multithreading.

 

In Figure 5, tasks 1, 2, and 3 are broken into multiple threads (represented by different shades of the task’s color); these threads are then scheduled across the processors. For simplicity, this example assumes the overhead of parallelization is minimal; in reality there may be substantial overhead in multithreading. This is another reason why performance of multithreaded applications is difficult to predict.

Applications Environment

A critical criterion in deciding what processor configuration to purchase is performance with a user’s key applications. Users are strongly encouraged to ask the vendor of their software applications about performance on multiprocessor systems, as the amount of performance improvement is highly varied. Figure 6 below provides an overview of some of the market segments for which the HP xw9300 is primarily suited[3], as well as an indication of the applicability of a dual-core processor configuration to that segment.

 

Figure 6. General applicability of different application segments to dual-core processor technology.

Application Segment

General Characteristics

Applicability to dual-core processors

Mechanical Computer-Aided Engineering (MCAE)

MCAE applications require high processor performance, and many are designed to run on multiprocessor systems.

High

Digital Content Creation (DCC)

The DCC applications that demand the most of a workstation processor are generally complex animations, rendering, and physics systems as well as video effects. Most DCC applications are multithreaded. Rendering is a highly parallelizable operation and benefits from multiple processors.

High

Scientific Research

The Scientific Research market segment is kind of a “catch-all” market. However, many of the software developers in this market segment, such as imaging and life and material sciences, employ parallelism (both thread- and process-level).

High

Oil and Gas

The Oil and Gas industries make heavy use of both 32- and 64-bit applications, and many of these applications are designed to use parallelism to increase performance[4].

High

 

An additional important performance consideration is that of response time to the workstation user. All of the popular operating systems today are multi-threaded to some degree. Multithreading allows multiple operating system functions (e.g., file system access, window management, printing functions) to be carried out simultaneously, and systems with multiple processors will generally provide better response time to the workstation user.

Applications Licensing

Another important issue when comparing single- to multi-core processor technology is that of applications licensing. Some software vendors license applications by the computer system, some by the processor and some by the core. It is prudent to check with your independent software vendor (ISV) before making a decision, since customers who use software from vendors that license by individual-core may face increased software costs when upgrading to multi-core processor systems.

AMD recommends that software developers license their software by socket and schedule threads by available cores[5]. At least one major software vendor, Microsoft, has announced licensing based on the number of processor chips, regardless of the number of cores on the chip[6]. Thus, Microsoft operating systems and applications will install and run on multi-core systems just as they do on current single-core systems.


Performance Comparison

Many benchmarks are available that attempt to predict the performance of an application on a specific platform. Users are strongly advised to test individual applications on specific architectures, especially since dual-core technology is very sensitive to an application’s use of multiple processors.

Application benchmarks

As described in the section “Single vs. Multi-core Technology above, performance generally reflects some mixture of multi-threading and multi-tasking. Workstation users will nearly always benefit from multiple processors, if for no other reason multiple processors allow multiple operating system tasks to be executing simultaneously. By having multiple tasks executing concurrently, the operating system can respond more quickly to interactive requests, and/or provide more resources to application requirements. To assess performance of specific applications, several benchmarks are shown below.

Ansys

The first, Figure 7, illustrates performance on a suite of twenty-six engineering simulation programs using the Ansys 9.0 application[7]. For each processor configuration, the total runtime of all twenty-six benchmarks is shown (thus, the smaller the result the better the performance). Since Ansys has been optimized with multiprocessor configurations in mind, it benefits quite well from the multiprocessor, and specifically dual-core, configuration of the xw9300 workstation.

 

Figure 7. Comparison of performance using Ansys 9.0 standard benchmark suite[8].

 

3D Studio Max

The second application benchmark, shown in Figure 8, is based on the SPECapc (Application Performance Characterization) suite[9] and measures performance based on the workload of a typical workstation user using the application 3D Studio Max (v6)[10]. The 3D Studio Max benchmark includes functions such as wireframe modeling, shading, texturing, lighting, blending, inverse kinematics, object creation and manipulation, editing, scene creation, particle tracing, animation and rendering.

The total number of seconds to run each test is normalized based on a reference machine, and a composite score is computed. Composite scores are reported for both rendering and interactive tests. An overall composite score is also reported.

As shown in Figure 8, the rendering activity benefits greatly from the dual-core implementation on the xw9300—this is because the rendering algorithms are quite parallelizable and the 3D Studio Max application implements them in a multi-threaded environment. The interactive component, dominated by graphics and single-threaded portions of the application, do not benefit as much from multiple processors.

 

Figure 8. Comparison of performance for different workloads using the 3D Studio Max application.

 

GeoProbe/SeisWorks

Another benchmark that reflects increased performance through dual-core technology is that of the GeoProbe/SeisWorks applications from Landmark Graphics[11]. The applications allow interpreters to simultaneously view multi-attribute/multi-volume seismic data, well data, cultural data, and reservoir models. The benchmarks show two different methods of processing, and are operating on fairly large (10 GByte) data sets. The applications are designed to automatically take advantage of multiple processors/cores if they are present in the system on which the application is executing.

 

Figure 9. Comparison of performance for different configurations of the xw9300 workstation on the Landmarks Graphics applications GeoProbe and SeisWorks.

 

Performance Summary

As would be expected, users in a multitasking environment or with applications that employ multitasking or multitasking will benefit from dual-core configurations; those that are not so structured will not. As we have seen, many off-the-shelf applications are of this nature, and even operating systems such as Microsoft Windows and Linux are able to use multiple processors in day-to-day activities. Nonetheless, it is prudent for users to check with their software vendor to determine the benefit of dual-core technology for a specific application.

Conclusion

The introduction of multi-core processors promises to bring higher levels of performance to workstation users. Users benefit from reduced response times, faster job turnaround, and the ability to perform multiple tasks simultaneously. As more applications are specifically architected for multiprocessor systems, users will benefit from reduced turn-around time on compute-intense applications.

For scientists, engineers, and content creators that require the highest levels of performance, the HP xw9300 can provide that performance. HP’s close partnership with independent software vendors ensures that applications are designed for optimal performance on critical applications, including using multiple processors to enhance performance. Further, HP expertise with multiprocessor technology and 64-bit operating systems combine to deliver strong problem-solving performance, high reliability, and extreme graphics capabilities. Users that are considering acquiring high-end levels of performance with superior price/performance are urged to evaluate the HP xw9300 workstation.


For more information

http://www.hp.com/workstations/

HP’s personal workstations home page.

http://www.hp.com/workstations/pws/xw9300/index.html

The HP xw9300 personal workstation specifications.

http://www.amd.com/us-en/Processors/ProductInformation

AMD Opteron™ processor product page.

http://multi-core.amd.com/

AMD multi-core processor technology home page.

http://www.microsoft.com/licensing/highlights/multi-core.mspx

A statement from Microsoft on licensing issues related to multi-core processors.

© 2005 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. The only warranties for HP products and services are set forth in the express warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty. HP shall not be liable for technical or editorial errors or omissions contained herein.

Itanium is a trademark or registered trademark of Intel Corporation in the U.S. and other countries and is used under license. AMD, the AMD Arrow logo, AMD Opteron, combinations thereof, are trademarks of Advanced Micro Devices, Inc.

XXXX-XXXXENW, 06/2005

 
 



[1]http://multi-core.amd.com/Products/

[2] The terms multitasking and multiprocessing are used synonymously in this paper

[3] The xw9300 workstation is well suited for wide variety of applications; for brevity, we present the most common.

[4] Some use thread-level parallelism, others use process-level parallelism.

[5] See http://multi-core.amd.com/Technology/SoftwareLicensing/

[6] See http://www.microsoft.com/licensing/highlights/multi-core.mspx

[7] Please see http://www.ansys.com

[8] HP notation for number of processors/cores is “nP/mC,” where n=total number of processor modules, and m= total number of cores.

[9] Please see http://www.spec.org/gpc/apc.static/max6info.html

[10] Please see http://www4.discreet.com/3dsmax/

[11] See http://www.lgc.com/