Direct N-Body Simulation

May 21, 2015, 6:38 am

≫ Next: OpenMP and OpenCL on Intel Xeon Phi

In some domains, an N-Body simulation is key to solving for the movement and forces of a dynamic system of particles. At each time step, the force that one body exacts on each other, and then the velocity can be computed. The simulation can continue up to a desired number of time steps.

The post Direct N-Body Simulation appeared first on insideHPC.

↧

OpenMP and OpenCL on Intel Xeon Phi

February 18, 2016, 7:02 am

≫ Next: Design Optimization for HPC Clusters

≪ Previous: Direct N-Body Simulation

"In a heterogeneous system that combines both the Intel Xeon CPU and the Intel Xeon Phi coprocessor, there are various options available to optimize applications. Whether one has an advantage over another is somewhat dependent on the application that is being run. Comparisons can be made comparing the two methods, as long as the algorithm lends itself to run and take advantage of either OpenMP or OpenCL."

The post OpenMP and OpenCL on Intel Xeon Phi appeared first on insideHPC.

↧

Design Optimization for HPC Clusters

August 31, 2016, 6:52 pm

≫ Next: Intel® VTune™ Amplifier Turns Raw Profiling Data Into Performance Insights

≪ Previous: OpenMP and OpenCL on Intel Xeon Phi

Advanced simulation software can dramatically shorten the design phase by allowing engineers to virtually optimize and validate new ideas earlier in the process, minimizing the expense of building physical prototypes and streamlining real-world testing.

The post Design Optimization for HPC Clusters appeared first on insideHPC.

↧

Intel® VTune™ Amplifier Turns Raw Profiling Data Into Performance Insights

April 13, 2017, 12:06 am

≫ Next: Intel Advisor Roofline Analysis Finds New Opportunities for Optimizing Application Performance

≪ Previous: Design Optimization for HPC Clusters

Discovering where the performance bottlenecks are and knowing what to do about it can be a mysterious and complex art, needing some very sophisticated performance analysis tools for success. That’s where Intel® VTune™ Amplifier XE 2017, part of Intel Parallel Studio XE, comes in.

The post Intel® VTune™ Amplifier Turns Raw Profiling Data Into Performance Insights appeared first on insideHPC.

↧

Intel Advisor Roofline Analysis Finds New Opportunities for Optimizing Application Performance

April 27, 2017, 7:00 am

≫ Next: C++ Parallel STL Introduced in Intel Parallel Studio XE 2018 Beta

≪ Previous: Intel® VTune™ Amplifier Turns Raw Profiling Data Into Performance Insights

Intel Advisor, an integral part of Intel Parallel Studio XE 2017, can help identify portions of code that could be good candidates for parallelization (both vectorization and threading). It can also help determine when it might not be appropriate to parallelize a section of code, depending on the platform, processor, and configuration it’s running on. Intel Advisor Roofline Analysis reveals the gap between an application’s performance and its expected performance.

The post Intel Advisor Roofline Analysis Finds New Opportunities for Optimizing Application Performance appeared first on insideHPC.

↧

C++ Parallel STL Introduced in Intel Parallel Studio XE 2018 Beta

May 11, 2017, 7:51 am

≫ Next: The OpenMP API Celebrates 20 Years of Success

≪ Previous: Intel Advisor Roofline Analysis Finds New Opportunities for Optimizing Application Performance

Parallel STL now makes it possible to transform existing sequential C++ code to take advantage of the threading and vectorization capabilities of modern hardware architectures. It does this by extending the C++ Standard Template Library with an execution policy argument that specifies the degree of threading and vectorization for each algorithm used.

The post C++ Parallel STL Introduced in Intel Parallel Studio XE 2018 Beta appeared first on insideHPC.

↧

The OpenMP API Celebrates 20 Years of Success

May 25, 2017, 7:55 am

≫ Next: Multicore Performance Challenges for Game Developers

≪ Previous: C++ Parallel STL Introduced in Intel Parallel Studio XE 2018 Beta

OpenMP is a good example of how hardware and software vendors, researchers, and academia, volunteering to work together, can successfully design a standard that benefits the entire developer community. Today, most software vendors track OpenMP advances closely and have implemented the latest API features in their compilers and tools. With OpenMP, application portability is assured across the latest multicore systems, including Intel Xeon Phi processors.

The post The OpenMP API Celebrates 20 Years of Success appeared first on insideHPC.

↧

Multicore Performance Challenges for Game Developers

June 8, 2017, 6:00 am

≫ Next: OpenMP at 20 Moving Forward to 5.0

≪ Previous: The OpenMP API Celebrates 20 Years of Success

Game developers face a unique challenge – how to make their graphics-heavy applications perform well across a very wide spectrum of hardware devices, not just high-end systems. So while an early version of a game might have been developed on some high-end system with 10 teraflops of CPU potential in a discrete graphics card, how do you scale it down to smaller consumer devices where optimization options are more limited?

The post Multicore Performance Challenges for Game Developers appeared first on insideHPC.

↧

OpenMP at 20 Moving Forward to 5.0

September 28, 2017, 7:07 am

≫ Next: Building Fast Data Compression Code with Intel Integrated Performance Primitives (Intel IPP) 2018

≪ Previous: Multicore Performance Challenges for Game Developers

This year, OpenMP*, the widely used API for shared memory parallelism supported in many C/C++ and Fortran compilers, turns 20. OpenMP is a great example of how hardware and software vendors, researchers, and academia, volunteering to work together, can successfully design a specification that benefits the entire developer community.

The post OpenMP at 20 Moving Forward to 5.0 appeared first on insideHPC.

↧

Building Fast Data Compression Code with Intel Integrated Performance Primitives (Intel IPP) 2018

November 9, 2017, 7:15 am

≫ Next: A New Way to Visualize Performance Optimization Tradeoffs

≪ Previous: OpenMP at 20 Moving Forward to 5.0

Intel® Integrated Performance Primitives (Intel IPP) is a highly optimized, production-ready, library for lossless data compression/decompression targeting image, signal, and data processing, and cryptography applications. Intel IPP includes more than 2,500 image processing, 1,300 signal processing, 500 computer vision, and 300 cryptography optimized functions for creating digital media, enterprise data, embedded, communications, and scientific, technical, and security applications.

The post Building Fast Data Compression Code with Intel Integrated Performance Primitives (Intel IPP) 2018 appeared first on insideHPC.

↧

A New Way to Visualize Performance Optimization Tradeoffs

November 30, 2017, 6:58 am

≪ Previous: Building Fast Data Compression Code with Intel Integrated Performance Primitives (Intel IPP) 2018

A valuable feature of Intel Advisor is its Roofline Analysis Chart, which provides an intuitive and powerful visualization of actual performance measured against hardware-imposed performance ceilings. Intel Advisor's vector parallelism optimization analysis and memory-versus-compute roofline analysis, working together, offer a powerful tool for visualizing an application’s complete current and potential performance profile on a given platform.

The post A New Way to Visualize Performance Optimization Tradeoffs appeared first on insideHPC.

↧

Latest Images