Personal tools
Home Pós-Graduação Informações para Alunos e Docentes do Programa Seminários no IC-UNICAMP Seminários de Pesquisa do IC Palestra Extraordinária: Beyond Manycores Comes Heterogeneous Accelerated Computing:
Navigation
IC 40 anos
 
Document Actions

Palestra Extraordinária: Beyond Manycores Comes Heterogeneous Accelerated Computing:

Mauricio Breternitz Jr, Ph.D., Advanced Software and Analytics Technology Group, Advanced Micro Devices, Austin, Tx, dia 15/02, às 15:30h, Auditório do IC, Sala 85 - IC 2.

What Palestra
When 15/02/2011
from 15:30 to 17:30
Where Auditório do IC (Sala 85) - IC2
Add event to calendar vCal
iCal
The microprocessor industry has recently undergone and is
still absorving a transition to multiple execution cores.
This change was motivated by exponential increases in power
consumption and area costs, which precluded the continued
growth in processing frequency and single core complexity. Currently
CPU vendors place multiple (10's) of cores in
a single chip. Software vendors are (still) looking for new
parallelization technology.
       About the same time frame, parallel processing was being
utilized efficiently in special applications such as scientific
computing
and graphics processing. Graphics processing units (GPUs) capable of
multiple giga flops became common and affordable.
Once the potential of heterogenous systems utilizing GPU acceleration
has been realized, the search for GPGPU (general purpose GPU
computing)
opportunities has been launched. However, the heterogeneous processor
(e.g. GPU) is usually connected via an attached bus
(PCI) to the system, introducing extra complexity, latency, and
bandwidth limitations to be overcome. Furthermore, the SIMD execution
model requires requires higly regular parallelism which is not present
is all computations.
Still, there area a good number of applications for which this
solution is cost effective.
         Traditionally, GPU instruction sets are not exposed beyond
the operating-system-specific drivers and as such are not public or
stable
across generations. To enable widespread adoption and cross-generation legacy,
approaches such as  CUDA and (open standards based) OpenCL have emerged.
         A recent development prescribes a tight integration of scalar
execution cores with a highly-parallel accelerator via access
to shared memory. This organization is called an APU - accelerated
processing unit. The scalar execution cores provide efficient
execution for the 'serial portion' of applications, the multiplicity
of cores provide efficient speedup in situations in which a limited
amount
of parallelism has been identified, and the highly paralllel
accelerator provides cost-effective/low power performance.
APUs enable a wider class of applications beyond the regular, highly
parallel paradigm computing pattern required for GPU acceleration.
     The advent of APUs introduce a new set of challenges: the
programmer (along with the system runtime) must specify and decide at
each step
of the computation the appropriate execution resource that provides
the best efficient performance. We describe challenges,
initial results, and potential research ideas to improve the
programmability and efficiency of such systems.

Instituto de Computação :: Universidade Estadual de Campinas
Av. Albert Einstein, 1251 - Cidade Universitária • CEP 13083-852 • Campinas/SP - Brasil • Fone: [19] 3521-5838