Wattch + Dynamic SimpleScalar (DSS) Cycle accurate power estimation

Wattch + DynamicSimpleScalar (DSS): Cycle accurate power estimation

AuthorName

Dept.Affiliation, School/Corp.

City,State, Country

email

ABSTRACT

DSSWattch is an ardent tool which enables users of Dynamic SimpleScalar (DSS) toolset to get cycle-accurate power estimates through detailed out-of-order simulation environment. DSSWatch refers to an adaptation of Wattch that was originally for SimpleScalar on PISA and Alpha architectures, to DSS for PowerPc architecture.

BACKGROUND AND APPROACHES

Rapidly, heat and power dissipation are becoming key constraints in microarchitecture design therefore it has become vital to architects to reconnoiter power implications of microarchitectural design decisions. Apart from giving information late in design process, the tools are always slow (n.a, 2006). The objectives of DSSWattch and predecessor, Wattch, are to offer fast, available, as well as reasonably accurate tools which can explore power consumption during the design process (Jochen, 2008).

DSSWattch is an adaptation of the Watch to Dynamic SimpleScalar (DSS), a novel version of the SimpleScalar. DSS adds dynamic aspects enabling it to simulate programs like Jikes RVM Java virtual machine which performs actions such as dynamic compilation as well as memory mapping or which need dynamic linking (Jochen, 2008 Robert, 2006). Additionally, DSSWattch extends functionality of the Wattch with improved register files modeling, floating point capacity as well as support for 32-bit mixed data-width PowerPC architecture (Hennessy &amp Patterson, 2012).

The main changes between Wattch and DSSWattch can be put into five categories: address data-widths, floating-point, population count enhancements, handling of floating point operands, operand harvesting and differentiation of integer (Leiterman, 2010).

  1. Operand harvesting

Wattch assumed that all the instructions would take two arguments and generate one result. In addition, memory instructions had an option of update their base address register. Inopportunely, this kind of simplification is not suited for PowerPC Instruction set Architecture (ISA) where instructions need two or more operands and can be read special purpose registers for both input and update special purpose register which are side-effect (Barry &amp Crowley, 2012). To help support ISA a flexible operand harvesting method which stores all the outputs and inputs and their register identifiers in RUU station associated with all the in-flight instruction. Arrays which store identifiers for outputs and inputs are called watch_odeps and watch_ideps respectively. The real register values are put in watch_inputs and watch_outputs arrays.

  1. Register File Modeling

Originally, Wattch was designed to model one integer register file that had uniform word-length. The model was not attuned with 32-bit PPC’s ISA that calls for a 64-bit and a 32-bit floating-point register file. To accommodate the new register file layout, DSSWattch was extended to enable any number of register files that are described in files watch_regfile.h and watch_regfile.c. Files are stored in watch_regfiles array. When doing power calculations, each register is examined and compute independent power estimates prior to adding them together to get the total register file for power use.

  1. Handling of floating-point Operands and population counting

Watch can estimate dynamic bus power usage through conducting population counts on data which is sent over busses in power model generate dynamic Activity Factor (AF). Population counting entails counting of the number of one-bits in a variable. Originally, Wattch was designed to work with integer operands as well as its dynamic AF could not differentiate between floating-point and integer registers. This functionality was added to DSSWattch through use of novel operand harvesting data.

  1. Differentiation of Data-Widths

32-bit PowerPC involves use of 32-bit addressing as well as performs 32-bit integer and 64-bit floating-point operations. However, Wattch was designed so that it could work with uniform 64-bit architecture. Therefore, Wattch’s idea of a constant data_widthwas grouped into int_data_width, address_width, fp_data_width, as well as a vestigial data_width, that reflects default data that has various units, for instance, the RUU.

RESULTS

Using DSSWattch

DSSWatch can be developed independently of other simulation in DSS suite by developing sim-wattch target or defining BUILD_WATTCH when compiling. This generates a binary called sim-wattch which is a copy of sim-outorder with DSSWattch power modeling enabled.

Because of the large switch statement in the sim-outorder.c’s ruu_dispatch function is made larger by the DSSWattch, compilations fails in older GCC version with optimizations enabled. This problem can be overcome by disabling optimizations or using nibel versions of the GCC. In addition, generation of the dynamic activity factor is disabled when compiling through defining STATIC_AF instead of the DYNAMIC_AF in power.h file. Generally, when a static activity factor is used it greatly reduces simulation time and laso decreases accuracy of power statistics which depend on the activity factors.

Running DSSWatch

Sim-wattch is run in the same manner as sim-outorder. Before the simulation starts, sim-wattch outputs information regarding power model and when the simulation has been finished it shows the results of power simulations after the DSS simulation statistics (Byna, 2006).

Making changes to DSSWatch

It is easy to extend and customize DSSWattch for special simulations situations. Additionally, the files needed in sim-outorder, The DSSWathc comprises of these files:

Sim-outorder.c DSS’s out-of-order simulator. It was modified to generate access counts and population counts in all elements in the DSSWatch’s power model

Dsswatch.h it contains code for interface with sim-outorder.

Dsswattch_regfile.{hc} the fields are constituents of the DSSWattch’s novel register file abstraction

Power.{hc} the file has DSSWatch’s power model and has all the codes used in computing estimates from gathered information in sim-outorder.

Cacti/ Files in cacti directory help in power modeling of various structures. Cacti refers to an independent project that was begun by DEC. presently, it is controlled by Hewlett-Packard. The copy of cacti was adopted and it is being used in DSSWattch.

Adding register files

Register files can be added to DSSWattch through changing NUM_REGFILES in the dsswattch_regfiles.h as well as updating watch_init_regfiles ( ) function in the dsswattch_regfiles.c to adjust register file’s configurable considerations. In addition, one needs to modify sim-outorder.c in order to maintain per-cycle access as well as population counts for new register file. When the changes are put in place, DSSWattch automatically includes the novel register file in power model as well as calculate the per-cycle statistics.

Changing Bit-Widths

Register-file widths can be described as fp_data_width and int_data_width in dsswattch_regfile.c file. The data_width, addr_width, opcode_length and instr_lenght constant are described in power.c file and can be updated to show novel architectures.

CONCLUSION

Functional unit power model can be estimated through use of power consumption estimates. Access can be made to integer unit or floating-point unit and each has a constant power cost. Therefore, all floating-point operation uses the same energy levels and all integer operations use same energy levels. The simple model generates reasonable average power-consumption estimate and can be easy to improve should it be important to get detailed power numbers from functional units. Power used in special-purpose registers cannot be simulated through DSSWatch. The general error from this decision is minimal and the power consumed by special purpose registers is very small unlike the power consume by PowerPC (Dinan &amp Moss, 2004).

References

  1. Barry, P &amp Crowley, P. (2012). Modern embedded computing: Designing connected, pervasive, media-rich systems. Waltham: Elsevier.

  2. Byna, S. (2006). Server-based data push architecture for data access performance optimization. New York: Illinois Institute of Technology.

  3. Dinan, J &amp Moss, E. (2004). DSSWattch: Power estimation in dynamic SimpleScalar. Retrieved from http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.112.8700

  4. Hennessy, J &amp Patterson, D. (2012). Computer architecture: A quantitative approach. Waltham: Elsevier.

  5. Jochen, M. (2008). Mobile code integrity through static program analysis, steganography and dynamic transformation control. London: ProQuest.

  6. Leiterman, J. (2010). 32/64-Bit 80X86 Assembly language architecture. Los Rios Boulevard Plano: Jones &amp Barlett Publishers.

  7. n.a, (2006). International symposium on memory management. London: ACM Press.

  8. Robert, Y. (2006). High performance computing-HiPC 2006:13th international conference Bangalore, India, December 18-21, 2006, proceedings. London: Springer Sceinnce &amp Business Media.