By Fabiha Hannan on 11 February 2014

Start time: around 12:10pm

In attendance: Prof Spjut, Fabiha, Erik, Paul, Akhil, Sami, Donghyeon

__

Paul:

-Paul thinks he is oversimplifying his project

-Testing with more instructions is the ideal

-Instructions should forwards branch as much as possible; if there's a backwards branch, there need to be some code to make it forwards branch again

-Getting raspberry pi to work is the hard part, so he is fine; getting to that ASAP would be good

-Prof Harris' implementation does have a bug; testing should catch it (Paul should test Prof Harris' thing with his version)

__

Sami:

-Sami was not able to run GPGPU-Sim

-It is working; Akhil says it is working

-The default is .bashrc

__

Fabiha:

-One thing to note is that there are no reservation fails recorded in the output file we are currently looking out

-The miss rates for these simulations are very low; we may want to decrease the cache size to increase the miss rate (argue it's good in the paper for some reason, such as saving space [though it saves very little])

-Ideally, we would want fewer instruction caches (vs smaller ones) and figure out how to share them

-We want the static number of instructions

-NOT the dynamic number of instructions (ex. the accesses + misses + reservation fails)

-We need a way to record this

-We also need a way to figure out when the banks want to access the same instruction (this has to be included with a print command in the source)

-"inputfile.readlines()[x:y]" This Python function will allow only certain lines (x through y) to be read and this can be a variable inputted as an argument to the parser function

-What I need to do for next week: table with list of benchmarks with number of instructions per benchmark (ideally we want sass and not ptx)

-Ptx: like llvm in that it is a high level assembly, but it is symbolic assembly

-Sass: like mips or ARM, actual instructions that get executed

-Compiling using nvcc compiles it down to ptx and then loads and compiles into sass

-Sass is different depending on what GPU you have

-Commit to ptx, but not sass (we will probably get ptx instructions and that's fine, but sass would be ideal)

__

Donghyeon&Akhil:

-They had questions to ask about their project

-warp is 32

-simd has 16 or 32 shader cores (number of shader cores divided by 16 or 32)

-simd width * simd threads= total number of shader cores, indicates how powerful it is

-execution is outside of it

-4 CTAs (NVIDIA term for warp), indication of how many parallels your core can handle

-CTA = cooperative thread array, array of threads that executes kernel concurrently or in parallel (read more here)

-shader cores are tightly coupled (shader is the graphics term for something programmable, ex. creating vertex shaders to create illumination model but shader cores do general computation)

category: tags:


blog comments powered by Disqus