Opticks replaces Geant4 optical photon simulation with an equivalent implementation that benefits from state-of-the-art GPU ray tracing from NVIDIA OptiX, which can yield performance >1000x Geant4.
All optically relevant aspects of Geant4 context are translated+copied to GPU:
- geometry : solids, structure, material+surface properties
- generation : Cerenkov + Scintillation (using Gensteps from Geant4)
- propagation : Rayleigh scattering, absorption, reemission, boundary
COMPLETED : Full Simulation re-implementation for OptiX 7 API
- NEXT : A-B Validation iteration, geometry issue fixes
Integrate NVIDIA OptiX with Geant4
SEvt NumPy arrays : Extreme Detail
: a.genstep : (1, 6, 4) : generation param : a.seed : (10000,) : photon -> genstep : a.inphoton: (10000, 4, 4) : input photon : a.photon : (10000, 4, 4) : final photon param : a.hit : (9633, 4, 4) : hitmasked photon : a.seq : (10000, 2) : history sequence : a.record :(10000, 10, 4, 4) : steppoint record : a.rec :(10000, 10, 2, 4) : compressed record : a.domain : (2, 4, 4) : used for compression : a.sframe : (4, 4, 4) : used for targetting : a.flat : (10000, 64) : random numbers tagged : a.tag : (10000, 4) : compressed tag enum : a.prd :(10000, 10, 2, 4) : per-ray-data isect
Every random consumed by every photon step
A : Opticks : GPU geometry + optical sim.
B : Geant4 simulation
A-B : comparison via python/NumPy analysis
March : Shifted focus, geom issues => optical sim, July : A-B iteration starting
A : Opticks GPU sim, B : Geant4 CPU sim (both fully instrumented: point recording, histories, tagged random consumption)
pkg | hh/cc/cu/py | Old CUDA Simulation and Visualization |
---|---|---|
cudarap | 16/8/3/0 | old CUDA interface |
thrustrap | 20/2/6/1 | old CUDA thrust interface : photon seeding, indexing |
optixrap | 43/30/3/2 | old optical photon simulation implemented in old OptiX API |
okop | 14/10/1/0 | high level pkg on top of optixrap |
oglrap | 38/29/0/0 | old OpenGL based visualization of opticksgeo geometry |
opticksgl | 9/5/0/0 | integrating OpenGL with okop OptiX |
pkg | hh/cc/cu/py | New Geometry model and CUDA simulation |
---|---|---|
CSG_GGeo | 3/2/0/0 | GGeo to CSG geometry translation |
GeoChain | 3/2/0/0 | geometry translation testing |
CSG | 37/18/0/7 | New geometry model |
CSGOptiX | 20/14/4/2 | CSG intersection with OptiX 7 |
qudarap | 43/25/13/1 | CUDA optical photon simulation, CUDA upload, download, textures |
u4 | 32/10/0/2 | New Geant4 interface, genstep collection, U4Recorder |
gdxml | 6/4/0/0 | GDML loaded as XML for fixups |
g4cx | 3/2/0/0 | New top level package integrating Geant4 and CSGOptiX |
green : active development, blue : plan to remove, red : removed
Old simulation (OptiXRap) | New simulation (QUDARap/qsim.h + CSGOptiX) |
---|---|
|
|
|
|
|
|
|
|
|
|
|
|
Goals of re-implementation : flexible, modular GPU simulation, easily testable, less code
CSGOptiX built upon:
Geometry + Simulation brought together here
void simulate(const uint3& idx, const uint3& dim) { // skip some setup curandState rng = sim->rngstate[idx] ; sim->generate_photon(ctx.p, rng, gs, idx, gs_idx ); int command = START ; int bounce = 0 ; ctx.point(bounce); while( bounce < evt->max_bounce ) { trace( params.handle, ctx.p.pos, ctx.p.mom, params.tmin, params.tmax, prd); if( prd->boundary() == 0xffffu ) break ; ctx.trace(bounce); // record PerRayData intersect command = sim->propagate(bounce, rng, ctx); bounce++; ctx.point(bounce) ; if(command == BREAK) break ; } ctx.end(); // write seq, tag, flat evt->photon[idx] = ctx.p ; // final photon }
https://bitbucket.org/simoncblyth/opticks/src/master/CSGOptiX/CSGOptiX7.cu
Strong Code Reduction Potential
Several pkgs only used by geo translation, not sim
Longer term: revisit geometry translation + viz
Same translation in less code:
pkg | hh/cc/cu/py | Base |
---|---|---|
okconf | 3/2/0/0 | config version detection |
sysrap | 150/70/1/8 | basis types, new array NP.hh |
boostrap | 46/42/0/0 | boost tools |
npy | 181/165/0/6 | geo primitives, old array NPY.hpp |
optickscore | 70/62/0/1 | old core, argument parsing |
pkg | hh/cc/cu/py | Geometry |
---|---|---|
ggeo | 68/65/0/2 | complete geometry model : no Geant4 dependency |
extg4 | 62/53/0/0 | Geant4 geometry translation into GGeo model |
opticksgeo | 10/6/0/0 | high level pkg on top of ggeo |
cfg4 | 126/117/0/3 | Old Geant4 comparison machinery, eg CRecorder |
green : active development, blue : plan to remove, red : removed
sysrap/stag.h enum
enum {
stag_undef = 0,
stag_to_sci = 1,
stag_to_bnd = 2,
stag_to_sca = 3,
stag_to_abs = 4,
stag_at_burn_sf_sd = 5,
stag_at_ref = 6,
stag_sf_burn = 7,
stag_sc = 8,
stag_to_ree = 9,
stag_re_wl = 10,
stag_re_mom_ph = 11,
stag_re_mom_ct = 12,
stag_re_pol_ph = 13,
stag_re_pol_ct = 14,
stag_hp_ph = 15
//stag_hp_ct = 16
}; // 4 bit tags pack better
qudarap/qsim.h:
459 inline QSIM_METHOD int qsim::propagate_to_boundary(unsigned& flag, curandStateXORWOW& rng, sctx& ctx) 460 { 462 const sstate& s = ctx.s ; 464 const float& absorption_length = s.material1.y ; 465 const float& scattering_length = s.material1.z ; 471 #ifdef DEBUG_TAG 472 float u_to_sci = curand_uniform(&rng) ; // burn to align 473 float u_to_bnd = curand_uniform(&rng) ; // burn to align 474 #endif 475 float u_scattering = curand_uniform(&rng) ; 476 float u_absorption = curand_uniform(&rng) ; 478 #ifdef DEBUG_TAG 479 stagr& tagr = ctx.tagr ; 480 tagr.add( stag_to_sci, u_to_sci); 481 tagr.add( stag_to_bnd, u_to_bnd); 482 tagr.add( stag_to_sca, u_scattering); 483 tagr.add( stag_to_abs, u_absorption); 491 #endif 489 float scattering_distance = -scattering_length*logf(u_scattering); 490 float absorption_distance = -absorption_length*logf(u_absorption); ...
sysrap/stag.h/stagr collects into tag.npy flat.npy : A-side straightforward as already control all code
u4/U4Stack.h enum
enum { U4Stack_Unclassified = 0, U4Stack_RestDiscreteReset = 1, U4Stack_DiscreteReset = 2, U4Stack_ScintDiscreteReset = 3, U4Stack_BoundaryDiscreteReset = 4, U4Stack_RayleighDiscreteReset = 5, U4Stack_AbsorptionDiscreteReset = 6, U4Stack_BoundaryBurn_SurfaceReflectTransmitAbsorb = 7, U4Stack_BoundaryDiDiTransCoeff = 8, U4Stack_AbsorptionEffDetect = 9, U4Stack_RayleighScatter = 10, U4Stack_BoundaryDiMeReflectivity = 11, U4Stack_ChooseReflection = 12, U4Stack_RandomDirection = 13, U4Stack_LambertianRand = 14, U4Stack_Reemission = 15 // 4-bit better }; u4/InstrumentedG4OpBoundaryProcess.hh u4/InstrumentedG4OpBoundaryProcess.cc u4/tests/ShimG4OpRayleigh.hh u4/tests/ShimG4OpRayleigh.cc u4/tests/ShimG4OpAbsorption.hh u4/tests/ShimG4OpAbsorption.cc u4/tests/DsG4Scintillation.h u4/tests/DsG4Scintillation.cc
u4/tests/ShimG4OpAbsorption.cc:
32 void ShimG4OpAbsorption::ResetNumberOfInteractionLengthLeft()
33 {
35 G4double u = G4UniformRand() ;
36 SEvt::AddTag( U4Stack_AbsorptionDiscreteReset, u );
45 theNumberOfInteractionLengthLeft = -1.*G4Log(u) ;
49 }
882 void SEvt::addTag(unsigned tag, float flat){
883 {
885 stagr& tagr = current_ctx.tagr ;
899 tagr.add(tag,flat) ;
929 }
sysrap/SEvt.cc:SEvt::AddTag using same stagr struct as A
u4/U4Process::ClearNumberOfInteractionLengthLeft
Aligned : same u, same purpose
epsilon:~ blyth$ gx /Users/blyth/opticks/g4cx epsilon:g4cx blyth$ ./ab.sh In [1]: AB(54) Out[1]: A(54) : TO BT BR BR BT SA B(54) : TO BT BR BR BT SA A.t : (1000, 64) B.t : (1000, 64) A.t2 : (1000, 64) B.t2 : (1000, 64) 0 : 0.7082 : 1 : to_sci 0 : 0.7082 : 3 : ScintDiscreteReset : 1 : 0.0797 : 2 : to_bnd 1 : 0.0797 : 4 : BoundaryDiscreteReset : 2 : 0.1970 : 3 : to_sca 2 : 0.1970 : 5 : RayleighDiscreteReset : 3 : 0.4009 : 4 : to_abs 3 : 0.4009 : 6 : AbsorptionDiscreteReset : 4 : 0.3778 : 5 : at_burn_sf_sd 4 : 0.3778 : 7 : BoundaryBurn_SurfaceReflectTransmitAbsorb : 5 : 0.7441 : 6 : at_ref 5 : 0.7441 : 8 : BoundaryDiDiTransCoeff : ... 32 : 0.2452 : 1 : to_sci 32 : 0.2452 : 3 : ScintDiscreteReset : 33 : 0.5043 : 2 : to_bnd 33 : 0.5043 : 4 : BoundaryDiscreteReset : 34 : 0.1788 : 3 : to_sca 34 : 0.1788 : 5 : RayleighDiscreteReset : 35 : 0.8004 : 4 : to_abs 35 : 0.8004 : 6 : AbsorptionDiscreteReset : 36 : 0.3335 : 5 : at_burn_sf_sd 36 : 0.3335 : 7 : BoundaryBurn_SurfaceReflectTransmitAbsorb : 37 : 0.7170 : 7 : sf_burn 37 : 0.7170 : 9 : AbsorptionEffDetect : 38 : 0.0000 : 0 : undef 38 : 0.0000 : 0 : Unclassified : A : /tmp/blyth/opticks/RaindropRockAirWaterSmall/G4CXSimulateTest/ALL B : /tmp/blyth/opticks/RaindropRockAirWaterSmall/U4RecorderTest/ShimG4OpAbsorption_FLOAT_ShimG4OpRayleigh_FLOAT/ALL In [2]: sm = np.all( a.seq[:,0] == b.seq[:,0] ) : True ww = np.where( a.seq[:,0] != b.seq[:,0] )[0] : [] History alignment : can be accidental we = np.where( A.t.view('|S64') != B.t2.view('|S64') )[0] : [] Tag enum alignment : much more stringent match sequences of 64 tags by viewing as 64 char string
epsilon:~ blyth$ ~/opticks/g4cx/ab.sh sm = np.all( a.seq[:,0] == b.seq[:,0] ) : True ## all 1M histories match we = np.where( A.t.view('|S64') != B.t2.view('|S64') )[0] : [ 41595 114799 125032 158475 174993 243023 ... ] In [2]: seqhis_(a.seq[we,0]) ## 22/1M not enum aligned : all with truncated record Out[2]: ['TO BT SC BR BR BR BR BR BR BR', 'TO BT SC BR BR BR BR BR BR BR', ... In [4]: wm = np.where( A.t.view('|S64') == B.t2.view('|S64') )[0] ; wm, len(wm) ## select the A-B enum aligned Out[4]: array([ 0, 1, 2, 3, ..., 999995, 999996, 999997, 999998, 999999]), 999978 In [29]: np.abs(a.record[wm] - b.record[wm]).max() ## max deviation : 0.018 mm Out[29]: 0.018722534 In [30]: np.where( np.abs(a.record[wm] - b.record[wm]) > 0.01 ) ## select photon step points with deviation > 0.01 Out[30]: (array([ 18157, 125121, 467717, 499537, 624529, 695184, 759091, 779861, 938053]), ## photon index array([1, 1, 3, 1, 1, 2, 4, 1, 1]), ## step point index array([0, 0, 0, 0, 0, 0, 0, 0, 0]), ## first photon row is position, time array([2, 2, 2, 2, 2, 0, 2, 2, 2])) ## 2:Z 0:X coordinate In [31]: dv = wm[np.unique(np.where( np.abs(a.record[wm] - b.record[wm]) > 0.01 )[0])] ## deviant indices In [36]: seqhis_(a.seq[dv,0]) ## deviant step point mostly AB/SC position Out[36]: ['TO AB','TO AB','TO BT BT AB','TO AB','TO AB','TO BR AB','TO SC BT BT SA','TO AB','TO AB']
In [4]: AB(we[0]) ## all 22/1M loose alignment at 1st BR after SC Out[4]: A(41595) : TO BT SC BR BR BR BR BR BR BR B(41595) : TO BT SC BR BR BR BR BR BR BR 0 : 0.6518 : 1 : to_sci 0 : 0.6518 : 3 : ScintDiscreteReset : 1 : 0.3244 : 2 : to_bnd 1 : 0.3244 : 4 : BoundaryDiscreteReset : 2 : 0.2309 : 3 : to_sca 2 : 0.2309 : 5 : RayleighDiscreteReset : 3 : 0.7327 : 4 : to_abs 3 : 0.7327 : 6 : AbsorptionDiscreteReset : 4 : 0.1133 : 5 : at_burn_sf_sd 4 : 0.1133 : 7 : BoundaryBurn_SurfaceReflectTransmitAbsorb : 5 : 0.6275 : 6 : at_ref 5 : 0.6275 : 8 : BoundaryDiDiTransCoeff : 6 : 0.4789 : 1 : to_sci 6 : 0.4789 : 3 : ScintDiscreteReset : 7 : 0.6990 : 2 : to_bnd 7 : 0.6990 : 4 : BoundaryDiscreteReset : 8 : 0.9998 : 3 : to_sca 8 : 0.9998 : 5 : RayleighDiscreteReset : 9 : 0.0067 : 4 : to_abs 9 : 0.0067 : 6 : AbsorptionDiscreteReset : 10 : 0.1797 : 8 : sc 10 : 0.1797 : 10 : RayleighScatter : 11 : 0.0088 : 8 : sc 11 : 0.0088 : 10 : RayleighScatter : 12 : 0.5316 : 8 : sc 12 : 0.5316 : 10 : RayleighScatter : 13 : 0.8436 : 8 : sc 13 : 0.8436 : 10 : RayleighScatter : 14 : 0.4477 : 8 : sc 14 : 0.4477 : 10 : RayleighScatter : 15 : 0.4004 : 1 : to_sci 15 : 0.4004 : 3 : ScintDiscreteReset : 16 : 0.8328 : 2 : to_bnd 16 : 0.8328 : 4 : BoundaryDiscreteReset : 17 : 0.0016 : 3 : to_sca 17 : 0.0016 : 5 : RayleighDiscreteReset : 18 : 0.2059 : 4 : to_abs 18 : 0.2059 : 6 : AbsorptionDiscreteReset : 19 : 0.5832 : 5 : at_burn_sf_sd 19 : 0.5832 : 7 : BoundaryBurn_SurfaceReflectTransmitAbsorb : 20 : 0.7585 : 6 : at_ref <----- ALIGNMENT LOST HERE : DO NOT GET EXPECTED TAG : BoundaryDiDiTransCoeff 20 : 0.7585 : 3 : ScintDiscreteReset : 21 : 0.6396 : 1 : to_sci 21 : 0.6396 : 4 : BoundaryDiscreteReset : 22 : 0.9837 : 2 : to_bnd 22 : 0.9837 : 5 : RayleighDiscreteReset :
SC/AB distance = -length*logf(u)
SC/AB lengths often large > 1e6 mm
small logf(u) deviations scaled up
=> SC/AB position difference
SC+AB compete on distance with reaching boundary
=> when SC/AB win : u > 0.998
-log(u) u > 0.998 -log(1-x) ~= x x < 0.002
"Raindrop" geometry : 50mm radius sphere of water in air
Previously assumed float/double difference between Opticks/Geant4
sysrap/tests/logTest.cu : CUDA comparison
Deviations small,
Using KLUDGE_FASTMATH_LOGF reduces SC/AB position deviation
##define KLUDGE_FASTMATH_LOGF(u) (u < 0.998f ? __logf(u) : __logf(u) - 0.46735790f*1e-7f )
Full geom, but target single PMT
Input photons simplify random consumption
Configure geometry and input photons:
geom_ oip
Run A and B simulations and "T" simtrace running:
cd ~/opticks/g4cx ./gxs.sh ## A ./gxt.sh ## T collect "simtrace" arrays of intersect cd ~/opticks/u4 ./u4s.sh ## B
gxt.sh uses "simtrace" arrays to make 2D slice plots (pyvista, matplotlib)
cd ~/opticks/g4cx ./gxt.sh ana ./ab.sh # compare A and B with NumPy
"simtrace" : scatter plot of intersects, looks like missed ellipsoid transform
gx ; EYE=0,1000,200 LOOK=0,0,200 ./ab.sh recplot
single PMT test : no-such missed ellipsoid transform issue
Full GDML geometry tests + Standalone tests
First step to investigate issues:
Aligned A-B validation iteration
Very powerful tool for finding geometry issues
Non-aligned A-B iteration : statistical comparison
Update JUNO-Opticks integration to use new Opticks
JUNO Simulation -> Opticks -> GPU | |
Benefits: |
|
Status: |
|
Next Steps Summary
Changed gears:
Many Thanks to : Yuxiang Hu
https://bitbucket.org/simoncblyth/opticks | code repository |
https://simoncblyth.bitbucket.io | presentations and videos |