 
Huge CPU Memory+Time Expense
 
 
        
        Not a Photo, a Calculation
 
Much in common : geometry, light sources, optical physics
Many Applications of ray tracing :
https://nvidianews.nvidia.com/news/nvidia-reveals-the-titan-of-turing-titan-rtx
Offload Ray Trace to Dedicated HW
SM : Streaming Multiprocessor
BVH : Bounding Volume Hierarchy
Tree of Bounding Boxes (bbox)
OptiX Raytracing Pipeline
Analogous to OpenGL rasterization pipeline:
 
OptiX makes GPU ray tracing accessible
NVIDIA expertise:
https://developer.nvidia.com/rtx
User provides (Yellow):
[1] Turing+ GPUs eg NVIDIA TITAN RTX
GPU Resident Photons
Thrust: high level C++ access to CUDA
 
OptiX : single-ray programming model -> line-by-line translation
 
Sphere, Cylinder, Disc, Cone, Convex Polyhedron, Hyperboloid, Torus, ...
Outside/Inside Unions
dot(normal,rayDir) -> Enter/Exit
 
 
Complete Binary Tree, pick between pairs of nearest intersects:
| UNION tA < tB | Enter B | Exit B | Miss B | 
|---|---|---|---|
| Enter A | ReturnA | LoopA | ReturnA | 
| Exit A | ReturnA | ReturnB | ReturnA | 
| Miss A | ReturnB | ReturnB | ReturnMiss | 
Materials/Surfaces -> GPU Texture
Material/Surface/Scintillator properties
Material/surface boundary : 4 indices
Primitives labelled with unique boundary index
simple/fast properties + reemission wavelength
G4 Structure Tree -> Instance+Global Arrays -> OptiX
Group structure into repeated instances + global remainder:
instancing -> huge memory savings for JUNO PMTs
Why Avoid Torus ?
Double heavy quartic root finding
Alternative Approaches To Cope With Torus
tut_detsim.py options (in SVN trunk r3990)
"cylinder - torus" -> polycone
Opticks Build Generalized
Builds against CMAKE_PREFIX_PATH
opticks-config (pkg-config .pc file based)
releases on github : latest v0.0.0-rc3
https://github.com/simoncblyth/opticks/releases/tag/v0.0.0-rc3
Add Opticks to an existing installation (SVN trunk r3990+):
bash junoenv libs all opticks
Four mandatory envvars :
export CMTEXTRATAGS=opticks # -> Opticks ON in JunoENV + CMT export OPTICKS_COMPUTE_CAPABILITY=70 # -> nvcc flags export OPTICKS_CUDA_PREFIX=/usr/local/cuda-10.1 export OPTICKS_OPTIX_PREFIX=/usr/local/OptiX_650 # location of CUDA and OptiX installs
Build all JUNO externals, including Opticks (takes ~1hr, 13G) :
jlibs(){
    mkdir -p $JUNOTOP && cd $JUNOTOP
    svn co https://juno.ihep.ac.cn/svn/offline/trunk/installation/junoenv && cd $JUNOTOP/junoenv
    local libs=$(bash junoenv libs list | perl -ne 'm, (\S*)@, && print "$1\n"' -)
    for lib in $libs ; do
        bash junoenv libs all $lib || return 1
    done
}
        
        Offline Opticks : #ifdef WITH_G4OPTICKS ... #endif + if(m_opticksMode > 0){ ... }
cd $JUNOTOP/offline/Simulation/DetSimV2 ; find . -type f -exec grep -l WITH_G4OPTICKS {} \+
./DetSimOptions/src/DetSim0Svc.cc                  ## initialize/finalize
./DetSimOptions/src/LSExpDetectorConstruction.cc   ## translate geometry to GPU
./PhysiSim/src/DsG4Scintillation.cc    ## collect Scintillation gensteps, excluding reemission
./PhysiSim/src/LocalG4Cerenkov1042.cc  ## collect Cerenkov gensteps
./PhysiSim/src/DsPhysConsOptical.cc    ## use LocalG4Cerenkov1042 rather than G4Cerenkov
./PMTSim/src/PMTSDMgr.cc         ## setup separate PMTHitMerger m_pmthitmerger_opticks
./PMTSim/src/junoSD_PMT_v2.cc    ## bulk hit creation+collection at EndOfEvent
tut_detsim.py options relevant to Opticks usage
tds : Example Bash Function :
tds-(){ python $JUNOTOP/offline/Examples/Tutorial/share/tut_detsim.py $* ; }
tds(){ tds- --opticks-mode 1 --no-guide_tube --pmt20inch-polycone-neck --evtmax 2 gun ; }
        
        Collection Effic. Culling -> GPU ?
Significant performance/memory benefit (factor ~2)
Aligns CPU/GPU random consumption
(1) Complete CPU side bulk hit creation in PMTSim: junoSD_PMT_v2::EndOfEvent_Opticks
Hits currently missing:
(2) CUDA port: junoSD_PMT_v2::get_ce
 
[1] set in NormalTrackInfo by NormalAnaMgr::UserSteppingAction [2] PMTID int distinguishes NNVT/Hamamatsu !
[3] https://github.com/DannyRuijters/CubicInterpolationCUDA "GPU Prefilter for Accurate Cubic B-spline Interpolation"
Opticks interest : Geant4 + HSF + ...
Other currently active Opticks users:
Full integration of Opticks with Geant4 : LAr TPCs, DUNE
Creation of a Geant4 extended example G4OpticksTest
Advised Hans on Opticks use, adding Wavelength Shifting
Prototype of gamma or electron transport using GPU vendor library : HL-LHC ECAL
Examine the Opticks simulation of optical photons, to identify similarities/differences ... for gamma and/or electron interactions
Agreed to act in a consulting role for this task ; email exchanges with John Apostolakis
[1] HSF : HEP Software Foundation
Next Steps
 
Opticks : state-of-the-art GPU ray traced optical simulation integrated with Geant4. Now integrated with JUNO Offline Pioneer users welcome

- Drastic speedup -> better detector understanding -> greater precision
- any simulation limited by optical photons can benefit
- more photon limited -> more overall speedup (99% -> 100x)
| https://bitbucket.org/simoncblyth/opticks | code repository | 
| https://github.com/simoncblyth/opticks/releases | .zip .tar.gz archives | 
| https://simoncblyth.bitbucket.io | presentations and videos | 
| https://groups.io/g/opticks | forum/mailing list archive | 
| email:opticks+subscribe@groups.io | subscribe to mailing list | 
| JUNO analytic, 400M photons from center | Speedup | |
|---|---|---|
| Geant4 Extrap. | 95,600 s (26 hrs) | |
| Opticks RTX ON (i) | 58 s | 1650x | 
014743 <physvol name="pLPMT_NNVT_MCPPMT0x329e330"> <-- zero is special ? NOT A GOOD ID 14744 <volumeref ref="NNVTMCPPMTlMaskVirtual0x32a56c0"/> 14745 <position name="pLPMT_NNVT_MCPPMT0x329e330_pos" unit="mm" x="-930.2976" y="-111.8724" z="19365"/> 14746 <rotation name="pLPMT_NNVT_MCPPMT0x329e330_rot" unit="deg" x="180" y="0" z="180"/> 14747 </physvol> 14748 <physvol copynumber="1" name="pLPMT_Hamamatsu_R128600x329e3b0"> 14749 <volumeref ref="HamamatsuR12860lMaskVirtual0x3290b70"/> 14750 <position name="pLPMT_Hamamatsu_R128600x329e3b0_pos" unit="mm" x="-492.5657" y="-797.0872" z="19365"/> 14751 <rotation name="pLPMT_Hamamatsu_R128600x329e3b0_rot" unit="deg" x="180" y="0" z="180"/> 14752 </physvol> 14753 <physvol copynumber="2" name="pLPMT_NNVT_MCPPMT0x329e4b0"> <-- forced to use name OR PMTParamSvc -> PMTType 14754 <volumeref ref="NNVTMCPPMTlMaskVirtual0x32a56c0"/> 14755 <position name="pLPMT_NNVT_MCPPMT0x329e4b0_pos" unit="mm" x="316.0782" y="-882.0791" z="19365"/> 14756 <rotation name="pLPMT_NNVT_MCPPMT0x329e4b0_rot" unit="deg" x="180" y="0" z="180"/> 14757 </physvol> ... 102774 <physvol copynumber="17610" name="pLPMT_Hamamatsu_R128600x3a2a9d0"> Simple fix : separate NNVT/Hamamatsu offsets 102775 <volumeref ref="HamamatsuR12860lMaskVirtual0x3290b70"/> 102776 <position name="pLPMT_Hamamatsu_R128600x3a2a9d0_pos" unit="mm" x="-468.5" y="811.4658" z="-19365"/> 102777 </physvol> 102778 <physvol copynumber="17611" name="pLPMT_NNVT_MCPPMT0x3a2aa80"> 102779 <volumeref ref="NNVTMCPPMTlMaskVirtual0x32a56c0"/> 102780 <position name="pLPMT_NNVT_MCPPMT0x3a2aa80_pos" unit="mm" x="-926.5345" y="139.6526" z="-19365"/> 102781 </physvol> 102782 <physvol copynumber="300000" name="PMT_3inch_log_phys0x3a307c0"> 102783 <volumeref ref="PMT_3inch_log0x3a2e430"/> 102784 <position name="PMT_3inch_log_phys0x3a307c0_pos" unit="mm" x="-3734.24733197939" y="1835.06563378074" z="18932.1786116475"/> 102785 <rotation name="PMT_3inch_log_phys0x3a307c0_rot" unit="deg" x="-174.463707019134" y="-11.1072339347848" z="-26.7089289722046"/> 102786 </physvol> 102787 <physvol copynumber="300001" name="PMT_3inch_log_phys0x3a30890"> 102788 <volumeref ref="PMT_3inch_log0x3a2e430"/> 102789 <position name="PMT_3inch_log_phys0x3a30890_pos" unit="mm" x="-3470.82479867782" y="1705.61581563788" z="18994.3071973244"/> 102790 <rotation name="PMT_3inch_log_phys0x3a30890_rot" unit="deg" x="-174.86882108069" y="-10.3147892102431" z="-26.6336342093065"/> 102791 </physvol>
43 class PMTParamSvc: public SvcBase {      //  Encode all this into : 64 bit pmtdat (unsigned long long)
44 public:                                  //
45   PMTParamSvc(const std::string& n)      //
..                                          //  16 bits : which PMT identifier (enough for 1..65,536)
53   bool isCD(int pmtid);                  //  16 bits : 8 bits of type flags + 8 bits spare
54   bool isWP(int pmtid);                  //  32 bits : float qe
56   bool is20inch(int pmtid);              //  ----------------------------------------------
57   bool is3inch(int pmtid);               //  64 bits : unsigned long long pmtdat
59   bool isNNVT(int pmtid);                //
60   bool isHamamatsu(int pmtid);           //
61   bool isHighQE(int pmtid);              //   Adhere to hex 4bit boundaries:
62   bool isHZC(int pmtid);                 //
                                            //                      Which PMT 16 bits
63   float getPMTQE(int pmtid);             //                       |
..                                          //   0x|....|....|..00|....|
74   std::map<int, int> m_pmt_categories;   //                 |
75   std::map<int, float> m_pmt_qe;         //       float qe  type flags
76 };                                       //
                                            //   unsigned long long pmtdata[npmt] ;