csg_intersect_boolean.h : General CSG intersection using evaluative_csg approach¶
Intersecting rays with general CSG shapes requires the appropriate primitive intersect to be selected depending on the origin and direction of the ray and the current t_min.
Traditional implementations of CSG intersection first calculate ray intervals with each primitive and then combine these intervals using the boolean operators to determine intersects. Efficient use of GPUs requires many thousands of simultaneously operational threads which disfavors the traditional approach due to the requirement to store intervals for all constituent primitives. A quite different approach described by Andrew Kensler avoids interval storage by instead selecting between the two candidate intersects at each level of the binary tree, which allows a recursive algorithm to be developed. The two candidate intersects at each level are classified as “Enter”, “Exit” or “Miss” using the angle between the ray direction and surface normal. Six decision tables corresponding to which side is closer and to the three boolean operators are used to determine an action from the classifications such as returning an intersect or advancing t_min and intersecting again.
Recursive function calls are a natural way to process self similar structures such as CSG trees, however recursion is a memory expensive technique which makes it inappropriate for GPU usage. Although NVIDIA OptiX supports recursive ray tracing in does not support recursion within intersect programs. The Opticks “evaluative” CSG implementation was inspired by the realization that CSG node tree intersection directly parallels binary expression tree evaluation and that techniques to simplify expression tree evaluation such as using postorder traversals could be applied. Binary expression trees are used to represent and evaluate mathematical expressions. Factoring out the postorder sequence allowed an iterative solution to be developed for a recursive problem.
The CSG implementation relies on selection of the closer of two intersects at each level of the node tree. When the faces of constituent shapes coincide the ambiguity regarding which is closer can cause spurious intersects. Modifying some constituents to prevent coincident faces avoids the issue without changing the intended geometry. As such coincidences are rather common Opticks includes detection and automated fixing for some common situations.
- postorder traversal means that have always visited left and right subtrees before visiting a node
- slavish python translations of the csg intersect implementationa are in dev/csg/slavish.py
- postorder_sequence for four tree heights were prepared by opticks/dev/csg/node.py:Node.postOrderSequence
- the sequence contains 1-based levelorder indices(nodeIdx) in right to left postorder
- left,right child of nodeIdx are at (nodeIdx*2, nodeIdx*2+1)
- for non-perfect trees, the height means the maximum
Below indices are postorder slice flavor, not levelorder
Height 3 tree, numNodes = 15, halfNodes = 7, root at i = 14
upTree i : numNodes 14 : 15 = 14:15
leftTree i - 2h : i - h 14 - 2*7 : 14 - 7 = 0:7
rightTree i - h : i 14 - 7 : 14 = 7:14
NB all nodes including root needs an upTree tranche to evaluate left and right
CSG shape complete binary tree serialization¶
A complete binary tree serialization with array indices matching level order tree indices and zeros at missing nodes is used for the serialization of the CSG trees. This simple serialization allows tree navigation directly from bitwise manipulations of the serialized array index.
Advantages:
- simple, with no tree overheads for child/parent etc..
- no need to deserialize as can postorder tree traverse direct from the serialized buffer just by bit twiddling, as shown below
Disadvantages:
- very inefficient for complex CSG shapes with many constituent nodes
- must implement tree balancing to handle CSG shapes of “medium” complexity, this is done in geometry preprocessing
.
depth elevation
1 0 3
10 11 1 2
100 101 110 111 2 1
1000 1001 1010 1011 1100 1101 1110 1111 3 0
postorder_next(i,elevation) = i & 1 ? i >> 1 : (i << elevation) + (1 << elevation) ; // from pattern of bits
A postorder tree traverse visits all nodes, starting from leftmost, such that children are visited prior to their parents.
CSG Development code including python implementations of the CUDA¶
1-based binary tree indices in binary¶
Repeat above enumeration in nodeIdx lingo
nodeIdx{dec}
[postorderIdx]
height 3 tree
1{1} elev:3 depth:0
[14]
10{2} 11{3} elev:2 depth:1
[6] [13]
100{4} 101{5} 110{6} 111{7} elev:1 depth:2
[2] [5] [9] [12]
1000{8} 1001{9} 1010{a} 1011{b} 1100{c} 1101{d} 1110{e} 1111{f} elev:0 depth:3
[0] [1] [3] [4] [7] [8] [10] [11]
upTree 1 : 0
root 1 >> 1 ("parent" of root)
leftTree
10{2} << 2 11{3} << 2 (leftmost of the rhs is one-past the leftTree)
1000{8} 1100{c}
rightTree
11{3} << 2 1 <-- one-past the end of the rightTree is root
1100{c}
Algorithm to find leftTree of nodeIdx, is find leftIdx
leftIdx = (nodeIdx << 1)
rightIdx = (nodeIdx << 1) + 1
leftTree (leftIdx << (elevation-1)) : (rightIdx << (elevation-1))
rightTree (rightIdx << (elevation-1)) : nodeIdx
POSTORDER_RANGE debug ideas¶
- parallelR variables
- use quad-decker begin,end,beginR,endR inside the tranche slice so can see exactly where things diverge
- maybe running into elev 0 and doing << -1
Recursive within intersect, nope¶
Hmm seems cannot do recursive within an intersect program anyhow ??
libc++abi.dylib: terminating with uncaught exception of type optix::Exception:
Parse error (Details: Function "RTresult _rtProgramCreateFromPTXFile(RTcontext, const char *, const char *, RTprogram *)"
caught exception: /usr/local/opticks/installcache/PTX/OptiXRap_generated_hemi-pmt.cu.ptx:
error: Recursion detected in function
_Z15recursive_csg_rjjjf( file /usr/local/opticks/installcache/PTX/OptiXRap_generated_hemi-pmt.cu.ptx ),
cannot inline. [5308892], [5308892])
csg_intersect_boolean.h : struct Tranche¶
Postorder Tranch storing a stack of slices into the postorder sequence.
- slice
- 32 bit unsigned holding a pair of begin and end indices specifying a range over the postorder traversal sequence
- tranche_push
- push (slice, tmin) onto the small stack
- tranche_pop
- pops off (slice, tmin)
- tranche_repr
- representation of the stack of slices packed into a 64 bit unsigned long long
csg_intersect_boolean.h : struct CSG¶
Small stack of float4 isect (surface normals and t:distance_to_intersect).
- csg_push
- push float4 isect and nodeIdx into stack
- csg_pop
- pop float4 isect and nodeIdx off the stack
csg_intersect_boolean.h : evaluative_csg¶
Recursive CSG intersection algorithm implemented iteratively using a stack of slices into the postorder traversal sequence of the complete binary tree, effectively emulating recursion.
The whole algoritm depends on the postorder traversal feature that the left and right children of a node are always visited before the node itself.
- USE_TWIDDLE_POSTORDER
macro that is now standardly defined
bit-twiddle postorder is limited to trees of height 7, ie a maximum of 0xff (255) nodes (using 2-bytes with PACK2 would bump that to 0xffff (65535) nodes) In any case 0xff nodes are far more than this is expected to be used with