# Multi-Hand Tracking (GPU) This doc focuses on the [example graph](https://github.com/google/mediapipe/tree/master/mediapipe/graphs/hand_tracking/multi_hand_tracking_mobile.pbtxt) that performs multi-hand tracking with TensorFlow Lite on GPU. It is related to the [hand_tracking_example](./hand_tracking_mobile_gpu.md), and we recommend users to review the (single) hand tracking example first. ![multi_hand_tracking_android_gpu.gif](images/mobile/multi_hand_tracking_android_gpu.gif) In the visualization above, the red dots represent the hand landmarks and the green lines are simply connections between selected landmark paris for visualization of the hand skeleton. When there are fewer than `N` hands (`N=2` in the graphs here), the purple box represents a hand rectangle that covers the entire hand, derived from hand detection (see [hand_detection_example](./hand_detection_mobile_gpu.md)). When there are `N` hands (i.e. 2 hands for the graphs here), the red boxes represent hand rectangles for each of the hands, derived from the previous round of hand landmark localization using an ML model (see also [model card](https://mediapipe.page.link/handmc)). Hand landmark localization for each hand is performed only within the hand rectangle for computational efficiency and accuracy. Hand detection is only invoked whenever there are fewer than `N` hands in the previous iteration. This example can also run a model that localizes hand landmarks in 3D (i.e., estimating an extra z coordinate): ![multi_hand_tracking_3d_android_gpu.gif](images/mobile/multi_hand_tracking_3d_android_gpu.gif) In the visualization above, the localized hand landmarks are represented by dots in different shades, with the brighter ones denoting landmarks closer to the camera. ## Android [Source](https://github.com/google/mediapipe/tree/master/mediapipe/examples/android/src/java/com/google/mediapipe/apps/multihandtrackinggpu) To build the app yourself, run: ```bash bazel build -c opt --config=android_arm64 mediapipe/examples/android/src/java/com/google/mediapipe/apps/multihandtrackinggpu ``` To build for the 3D mode, run: ```bash bazel build -c opt --config=android_arm64 --define 3D=true mediapipe/examples/android/src/java/com/google/mediapipe/apps/multihandtrackinggpu ``` Once the app is built, install it on Android device with: ```bash adb install bazel-bin/mediapipe/examples/android/src/java/com/google/mediapipe/apps/multihandtrackinggpu/multihandtrackinggpu.apk ``` ## iOS [Source](https://github.com/google/mediapipe/tree/master/mediapipe/examples/ios/multihandtrackinggpu). See the general [instructions](./mediapipe_ios_setup.md) for building iOS examples and generating an Xcode project. This will be the HandDetectionGpuApp target. To build on the command line: ```bash bazel build -c opt --config=ios_arm64 mediapipe/examples/ios/multihandtrackinggpu:MultiHandTrackingGpuApp ``` To build for the 3D mode, run: ```bash bazel build -c opt --config=ios_arm64 --define 3D=true mediapipe/examples/ios/multihandtrackinggpu:MultiHandTrackingGpuApp ``` ## Graph The multi-hand tracking [main graph](#main-graph) internal utilizes a [multi_hand_detection_subgraph](#multi-hand-detection-subgraph), a [multi_hand_landmark_subgraph](#multi-hand-landmark-subgraph), and a [multi_hand_renderer_subgraph](#multi-hand-renderer-subgraph). The subgraphs show up in the main graph visualization as nodes colored in purple, and the subgraph itself can also be visualized just like a regular graph. For more information on how to visualize a graph that includes subgraphs, see the Visualizing Subgraphs section in the [visualizer documentation](./visualizer.md). ### Main Graph ![multi_hand_tracking_mobile_graph](images/mobile/multi_hand_tracking_mobile.png) There are two key differences between this graph and the [single_hand_tracking_mobile_graph](./hand_tracking_mobile_gpu.md). 1. There is a `NormalizedRectVectorHasMinSize` calculator, that checks if in input vector of `NormalizedRect` objects has a minimum size equal to `N`. In this graph, if the vector contains fewer than `N` objects, `MultiHandDetection` subgraph runs. Otherwise, the `GateCalculator` doesn't send any image packets to the `MultiHandDetection` subgraph. This way, the main graph is efficient in that it avoids running the costly hand detection step when there are already `N` hands in the frame. 2. The `MergeCalculator` has been replaced by the `AssociationNormRect` calculator. This `AssociationNormRect` takes as input a vector of `NormalizedRect` objects from the `MultiHandDetection` subgraph on the current frame, and a vector of `NormalizedRect` objects from the `MultiHandLandmark` subgraph from the previous frame, and performs an association operation between these objects. This calculator ensures that the output vector doesn't contain overlapping regions based on the specified `min_similarity_threshold`. [Source pbtxt file](https://github.com/google/mediapipe/tree/master/mediapipe/graphs/hand_tracking/multi_hand_tracking_mobile.pbtxt) ```bash # MediaPipe graph that performs multi-hand tracking with TensorFlow Lite on GPU. # Used in the examples in # mediapipie/examples/android/src/java/com/mediapipe/apps/multihandtrackinggpu. # Images coming into and out of the graph. input_stream: "input_video" output_stream: "output_video" # Throttles the images flowing downstream for flow control. It passes through # the very first incoming image unaltered, and waits for downstream nodes # (calculators and subgraphs) in the graph to finish their tasks before it # passes through another image. All images that come in while waiting are # dropped, limiting the number of in-flight images in most part of the graph to # 1. This prevents the downstream nodes from queuing up incoming images and data # excessively, which leads to increased latency and memory usage, unwanted in # real-time mobile applications. It also eliminates unnecessarily computation, # e.g., the output produced by a node may get dropped downstream if the # subsequent nodes are still busy processing previous inputs. node { calculator: "FlowLimiterCalculator" input_stream: "input_video" input_stream: "FINISHED:multi_hand_rects" input_stream_info: { tag_index: "FINISHED" back_edge: true } output_stream: "throttled_input_video" } # Determines if an input vector of NormalizedRect has a size greater than or # equal to the provided min_size. node { calculator: "NormalizedRectVectorHasMinSizeCalculator" input_stream: "ITERABLE:prev_multi_hand_rects_from_landmarks" output_stream: "prev_has_enough_hands" node_options: { [type.googleapis.com/mediapipe.CollectionHasMinSizeCalculatorOptions] { # This value can be changed to support tracking arbitrary number of hands. # Please also remember to modify max_vec_size in # ClipVectorSizeCalculatorOptions in # mediapipe/graphs/hand_tracking/subgraphs/multi_hand_detection_gpu.pbtxt min_size: 2 } } } # Drops the incoming image if the previous frame had at least N hands. # Otherwise, passes the incoming image through to trigger a new round of hand # detection in MultiHandDetectionSubgraph. node { calculator: "GateCalculator" input_stream: "throttled_input_video" input_stream: "DISALLOW:prev_has_enough_hands" output_stream: "multi_hand_detection_input_video" node_options: { [type.googleapis.com/mediapipe.GateCalculatorOptions] { empty_packets_as_allow: true } } } # Subgraph that detections hands (see multi_hand_detection_gpu.pbtxt). node { calculator: "MultiHandDetectionSubgraph" input_stream: "multi_hand_detection_input_video" output_stream: "DETECTIONS:multi_palm_detections" output_stream: "NORM_RECTS:multi_palm_rects" } # Subgraph that localizes hand landmarks for multiple hands (see # multi_hand_landmark.pbtxt). node { calculator: "MultiHandLandmarkSubgraph" input_stream: "IMAGE:throttled_input_video" input_stream: "NORM_RECTS:multi_hand_rects" output_stream: "LANDMARKS:multi_hand_landmarks" output_stream: "NORM_RECTS:multi_hand_rects_from_landmarks" } # Caches a hand rectangle fed back from MultiHandLandmarkSubgraph, and upon the # arrival of the next input image sends out the cached rectangle with the # timestamp replaced by that of the input image, essentially generating a packet # that carries the previous hand rectangle. Note that upon the arrival of the # very first input image, an empty packet is sent out to jump start the # feedback loop. node { calculator: "PreviousLoopbackCalculator" input_stream: "MAIN:throttled_input_video" input_stream: "LOOP:multi_hand_rects_from_landmarks" input_stream_info: { tag_index: "LOOP" back_edge: true } output_stream: "PREV_LOOP:prev_multi_hand_rects_from_landmarks" } # Performs association between NormalizedRect vector elements from previous # frame and those from the current frame if MultiHandDetectionSubgraph runs. # This calculator ensures that the output multi_hand_rects vector doesn't # contain overlapping regions based on the specified min_similarity_threshold. node { calculator: "AssociationNormRectCalculator" input_stream: "prev_multi_hand_rects_from_landmarks" input_stream: "multi_palm_rects" output_stream: "multi_hand_rects" node_options: { [type.googleapis.com/mediapipe.AssociationCalculatorOptions] { min_similarity_threshold: 0.5 } } } # Subgraph that renders annotations and overlays them on top of the input # images (see multi_hand_renderer_gpu.pbtxt). node { calculator: "MultiHandRendererSubgraph" input_stream: "IMAGE:throttled_input_video" input_stream: "DETECTIONS:multi_palm_detections" input_stream: "LANDMARKS:multi_hand_landmarks" input_stream: "NORM_RECTS:0:multi_palm_rects" input_stream: "NORM_RECTS:1:multi_hand_rects" output_stream: "IMAGE:output_video" } ``` ### Multi-Hand Detection Subgraph ![multi_hand_detection_gpu_subgraph](images/mobile/multi_hand_detection_gpu_subgraph.png) This graph outputs a vector of `NormalizedRect` objects corresponding to each of the hand instances visible in the frame. Note that at the end of this graph, there is a `ClipNormalizedRectVectorSizeCalculator`. This calculator clips the size of the input vector to a maximum size `N`. This implies that the `MultiHandDetection` subgraph outputs a vector of maximum `N` hand instance locations. [Source pbtxt file](https://github.com/google/mediapipe/tree/master/mediapipe/graphs/hand_tracking/subgraphs/multi_hand_detection_gpu.pbtxt) ```bash # MediaPipe multi-hand detection subgraph. type: "MultiHandDetectionSubgraph" input_stream: "input_video" output_stream: "DETECTIONS:palm_detections" output_stream: "NORM_RECTS:clipped_hand_rects_from_palm_detections" # Transforms the input image on GPU to a 256x256 image. To scale the input # image, the scale_mode option is set to FIT to preserve the aspect ratio, # resulting in potential letterboxing in the transformed image. node: { calculator: "ImageTransformationCalculator" input_stream: "IMAGE_GPU:input_video" output_stream: "IMAGE_GPU:transformed_input_video" output_stream: "LETTERBOX_PADDING:letterbox_padding" node_options: { [type.googleapis.com/mediapipe.ImageTransformationCalculatorOptions] { output_width: 256 output_height: 256 scale_mode: FIT } } } # Generates a single side packet containing a TensorFlow Lite op resolver that # supports custom ops needed by the model used in this graph. node { calculator: "TfLiteCustomOpResolverCalculator" output_side_packet: "opresolver" node_options: { [type.googleapis.com/mediapipe.TfLiteCustomOpResolverCalculatorOptions] { use_gpu: true } } } # Converts the transformed input image on GPU into an image tensor stored as a # TfLiteTensor. node { calculator: "TfLiteConverterCalculator" input_stream: "IMAGE_GPU:transformed_input_video" output_stream: "TENSORS_GPU:image_tensor" } # Runs a TensorFlow Lite model on GPU that takes an image tensor and outputs a # vector of tensors representing, for instance, detection boxes/keypoints and # scores. node { calculator: "TfLiteInferenceCalculator" input_stream: "TENSORS_GPU:image_tensor" output_stream: "TENSORS_GPU:detection_tensors" input_side_packet: "CUSTOM_OP_RESOLVER:opresolver" node_options: { [type.googleapis.com/mediapipe.TfLiteInferenceCalculatorOptions] { model_path: "mediapipe/models/palm_detection.tflite" use_gpu: true } } } # Generates a single side packet containing a vector of SSD anchors based on # the specification in the options. node { calculator: "SsdAnchorsCalculator" output_side_packet: "anchors" node_options: { [type.googleapis.com/mediapipe.SsdAnchorsCalculatorOptions] { num_layers: 5 min_scale: 0.1171875 max_scale: 0.75 input_size_height: 256 input_size_width: 256 anchor_offset_x: 0.5 anchor_offset_y: 0.5 strides: 8 strides: 16 strides: 32 strides: 32 strides: 32 aspect_ratios: 1.0 fixed_anchor_size: true } } } # Decodes the detection tensors generated by the TensorFlow Lite model, based on # the SSD anchors and the specification in the options, into a vector of # detections. Each detection describes a detected object. node { calculator: "TfLiteTensorsToDetectionsCalculator" input_stream: "TENSORS_GPU:detection_tensors" input_side_packet: "ANCHORS:anchors" output_stream: "DETECTIONS:detections" node_options: { [type.googleapis.com/mediapipe.TfLiteTensorsToDetectionsCalculatorOptions] { num_classes: 1 num_boxes: 2944 num_coords: 18 box_coord_offset: 0 keypoint_coord_offset: 4 num_keypoints: 7 num_values_per_keypoint: 2 sigmoid_score: true score_clipping_thresh: 100.0 reverse_output_order: true x_scale: 256.0 y_scale: 256.0 h_scale: 256.0 w_scale: 256.0 min_score_thresh: 0.7 } } } # Performs non-max suppression to remove excessive detections. node { calculator: "NonMaxSuppressionCalculator" input_stream: "detections" output_stream: "filtered_detections" node_options: { [type.googleapis.com/mediapipe.NonMaxSuppressionCalculatorOptions] { min_suppression_threshold: 0.3 overlap_type: INTERSECTION_OVER_UNION algorithm: WEIGHTED return_empty_detections: true } } } # Maps detection label IDs to the corresponding label text ("Palm"). The label # map is provided in the label_map_path option. node { calculator: "DetectionLabelIdToTextCalculator" input_stream: "filtered_detections" output_stream: "labeled_detections" node_options: { [type.googleapis.com/mediapipe.DetectionLabelIdToTextCalculatorOptions] { label_map_path: "mediapipe/models/palm_detection_labelmap.txt" } } } # Adjusts detection locations (already normalized to [0.f, 1.f]) on the # letterboxed image (after image transformation with the FIT scale mode) to the # corresponding locations on the same image with the letterbox removed (the # input image to the graph before image transformation). node { calculator: "DetectionLetterboxRemovalCalculator" input_stream: "DETECTIONS:labeled_detections" input_stream: "LETTERBOX_PADDING:letterbox_padding" output_stream: "DETECTIONS:palm_detections" } # Extracts image size from the input images. node { calculator: "ImagePropertiesCalculator" input_stream: "IMAGE_GPU:input_video" output_stream: "SIZE:image_size" } # Converts each palm detection into a rectangle (normalized by image size) # that encloses the palm and is rotated such that the line connecting center of # the wrist and MCP of the middle finger is aligned with the Y-axis of the # rectangle. node { calculator: "DetectionsToRectsCalculator" input_stream: "DETECTIONS:palm_detections" input_stream: "IMAGE_SIZE:image_size" output_stream: "NORM_RECTS:palm_rects" node_options: { [type.googleapis.com/mediapipe.DetectionsToRectsCalculatorOptions] { rotation_vector_start_keypoint_index: 0 # Center of wrist. rotation_vector_end_keypoint_index: 2 # MCP of middle finger. rotation_vector_target_angle_degrees: 90 output_zero_rect_for_empty_detections: true } } } # Expands and shifts the rectangle that contains the palm so that it's likely # to cover the entire hand. node { calculator: "RectTransformationCalculator" input_stream: "NORM_RECTS:palm_rects" input_stream: "IMAGE_SIZE:image_size" output_stream: "hand_rects_from_palm_detections" node_options: { [type.googleapis.com/mediapipe.RectTransformationCalculatorOptions] { scale_x: 2.6 scale_y: 2.6 shift_y: -0.5 square_long: true } } } # Clips the size of the input vector to the provided max_vec_size. This # determines the maximum number of hand instances this graph outputs. # Note that the performance gain of clipping detections earlier in this graph is # minimal because NMS will minimize overlapping detections and the number of # detections isn't expected to exceed 5-10. node { calculator: "ClipNormalizedRectVectorSizeCalculator" input_stream: "hand_rects_from_palm_detections" output_stream: "clipped_hand_rects_from_palm_detections" node_options: { [type.googleapis.com/mediapipe.ClipVectorSizeCalculatorOptions] { # This value can be changed to support tracking arbitrary number of hands. # Please also remember to modify min_size in # CollectionHsMinSizeCalculatorOptions in # mediapipe/graphs/hand_tracking/multi_hand_tracking_mobile.pbtxt and # mediapipe/graphs/hand_tracking/multi_hand_tracking_desktop_live.pbtxt. max_vec_size: 2 } } } ``` ### Multi-Hand Landmark Subgraph ![multi_hand_landmark_subgraph.pbtxt](images/mobile/multi_hand_landmark_subgraph.png) This graph accepts as input a vector of `NormalizedRect` objects, corresponding the the region of each hand instance in the input image. For each `NormalizedRect` object, the graph runs the existing `HandLandmark` subgraph and collect the outputs of this subgraph into vectors. This is enabled by `BeginLoop` and `EndLoop` calculators. The `BeginLoop` calculator accepts as input a packet containing an iterable collection of elements. This calculator is templatized (see [begin_loop_calculator.h](https://github.com/google/mediapipe/tree/master/mediapipe/calculators/core/begin_loop_calculator.h)). If the input packet arrived at a timestamp `ts`, this calculator outputs each element in the collection at a fake timestamp `internal_ts`. At the end of the collection, the calculator outputs the arrival timestamp `ts` in the output stream tagged with `BATCH_END`. The nodes between the `BeginLoop` calculator and the corresponding `EndLoop` calculator process individual packets at the fake timestamps `internal_ts`. After each element is processed, it is sent to the `EndLoop` calculator (see [end_loop_calculator.h](https://github.com/google/mediapipe/tree/master/mediapipe/calculators/core/end_loop_calculator.h)), which collects these elements in an output collection. The `EndLoop` calculator listens for packets from the `BATCH_END` output stream of the `BeginLoop` calculator. When the `BATCH_END` packet containing the real timestamp `ts` arrives at the `EndLoop` calculator, the `EndLoop` calculator outputs a packet containing the collection of processed elements at the real timestamp `ts`. In the multi-hand landmark subgraph, the `EndLoop` calculators collect the output vector of hand landmarks per hand instance, the boolean values indicating the presence of each hand and the `NormalizedRect` objects corresponding to the regions surrounding each hand into vectors. Finally, based on the hand presence boolean value, the graph filters the collections of hand landmarks and `NormalizdRect` objects corresponding to each hand instance. [Source pbtxt file](https://github.com/google/mediapipe/tree/master/mediapipe/graphs/hand_tracking/subgraphs/multi_hand_landmark.pbtxt) ```bash # MediaPipe hand landmark localization subgraph. type: "MultiHandLandmarkSubgraph" input_stream: "IMAGE:input_video" # A vector of NormalizedRect, one per each hand detected. input_stream: "NORM_RECTS:multi_hand_rects" # A vector of NormalizedLandmarks, one set per each hand. output_stream: "LANDMARKS:filtered_multi_hand_landmarks" # A vector of NormalizedRect, one per each hand. output_stream: "NORM_RECTS:filtered_multi_hand_rects_for_next_frame" # Outputs each element of multi_hand_rects at a fake timestamp for the rest # of the graph to process. Clones the input_video packet for each # single_hand_rect at the fake timestamp. At the end of the loop, # outputs the BATCH_END timestamp for downstream calculators to inform them # that all elements in the vector have been processed. node { calculator: "BeginLoopNormalizedRectCalculator" input_stream: "ITERABLE:multi_hand_rects" input_stream: "CLONE:input_video" output_stream: "ITEM:single_hand_rect" output_stream: "CLONE:input_video_cloned" output_stream: "BATCH_END:single_hand_rect_timestamp" } node { calculator: "HandLandmarkSubgraph" input_stream: "IMAGE:input_video_cloned" input_stream: "NORM_RECT:single_hand_rect" output_stream: "LANDMARKS:single_hand_landmarks" output_stream: "NORM_RECT:single_hand_rect_from_landmarks" output_stream: "PRESENCE:single_hand_presence" } # Collects the boolean presence value for each single hand into a vector. Upon # receiving the BATCH_END timestamp, outputs a vector of boolean values at the # BATCH_END timestamp. node { calculator: "EndLoopBooleanCalculator" input_stream: "ITEM:single_hand_presence" input_stream: "BATCH_END:single_hand_rect_timestamp" output_stream: "ITERABLE:multi_hand_presence" } # Collects a set of landmarks for each hand into a vector. Upon receiving the # BATCH_END timestamp, outputs the vector of landmarks at the BATCH_END # timestamp. node { calculator: "EndLoopNormalizedLandmarksVectorCalculator" input_stream: "ITEM:single_hand_landmarks" input_stream: "BATCH_END:single_hand_rect_timestamp" output_stream: "ITERABLE:multi_hand_landmarks" } # Collects a NormalizedRect for each hand into a vector. Upon receiving the # BATCH_END timestamp, outputs the vector of NormalizedRect at the BATCH_END # timestamp. node { calculator: "EndLoopNormalizedRectCalculator" input_stream: "ITEM:single_hand_rect_from_landmarks" input_stream: "BATCH_END:single_hand_rect_timestamp" output_stream: "ITERABLE:multi_hand_rects_for_next_frame" } # Filters the input vector of landmarks based on hand presence value for each # hand. If the hand presence for hand #i is false, the set of landmarks # corresponding to that hand are dropped from the vector. node { calculator: "FilterLandmarksCollectionCalculator" input_stream: "ITERABLE:multi_hand_landmarks" input_stream: "CONDITION:multi_hand_presence" output_stream: "ITERABLE:filtered_multi_hand_landmarks" } # Filters the input vector of NormalizedRect based on hand presence value for # each hand. If the hand presence for hand #i is false, the NormalizedRect # corresponding to that hand are dropped from the vector. node { calculator: "FilterNormalizedRectCollectionCalculator" input_stream: "ITERABLE:multi_hand_rects_for_next_frame" input_stream: "CONDITION:multi_hand_presence" output_stream: "ITERABLE:filtered_multi_hand_rects_for_next_frame" } ``` ### Multi-Hand Renderer Subgraph ![multi_hand_renderer_gpu_subgraph.pbtxt](images/mobile/multi_hand_renderer_gpu_subgraph.png) This graph also uses `BeginLoop` and `EndLoop` calculators to iteratively convert a set of hand landmarks per hand instance into corresponding `RenderData` objects. [Source pbtxt file](https://github.com/google/mediapipe/tree/master/mediapipe/graphs/hand_tracking/subgraphs/multi_hand_renderer_gpu.pbtxt) ```bash # MediaPipe multi-hand tracking rendering subgraph. type: "MultiHandRendererSubgraph" input_stream: "IMAGE:input_image" # A vector of NormalizedLandmarks, one for each hand. input_stream: "LANDMARKS:multi_hand_landmarks" # A vector of NormalizedRect, one for each hand. input_stream: "NORM_RECTS:0:multi_palm_rects" # A vector of NormalizedRect, one for each hand. input_stream: "NORM_RECTS:1:multi_hand_rects" # A vector of Detection, one for each hand. input_stream: "DETECTIONS:palm_detections" output_stream: "IMAGE:output_image" # Converts detections to drawing primitives for annotation overlay. node { calculator: "DetectionsToRenderDataCalculator" input_stream: "DETECTIONS:palm_detections" output_stream: "RENDER_DATA:detection_render_data" node_options: { [type.googleapis.com/mediapipe.DetectionsToRenderDataCalculatorOptions] { thickness: 4.0 color { r: 0 g: 255 b: 0 } } } } # Converts normalized rects to drawing primitives for annotation overlay. node { calculator: "RectToRenderDataCalculator" input_stream: "NORM_RECTS:multi_hand_rects" output_stream: "RENDER_DATA:multi_hand_rects_render_data" node_options: { [type.googleapis.com/mediapipe.RectToRenderDataCalculatorOptions] { filled: false color { r: 255 g: 0 b: 0 } thickness: 4.0 } } } # Converts normalized rects to drawing primitives for annotation overlay. node { calculator: "RectToRenderDataCalculator" input_stream: "NORM_RECTS:multi_palm_rects" output_stream: "RENDER_DATA:multi_palm_rects_render_data" node_options: { [type.googleapis.com/mediapipe.RectToRenderDataCalculatorOptions] { filled: false color { r: 125 g: 0 b: 122 } thickness: 4.0 } } } # Outputs each element of multi_palm_landmarks at a fake timestamp for the rest # of the graph to process. At the end of the loop, outputs the BATCH_END # timestamp for downstream calculators to inform them that all elements in the # vector have been processed. node { calculator: "BeginLoopNormalizedLandmarksVectorCalculator" input_stream: "ITERABLE:multi_hand_landmarks" output_stream: "ITEM:single_hand_landmarks" output_stream: "BATCH_END:landmark_timestamp" } # Converts landmarks to drawing primitives for annotation overlay. node { calculator: "LandmarksToRenderDataCalculator" input_stream: "NORM_LANDMARKS:single_hand_landmarks" output_stream: "RENDER_DATA:single_hand_landmark_render_data" node_options: { [type.googleapis.com/mediapipe.LandmarksToRenderDataCalculatorOptions] { landmark_connections: 0 landmark_connections: 1 landmark_connections: 1 landmark_connections: 2 landmark_connections: 2 landmark_connections: 3 landmark_connections: 3 landmark_connections: 4 landmark_connections: 0 landmark_connections: 5 landmark_connections: 5 landmark_connections: 6 landmark_connections: 6 landmark_connections: 7 landmark_connections: 7 landmark_connections: 8 landmark_connections: 5 landmark_connections: 9 landmark_connections: 9 landmark_connections: 10 landmark_connections: 10 landmark_connections: 11 landmark_connections: 11 landmark_connections: 12 landmark_connections: 9 landmark_connections: 13 landmark_connections: 13 landmark_connections: 14 landmark_connections: 14 landmark_connections: 15 landmark_connections: 15 landmark_connections: 16 landmark_connections: 13 landmark_connections: 17 landmark_connections: 0 landmark_connections: 17 landmark_connections: 17 landmark_connections: 18 landmark_connections: 18 landmark_connections: 19 landmark_connections: 19 landmark_connections: 20 landmark_color { r: 255 g: 0 b: 0 } connection_color { r: 0 g: 255 b: 0 } thickness: 4.0 } } } # Collects a RenderData object for each hand into a vector. Upon receiving the # BATCH_END timestamp, outputs the vector of RenderData at the BATCH_END # timestamp. node { calculator: "EndLoopRenderDataCalculator" input_stream: "ITEM:single_hand_landmark_render_data" input_stream: "BATCH_END:landmark_timestamp" output_stream: "ITERABLE:multi_hand_landmarks_render_data" } # Draws annotations and overlays them on top of the input images. Consumes # a vector of RenderData objects and draws each of them on the input frame. node { calculator: "AnnotationOverlayCalculator" input_stream: "INPUT_FRAME_GPU:input_image" input_stream: "detection_render_data" input_stream: "multi_hand_rects_render_data" input_stream: "multi_palm_rects_render_data" input_stream: "VECTOR:0:multi_hand_landmarks_render_data" output_stream: "OUTPUT_FRAME_GPU:output_image" } ```