Grammtically and spelling updated the tech docs

This commit is contained in:
zainali 2022-03-02 15:08:53 +02:00
parent e6c19885c6
commit b4dda635a1
7 changed files with 51 additions and 7 deletions

8
.idea/.gitignore vendored Normal file
View File

@ -0,0 +1,8 @@
# Default ignored files
/shelf/
/workspace.xml
# Datasource local storage ignored files
/dataSources/
/dataSources.local.xml
# Editor-based HTTP Client requests
/httpRequests/

View File

@ -0,0 +1,6 @@
<component name="InspectionProjectProfileManager">
<settings>
<option name="USE_PROJECT_PROFILE" value="false" />
<version value="1.0" />
</settings>
</component>

12
.idea/mediapipe.iml Normal file
View File

@ -0,0 +1,12 @@
<?xml version="1.0" encoding="UTF-8"?>
<module type="PYTHON_MODULE" version="4">
<component name="NewModuleRootManager">
<content url="file://$MODULE_DIR$" />
<orderEntry type="inheritedJdk" />
<orderEntry type="sourceFolder" forTests="false" />
</component>
<component name="PyDocumentationSettings">
<option name="format" value="PLAIN" />
<option name="myDocStringFormat" value="Plain" />
</component>
</module>

4
.idea/misc.xml Normal file
View File

@ -0,0 +1,4 @@
<?xml version="1.0" encoding="UTF-8"?>
<project version="4">
<component name="ProjectRootManager" version="2" project-jdk-name="Python 3.9 (projects python)" project-jdk-type="Python SDK" />
</project>

8
.idea/modules.xml Normal file
View File

@ -0,0 +1,8 @@
<?xml version="1.0" encoding="UTF-8"?>
<project version="4">
<component name="ProjectModuleManager">
<modules>
<module fileurl="file://$PROJECT_DIR$/.idea/mediapipe.iml" filepath="$PROJECT_DIR$/.idea/mediapipe.iml" />
</modules>
</component>
</project>

6
.idea/vcs.xml Normal file
View File

@ -0,0 +1,6 @@
<?xml version="1.0" encoding="UTF-8"?>
<project version="4">
<component name="VcsDirectoryMappings">
<mapping directory="$PROJECT_DIR$" vcs="Git" />
</component>
</project>

View File

@ -27,7 +27,7 @@ and hand gesture control, and can also enable the overlay of digital content and
information on top of the physical world in augmented reality. While coming information on top of the physical world in augmented reality. While coming
naturally to people, robust real-time hand perception is a decidedly challenging naturally to people, robust real-time hand perception is a decidedly challenging
computer vision task, as hands often occlude themselves or each other (e.g. computer vision task, as hands often occlude themselves or each other (e.g.
finger/palm occlusions and hand shakes) and lack high contrast patterns. finger/palm occlusions and handshakes) and lack high contrast patterns.
MediaPipe Hands is a high-fidelity hand and finger tracking solution. It employs MediaPipe Hands is a high-fidelity hand and finger tracking solution. It employs
machine learning (ML) to infer 21 3D landmarks of a hand from just a single machine learning (ML) to infer 21 3D landmarks of a hand from just a single
@ -107,12 +107,12 @@ train a palm detector instead of a hand detector, since estimating bounding
boxes of rigid objects like palms and fists is significantly simpler than boxes of rigid objects like palms and fists is significantly simpler than
detecting hands with articulated fingers. In addition, as palms are smaller detecting hands with articulated fingers. In addition, as palms are smaller
objects, the non-maximum suppression algorithm works well even for two-hand objects, the non-maximum suppression algorithm works well even for two-hand
self-occlusion cases, like handshakes. Moreover, palms can be modelled using self-occlusion cases, like handshakes. Moreover, palms can be modeled using
square bounding boxes (anchors in ML terminology) ignoring other aspect ratios, square bounding boxes (anchors in ML terminology) ignoring other aspect ratios,
and therefore reducing the number of anchors by a factor of 3-5. Second, an and therefore reducing the number of anchors by a factor of 3-5. Second, an
encoder-decoder feature extractor is used for bigger scene context awareness encoder-decoder feature extractor is used for bigger scene context awareness
even for small objects (similar to the RetinaNet approach). Lastly, we minimize even for small objects (similar to the RetinaNet approach). Lastly, we minimize
the focal loss during training to support a large amount of anchors resulting the focal loss during training to support a large number of anchors resulting
from the high scale variance. from the high scale variance.
With the above techniques, we achieve an average precision of 95.7% in palm With the above techniques, we achieve an average precision of 95.7% in palm
@ -129,7 +129,7 @@ The model learns a consistent internal hand pose representation and is robust
even to partially visible hands and self-occlusions. even to partially visible hands and self-occlusions.
To obtain ground truth data, we have manually annotated ~30K real-world images To obtain ground truth data, we have manually annotated ~30K real-world images
with 21 3D coordinates, as shown below (we take Z-value from image depth map, if with 21 3D coordinates, as shown below (we take Z-value from the image depth map, if
it exists per corresponding coordinate). To better cover the possible hand poses it exists per corresponding coordinate). To better cover the possible hand poses
and provide additional supervision on the nature of hand geometry, we also and provide additional supervision on the nature of hand geometry, we also
render a high-quality synthetic hand model over various backgrounds and map it render a high-quality synthetic hand model over various backgrounds and map it
@ -163,11 +163,11 @@ unrelated, images. Default to `false`.
#### max_num_hands #### max_num_hands
Maximum number of hands to detect. Default to `2`. The maximum number of hands to detect. Default to `2`.
#### model_complexity #### model_complexity
Complexity of the hand landmark model: `0` or `1`. Landmark accuracy as well as The complexity of the hand landmark model: `0` or `1`. Landmark accuracy as well as
inference latency generally go up with the model complexity. Default to `1`. inference latency generally go up with the model complexity. Default to `1`.
#### min_detection_confidence #### min_detection_confidence
@ -208,7 +208,7 @@ approximate geometric center.
Collection of handedness of the detected/tracked hands (i.e. is it a left or Collection of handedness of the detected/tracked hands (i.e. is it a left or
right hand). Each hand is composed of `label` and `score`. `label` is a string right hand). Each hand is composed of `label` and `score`. `label` is a string
of value either `"Left"` or `"Right"`. `score` is the estimated probability of of values either `"Left"` or `"Right"`. `score` is the estimated probability of
the predicted handedness and is always greater than or equal to `0.5` (and the the predicted handedness and is always greater than or equal to `0.5` (and the
opposite handedness has an estimated probability of `1 - score`). opposite handedness has an estimated probability of `1 - score`).