diff --git a/.github/ISSUE_TEMPLATE/00-build-installation-issue.md b/.github/ISSUE_TEMPLATE/00-build-installation-issue.md
new file mode 100644
index 000000000..f4300e42a
--- /dev/null
+++ b/.github/ISSUE_TEMPLATE/00-build-installation-issue.md
@@ -0,0 +1,27 @@
+---
+name: "Build/Installation Issue"
+about: Use this template for build/installation issues
+labels: type:build/install
+
+---
+Please make sure that this is a build/installation issue and also refer to the [troubleshooting](https://google.github.io/mediapipe/getting_started/troubleshooting.html) documentation before raising any issues.
+
+**System information** (Please provide as much relevant information as possible)
+- OS Platform and Distribution (e.g. Linux Ubuntu 16.04, Android 11, iOS 14.4):
+- Compiler version (e.g. gcc/g++ 8 /Apple clang version 12.0.0):
+- Programming Language and version ( e.g. C++ 14, Python 3.6, Java ):
+- Installed using virtualenv? pip? Conda? (if python):
+- [MediaPipe version](https://github.com/google/mediapipe/releases):
+- Bazel version:
+- XCode and Tulsi versions (if iOS):
+- Android SDK and NDK versions (if android):
+- Android [AAR](https://google.github.io/mediapipe/getting_started/android_archive_library.html) ( if android):
+- OpenCV version (if running on desktop):
+
+**Describe the problem**:
+
+
+**[Provide the exact sequence of commands / steps that you executed before running into the problem](https://google.github.io/mediapipe/getting_started/getting_started.html):**
+
+**Complete Logs:**
+Include Complete Log information or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached:
diff --git a/.github/ISSUE_TEMPLATE/10-solution-issue.md b/.github/ISSUE_TEMPLATE/10-solution-issue.md
new file mode 100644
index 000000000..a5332cb36
--- /dev/null
+++ b/.github/ISSUE_TEMPLATE/10-solution-issue.md
@@ -0,0 +1,26 @@
+---
+name: "Solution Issue"
+about: Use this template for assistance with a specific mediapipe solution, such as "Pose" or "Iris", including inference model usage/training, solution-specific calculators, etc.
+labels: type:support
+
+---
+Please make sure that this is a [solution](https://google.github.io/mediapipe/solutions/solutions.html) issue.
+
+**System information** (Please provide as much relevant information as possible)
+- Have I written custom code (as opposed to using a stock example script provided in Mediapipe):
+- OS Platform and Distribution (e.g., Linux Ubuntu 16.04, Android 11, iOS 14.4):
+- [MediaPipe version](https://github.com/google/mediapipe/releases):
+- Bazel version:
+- Solution (e.g. FaceMesh, Pose, Holistic):
+- Programming Language and version ( e.g. C++, Python, Java):
+
+**Describe the expected behavior:**
+
+**Standalone code you may have used to try to get what you need :**
+
+If there is a problem, provide a reproducible test case that is the bare minimum necessary to generate the problem. If possible, please share a link to Colab/repo link /any notebook:
+
+**Other info / Complete Logs :**
+Include any logs or source code that would be helpful to
+diagnose the problem. If including tracebacks, please include the full
+traceback. Large logs and files should be attached:
diff --git a/.github/ISSUE_TEMPLATE/20-documentation-issue.md b/.github/ISSUE_TEMPLATE/20-documentation-issue.md
new file mode 100644
index 000000000..2918e03b4
--- /dev/null
+++ b/.github/ISSUE_TEMPLATE/20-documentation-issue.md
@@ -0,0 +1,51 @@
+---
+name: "Documentation Issue"
+about: Use this template for documentation related issues
+labels: type:docs
+
+---
+Thank you for submitting a MediaPipe documentation issue.
+The MediaPipe docs are open source! To get involved, read the documentation Contributor Guide
+## URL(s) with the issue:
+
+Please provide a link to the documentation entry, for example: https://github.com/google/mediapipe/blob/master/docs/solutions/face_mesh.md#models
+
+## Description of issue (what needs changing):
+
+Kinds of documentation problems:
+
+### Clear description
+
+For example, why should someone use this method? How is it useful?
+
+### Correct links
+
+Is the link to the source code correct?
+
+### Parameters defined
+Are all parameters defined and formatted correctly?
+
+### Returns defined
+
+Are return values defined?
+
+### Raises listed and defined
+
+Are the errors defined? For example,
+
+### Usage example
+
+Is there a usage example?
+
+See the API guide:
+on how to write testable usage examples.
+
+### Request visuals, if applicable
+
+Are there currently visuals? If not, will it clarify the content?
+
+### Submit a pull request?
+
+Are you planning to also submit a pull request to fix the issue? See the docs
+https://github.com/google/mediapipe/blob/master/CONTRIBUTING.md
+
diff --git a/.github/ISSUE_TEMPLATE/30-bug-issue.md b/.github/ISSUE_TEMPLATE/30-bug-issue.md
new file mode 100644
index 000000000..996c06cf5
--- /dev/null
+++ b/.github/ISSUE_TEMPLATE/30-bug-issue.md
@@ -0,0 +1,32 @@
+---
+name: "Bug Issue"
+about: Use this template for reporting a bug
+labels: type:bug
+
+---
+Please make sure that this is a bug and also refer to the [troubleshooting](https://google.github.io/mediapipe/getting_started/troubleshooting.html), FAQ documentation before raising any issues.
+
+**System information** (Please provide as much relevant information as possible)
+
+- Have I written custom code (as opposed to using a stock example script provided in MediaPipe):
+- OS Platform and Distribution (e.g., Linux Ubuntu 16.04, Android 11, iOS 14.4):
+- Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device:
+- Browser and version (e.g. Google Chrome, Safari) if the issue happens on browser:
+- Programming Language and version ( e.g. C++, Python, Java):
+- [MediaPipe version](https://github.com/google/mediapipe/releases):
+- Bazel version (if compiling from source):
+- Solution ( e.g. FaceMesh, Pose, Holistic ):
+- Android Studio, NDK, SDK versions (if issue is related to building in Android environment):
+- Xcode & Tulsi version (if issue is related to building for iOS):
+
+**Describe the current behavior:**
+
+**Describe the expected behavior:**
+
+**Standalone code to reproduce the issue:**
+Provide a reproducible test case that is the bare minimum necessary to replicate the problem. If possible, please share a link to Colab/repo link /any notebook:
+
+**Other info / Complete Logs :**
+ Include any logs or source code that would be helpful to
+diagnose the problem. If including tracebacks, please include the full
+traceback. Large logs and files should be attached
diff --git a/.github/ISSUE_TEMPLATE/40-feature-request.md b/.github/ISSUE_TEMPLATE/40-feature-request.md
new file mode 100644
index 000000000..2e1aafc7a
--- /dev/null
+++ b/.github/ISSUE_TEMPLATE/40-feature-request.md
@@ -0,0 +1,24 @@
+---
+name: "Feature Request"
+about: Use this template for raising a feature request
+labels: type:feature
+
+---
+Please make sure that this is a feature request.
+
+**System information** (Please provide as much relevant information as possible)
+
+- MediaPipe Solution (you are using):
+- Programming language : C++/typescript/Python/Objective C/Android Java
+- Are you willing to contribute it (Yes/No):
+
+
+**Describe the feature and the current behavior/state:**
+
+**Will this change the current api? How?**
+
+**Who will benefit with this feature?**
+
+**Please specify the use cases for this feature:**
+
+**Any Other info:**
diff --git a/.github/ISSUE_TEMPLATE/50-other-issues.md b/.github/ISSUE_TEMPLATE/50-other-issues.md
new file mode 100644
index 000000000..e51add916
--- /dev/null
+++ b/.github/ISSUE_TEMPLATE/50-other-issues.md
@@ -0,0 +1,14 @@
+---
+name: "Other Issue"
+about: Use this template for any other non-support related issues.
+labels: type:others
+
+---
+This template is for miscellaneous issues not covered by the other issue categories
+
+For questions on how to work with MediaPipe, or support for problems that are not verified bugs in MediaPipe, please go to [StackOverflow](https://stackoverflow.com/questions/tagged/mediapipe) and [Slack](https://mediapipe.page.link/joinslack) communities.
+
+If you are reporting a vulnerability, please use the [dedicated reporting process](https://github.com/google/mediapipe/security).
+
+For high-level discussions about MediaPipe, please post to discuss@mediapipe.org, for questions about the development or internal workings of MediaPipe, or if you would like to know how to contribute to MediaPipe, please post to developers@mediapipe.org.
+
diff --git a/.github/bot_config.yml b/.github/bot_config.yml
new file mode 100644
index 000000000..b1b2d98ea
--- /dev/null
+++ b/.github/bot_config.yml
@@ -0,0 +1,18 @@
+# Copyright 2021 The MediaPipe Authors.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ============================================================================
+
+# A list of assignees
+assignees:
+ - sgowroji
diff --git a/.github/stale.yml b/.github/stale.yml
new file mode 100644
index 000000000..03c67d0f6
--- /dev/null
+++ b/.github/stale.yml
@@ -0,0 +1,34 @@
+# Copyright 2021 The MediaPipe Authors.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ============================================================================
+#
+# This file was assembled from multiple pieces, whose use is documented
+# throughout. Please refer to the TensorFlow dockerfiles documentation
+# for more information.
+
+# Number of days of inactivity before an Issue or Pull Request becomes stale
+daysUntilStale: 7
+# Number of days of inactivity before a stale Issue or Pull Request is closed
+daysUntilClose: 7
+# Only issues or pull requests with all of these labels are checked if stale. Defaults to `[]` (disabled)
+onlyLabels:
+ - stat:awaiting response
+# Comment to post when marking as stale. Set to `false` to disable
+markComment: >
+ This issue has been automatically marked as stale because it has not had
+ recent activity. It will be closed if no further activity occurs. Thank you.
+# Comment to post when removing the stale label. Set to `false` to disable
+unmarkComment: false
+closeComment: >
+ Closing as stale. Please reopen if you'd like to work on this further.
diff --git a/MANIFEST.in b/MANIFEST.in
index 8d5c4ec50..14afffebe 100644
--- a/MANIFEST.in
+++ b/MANIFEST.in
@@ -8,6 +8,7 @@ include README.md
include requirements.txt
recursive-include mediapipe/modules *.tflite *.txt *.binarypb
+exclude mediapipe/modules/face_detection/face_detection_full_range.tflite
exclude mediapipe/modules/objectron/object_detection_3d_chair_1stage.tflite
exclude mediapipe/modules/objectron/object_detection_3d_sneakers_1stage.tflite
exclude mediapipe/modules/objectron/object_detection_3d_sneakers.tflite
diff --git a/README.md b/README.md
index 23e0d9981..9ea72ab8a 100644
--- a/README.md
+++ b/README.md
@@ -40,11 +40,12 @@ Hair Segmentation
[Hands](https://google.github.io/mediapipe/solutions/hands) | ✅ | ✅ | ✅ | ✅ | ✅ |
[Pose](https://google.github.io/mediapipe/solutions/pose) | ✅ | ✅ | ✅ | ✅ | ✅ |
[Holistic](https://google.github.io/mediapipe/solutions/holistic) | ✅ | ✅ | ✅ | ✅ | ✅ |
+[Selfie Segmentation](https://google.github.io/mediapipe/solutions/selfie_segmentation) | ✅ | ✅ | ✅ | ✅ | ✅ |
[Hair Segmentation](https://google.github.io/mediapipe/solutions/hair_segmentation) | ✅ | | ✅ | | |
[Object Detection](https://google.github.io/mediapipe/solutions/object_detection) | ✅ | ✅ | ✅ | | | ✅
[Box Tracking](https://google.github.io/mediapipe/solutions/box_tracking) | ✅ | ✅ | ✅ | | |
[Instant Motion Tracking](https://google.github.io/mediapipe/solutions/instant_motion_tracking) | ✅ | | | | |
-[Objectron](https://google.github.io/mediapipe/solutions/objectron) | ✅ | | ✅ | ✅ | |
+[Objectron](https://google.github.io/mediapipe/solutions/objectron) | ✅ | | ✅ | ✅ | ✅ |
[KNIFT](https://google.github.io/mediapipe/solutions/knift) | ✅ | | | | |
[AutoFlip](https://google.github.io/mediapipe/solutions/autoflip) | | | ✅ | | |
[MediaSequence](https://google.github.io/mediapipe/solutions/media_sequence) | | | ✅ | | |
@@ -54,46 +55,22 @@ See also
[MediaPipe Models and Model Cards](https://google.github.io/mediapipe/solutions/models)
for ML models released in MediaPipe.
-## MediaPipe in Python
-
-MediaPipe offers customizable Python solutions as a prebuilt Python package on
-[PyPI](https://pypi.org/project/mediapipe/), which can be installed simply with
-`pip install mediapipe`. It also provides tools for users to build their own
-solutions. Please see
-[MediaPipe in Python](https://google.github.io/mediapipe/getting_started/python)
-for more info.
-
-## MediaPipe on the Web
-
-MediaPipe on the Web is an effort to run the same ML solutions built for mobile
-and desktop also in web browsers. The official API is under construction, but
-the core technology has been proven effective. Please see
-[MediaPipe on the Web](https://developers.googleblog.com/2020/01/mediapipe-on-web.html)
-in Google Developers Blog for details.
-
-You can use the following links to load a demo in the MediaPipe Visualizer, and
-over there click the "Runner" icon in the top bar like shown below. The demos
-use your webcam video as input, which is processed all locally in real-time and
-never leaves your device.
-
-
-
-* [MediaPipe Face Detection](https://viz.mediapipe.dev/demo/face_detection)
-* [MediaPipe Iris](https://viz.mediapipe.dev/demo/iris_tracking)
-* [MediaPipe Iris: Depth-from-Iris](https://viz.mediapipe.dev/demo/iris_depth)
-* [MediaPipe Hands](https://viz.mediapipe.dev/demo/hand_tracking)
-* [MediaPipe Hands (palm/hand detection only)](https://viz.mediapipe.dev/demo/hand_detection)
-* [MediaPipe Pose](https://viz.mediapipe.dev/demo/pose_tracking)
-* [MediaPipe Hair Segmentation](https://viz.mediapipe.dev/demo/hair_segmentation)
-
## Getting started
-Learn how to [install](https://google.github.io/mediapipe/getting_started/install)
-MediaPipe and
-[build example applications](https://google.github.io/mediapipe/getting_started/building_examples),
-and start exploring our ready-to-use
-[solutions](https://google.github.io/mediapipe/solutions/solutions) that you can
-further extend and customize.
+To start using MediaPipe
+[solutions](https://google.github.io/mediapipe/solutions/solutions) with only a few
+lines code, see example code and demos in
+[MediaPipe in Python](https://google.github.io/mediapipe/getting_started/python) and
+[MediaPipe in JavaScript](https://google.github.io/mediapipe/getting_started/javascript).
+
+To use MediaPipe in C++, Android and iOS, which allow further customization of
+the [solutions](https://google.github.io/mediapipe/solutions/solutions) as well as
+building your own, learn how to
+[install](https://google.github.io/mediapipe/getting_started/install) MediaPipe and
+start building example applications in
+[C++](https://google.github.io/mediapipe/getting_started/cpp),
+[Android](https://google.github.io/mediapipe/getting_started/android) and
+[iOS](https://google.github.io/mediapipe/getting_started/ios).
The source code is hosted in the
[MediaPipe Github repository](https://github.com/google/mediapipe), and you can
@@ -167,6 +144,13 @@ bash build_macos_desktop_examples.sh --cpu i386 --app face_detection -r
## Publications
+* [Bringing artworks to life with AR](https://developers.googleblog.com/2021/07/bringing-artworks-to-life-with-ar.html)
+ in Google Developers Blog
+* [Prosthesis control via Mirru App using MediaPipe hand tracking](https://developers.googleblog.com/2021/05/control-your-mirru-prosthesis-with-mediapipe-hand-tracking.html)
+ in Google Developers Blog
+* [SignAll SDK: Sign language interface using MediaPipe is now available for
+ developers](https://developers.googleblog.com/2021/04/signall-sdk-sign-language-interface-using-mediapipe-now-available.html)
+ in Google Developers Blog
* [MediaPipe Holistic - Simultaneous Face, Hand and Pose Prediction, on Device](https://ai.googleblog.com/2020/12/mediapipe-holistic-simultaneous-face.html)
in Google AI Blog
* [Background Features in Google Meet, Powered by Web ML](https://ai.googleblog.com/2020/10/background-features-in-google-meet.html)
diff --git a/WORKSPACE b/WORKSPACE
index c7cb94346..9b0a7e86c 100644
--- a/WORKSPACE
+++ b/WORKSPACE
@@ -65,26 +65,19 @@ rules_foreign_cc_dependencies()
all_content = """filegroup(name = "all", srcs = glob(["**"]), visibility = ["//visibility:public"])"""
# GoogleTest/GoogleMock framework. Used by most unit-tests.
-# Last updated 2020-06-30.
+# Last updated 2021-07-02.
http_archive(
name = "com_google_googletest",
- urls = ["https://github.com/google/googletest/archive/aee0f9d9b5b87796ee8a0ab26b7587ec30e8858e.zip"],
- patches = [
- # fix for https://github.com/google/googletest/issues/2817
- "@//third_party:com_google_googletest_9d580ea80592189e6d44fa35bcf9cdea8bf620d6.diff"
- ],
- patch_args = [
- "-p1",
- ],
- strip_prefix = "googletest-aee0f9d9b5b87796ee8a0ab26b7587ec30e8858e",
- sha256 = "04a1751f94244307cebe695a69cc945f9387a80b0ef1af21394a490697c5c895",
+ urls = ["https://github.com/google/googletest/archive/4ec4cd23f486bf70efcc5d2caa40f24368f752e3.zip"],
+ strip_prefix = "googletest-4ec4cd23f486bf70efcc5d2caa40f24368f752e3",
+ sha256 = "de682ea824bfffba05b4e33b67431c247397d6175962534305136aa06f92e049",
)
# Google Benchmark library.
http_archive(
name = "com_google_benchmark",
- urls = ["https://github.com/google/benchmark/archive/master.zip"],
- strip_prefix = "benchmark-master",
+ urls = ["https://github.com/google/benchmark/archive/main.zip"],
+ strip_prefix = "benchmark-main",
build_file = "@//third_party:benchmark.BUILD",
)
@@ -176,11 +169,11 @@ http_archive(
http_archive(
name = "pybind11",
urls = [
- "https://storage.googleapis.com/mirror.tensorflow.org/github.com/pybind/pybind11/archive/v2.4.3.tar.gz",
- "https://github.com/pybind/pybind11/archive/v2.4.3.tar.gz",
+ "https://storage.googleapis.com/mirror.tensorflow.org/github.com/pybind/pybind11/archive/v2.7.1.tar.gz",
+ "https://github.com/pybind/pybind11/archive/v2.7.1.tar.gz",
],
- sha256 = "1eed57bc6863190e35637290f97a20c81cfe4d9090ac0a24f3bbf08f265eb71d",
- strip_prefix = "pybind11-2.4.3",
+ sha256 = "616d1c42e4cf14fa27b2a4ff759d7d7b33006fdc5ad8fd603bb2c22622f27020",
+ strip_prefix = "pybind11-2.7.1",
build_file = "@pybind11_bazel//:pybind11.BUILD",
)
@@ -254,6 +247,20 @@ http_archive(
url = "https://github.com/opencv/opencv/releases/download/3.2.0/opencv-3.2.0-ios-framework.zip",
)
+http_archive(
+ name = "stblib",
+ strip_prefix = "stb-b42009b3b9d4ca35bc703f5310eedc74f584be58",
+ sha256 = "13a99ad430e930907f5611325ec384168a958bf7610e63e60e2fd8e7b7379610",
+ urls = ["https://github.com/nothings/stb/archive/b42009b3b9d4ca35bc703f5310eedc74f584be58.tar.gz"],
+ build_file = "@//third_party:stblib.BUILD",
+ patches = [
+ "@//third_party:stb_image_impl.diff"
+ ],
+ patch_args = [
+ "-p1",
+ ],
+)
+
# You may run setup_android.sh to install Android SDK and NDK.
android_ndk_repository(
name = "androidndk",
@@ -336,7 +343,9 @@ load("@rules_jvm_external//:defs.bzl", "maven_install")
maven_install(
artifacts = [
"androidx.concurrent:concurrent-futures:1.0.0-alpha03",
- "androidx.lifecycle:lifecycle-common:2.2.0",
+ "androidx.lifecycle:lifecycle-common:2.3.1",
+ "androidx.activity:activity:1.2.2",
+ "androidx.fragment:fragment:1.3.4",
"androidx.annotation:annotation:aar:1.1.0",
"androidx.appcompat:appcompat:aar:1.1.0-rc01",
"androidx.camera:camera-core:1.0.0-beta10",
@@ -349,11 +358,11 @@ maven_install(
"androidx.test.espresso:espresso-core:3.1.1",
"com.github.bumptech.glide:glide:4.11.0",
"com.google.android.material:material:aar:1.0.0-rc01",
- "com.google.auto.value:auto-value:1.6.4",
- "com.google.auto.value:auto-value-annotations:1.6.4",
- "com.google.code.findbugs:jsr305:3.0.2",
- "com.google.flogger:flogger-system-backend:0.3.1",
- "com.google.flogger:flogger:0.3.1",
+ "com.google.auto.value:auto-value:1.8.1",
+ "com.google.auto.value:auto-value-annotations:1.8.1",
+ "com.google.code.findbugs:jsr305:latest.release",
+ "com.google.flogger:flogger-system-backend:latest.release",
+ "com.google.flogger:flogger:latest.release",
"com.google.guava:guava:27.0.1-android",
"com.google.guava:listenablefuture:1.0",
"junit:junit:4.12",
@@ -381,9 +390,9 @@ http_archive(
)
# Tensorflow repo should always go after the other external dependencies.
-# 2021-04-30
-_TENSORFLOW_GIT_COMMIT = "5bd3c57ef184543d22e34e36cff9d9bea608e06d"
-_TENSORFLOW_SHA256= "9a45862834221aafacf6fb275f92b3876bc89443cbecc51be93f13839a6609f0"
+# 2021-07-29
+_TENSORFLOW_GIT_COMMIT = "52a2905cbc21034766c08041933053178c5d10e3"
+_TENSORFLOW_SHA256 = "06d4691bcdb700f3275fa0971a1585221c2b9f3dffe867963be565a6643d7f56"
http_archive(
name = "org_tensorflow",
urls = [
@@ -404,3 +413,18 @@ load("@org_tensorflow//tensorflow:workspace3.bzl", "tf_workspace3")
tf_workspace3()
load("@org_tensorflow//tensorflow:workspace2.bzl", "tf_workspace2")
tf_workspace2()
+
+# Edge TPU
+http_archive(
+ name = "libedgetpu",
+ sha256 = "14d5527a943a25bc648c28a9961f954f70ba4d79c0a9ca5ae226e1831d72fe80",
+ strip_prefix = "libedgetpu-3164995622300286ef2bb14d7fdc2792dae045b7",
+ urls = [
+ "https://github.com/google-coral/libedgetpu/archive/3164995622300286ef2bb14d7fdc2792dae045b7.tar.gz"
+ ],
+)
+load("@libedgetpu//:workspace.bzl", "libedgetpu_dependencies")
+libedgetpu_dependencies()
+
+load("@coral_crosstool//:configure.bzl", "cc_crosstool")
+cc_crosstool(name = "crosstool")
diff --git a/build_desktop_examples.sh b/build_desktop_examples.sh
index a35556cf0..7ff8db29c 100644
--- a/build_desktop_examples.sh
+++ b/build_desktop_examples.sh
@@ -97,6 +97,7 @@ for app in ${apps}; do
if [[ ${target_name} == "holistic_tracking" ||
${target_name} == "iris_tracking" ||
${target_name} == "pose_tracking" ||
+ ${target_name} == "selfie_segmentation" ||
${target_name} == "upper_body_pose_tracking" ]]; then
graph_suffix="cpu"
else
diff --git a/docs/framework_concepts/calculators.md b/docs/framework_concepts/calculators.md
index 98bf1def4..9548fa461 100644
--- a/docs/framework_concepts/calculators.md
+++ b/docs/framework_concepts/calculators.md
@@ -248,12 +248,70 @@ absl::Status MyCalculator::Process() {
}
```
+## Calculator options
+
+Calculators accept processing parameters through (1) input stream packets (2)
+input side packets, and (3) calculator options. Calculator options, if
+specified, appear as literal values in the `node_options` field of the
+`CalculatorGraphConfiguration.Node` message.
+
+```
+ node {
+ calculator: "TfLiteInferenceCalculator"
+ input_stream: "TENSORS:main_model_input"
+ output_stream: "TENSORS:main_model_output"
+ node_options: {
+ [type.googleapis.com/mediapipe.TfLiteInferenceCalculatorOptions] {
+ model_path: "mediapipe/models/detection_model.tflite"
+ }
+ }
+ }
+```
+
+The `node_options` field accepts the proto3 syntax. Alternatively, calculator
+options can be specified in the `options` field using proto2 syntax.
+
+```
+ node {
+ calculator: "TfLiteInferenceCalculator"
+ input_stream: "TENSORS:main_model_input"
+ output_stream: "TENSORS:main_model_output"
+ node_options: {
+ [type.googleapis.com/mediapipe.TfLiteInferenceCalculatorOptions] {
+ model_path: "mediapipe/models/detection_model.tflite"
+ }
+ }
+ }
+```
+
+Not all calculators accept calcuator options. In order to accept options, a
+calculator will normally define a new protobuf message type to represent its
+options, such as `PacketClonerCalculatorOptions`. The calculator will then
+read that protobuf message in its `CalculatorBase::Open` method, and possibly
+also in its `CalculatorBase::GetContract` function or its
+`CalculatorBase::Process` method. Normally, the new protobuf message type will
+be defined as a protobuf schema using a ".proto" file and a
+`mediapipe_proto_library()` build rule.
+
+```
+ mediapipe_proto_library(
+ name = "packet_cloner_calculator_proto",
+ srcs = ["packet_cloner_calculator.proto"],
+ visibility = ["//visibility:public"],
+ deps = [
+ "//mediapipe/framework:calculator_options_proto",
+ "//mediapipe/framework:calculator_proto",
+ ],
+ )
+```
+
+
## Example calculator
This section discusses the implementation of `PacketClonerCalculator`, which
does a relatively simple job, and is used in many calculator graphs.
-`PacketClonerCalculator` simply produces a copy of its most recent input
-packets on demand.
+`PacketClonerCalculator` simply produces a copy of its most recent input packets
+on demand.
`PacketClonerCalculator` is useful when the timestamps of arriving data packets
are not aligned perfectly. Suppose we have a room with a microphone, light
@@ -279,8 +337,8 @@ input streams:
imageframe of video data representing video collected from camera in the
room with timestamp.
-Below is the implementation of the `PacketClonerCalculator`. You can see
-the `GetContract()`, `Open()`, and `Process()` methods as well as the instance
+Below is the implementation of the `PacketClonerCalculator`. You can see the
+`GetContract()`, `Open()`, and `Process()` methods as well as the instance
variable `current_` which holds the most recent input packets.
```c++
@@ -401,6 +459,6 @@ node {
The diagram below shows how the `PacketClonerCalculator` defines its output
packets (bottom) based on its series of input packets (top).
-|  |
-| :---------------------------------------------------------------------------: |
-| *Each time it receives a packet on its TICK input stream, the PacketClonerCalculator outputs the most recent packet from each of its input streams. The sequence of output packets (bottom) is determined by the sequence of input packets (top) and their timestamps. The timestamps are shown along the right side of the diagram.* |
+ |
+:--------------------------------------------------------------------------: |
+*Each time it receives a packet on its TICK input stream, the PacketClonerCalculator outputs the most recent packet from each of its input streams. The sequence of output packets (bottom) is determined by the sequence of input packets (top) and their timestamps. The timestamps are shown along the right side of the diagram.* |
diff --git a/docs/framework_concepts/framework_concepts.md b/docs/framework_concepts/framework_concepts.md
index dcf446a9d..dd43d830c 100644
--- a/docs/framework_concepts/framework_concepts.md
+++ b/docs/framework_concepts/framework_concepts.md
@@ -111,11 +111,11 @@ component known as an InputStreamHandler.
See [Synchronization](synchronization.md) for more details.
-### Realtime data streams
+### Real-time streams
MediaPipe calculator graphs are often used to process streams of video or audio
frames for interactive applications. Normally, each Calculator runs as soon as
all of its input packets for a given timestamp become available. Calculators
-used in realtime graphs need to define output timestamp bounds based on input
+used in real-time graphs need to define output timestamp bounds based on input
timestamp bounds in order to allow downstream calculators to be scheduled
-promptly. See [Realtime data streams](realtime.md) for details.
+promptly. See [Real-time Streams](realtime_streams.md) for details.
diff --git a/docs/framework_concepts/realtime.md b/docs/framework_concepts/realtime_streams.md
similarity index 91%
rename from docs/framework_concepts/realtime.md
rename to docs/framework_concepts/realtime_streams.md
index 36b606825..038081453 100644
--- a/docs/framework_concepts/realtime.md
+++ b/docs/framework_concepts/realtime_streams.md
@@ -1,29 +1,28 @@
---
layout: default
-title: Processing real-time data streams
+title: Real-time Streams
+parent: Framework Concepts
nav_order: 6
-has_children: true
-has_toc: false
---
-# Processing real-time data streams
+# Real-time Streams
{: .no_toc }
1. TOC
{:toc}
---
-## Realtime timestamps
+## Real-time timestamps
MediaPipe calculator graphs are often used to process streams of video or audio
frames for interactive applications. The MediaPipe framework requires only that
successive packets be assigned monotonically increasing timestamps. By
-convention, realtime calculators and graphs use the recording time or the
+convention, real-time calculators and graphs use the recording time or the
presentation time of each frame as its timestamp, with each timestamp indicating
the microseconds since `Jan/1/1970:00:00:00`. This allows packets from various
sources to be processed in a globally consistent sequence.
-## Realtime scheduling
+## Real-time scheduling
Normally, each Calculator runs as soon as all of its input packets for a given
timestamp become available. Normally, this happens when the calculator has
@@ -38,7 +37,7 @@ When a calculator does not produce any output packets for a given timestamp, it
can instead output a "timestamp bound" indicating that no packet will be
produced for that timestamp. This indication is necessary to allow downstream
calculators to run at that timestamp, even though no packet has arrived for
-certain streams for that timestamp. This is especially important for realtime
+certain streams for that timestamp. This is especially important for real-time
graphs in interactive applications, where it is crucial that each calculator
begin processing as soon as possible.
@@ -83,12 +82,12 @@ For example, `Timestamp(1).NextAllowedInStream() == Timestamp(2)`.
## Propagating timestamp bounds
-Calculators that will be used in realtime graphs need to define output timestamp
-bounds based on input timestamp bounds in order to allow downstream calculators
-to be scheduled promptly. A common pattern is for calculators to output packets
-with the same timestamps as their input packets. In this case, simply outputting
-a packet on every call to `Calculator::Process` is sufficient to define output
-timestamp bounds.
+Calculators that will be used in real-time graphs need to define output
+timestamp bounds based on input timestamp bounds in order to allow downstream
+calculators to be scheduled promptly. A common pattern is for calculators to
+output packets with the same timestamps as their input packets. In this case,
+simply outputting a packet on every call to `Calculator::Process` is sufficient
+to define output timestamp bounds.
However, calculators are not required to follow this common pattern for output
timestamps, they are only required to choose monotonically increasing output
diff --git a/docs/getting_started/android.md b/docs/getting_started/android.md
index 71224a258..c3c6506ee 100644
--- a/docs/getting_started/android.md
+++ b/docs/getting_started/android.md
@@ -16,12 +16,14 @@ nav_order: 1
Please follow instructions below to build Android example apps in the supported
MediaPipe [solutions](../solutions/solutions.md). To learn more about these
-example apps, start from [Hello World! on Android](./hello_world_android.md). To
-incorporate MediaPipe into an existing Android Studio project, see these
-[instructions](./android_archive_library.md) that use Android Archive (AAR) and
-Gradle.
+example apps, start from [Hello World! on Android](./hello_world_android.md).
-## Building Android example apps
+To incorporate MediaPipe into Android Studio projects, see these
+[instructions](./android_solutions.md) to use the MediaPipe Android Solution
+APIs (currently in alpha) that are now available in
+[Google's Maven Repository](https://maven.google.com/web/index.html?#com.google.mediapipe).
+
+## Building Android example apps with Bazel
### Prerequisite
@@ -51,16 +53,6 @@ $YOUR_INTENDED_API_LEVEL` in android_ndk_repository() and/or
android_sdk_repository() in the
[`WORKSPACE`](https://github.com/google/mediapipe/blob/master/WORKSPACE) file.
-Please verify all the necessary packages are installed.
-
-* Android SDK Platform API Level 28 or 29
-* Android SDK Build-Tools 28 or 29
-* Android SDK Platform-Tools 28 or 29
-* Android SDK Tools 26.1.1
-* Android NDK 19c or above
-
-### Option 1: Build with Bazel in Command Line
-
Tip: You can run this
[script](https://github.com/google/mediapipe/blob/master/build_android_examples.sh)
to build (and install) all MediaPipe Android example apps.
@@ -84,108 +76,3 @@ to build (and install) all MediaPipe Android example apps.
```bash
adb install bazel-bin/mediapipe/examples/android/src/java/com/google/mediapipe/apps/handtrackinggpu/handtrackinggpu.apk
```
-
-### Option 2: Build with Bazel in Android Studio
-
-The MediaPipe project can be imported into Android Studio using the Bazel
-plugins. This allows the MediaPipe examples to be built and modified in Android
-Studio.
-
-To incorporate MediaPipe into an existing Android Studio project, see these
-[instructions](./android_archive_library.md) that use Android Archive (AAR) and
-Gradle.
-
-The steps below use Android Studio 3.5 to build and install a MediaPipe example
-app:
-
-1. Install and launch Android Studio 3.5.
-
-2. Select `Configure` -> `SDK Manager` -> `SDK Platforms`.
-
- * Verify that Android SDK Platform API Level 28 or 29 is installed.
- * Take note of the Android SDK Location, e.g.,
- `/usr/local/home/Android/Sdk`.
-
-3. Select `Configure` -> `SDK Manager` -> `SDK Tools`.
-
- * Verify that Android SDK Build-Tools 28 or 29 is installed.
- * Verify that Android SDK Platform-Tools 28 or 29 is installed.
- * Verify that Android SDK Tools 26.1.1 is installed.
- * Verify that Android NDK 19c or above is installed.
- * Take note of the Android NDK Location, e.g.,
- `/usr/local/home/Android/Sdk/ndk-bundle` or
- `/usr/local/home/Android/Sdk/ndk/20.0.5594570`.
-
-4. Set environment variables `$ANDROID_HOME` and `$ANDROID_NDK_HOME` to point
- to the installed SDK and NDK.
-
- ```bash
- export ANDROID_HOME=/usr/local/home/Android/Sdk
-
- # If the NDK libraries are installed by a previous version of Android Studio, do
- export ANDROID_NDK_HOME=/usr/local/home/Android/Sdk/ndk-bundle
- # If the NDK libraries are installed by Android Studio 3.5, do
- export ANDROID_NDK_HOME=/usr/local/home/Android/Sdk/ndk/
- ```
-
-5. Select `Configure` -> `Plugins` to install `Bazel`.
-
-6. On Linux, select `File` -> `Settings` -> `Bazel settings`. On macos, select
- `Android Studio` -> `Preferences` -> `Bazel settings`. Then, modify `Bazel
- binary location` to be the same as the output of `$ which bazel`.
-
-7. Select `Import Bazel Project`.
-
- * Select `Workspace`: `/path/to/mediapipe` and select `Next`.
- * Select `Generate from BUILD file`: `/path/to/mediapipe/BUILD` and select
- `Next`.
- * Modify `Project View` to be the following and select `Finish`.
-
- ```
- directories:
- # read project settings, e.g., .bazelrc
- .
- -mediapipe/objc
- -mediapipe/examples/ios
-
- targets:
- //mediapipe/examples/android/...:all
- //mediapipe/java/...:all
-
- android_sdk_platform: android-29
-
- sync_flags:
- --host_crosstool_top=@bazel_tools//tools/cpp:toolchain
- ```
-
-8. Select `Bazel` -> `Sync` -> `Sync project with Build files`.
-
- Note: Even after doing step 4, if you still see the error: `"no such package
- '@androidsdk//': Either the path attribute of android_sdk_repository or the
- ANDROID_HOME environment variable must be set."`, please modify the
- [`WORKSPACE`](https://github.com/google/mediapipe/blob/master/WORKSPACE)
- file to point to your SDK and NDK library locations, as below:
-
- ```
- android_sdk_repository(
- name = "androidsdk",
- path = "/path/to/android/sdk"
- )
-
- android_ndk_repository(
- name = "androidndk",
- path = "/path/to/android/ndk"
- )
- ```
-
-9. Connect an Android device to the workstation.
-
-10. Select `Run...` -> `Edit Configurations...`.
-
- * Select `Templates` -> `Bazel Command`.
- * Enter Target Expression:
- `//mediapipe/examples/android/src/java/com/google/mediapipe/apps/handtrackinggpu:handtrackinggpu`
- * Enter Bazel command: `mobile-install`.
- * Enter Bazel flags: `-c opt --config=android_arm64`.
- * Press the `[+]` button to add the new configuration.
- * Select `Run` to run the example app on the connected Android device.
diff --git a/docs/getting_started/android_archive_library.md b/docs/getting_started/android_archive_library.md
index 2c2ca99f3..d2f25213f 100644
--- a/docs/getting_started/android_archive_library.md
+++ b/docs/getting_started/android_archive_library.md
@@ -3,7 +3,7 @@ layout: default
title: MediaPipe Android Archive
parent: MediaPipe on Android
grand_parent: Getting Started
-nav_order: 2
+nav_order: 3
---
# MediaPipe Android Archive
@@ -92,12 +92,12 @@ each project.
and copy
[the binary graph](https://github.com/google/mediapipe/blob/master/mediapipe/examples/android/src/java/com/google/mediapipe/apps/facedetectiongpu/BUILD#L41)
and
- [the face detection tflite model](https://github.com/google/mediapipe/tree/master/mediapipe/modules/face_detection/face_detection_front.tflite).
+ [the face detection tflite model](https://github.com/google/mediapipe/tree/master/mediapipe/modules/face_detection/face_detection_short_range.tflite).
```bash
bazel build -c opt mediapipe/graphs/face_detection:face_detection_mobile_gpu_binary_graph
cp bazel-bin/mediapipe/graphs/face_detection/face_detection_mobile_gpu.binarypb /path/to/your/app/src/main/assets/
- cp mediapipe/modules/face_detection/face_detection_front.tflite /path/to/your/app/src/main/assets/
+ cp mediapipe/modules/face_detection/face_detection_short_range.tflite /path/to/your/app/src/main/assets/
```

@@ -113,10 +113,9 @@ each project.
androidTestImplementation 'androidx.test.ext:junit:1.1.0'
androidTestImplementation 'androidx.test.espresso:espresso-core:3.1.1'
// MediaPipe deps
- implementation 'com.google.flogger:flogger:0.3.1'
- implementation 'com.google.flogger:flogger-system-backend:0.3.1'
- implementation 'com.google.code.findbugs:jsr305:3.0.2'
- implementation 'com.google.guava:guava:27.0.1-android'
+ implementation 'com.google.flogger:flogger:latest.release'
+ implementation 'com.google.flogger:flogger-system-backend:latest.release'
+ implementation 'com.google.code.findbugs:jsr305:latest.release'
implementation 'com.google.guava:guava:27.0.1-android'
implementation 'com.google.protobuf:protobuf-java:3.11.4'
// CameraX core library
@@ -125,7 +124,7 @@ each project.
implementation "androidx.camera:camera-camera2:$camerax_version"
implementation "androidx.camera:camera-lifecycle:$camerax_version"
// AutoValue
- def auto_value_version = "1.6.4"
+ def auto_value_version = "1.8.1"
implementation "com.google.auto.value:auto-value-annotations:$auto_value_version"
annotationProcessor "com.google.auto.value:auto-value:$auto_value_version"
}
diff --git a/docs/getting_started/android_solutions.md b/docs/getting_started/android_solutions.md
new file mode 100644
index 000000000..de7135c18
--- /dev/null
+++ b/docs/getting_started/android_solutions.md
@@ -0,0 +1,79 @@
+---
+layout: default
+title: Android Solutions
+parent: MediaPipe on Android
+grand_parent: Getting Started
+nav_order: 2
+---
+
+# Android Solution APIs
+{: .no_toc }
+
+1. TOC
+{:toc}
+---
+
+Please follow instructions below to use the MediaPipe Solution APIs in Android
+Studio projects and build the Android example apps in the supported MediaPipe
+[solutions](../solutions/solutions.md).
+
+## Integrate MediaPipe Android Solutions in Android Studio
+
+MediaPipe Android Solution APIs (currently in alpha) are now available in
+[Google's Maven Repository](https://maven.google.com/web/index.html?#com.google.mediapipe).
+To incorporate MediaPipe Android Solutions into an Android Studio project, add
+the following into the project's Gradle dependencies:
+
+```
+dependencies {
+ // MediaPipe solution-core is the foundation of any MediaPipe solutions.
+ implementation 'com.google.mediapipe:solution-core:latest.release'
+ // Optional: MediaPipe Hands solution.
+ implementation 'com.google.mediapipe:hands:latest.release'
+ // Optional: MediaPipe FaceMesh solution.
+ implementation 'com.google.mediapipe:facemesh:latest.release'
+ // MediaPipe deps
+ implementation 'com.google.flogger:flogger:latest.release'
+ implementation 'com.google.flogger:flogger-system-backend:latest.release'
+ implementation 'com.google.guava:guava:27.0.1-android'
+ implementation 'com.google.protobuf:protobuf-java:3.11.4'
+ // CameraX core library
+ def camerax_version = "1.0.0-beta10"
+ implementation "androidx.camera:camera-core:$camerax_version"
+ implementation "androidx.camera:camera-camera2:$camerax_version"
+ implementation "androidx.camera:camera-lifecycle:$camerax_version"
+}
+```
+
+See the detailed solutions API usage examples for different use cases in the
+solution example apps'
+[source code](https://github.com/google/mediapipe/tree/master/mediapipe/examples/android/solutions).
+If the prebuilt maven packages are not sufficient, building the MediaPipe
+Android archive library locally by following these
+[instructions](./android_archive_library.md).
+
+## Build solution example apps in Android Studio
+
+1. Open Android Studio Arctic Fox on Linux, macOS, or Windows.
+
+2. Import mediapipe/examples/android/solutions directory into Android Studio.
+
+ 
+
+3. For Windows users, run `create_win_symlinks.bat` as administrator to create
+ res directory symlinks.
+
+ 
+
+4. Select "File" -> "Sync Project with Gradle Files" to sync project.
+
+5. Run solution example app in Android Studio.
+
+ 
+
+6. (Optional) Run solutions on CPU.
+
+ MediaPipe solution example apps run the pipeline and the model inference on
+ GPU by default. If needed, for example to run the apps on Android Emulator,
+ set the `RUN_ON_GPU` boolean variable to `false` in the app's
+ MainActivity.java to run the pipeline and the model inference on CPU.
diff --git a/docs/getting_started/hello_world_android.md b/docs/getting_started/hello_world_android.md
index 9f277f799..6674d4023 100644
--- a/docs/getting_started/hello_world_android.md
+++ b/docs/getting_started/hello_world_android.md
@@ -31,8 +31,8 @@ stream on an Android device.
## Setup
-1. Install MediaPipe on your system, see [MediaPipe installation guide] for
- details.
+1. Install MediaPipe on your system, see
+ [MediaPipe installation guide](./install.md) for details.
2. Install Android Development SDK and Android NDK. See how to do so also in
[MediaPipe installation guide].
3. Enable [developer options] on your Android device.
@@ -770,7 +770,6 @@ If you ran into any issues, please see the full code of the tutorial
[`ExternalTextureConverter`]:https://github.com/google/mediapipe/tree/master/mediapipe/java/com/google/mediapipe/components/ExternalTextureConverter.java
[`FrameLayout`]:https://developer.android.com/reference/android/widget/FrameLayout
[`FrameProcessor`]:https://github.com/google/mediapipe/tree/master/mediapipe/java/com/google/mediapipe/components/FrameProcessor.java
-[MediaPipe installation guide]:./install.md
[`PermissionHelper`]: https://github.com/google/mediapipe/tree/master/mediapipe/java/com/google/mediapipe/components/PermissionHelper.java
[`SurfaceHolder.Callback`]:https://developer.android.com/reference/android/view/SurfaceHolder.Callback.html
[`SurfaceView`]:https://developer.android.com/reference/android/view/SurfaceView
diff --git a/docs/getting_started/hello_world_ios.md b/docs/getting_started/hello_world_ios.md
index 06d79c67d..4591b5f33 100644
--- a/docs/getting_started/hello_world_ios.md
+++ b/docs/getting_started/hello_world_ios.md
@@ -31,8 +31,8 @@ stream on an iOS device.
## Setup
-1. Install MediaPipe on your system, see [MediaPipe installation guide] for
- details.
+1. Install MediaPipe on your system, see
+ [MediaPipe installation guide](./install.md) for details.
2. Setup your iOS device for development.
3. Setup [Bazel] on your system to build and deploy the iOS app.
@@ -113,6 +113,10 @@ bazel to build the iOS application. The content of the
5. `Main.storyboard` and `Launch.storyboard`
6. `Assets.xcassets` directory.
+Note: In newer versions of Xcode, you may see additional files `SceneDelegate.h`
+and `SceneDelegate.m`. Make sure to copy them too and add them to the `BUILD`
+file mentioned below.
+
Copy these files to a directory named `HelloWorld` to a location that can access
the MediaPipe source code. For example, the source code of the application that
we will build in this tutorial is located in
@@ -247,6 +251,12 @@ We need to get frames from the `_cameraSource` into our application
`MPPInputSourceDelegate`. So our application `ViewController` can be a delegate
of `_cameraSource`.
+Update the interface definition of `ViewController` accordingly:
+
+```
+@interface ViewController ()
+```
+
To handle camera setup and process incoming frames, we should use a queue
different from the main queue. Add the following to the implementation block of
the `ViewController`:
@@ -288,6 +298,12 @@ utility called `MPPLayerRenderer` to display images on the screen. This utility
can be used to display `CVPixelBufferRef` objects, which is the type of the
images provided by `MPPCameraInputSource` to its delegates.
+In `ViewController.m`, add the following import line:
+
+```
+#import "mediapipe/objc/MPPLayerRenderer.h"
+```
+
To display images of the screen, we need to add a new `UIView` object called
`_liveView` to the `ViewController`.
@@ -411,6 +427,12 @@ Objective-C++.
### Use the graph in `ViewController`
+In `ViewController.m`, add the following import line:
+
+```
+#import "mediapipe/objc/MPPGraph.h"
+```
+
Declare a static constant with the name of the graph, the input stream and the
output stream:
@@ -549,6 +571,12 @@ method to receive packets on this output stream and display them on the screen:
}
```
+Update the interface definition of `ViewController` with `MPPGraphDelegate`:
+
+```
+@interface ViewController ()
+```
+
And that is all! Build and run the app on your iOS device. You should see the
results of running the edge detection graph on a live video feed. Congrats!
@@ -560,6 +588,5 @@ appropriate `BUILD` file dependencies for the edge detection graph.
[Bazel]:https://bazel.build/
[`edge_detection_mobile_gpu.pbtxt`]:https://github.com/google/mediapipe/tree/master/mediapipe/graphs/edge_detection/edge_detection_mobile_gpu.pbtxt
-[MediaPipe installation guide]:./install.md
-[common]:(https://github.com/google/mediapipe/tree/master/mediapipe/examples/ios/common)
-[helloworld]:(https://github.com/google/mediapipe/tree/master/mediapipe/examples/ios/helloworld)
+[common]:https://github.com/google/mediapipe/tree/master/mediapipe/examples/ios/common
+[helloworld]:https://github.com/google/mediapipe/tree/master/mediapipe/examples/ios/helloworld
diff --git a/docs/getting_started/install.md b/docs/getting_started/install.md
index 95dce1d17..bb2539d33 100644
--- a/docs/getting_started/install.md
+++ b/docs/getting_started/install.md
@@ -43,104 +43,189 @@ install --user six`.
3. Install OpenCV and FFmpeg.
- Option 1. Use package manager tool to install the pre-compiled OpenCV
- libraries. FFmpeg will be installed via libopencv-video-dev.
+ **Option 1**. Use package manager tool to install the pre-compiled OpenCV
+ libraries. FFmpeg will be installed via `libopencv-video-dev`.
- Note: Debian 9 and Ubuntu 16.04 provide OpenCV 2.4.9. You may want to take
- option 2 or 3 to install OpenCV 3 or above.
+ OS | OpenCV
+ -------------------- | ------
+ Debian 9 (stretch) | 2.4
+ Debian 10 (buster) | 3.2
+ Debian 11 (bullseye) | 4.5
+ Ubuntu 16.04 LTS | 2.4
+ Ubuntu 18.04 LTS | 3.2
+ Ubuntu 20.04 LTS | 4.2
+ Ubuntu 20.04 LTS | 4.2
+ Ubuntu 21.04 | 4.5
```bash
- $ sudo apt-get install libopencv-core-dev libopencv-highgui-dev \
- libopencv-calib3d-dev libopencv-features2d-dev \
- libopencv-imgproc-dev libopencv-video-dev
+ $ sudo apt-get install -y \
+ libopencv-core-dev \
+ libopencv-highgui-dev \
+ libopencv-calib3d-dev \
+ libopencv-features2d-dev \
+ libopencv-imgproc-dev \
+ libopencv-video-dev
```
- Debian 9 and Ubuntu 18.04 install the packages in
- `/usr/lib/x86_64-linux-gnu`. MediaPipe's [`opencv_linux.BUILD`] and
- [`ffmpeg_linux.BUILD`] are configured for this library path. Ubuntu 20.04
- may install the OpenCV and FFmpeg packages in `/usr/local`, Please follow
- the option 3 below to modify the [`WORKSPACE`], [`opencv_linux.BUILD`] and
- [`ffmpeg_linux.BUILD`] files accordingly.
-
- Moreover, for Nvidia Jetson and Raspberry Pi devices with ARM Ubuntu, the
- library path needs to be modified like the following:
+ MediaPipe's [`opencv_linux.BUILD`] and [`WORKSPACE`] are already configured
+ for OpenCV 2/3 and should work correctly on any architecture:
```bash
- sed -i "s/x86_64-linux-gnu/aarch64-linux-gnu/g" third_party/opencv_linux.BUILD
+ # WORKSPACE
+ new_local_repository(
+ name = "linux_opencv",
+ build_file = "@//third_party:opencv_linux.BUILD",
+ path = "/usr",
+ )
+
+ # opencv_linux.BUILD for OpenCV 2/3 installed from Debian package
+ cc_library(
+ name = "opencv",
+ linkopts = [
+ "-l:libopencv_core.so",
+ "-l:libopencv_calib3d.so",
+ "-l:libopencv_features2d.so",
+ "-l:libopencv_highgui.so",
+ "-l:libopencv_imgcodecs.so",
+ "-l:libopencv_imgproc.so",
+ "-l:libopencv_video.so",
+ "-l:libopencv_videoio.so",
+ ],
+ )
```
- Option 2. Run [`setup_opencv.sh`] to automatically build OpenCV from source
- and modify MediaPipe's OpenCV config.
+ For OpenCV 4 you need to modify [`opencv_linux.BUILD`] taking into account
+ current architecture:
- Option 3. Follow OpenCV's
+ ```bash
+ # WORKSPACE
+ new_local_repository(
+ name = "linux_opencv",
+ build_file = "@//third_party:opencv_linux.BUILD",
+ path = "/usr",
+ )
+
+ # opencv_linux.BUILD for OpenCV 4 installed from Debian package
+ cc_library(
+ name = "opencv",
+ hdrs = glob([
+ # Uncomment according to your multiarch value (gcc -print-multiarch):
+ # "include/aarch64-linux-gnu/opencv4/opencv2/cvconfig.h",
+ # "include/arm-linux-gnueabihf/opencv4/opencv2/cvconfig.h",
+ # "include/x86_64-linux-gnu/opencv4/opencv2/cvconfig.h",
+ "include/opencv4/opencv2/**/*.h*",
+ ]),
+ includes = [
+ # Uncomment according to your multiarch value (gcc -print-multiarch):
+ # "include/aarch64-linux-gnu/opencv4/",
+ # "include/arm-linux-gnueabihf/opencv4/",
+ # "include/x86_64-linux-gnu/opencv4/",
+ "include/opencv4/",
+ ],
+ linkopts = [
+ "-l:libopencv_core.so",
+ "-l:libopencv_calib3d.so",
+ "-l:libopencv_features2d.so",
+ "-l:libopencv_highgui.so",
+ "-l:libopencv_imgcodecs.so",
+ "-l:libopencv_imgproc.so",
+ "-l:libopencv_video.so",
+ "-l:libopencv_videoio.so",
+ ],
+ )
+ ```
+
+ **Option 2**. Run [`setup_opencv.sh`] to automatically build OpenCV from
+ source and modify MediaPipe's OpenCV config. This option will do all steps
+ defined in Option 3 automatically.
+
+ **Option 3**. Follow OpenCV's
[documentation](https://docs.opencv.org/3.4.6/d7/d9f/tutorial_linux_install.html)
to manually build OpenCV from source code.
- Note: You may need to modify [`WORKSPACE`], [`opencv_linux.BUILD`] and
- [`ffmpeg_linux.BUILD`] to point MediaPipe to your own OpenCV and FFmpeg
- libraries. For example if OpenCV and FFmpeg are both manually installed in
- "/usr/local/", you will need to update: (1) the "linux_opencv" and
- "linux_ffmpeg" new_local_repository rules in [`WORKSPACE`], (2) the "opencv"
- cc_library rule in [`opencv_linux.BUILD`], and (3) the "libffmpeg"
- cc_library rule in [`ffmpeg_linux.BUILD`]. These 3 changes are shown below:
+ You may need to modify [`WORKSPACE`] and [`opencv_linux.BUILD`] to point
+ MediaPipe to your own OpenCV libraries. Assume OpenCV would be installed to
+ `/usr/local/` which is recommended by default.
+
+ OpenCV 2/3 setup:
```bash
+ # WORKSPACE
new_local_repository(
- name = "linux_opencv",
- build_file = "@//third_party:opencv_linux.BUILD",
- path = "/usr/local",
+ name = "linux_opencv",
+ build_file = "@//third_party:opencv_linux.BUILD",
+ path = "/usr/local",
)
+ # opencv_linux.BUILD for OpenCV 2/3 installed to /usr/local
+ cc_library(
+ name = "opencv",
+ linkopts = [
+ "-L/usr/local/lib",
+ "-l:libopencv_core.so",
+ "-l:libopencv_calib3d.so",
+ "-l:libopencv_features2d.so",
+ "-l:libopencv_highgui.so",
+ "-l:libopencv_imgcodecs.so",
+ "-l:libopencv_imgproc.so",
+ "-l:libopencv_video.so",
+ "-l:libopencv_videoio.so",
+ ],
+ )
+ ```
+
+ OpenCV 4 setup:
+
+ ```bash
+ # WORKSPACE
new_local_repository(
- name = "linux_ffmpeg",
- build_file = "@//third_party:ffmpeg_linux.BUILD",
- path = "/usr/local",
+ name = "linux_opencv",
+ build_file = "@//third_party:opencv_linux.BUILD",
+ path = "/usr/local",
)
+ # opencv_linux.BUILD for OpenCV 4 installed to /usr/local
cc_library(
- name = "opencv",
- srcs = glob(
- [
- "lib/libopencv_core.so",
- "lib/libopencv_highgui.so",
- "lib/libopencv_imgcodecs.so",
- "lib/libopencv_imgproc.so",
- "lib/libopencv_video.so",
- "lib/libopencv_videoio.so",
- ],
- ),
- hdrs = glob([
- # For OpenCV 3.x
- "include/opencv2/**/*.h*",
- # For OpenCV 4.x
- # "include/opencv4/opencv2/**/*.h*",
- ]),
- includes = [
- # For OpenCV 3.x
- "include/",
- # For OpenCV 4.x
- # "include/opencv4/",
- ],
- linkstatic = 1,
- visibility = ["//visibility:public"],
+ name = "opencv",
+ hdrs = glob([
+ "include/opencv4/opencv2/**/*.h*",
+ ]),
+ includes = [
+ "include/opencv4/",
+ ],
+ linkopts = [
+ "-L/usr/local/lib",
+ "-l:libopencv_core.so",
+ "-l:libopencv_calib3d.so",
+ "-l:libopencv_features2d.so",
+ "-l:libopencv_highgui.so",
+ "-l:libopencv_imgcodecs.so",
+ "-l:libopencv_imgproc.so",
+ "-l:libopencv_video.so",
+ "-l:libopencv_videoio.so",
+ ],
+ )
+ ```
+
+ Current FFmpeg setup is defined in [`ffmpeg_linux.BUILD`] and should work
+ for any architecture:
+
+ ```bash
+ # WORKSPACE
+ new_local_repository(
+ name = "linux_ffmpeg",
+ build_file = "@//third_party:ffmpeg_linux.BUILD",
+ path = "/usr"
)
+ # ffmpeg_linux.BUILD for FFmpeg installed from Debian package
cc_library(
- name = "libffmpeg",
- srcs = glob(
- [
- "lib/libav*.so",
- ],
- ),
- hdrs = glob(["include/libav*/*.h"]),
- includes = ["include"],
- linkopts = [
- "-lavcodec",
- "-lavformat",
- "-lavutil",
- ],
- linkstatic = 1,
- visibility = ["//visibility:public"],
+ name = "libffmpeg",
+ linkopts = [
+ "-l:libavcodec.so",
+ "-l:libavformat.so",
+ "-l:libavutil.so",
+ ],
)
```
@@ -711,7 +796,7 @@ This will use a Docker image that will isolate mediapipe's installation from the
```bash
$ docker run -it --name mediapipe mediapipe:latest
- root@bca08b91ff63:/mediapipe# GLOG_logtostderr=1 bazel run --define MEDIAPIPE_DISABLE_GPU=1 mediapipe/examples/desktop/hello_world:hello_world
+ root@bca08b91ff63:/mediapipe# GLOG_logtostderr=1 bazelisk run --define MEDIAPIPE_DISABLE_GPU=1 mediapipe/examples/desktop/hello_world:hello_world
# Should print:
# Hello World!
diff --git a/docs/getting_started/javascript.md b/docs/getting_started/javascript.md
index 0c49e1dd4..f56abcd6e 100644
--- a/docs/getting_started/javascript.md
+++ b/docs/getting_started/javascript.md
@@ -16,17 +16,29 @@ nav_order: 4
MediaPipe currently offers the following solutions:
-Solution | NPM Package | Example
------------------ | ----------------------------- | -------
-[Face Mesh][F-pg] | [@mediapipe/face_mesh][F-npm] | [mediapipe.dev/demo/face_mesh][F-demo]
-[Face Detection][Fd-pg] | [@mediapipe/face_detection][Fd-npm] | [mediapipe.dev/demo/face_detection][Fd-demo]
-[Hands][H-pg] | [@mediapipe/hands][H-npm] | [mediapipe.dev/demo/hands][H-demo]
-[Holistic][Ho-pg] | [@mediapipe/holistic][Ho-npm] | [mediapipe.dev/demo/holistic][Ho-demo]
-[Pose][P-pg] | [@mediapipe/pose][P-npm] | [mediapipe.dev/demo/pose][P-demo]
+Solution | NPM Package | Example
+--------------------------- | --------------------------------------- | -------
+[Face Mesh][F-pg] | [@mediapipe/face_mesh][F-npm] | [mediapipe.dev/demo/face_mesh][F-demo]
+[Face Detection][Fd-pg] | [@mediapipe/face_detection][Fd-npm] | [mediapipe.dev/demo/face_detection][Fd-demo]
+[Hands][H-pg] | [@mediapipe/hands][H-npm] | [mediapipe.dev/demo/hands][H-demo]
+[Holistic][Ho-pg] | [@mediapipe/holistic][Ho-npm] | [mediapipe.dev/demo/holistic][Ho-demo]
+[Objectron][Ob-pg] | [@mediapipe/objectron][Ob-npm] | [mediapipe.dev/demo/objectron][Ob-demo]
+[Pose][P-pg] | [@mediapipe/pose][P-npm] | [mediapipe.dev/demo/pose][P-demo]
+[Selfie Segmentation][S-pg] | [@mediapipe/selfie_segmentation][S-npm] | [mediapipe.dev/demo/selfie_segmentation][S-demo]
Click on a solution link above for more information, including API and code
snippets.
+### Supported plaforms:
+
+| Browser | Platform | Notes |
+| ------- | ----------------------- | -------------------------------------- |
+| Chrome | Android / Windows / Mac | Pixel 4 and older unsupported. Fuschia |
+| | | unsupported. |
+| Chrome | iOS | Camera unavailable in Chrome on iOS. |
+| Safari | iPad/iPhone/Mac | iOS and Safari on iPad / iPhone / |
+| | | MacBook |
+
The quickest way to get acclimated is to look at the examples above. Each demo
has a link to a [CodePen][codepen] so that you can edit the code and try it
yourself. We have included a number of utility packages to help you get started:
@@ -66,29 +78,25 @@ affecting your work, restrict your request to a `` number. e.g.,
[F-pg]: ../solutions/face_mesh#javascript-solution-api
[Fd-pg]: ../solutions/face_detection#javascript-solution-api
[H-pg]: ../solutions/hands#javascript-solution-api
+[Ob-pg]: ../solutions/objectron#javascript-solution-api
[P-pg]: ../solutions/pose#javascript-solution-api
+[S-pg]: ../solutions/selfie_segmentation#javascript-solution-api
[Ho-npm]: https://www.npmjs.com/package/@mediapipe/holistic
[F-npm]: https://www.npmjs.com/package/@mediapipe/face_mesh
[Fd-npm]: https://www.npmjs.com/package/@mediapipe/face_detection
[H-npm]: https://www.npmjs.com/package/@mediapipe/hands
+[Ob-npm]: https://www.npmjs.com/package/@mediapipe/objectron
[P-npm]: https://www.npmjs.com/package/@mediapipe/pose
+[S-npm]: https://www.npmjs.com/package/@mediapipe/selfie_segmentation
[draw-npm]: https://www.npmjs.com/package/@mediapipe/drawing_utils
[cam-npm]: https://www.npmjs.com/package/@mediapipe/camera_utils
[ctrl-npm]: https://www.npmjs.com/package/@mediapipe/control_utils
-[Ho-jsd]: https://www.jsdelivr.com/package/npm/@mediapipe/holistic
-[F-jsd]: https://www.jsdelivr.com/package/npm/@mediapipe/face_mesh
-[Fd-jsd]: https://www.jsdelivr.com/package/npm/@mediapipe/face_detection
-[H-jsd]: https://www.jsdelivr.com/package/npm/@mediapipe/hands
-[P-jsd]: https://www.jsdelivr.com/package/npm/@mediapipe/pose
-[Ho-pen]: https://code.mediapipe.dev/codepen/holistic
-[F-pen]: https://code.mediapipe.dev/codepen/face_mesh
-[Fd-pen]: https://code.mediapipe.dev/codepen/face_detection
-[H-pen]: https://code.mediapipe.dev/codepen/hands
-[P-pen]: https://code.mediapipe.dev/codepen/pose
[Ho-demo]: https://mediapipe.dev/demo/holistic
[F-demo]: https://mediapipe.dev/demo/face_mesh
[Fd-demo]: https://mediapipe.dev/demo/face_detection
[H-demo]: https://mediapipe.dev/demo/hands
+[Ob-demo]: https://mediapipe.dev/demo/objectron
[P-demo]: https://mediapipe.dev/demo/pose
+[S-demo]: https://mediapipe.dev/demo/selfie_segmentation
[npm]: https://www.npmjs.com/package/@mediapipe
[codepen]: https://code.mediapipe.dev/codepen
diff --git a/docs/getting_started/python.md b/docs/getting_started/python.md
index d59f35bbf..83550be84 100644
--- a/docs/getting_started/python.md
+++ b/docs/getting_started/python.md
@@ -51,6 +51,7 @@ details in each solution via the links below:
* [MediaPipe Holistic](../solutions/holistic#python-solution-api)
* [MediaPipe Objectron](../solutions/objectron#python-solution-api)
* [MediaPipe Pose](../solutions/pose#python-solution-api)
+* [MediaPipe Selfie Segmentation](../solutions/selfie_segmentation#python-solution-api)
## MediaPipe on Google Colab
@@ -62,6 +63,7 @@ details in each solution via the links below:
* [MediaPipe Pose Colab](https://mediapipe.page.link/pose_py_colab)
* [MediaPipe Pose Classification Colab (Basic)](https://mediapipe.page.link/pose_classification_basic)
* [MediaPipe Pose Classification Colab (Extended)](https://mediapipe.page.link/pose_classification_extended)
+* [MediaPipe Selfie Segmentation Colab](https://mediapipe.page.link/selfie_segmentation_py_colab)
## MediaPipe Python Framework
diff --git a/docs/getting_started/python_framework.md b/docs/getting_started/python_framework.md
index ece14bc91..688285d87 100644
--- a/docs/getting_started/python_framework.md
+++ b/docs/getting_started/python_framework.md
@@ -74,7 +74,7 @@ Mapping\[str, Packet\] | std::map | create_st
np.ndarray (cv.mat and PIL.Image) | mp::ImageFrame | create_image_frame( format=ImageFormat.SRGB, data=mat) | get_image_frame(packet)
np.ndarray | mp::Matrix | create_matrix(data) | get_matrix(packet)
Google Proto Message | Google Proto Message | create_proto(proto) | get_proto(packet)
-List\[Proto\] | std::vector\ | create_proto_vector(proto_list) | get_proto_list(packet)
+List\[Proto\] | std::vector\ | n/a | get_proto_list(packet)
It's not uncommon that users create custom C++ classes and and send those into
the graphs and calculators. To allow the custom classes to be used in Python
diff --git a/docs/images/import_mp_android_studio_project.png b/docs/images/import_mp_android_studio_project.png
new file mode 100644
index 000000000..aa02b95ce
Binary files /dev/null and b/docs/images/import_mp_android_studio_project.png differ
diff --git a/docs/images/mobile/pose_segmentation.mp4 b/docs/images/mobile/pose_segmentation.mp4
new file mode 100644
index 000000000..e0a68da70
Binary files /dev/null and b/docs/images/mobile/pose_segmentation.mp4 differ
diff --git a/docs/images/mobile/pose_tracking_pck_chart.png b/docs/images/mobile/pose_tracking_pck_chart.png
index 8b781e630..1fa4bf97d 100644
Binary files a/docs/images/mobile/pose_tracking_pck_chart.png and b/docs/images/mobile/pose_tracking_pck_chart.png differ
diff --git a/docs/images/mobile/pose_world_landmarks.mp4 b/docs/images/mobile/pose_world_landmarks.mp4
new file mode 100644
index 000000000..4a5bf3016
Binary files /dev/null and b/docs/images/mobile/pose_world_landmarks.mp4 differ
diff --git a/docs/images/run_android_solution_app.png b/docs/images/run_android_solution_app.png
new file mode 100644
index 000000000..aa21f3c24
Binary files /dev/null and b/docs/images/run_android_solution_app.png differ
diff --git a/docs/images/run_create_win_symlinks.png b/docs/images/run_create_win_symlinks.png
new file mode 100644
index 000000000..69b94b75f
Binary files /dev/null and b/docs/images/run_create_win_symlinks.png differ
diff --git a/docs/images/selfie_segmentation_web.mp4 b/docs/images/selfie_segmentation_web.mp4
new file mode 100644
index 000000000..d9e62838e
Binary files /dev/null and b/docs/images/selfie_segmentation_web.mp4 differ
diff --git a/docs/index.md b/docs/index.md
index 9035bf106..86d6ddc5e 100644
--- a/docs/index.md
+++ b/docs/index.md
@@ -40,11 +40,12 @@ Hair Segmentation
[Hands](https://google.github.io/mediapipe/solutions/hands) | ✅ | ✅ | ✅ | ✅ | ✅ |
[Pose](https://google.github.io/mediapipe/solutions/pose) | ✅ | ✅ | ✅ | ✅ | ✅ |
[Holistic](https://google.github.io/mediapipe/solutions/holistic) | ✅ | ✅ | ✅ | ✅ | ✅ |
+[Selfie Segmentation](https://google.github.io/mediapipe/solutions/selfie_segmentation) | ✅ | ✅ | ✅ | ✅ | ✅ |
[Hair Segmentation](https://google.github.io/mediapipe/solutions/hair_segmentation) | ✅ | | ✅ | | |
[Object Detection](https://google.github.io/mediapipe/solutions/object_detection) | ✅ | ✅ | ✅ | | | ✅
[Box Tracking](https://google.github.io/mediapipe/solutions/box_tracking) | ✅ | ✅ | ✅ | | |
[Instant Motion Tracking](https://google.github.io/mediapipe/solutions/instant_motion_tracking) | ✅ | | | | |
-[Objectron](https://google.github.io/mediapipe/solutions/objectron) | ✅ | | ✅ | ✅ | |
+[Objectron](https://google.github.io/mediapipe/solutions/objectron) | ✅ | | ✅ | ✅ | ✅ |
[KNIFT](https://google.github.io/mediapipe/solutions/knift) | ✅ | | | | |
[AutoFlip](https://google.github.io/mediapipe/solutions/autoflip) | | | ✅ | | |
[MediaSequence](https://google.github.io/mediapipe/solutions/media_sequence) | | | ✅ | | |
@@ -54,46 +55,22 @@ See also
[MediaPipe Models and Model Cards](https://google.github.io/mediapipe/solutions/models)
for ML models released in MediaPipe.
-## MediaPipe in Python
-
-MediaPipe offers customizable Python solutions as a prebuilt Python package on
-[PyPI](https://pypi.org/project/mediapipe/), which can be installed simply with
-`pip install mediapipe`. It also provides tools for users to build their own
-solutions. Please see
-[MediaPipe in Python](https://google.github.io/mediapipe/getting_started/python)
-for more info.
-
-## MediaPipe on the Web
-
-MediaPipe on the Web is an effort to run the same ML solutions built for mobile
-and desktop also in web browsers. The official API is under construction, but
-the core technology has been proven effective. Please see
-[MediaPipe on the Web](https://developers.googleblog.com/2020/01/mediapipe-on-web.html)
-in Google Developers Blog for details.
-
-You can use the following links to load a demo in the MediaPipe Visualizer, and
-over there click the "Runner" icon in the top bar like shown below. The demos
-use your webcam video as input, which is processed all locally in real-time and
-never leaves your device.
-
-
-
-* [MediaPipe Face Detection](https://viz.mediapipe.dev/demo/face_detection)
-* [MediaPipe Iris](https://viz.mediapipe.dev/demo/iris_tracking)
-* [MediaPipe Iris: Depth-from-Iris](https://viz.mediapipe.dev/demo/iris_depth)
-* [MediaPipe Hands](https://viz.mediapipe.dev/demo/hand_tracking)
-* [MediaPipe Hands (palm/hand detection only)](https://viz.mediapipe.dev/demo/hand_detection)
-* [MediaPipe Pose](https://viz.mediapipe.dev/demo/pose_tracking)
-* [MediaPipe Hair Segmentation](https://viz.mediapipe.dev/demo/hair_segmentation)
-
## Getting started
-Learn how to [install](https://google.github.io/mediapipe/getting_started/install)
-MediaPipe and
-[build example applications](https://google.github.io/mediapipe/getting_started/building_examples),
-and start exploring our ready-to-use
-[solutions](https://google.github.io/mediapipe/solutions/solutions) that you can
-further extend and customize.
+To start using MediaPipe
+[solutions](https://google.github.io/mediapipe/solutions/solutions) with only a few
+lines code, see example code and demos in
+[MediaPipe in Python](https://google.github.io/mediapipe/getting_started/python) and
+[MediaPipe in JavaScript](https://google.github.io/mediapipe/getting_started/javascript).
+
+To use MediaPipe in C++, Android and iOS, which allow further customization of
+the [solutions](https://google.github.io/mediapipe/solutions/solutions) as well as
+building your own, learn how to
+[install](https://google.github.io/mediapipe/getting_started/install) MediaPipe and
+start building example applications in
+[C++](https://google.github.io/mediapipe/getting_started/cpp),
+[Android](https://google.github.io/mediapipe/getting_started/android) and
+[iOS](https://google.github.io/mediapipe/getting_started/ios).
The source code is hosted in the
[MediaPipe Github repository](https://github.com/google/mediapipe), and you can
@@ -102,6 +79,13 @@ run code search using
## Publications
+* [Bringing artworks to life with AR](https://developers.googleblog.com/2021/07/bringing-artworks-to-life-with-ar.html)
+ in Google Developers Blog
+* [Prosthesis control via Mirru App using MediaPipe hand tracking](https://developers.googleblog.com/2021/05/control-your-mirru-prosthesis-with-mediapipe-hand-tracking.html)
+ in Google Developers Blog
+* [SignAll SDK: Sign language interface using MediaPipe is now available for
+ developers](https://developers.googleblog.com/2021/04/signall-sdk-sign-language-interface-using-mediapipe-now-available.html)
+ in Google Developers Blog
* [MediaPipe Holistic - Simultaneous Face, Hand and Pose Prediction, on Device](https://ai.googleblog.com/2020/12/mediapipe-holistic-simultaneous-face.html)
in Google AI Blog
* [Background Features in Google Meet, Powered by Web ML](https://ai.googleblog.com/2020/10/background-features-in-google-meet.html)
diff --git a/docs/solutions/autoflip.md b/docs/solutions/autoflip.md
index 0e118cc55..676abcae8 100644
--- a/docs/solutions/autoflip.md
+++ b/docs/solutions/autoflip.md
@@ -2,7 +2,7 @@
layout: default
title: AutoFlip (Saliency-aware Video Cropping)
parent: Solutions
-nav_order: 13
+nav_order: 14
---
# AutoFlip: Saliency-aware Video Cropping
diff --git a/docs/solutions/box_tracking.md b/docs/solutions/box_tracking.md
index 0e7550e7f..b84a015d1 100644
--- a/docs/solutions/box_tracking.md
+++ b/docs/solutions/box_tracking.md
@@ -2,7 +2,7 @@
layout: default
title: Box Tracking
parent: Solutions
-nav_order: 9
+nav_order: 10
---
# MediaPipe Box Tracking
diff --git a/docs/solutions/face_detection.md b/docs/solutions/face_detection.md
index 8d5de36eb..9d08ee482 100644
--- a/docs/solutions/face_detection.md
+++ b/docs/solutions/face_detection.md
@@ -45,6 +45,15 @@ section.
Naming style and availability may differ slightly across platforms/languages.
+#### model_selection
+
+An integer index `0` or `1`. Use `0` to select a short-range model that works
+best for faces within 2 meters from the camera, and `1` for a full-range model
+best for faces within 5 meters. For the full-range option, a sparse model is
+used for its improved inference speed. Please refer to the
+[model cards](./models.md#face_detection) for details. Default to `0` if not
+specified.
+
#### min_detection_confidence
Minimum confidence value (`[0.0, 1.0]`) from the face detection model for the
@@ -68,10 +77,11 @@ normalized to `[0.0, 1.0]` by the image width and height respectively.
Please first follow general [instructions](../getting_started/python.md) to
install MediaPipe Python package, then learn more in the companion
-[Python Colab](#resources) and the following usage example.
+[Python Colab](#resources) and the usage example below.
Supported configuration options:
+* [model_selection](#model_selection)
* [min_detection_confidence](#min_detection_confidence)
```python
@@ -81,9 +91,10 @@ mp_face_detection = mp.solutions.face_detection
mp_drawing = mp.solutions.drawing_utils
# For static images:
+IMAGE_FILES = []
with mp_face_detection.FaceDetection(
- min_detection_confidence=0.5) as face_detection:
- for idx, file in enumerate(file_list):
+ model_selection=1, min_detection_confidence=0.5) as face_detection:
+ for idx, file in enumerate(IMAGE_FILES):
image = cv2.imread(file)
# Convert the BGR image to RGB and process it with MediaPipe Face Detection.
results = face_detection.process(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
@@ -102,7 +113,7 @@ with mp_face_detection.FaceDetection(
# For webcam input:
cap = cv2.VideoCapture(0)
with mp_face_detection.FaceDetection(
- min_detection_confidence=0.5) as face_detection:
+ model_selection=0, min_detection_confidence=0.5) as face_detection:
while cap.isOpened():
success, image = cap.read()
if not success:
@@ -138,6 +149,7 @@ and the following usage example.
Supported configuration options:
+* [modelSelection](#model_selection)
* [minDetectionConfidence](#min_detection_confidence)
```html
@@ -188,6 +200,7 @@ const faceDetection = new FaceDetection({locateFile: (file) => {
return `https://cdn.jsdelivr.net/npm/@mediapipe/face_detection@0.0/${file}`;
}});
faceDetection.setOptions({
+ modelSelection: 0
minDetectionConfidence: 0.5
});
faceDetection.onResults(onResults);
@@ -254,10 +267,6 @@ same configuration as the GPU pipeline, runs entirely on CPU.
* Target:
[`mediapipe/examples/desktop/face_detection:face_detection_gpu`](https://github.com/google/mediapipe/tree/master/mediapipe/examples/desktop/face_detection/BUILD)
-### Web
-
-Please refer to [these instructions](../index.md#mediapipe-on-the-web).
-
### Coral
Please refer to
diff --git a/docs/solutions/face_mesh.md b/docs/solutions/face_mesh.md
index 0c620120c..a94785324 100644
--- a/docs/solutions/face_mesh.md
+++ b/docs/solutions/face_mesh.md
@@ -69,7 +69,7 @@ and renders using a dedicated
The
[face landmark subgraph](https://github.com/google/mediapipe/tree/master/mediapipe/modules/face_landmark/face_landmark_front_gpu.pbtxt)
internally uses a
-[face_detection_subgraph](https://github.com/google/mediapipe/tree/master/mediapipe/modules/face_detection/face_detection_front_gpu.pbtxt)
+[face_detection_subgraph](https://github.com/google/mediapipe/tree/master/mediapipe/modules/face_detection/face_detection_short_range_gpu.pbtxt)
from the
[face detection module](https://github.com/google/mediapipe/tree/master/mediapipe/modules/face_detection).
@@ -265,7 +265,7 @@ magnitude of `z` uses roughly the same scale as `x`.
Please first follow general [instructions](../getting_started/python.md) to
install MediaPipe Python package, then learn more in the companion
-[Python Colab](#resources) and the following usage example.
+[Python Colab](#resources) and the usage example below.
Supported configuration options:
@@ -278,15 +278,17 @@ Supported configuration options:
import cv2
import mediapipe as mp
mp_drawing = mp.solutions.drawing_utils
+mp_drawing_styles = mp.solutions.drawing_styles
mp_face_mesh = mp.solutions.face_mesh
# For static images:
+IMAGE_FILES = []
drawing_spec = mp_drawing.DrawingSpec(thickness=1, circle_radius=1)
with mp_face_mesh.FaceMesh(
static_image_mode=True,
max_num_faces=1,
min_detection_confidence=0.5) as face_mesh:
- for idx, file in enumerate(file_list):
+ for idx, file in enumerate(IMAGE_FILES):
image = cv2.imread(file)
# Convert the BGR image to RGB before processing.
results = face_mesh.process(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
@@ -300,9 +302,17 @@ with mp_face_mesh.FaceMesh(
mp_drawing.draw_landmarks(
image=annotated_image,
landmark_list=face_landmarks,
- connections=mp_face_mesh.FACE_CONNECTIONS,
- landmark_drawing_spec=drawing_spec,
- connection_drawing_spec=drawing_spec)
+ connections=mp_face_mesh.FACEMESH_TESSELATION,
+ landmark_drawing_spec=None,
+ connection_drawing_spec=mp_drawing_styles
+ .get_default_face_mesh_tesselation_style())
+ mp_drawing.draw_landmarks(
+ image=annotated_image,
+ landmark_list=face_landmarks,
+ connections=mp_face_mesh.FACEMESH_CONTOURS,
+ landmark_drawing_spec=None,
+ connection_drawing_spec=mp_drawing_styles
+ .get_default_face_mesh_contours_style())
cv2.imwrite('/tmp/annotated_image' + str(idx) + '.png', annotated_image)
# For webcam input:
@@ -334,9 +344,17 @@ with mp_face_mesh.FaceMesh(
mp_drawing.draw_landmarks(
image=image,
landmark_list=face_landmarks,
- connections=mp_face_mesh.FACE_CONNECTIONS,
- landmark_drawing_spec=drawing_spec,
- connection_drawing_spec=drawing_spec)
+ connections=mp_face_mesh.FACEMESH_TESSELATION,
+ landmark_drawing_spec=None,
+ connection_drawing_spec=mp_drawing_styles
+ .get_default_face_mesh_tesselation_style())
+ mp_drawing.draw_landmarks(
+ image=image,
+ landmark_list=face_landmarks,
+ connections=mp_face_mesh.FACEMESH_CONTOURS,
+ landmark_drawing_spec=None,
+ connection_drawing_spec=mp_drawing_styles
+ .get_default_face_mesh_contours_style())
cv2.imshow('MediaPipe FaceMesh', image)
if cv2.waitKey(5) & 0xFF == 27:
break
@@ -422,6 +440,200 @@ camera.start();
```
+### Android Solution API
+
+Please first follow general
+[instructions](../getting_started/android_solutions.md#integrate-mediapipe-android-solutions-api)
+to add MediaPipe Gradle dependencies, then try the FaceMash solution API in the
+companion
+[example Android Studio project](https://github.com/google/mediapipe/tree/master/mediapipe/examples/android/solutions/facemesh)
+following
+[these instructions](../getting_started/android_solutions.md#build-solution-example-apps-in-android-studio)
+and learn more in the usage example below.
+
+Supported configuration options:
+
+* [staticImageMode](#static_image_mode)
+* [maxNumFaces](#max_num_faces)
+* runOnGpu: Run the pipeline and the model inference on GPU or CPU.
+
+#### Camera Input
+
+```java
+// For camera input and result rendering with OpenGL.
+FaceMeshOptions faceMeshOptions =
+ FaceMeshOptions.builder()
+ .setMode(FaceMeshOptions.STREAMING_MODE) // API soon to become
+ .setMaxNumFaces(1) // setStaticImageMode(false)
+ .setRunOnGpu(true).build();
+FaceMesh facemesh = new FaceMesh(this, faceMeshOptions);
+facemesh.setErrorListener(
+ (message, e) -> Log.e(TAG, "MediaPipe FaceMesh error:" + message));
+
+// Initializes a new CameraInput instance and connects it to MediaPipe FaceMesh.
+CameraInput cameraInput = new CameraInput(this);
+cameraInput.setNewFrameListener(
+ textureFrame -> facemesh.send(textureFrame));
+
+// Initializes a new GlSurfaceView with a ResultGlRenderer instance
+// that provides the interfaces to run user-defined OpenGL rendering code.
+// See mediapipe/examples/android/solutions/facemesh/src/main/java/com/google/mediapipe/examples/facemesh/FaceMeshResultGlRenderer.java
+// as an example.
+SolutionGlSurfaceView glSurfaceView =
+ new SolutionGlSurfaceView<>(
+ this, facemesh.getGlContext(), facemesh.getGlMajorVersion());
+glSurfaceView.setSolutionResultRenderer(new FaceMeshResultGlRenderer());
+glSurfaceView.setRenderInputImage(true);
+
+facemesh.setResultListener(
+ faceMeshResult -> {
+ NormalizedLandmark noseLandmark =
+ result.multiFaceLandmarks().get(0).getLandmarkList().get(1);
+ Log.i(
+ TAG,
+ String.format(
+ "MediaPipe FaceMesh nose normalized coordinates (value range: [0, 1]): x=%f, y=%f",
+ noseLandmark.getX(), noseLandmark.getY()));
+ // Request GL rendering.
+ glSurfaceView.setRenderData(faceMeshResult);
+ glSurfaceView.requestRender();
+ });
+
+// The runnable to start camera after the GLSurfaceView is attached.
+glSurfaceView.post(
+ () ->
+ cameraInput.start(
+ this,
+ facemesh.getGlContext(),
+ CameraInput.CameraFacing.FRONT,
+ glSurfaceView.getWidth(),
+ glSurfaceView.getHeight()));
+```
+
+#### Image Input
+
+```java
+// For reading images from gallery and drawing the output in an ImageView.
+FaceMeshOptions faceMeshOptions =
+ FaceMeshOptions.builder()
+ .setMode(FaceMeshOptions.STATIC_IMAGE_MODE) // API soon to become
+ .setMaxNumFaces(1) // setStaticImageMode(true)
+ .setRunOnGpu(true).build();
+FaceMesh facemesh = new FaceMesh(this, faceMeshOptions);
+
+// Connects MediaPipe FaceMesh to the user-defined ImageView instance that allows
+// users to have the custom drawing of the output landmarks on it.
+// See mediapipe/examples/android/solutions/facemesh/src/main/java/com/google/mediapipe/examples/facemesh/FaceMeshResultImageView.java
+// as an example.
+FaceMeshResultImageView imageView = new FaceMeshResultImageView(this);
+facemesh.setResultListener(
+ faceMeshResult -> {
+ int width = faceMeshResult.inputBitmap().getWidth();
+ int height = faceMeshResult.inputBitmap().getHeight();
+ NormalizedLandmark noseLandmark =
+ result.multiFaceLandmarks().get(0).getLandmarkList().get(1);
+ Log.i(
+ TAG,
+ String.format(
+ "MediaPipe FaceMesh nose coordinates (pixel values): x=%f, y=%f",
+ noseLandmark.getX() * width, noseLandmark.getY() * height));
+ // Request canvas drawing.
+ imageView.setFaceMeshResult(faceMeshResult);
+ runOnUiThread(() -> imageView.update());
+ });
+facemesh.setErrorListener(
+ (message, e) -> Log.e(TAG, "MediaPipe FaceMesh error:" + message));
+
+// ActivityResultLauncher to get an image from the gallery as Bitmap.
+ActivityResultLauncher imageGetter =
+ registerForActivityResult(
+ new ActivityResultContracts.StartActivityForResult(),
+ result -> {
+ Intent resultIntent = result.getData();
+ if (resultIntent != null && result.getResultCode() == RESULT_OK) {
+ Bitmap bitmap = null;
+ try {
+ bitmap =
+ MediaStore.Images.Media.getBitmap(
+ this.getContentResolver(), resultIntent.getData());
+ } catch (IOException e) {
+ Log.e(TAG, "Bitmap reading error:" + e);
+ }
+ if (bitmap != null) {
+ facemesh.send(bitmap);
+ }
+ }
+ });
+Intent gallery = new Intent(
+ Intent.ACTION_PICK, MediaStore.Images.Media.INTERNAL_CONTENT_URI);
+imageGetter.launch(gallery);
+```
+
+#### Video Input
+
+```java
+// For video input and result rendering with OpenGL.
+FaceMeshOptions faceMeshOptions =
+ FaceMeshOptions.builder()
+ .setMode(FaceMeshOptions.STREAMING_MODE) // API soon to become
+ .setMaxNumFaces(1) // setStaticImageMode(false)
+ .setRunOnGpu(true).build();
+FaceMesh facemesh = new FaceMesh(this, faceMeshOptions);
+facemesh.setErrorListener(
+ (message, e) -> Log.e(TAG, "MediaPipe FaceMesh error:" + message));
+
+// Initializes a new VideoInput instance and connects it to MediaPipe FaceMesh.
+VideoInput videoInput = new VideoInput(this);
+videoInput.setNewFrameListener(
+ textureFrame -> facemesh.send(textureFrame));
+
+// Initializes a new GlSurfaceView with a ResultGlRenderer instance
+// that provides the interfaces to run user-defined OpenGL rendering code.
+// See mediapipe/examples/android/solutions/facemesh/src/main/java/com/google/mediapipe/examples/facemesh/FaceMeshResultGlRenderer.java
+// as an example.
+SolutionGlSurfaceView glSurfaceView =
+ new SolutionGlSurfaceView<>(
+ this, facemesh.getGlContext(), facemesh.getGlMajorVersion());
+glSurfaceView.setSolutionResultRenderer(new FaceMeshResultGlRenderer());
+glSurfaceView.setRenderInputImage(true);
+
+facemesh.setResultListener(
+ faceMeshResult -> {
+ NormalizedLandmark noseLandmark =
+ result.multiFaceLandmarks().get(0).getLandmarkList().get(1);
+ Log.i(
+ TAG,
+ String.format(
+ "MediaPipe FaceMesh nose normalized coordinates (value range: [0, 1]): x=%f, y=%f",
+ noseLandmark.getX(), noseLandmark.getY()));
+ // Request GL rendering.
+ glSurfaceView.setRenderData(faceMeshResult);
+ glSurfaceView.requestRender();
+ });
+
+ActivityResultLauncher videoGetter =
+ registerForActivityResult(
+ new ActivityResultContracts.StartActivityForResult(),
+ result -> {
+ Intent resultIntent = result.getData();
+ if (resultIntent != null) {
+ if (result.getResultCode() == RESULT_OK) {
+ glSurfaceView.post(
+ () ->
+ videoInput.start(
+ this,
+ resultIntent.getData(),
+ facemesh.getGlContext(),
+ glSurfaceView.getWidth(),
+ glSurfaceView.getHeight()));
+ }
+ }
+ });
+Intent gallery =
+ new Intent(Intent.ACTION_PICK, MediaStore.Video.Media.INTERNAL_CONTENT_URI);
+videoGetter.launch(gallery);
+```
+
## Example Apps
Please first see general instructions for
diff --git a/docs/solutions/hair_segmentation.md b/docs/solutions/hair_segmentation.md
index 5e2e4a7c5..9dd997b95 100644
--- a/docs/solutions/hair_segmentation.md
+++ b/docs/solutions/hair_segmentation.md
@@ -2,7 +2,7 @@
layout: default
title: Hair Segmentation
parent: Solutions
-nav_order: 7
+nav_order: 8
---
# MediaPipe Hair Segmentation
@@ -51,7 +51,14 @@ to visualize its associated subgraphs, please see
### Web
-Please refer to [these instructions](../index.md#mediapipe-on-the-web).
+Use [this link](https://viz.mediapipe.dev/demo/hair_segmentation) to load a demo
+in the MediaPipe Visualizer, and over there click the "Runner" icon in the top
+bar like shown below. The demos use your webcam video as input, which is
+processed all locally in real-time and never leaves your device. Please see
+[MediaPipe on the Web](https://developers.googleblog.com/2020/01/mediapipe-on-web.html)
+in Google Developers Blog for details.
+
+
## Resources
diff --git a/docs/solutions/hands.md b/docs/solutions/hands.md
index ac10124f2..c3088d64c 100644
--- a/docs/solutions/hands.md
+++ b/docs/solutions/hands.md
@@ -206,7 +206,7 @@ is not the case, please swap the handedness output in the application.
Please first follow general [instructions](../getting_started/python.md) to
install MediaPipe Python package, then learn more in the companion
-[Python Colab](#resources) and the following usage example.
+[Python Colab](#resources) and the usage example below.
Supported configuration options:
@@ -219,14 +219,16 @@ Supported configuration options:
import cv2
import mediapipe as mp
mp_drawing = mp.solutions.drawing_utils
+mp_drawing_styles = mp.solutions.drawing_styles
mp_hands = mp.solutions.hands
# For static images:
+IMAGE_FILES = []
with mp_hands.Hands(
static_image_mode=True,
max_num_hands=2,
min_detection_confidence=0.5) as hands:
- for idx, file in enumerate(file_list):
+ for idx, file in enumerate(IMAGE_FILES):
# Read an image, flip it around y-axis for correct handedness output (see
# above).
image = cv2.flip(cv2.imread(file), 1)
@@ -247,7 +249,11 @@ with mp_hands.Hands(
f'{hand_landmarks.landmark[mp_hands.HandLandmark.INDEX_FINGER_TIP].y * image_height})'
)
mp_drawing.draw_landmarks(
- annotated_image, hand_landmarks, mp_hands.HAND_CONNECTIONS)
+ annotated_image,
+ hand_landmarks,
+ mp_hands.HAND_CONNECTIONS,
+ mp_drawing_styles.get_default_hand_landmarks_style(),
+ mp_drawing_styles.get_default_hand_connections_style())
cv2.imwrite(
'/tmp/annotated_image' + str(idx) + '.png', cv2.flip(annotated_image, 1))
@@ -277,7 +283,11 @@ with mp_hands.Hands(
if results.multi_hand_landmarks:
for hand_landmarks in results.multi_hand_landmarks:
mp_drawing.draw_landmarks(
- image, hand_landmarks, mp_hands.HAND_CONNECTIONS)
+ image,
+ hand_landmarks,
+ mp_hands.HAND_CONNECTIONS,
+ mp_drawing_styles.get_default_hand_landmarks_style(),
+ mp_drawing_styles.get_default_hand_connections_style())
cv2.imshow('MediaPipe Hands', image)
if cv2.waitKey(5) & 0xFF == 27:
break
@@ -358,6 +368,200 @@ camera.start();
```
+### Android Solution API
+
+Please first follow general
+[instructions](../getting_started/android_solutions.md#integrate-mediapipe-android-solutions-api)
+to add MediaPipe Gradle dependencies, then try the Hands solution API in the
+companion
+[example Android Studio project](https://github.com/google/mediapipe/tree/master/mediapipe/examples/android/solutions/hands)
+following
+[these instructions](../getting_started/android_solutions.md#build-solution-example-apps-in-android-studio)
+and learn more in usage example below.
+
+Supported configuration options:
+
+* [staticImageMode](#static_image_mode)
+* [maxNumHands](#max_num_hands)
+* runOnGpu: Run the pipeline and the model inference on GPU or CPU.
+
+#### Camera Input
+
+```java
+// For camera input and result rendering with OpenGL.
+HandsOptions handsOptions =
+ HandsOptions.builder()
+ .setMode(HandsOptions.STREAMING_MODE) // API soon to become
+ .setMaxNumHands(1) // setStaticImageMode(false)
+ .setRunOnGpu(true).build();
+Hands hands = new Hands(this, handsOptions);
+hands.setErrorListener(
+ (message, e) -> Log.e(TAG, "MediaPipe Hands error:" + message));
+
+// Initializes a new CameraInput instance and connects it to MediaPipe Hands.
+CameraInput cameraInput = new CameraInput(this);
+cameraInput.setNewFrameListener(
+ textureFrame -> hands.send(textureFrame));
+
+// Initializes a new GlSurfaceView with a ResultGlRenderer instance
+// that provides the interfaces to run user-defined OpenGL rendering code.
+// See mediapipe/examples/android/solutions/hands/src/main/java/com/google/mediapipe/examples/hands/HandsResultGlRenderer.java
+// as an example.
+SolutionGlSurfaceView glSurfaceView =
+ new SolutionGlSurfaceView<>(
+ this, hands.getGlContext(), hands.getGlMajorVersion());
+glSurfaceView.setSolutionResultRenderer(new HandsResultGlRenderer());
+glSurfaceView.setRenderInputImage(true);
+
+hands.setResultListener(
+ handsResult -> {
+ NormalizedLandmark wristLandmark = Hands.getHandLandmark(
+ handsResult, 0, HandLandmark.WRIST);
+ Log.i(
+ TAG,
+ String.format(
+ "MediaPipe Hand wrist normalized coordinates (value range: [0, 1]): x=%f, y=%f",
+ wristLandmark.getX(), wristLandmark.getY()));
+ // Request GL rendering.
+ glSurfaceView.setRenderData(handsResult);
+ glSurfaceView.requestRender();
+ });
+
+// The runnable to start camera after the GLSurfaceView is attached.
+glSurfaceView.post(
+ () ->
+ cameraInput.start(
+ this,
+ hands.getGlContext(),
+ CameraInput.CameraFacing.FRONT,
+ glSurfaceView.getWidth(),
+ glSurfaceView.getHeight()));
+```
+
+#### Image Input
+
+```java
+// For reading images from gallery and drawing the output in an ImageView.
+HandsOptions handsOptions =
+ HandsOptions.builder()
+ .setMode(HandsOptions.STATIC_IMAGE_MODE) // API soon to become
+ .setMaxNumHands(1) // setStaticImageMode(true)
+ .setRunOnGpu(true).build();
+Hands hands = new Hands(this, handsOptions);
+
+// Connects MediaPipe Hands to the user-defined ImageView instance that allows
+// users to have the custom drawing of the output landmarks on it.
+// See mediapipe/examples/android/solutions/hands/src/main/java/com/google/mediapipe/examples/hands/HandsResultImageView.java
+// as an example.
+HandsResultImageView imageView = new HandsResultImageView(this);
+hands.setResultListener(
+ handsResult -> {
+ int width = handsResult.inputBitmap().getWidth();
+ int height = handsResult.inputBitmap().getHeight();
+ NormalizedLandmark wristLandmark = Hands.getHandLandmark(
+ handsResult, 0, HandLandmark.WRIST);
+ Log.i(
+ TAG,
+ String.format(
+ "MediaPipe Hand wrist coordinates (pixel values): x=%f, y=%f",
+ wristLandmark.getX() * width, wristLandmark.getY() * height));
+ // Request canvas drawing.
+ imageView.setHandsResult(handsResult);
+ runOnUiThread(() -> imageView.update());
+ });
+hands.setErrorListener(
+ (message, e) -> Log.e(TAG, "MediaPipe Hands error:" + message));
+
+// ActivityResultLauncher to get an image from the gallery as Bitmap.
+ActivityResultLauncher imageGetter =
+ registerForActivityResult(
+ new ActivityResultContracts.StartActivityForResult(),
+ result -> {
+ Intent resultIntent = result.getData();
+ if (resultIntent != null && result.getResultCode() == RESULT_OK) {
+ Bitmap bitmap = null;
+ try {
+ bitmap =
+ MediaStore.Images.Media.getBitmap(
+ this.getContentResolver(), resultIntent.getData());
+ } catch (IOException e) {
+ Log.e(TAG, "Bitmap reading error:" + e);
+ }
+ if (bitmap != null) {
+ hands.send(bitmap);
+ }
+ }
+ });
+Intent gallery = new Intent(
+ Intent.ACTION_PICK, MediaStore.Images.Media.INTERNAL_CONTENT_URI);
+imageGetter.launch(gallery);
+```
+
+#### Video Input
+
+```java
+// For video input and result rendering with OpenGL.
+HandsOptions handsOptions =
+ HandsOptions.builder()
+ .setMode(HandsOptions.STREAMING_MODE) // API soon to become
+ .setMaxNumHands(1) // setStaticImageMode(false)
+ .setRunOnGpu(true).build();
+Hands hands = new Hands(this, handsOptions);
+hands.setErrorListener(
+ (message, e) -> Log.e(TAG, "MediaPipe Hands error:" + message));
+
+// Initializes a new VideoInput instance and connects it to MediaPipe Hands.
+VideoInput videoInput = new VideoInput(this);
+videoInput.setNewFrameListener(
+ textureFrame -> hands.send(textureFrame));
+
+// Initializes a new GlSurfaceView with a ResultGlRenderer instance
+// that provides the interfaces to run user-defined OpenGL rendering code.
+// See mediapipe/examples/android/solutions/hands/src/main/java/com/google/mediapipe/examples/hands/HandsResultGlRenderer.java
+// as an example.
+SolutionGlSurfaceView glSurfaceView =
+ new SolutionGlSurfaceView<>(
+ this, hands.getGlContext(), hands.getGlMajorVersion());
+glSurfaceView.setSolutionResultRenderer(new HandsResultGlRenderer());
+glSurfaceView.setRenderInputImage(true);
+
+hands.setResultListener(
+ handsResult -> {
+ NormalizedLandmark wristLandmark = Hands.getHandLandmark(
+ handsResult, 0, HandLandmark.WRIST);
+ Log.i(
+ TAG,
+ String.format(
+ "MediaPipe Hand wrist normalized coordinates (value range: [0, 1]): x=%f, y=%f",
+ wristLandmark.getX(), wristLandmark.getY()));
+ // Request GL rendering.
+ glSurfaceView.setRenderData(handsResult);
+ glSurfaceView.requestRender();
+ });
+
+ActivityResultLauncher videoGetter =
+ registerForActivityResult(
+ new ActivityResultContracts.StartActivityForResult(),
+ result -> {
+ Intent resultIntent = result.getData();
+ if (resultIntent != null) {
+ if (result.getResultCode() == RESULT_OK) {
+ glSurfaceView.post(
+ () ->
+ videoInput.start(
+ this,
+ resultIntent.getData(),
+ hands.getGlContext(),
+ glSurfaceView.getWidth(),
+ glSurfaceView.getHeight()));
+ }
+ }
+ });
+Intent gallery =
+ new Intent(Intent.ACTION_PICK, MediaStore.Video.Media.INTERNAL_CONTENT_URI);
+videoGetter.launch(gallery);
+```
+
## Example Apps
Please first see general instructions for
diff --git a/docs/solutions/holistic.md b/docs/solutions/holistic.md
index 7c02c8d75..0532a33dd 100644
--- a/docs/solutions/holistic.md
+++ b/docs/solutions/holistic.md
@@ -176,6 +176,16 @@ A list of pose landmarks. Each landmark consists of the following:
* `visibility`: A value in `[0.0, 1.0]` indicating the likelihood of the
landmark being visible (present and not occluded) in the image.
+#### pose_world_landmarks
+
+Another list of pose landmarks in world coordinates. Each landmark consists of
+the following:
+
+* `x`, `y` and `z`: Real-world 3D coordinates in meters with the origin at the
+ center between hips.
+* `visibility`: Identical to that defined in the corresponding
+ [pose_landmarks](#pose_landmarks).
+
#### face_landmarks
A list of 468 face landmarks. Each landmark consists of `x`, `y` and `z`. `x`
@@ -201,7 +211,7 @@ A list of 21 hand landmarks on the right hand, in the same representation as
Please first follow general [instructions](../getting_started/python.md) to
install MediaPipe Python package, then learn more in the companion
-[Python Colab](#resources) and the following usage example.
+[Python Colab](#resources) and the usage example below.
Supported configuration options:
@@ -215,13 +225,15 @@ Supported configuration options:
import cv2
import mediapipe as mp
mp_drawing = mp.solutions.drawing_utils
+mp_drawing_styles = mp.solutions.drawing_styles
mp_holistic = mp.solutions.holistic
# For static images:
+IMAGE_FILES = []
with mp_holistic.Holistic(
static_image_mode=True,
model_complexity=2) as holistic:
- for idx, file in enumerate(file_list):
+ for idx, file in enumerate(IMAGE_FILES):
image = cv2.imread(file)
image_height, image_width, _ = image.shape
# Convert the BGR image to RGB before processing.
@@ -236,14 +248,22 @@ with mp_holistic.Holistic(
# Draw pose, left and right hands, and face landmarks on the image.
annotated_image = image.copy()
mp_drawing.draw_landmarks(
- annotated_image, results.face_landmarks, mp_holistic.FACE_CONNECTIONS)
+ annotated_image,
+ results.face_landmarks,
+ mp_holistic.FACEMESH_TESSELATION,
+ landmark_drawing_spec=None,
+ connection_drawing_spec=mp_drawing_styles
+ .get_default_face_mesh_tesselation_style())
mp_drawing.draw_landmarks(
- annotated_image, results.left_hand_landmarks, mp_holistic.HAND_CONNECTIONS)
- mp_drawing.draw_landmarks(
- annotated_image, results.right_hand_landmarks, mp_holistic.HAND_CONNECTIONS)
- mp_drawing.draw_landmarks(
- annotated_image, results.pose_landmarks, mp_holistic.POSE_CONNECTIONS)
+ annotated_image,
+ results.pose_landmarks,
+ mp_holistic.POSE_CONNECTIONS,
+ landmark_drawing_spec=mp_drawing_styles.
+ get_default_pose_landmarks_style())
cv2.imwrite('/tmp/annotated_image' + str(idx) + '.png', annotated_image)
+ # Plot pose world landmarks.
+ mp_drawing.plot_landmarks(
+ results.pose_world_landmarks, mp_holistic.POSE_CONNECTIONS)
# For webcam input:
cap = cv2.VideoCapture(0)
@@ -269,13 +289,18 @@ with mp_holistic.Holistic(
image.flags.writeable = True
image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)
mp_drawing.draw_landmarks(
- image, results.face_landmarks, mp_holistic.FACE_CONNECTIONS)
+ image,
+ results.face_landmarks,
+ mp_holistic.FACEMESH_CONTOURS,
+ landmark_drawing_spec=None,
+ connection_drawing_spec=mp_drawing_styles
+ .get_default_face_mesh_contours_style())
mp_drawing.draw_landmarks(
- image, results.left_hand_landmarks, mp_holistic.HAND_CONNECTIONS)
- mp_drawing.draw_landmarks(
- image, results.right_hand_landmarks, mp_holistic.HAND_CONNECTIONS)
- mp_drawing.draw_landmarks(
- image, results.pose_landmarks, mp_holistic.POSE_CONNECTIONS)
+ image,
+ results.pose_landmarks,
+ mp_holistic.POSE_CONNECTIONS,
+ landmark_drawing_spec=mp_drawing_styles
+ .get_default_pose_landmarks_style())
cv2.imshow('MediaPipe Holistic', image)
if cv2.waitKey(5) & 0xFF == 27:
break
diff --git a/docs/solutions/instant_motion_tracking.md b/docs/solutions/instant_motion_tracking.md
index 36e5e83e0..9fea7ec1c 100644
--- a/docs/solutions/instant_motion_tracking.md
+++ b/docs/solutions/instant_motion_tracking.md
@@ -2,7 +2,7 @@
layout: default
title: Instant Motion Tracking
parent: Solutions
-nav_order: 10
+nav_order: 11
---
# MediaPipe Instant Motion Tracking
diff --git a/docs/solutions/iris.md b/docs/solutions/iris.md
index 61ca8049c..af71c895f 100644
--- a/docs/solutions/iris.md
+++ b/docs/solutions/iris.md
@@ -69,7 +69,7 @@ and renders using a dedicated
The
[face landmark subgraph](https://github.com/google/mediapipe/tree/master/mediapipe/modules/face_landmark/face_landmark_front_gpu.pbtxt)
internally uses a
-[face detection subgraph](https://github.com/google/mediapipe/tree/master/mediapipe/modules/face_detection/face_detection_front_gpu.pbtxt)
+[face detection subgraph](https://github.com/google/mediapipe/tree/master/mediapipe/modules/face_detection/face_detection_short_range_gpu.pbtxt)
from the
[face detection module](https://github.com/google/mediapipe/tree/master/mediapipe/modules/face_detection).
@@ -193,7 +193,17 @@ on how to build MediaPipe examples.
### Web
-Please refer to [these instructions](../index.md#mediapipe-on-the-web).
+You can use the following links to load a demo in the MediaPipe Visualizer, and
+over there click the "Runner" icon in the top bar like shown below. The demos
+use your webcam video as input, which is processed all locally in real-time and
+never leaves your device. Please see
+[MediaPipe on the Web](https://developers.googleblog.com/2020/01/mediapipe-on-web.html)
+in Google Developers Blog for details.
+
+
+
+* [MediaPipe Iris](https://viz.mediapipe.dev/demo/iris_tracking)
+* [MediaPipe Iris: Depth-from-Iris](https://viz.mediapipe.dev/demo/iris_depth)
## Resources
diff --git a/docs/solutions/knift.md b/docs/solutions/knift.md
index 41691c418..b008f1496 100644
--- a/docs/solutions/knift.md
+++ b/docs/solutions/knift.md
@@ -2,7 +2,7 @@
layout: default
title: KNIFT (Template-based Feature Matching)
parent: Solutions
-nav_order: 12
+nav_order: 13
---
# MediaPipe KNIFT
diff --git a/docs/solutions/media_sequence.md b/docs/solutions/media_sequence.md
index cd3b7ecef..e6bd5fd44 100644
--- a/docs/solutions/media_sequence.md
+++ b/docs/solutions/media_sequence.md
@@ -2,7 +2,7 @@
layout: default
title: Dataset Preparation with MediaSequence
parent: Solutions
-nav_order: 14
+nav_order: 15
---
# Dataset Preparation with MediaSequence
diff --git a/docs/solutions/models.md b/docs/solutions/models.md
index e0ff4d14a..2f3001722 100644
--- a/docs/solutions/models.md
+++ b/docs/solutions/models.md
@@ -14,12 +14,27 @@ nav_order: 30
### [Face Detection](https://google.github.io/mediapipe/solutions/face_detection)
-* Face detection model for front-facing/selfie camera:
- [TFLite model](https://github.com/google/mediapipe/tree/master/mediapipe/modules/face_detection/face_detection_front.tflite),
- [TFLite model quantized for EdgeTPU/Coral](https://github.com/google/mediapipe/tree/master/mediapipe/examples/coral/models/face-detector-quantized_edgetpu.tflite)
-* Face detection model for back-facing camera:
- [TFLite model ](https://github.com/google/mediapipe/tree/master/mediapipe/modules/face_detection/face_detection_back.tflite)
-* [Model card](https://mediapipe.page.link/blazeface-mc)
+* Short-range model (best for faces within 2 meters from the camera):
+ [TFLite model](https://github.com/google/mediapipe/tree/master/mediapipe/modules/face_detection/face_detection_short_range.tflite),
+ [TFLite model quantized for EdgeTPU/Coral](https://github.com/google/mediapipe/tree/master/mediapipe/examples/coral/models/face-detector-quantized_edgetpu.tflite),
+ [Model card](https://mediapipe.page.link/blazeface-mc)
+* Full-range model (dense, best for faces within 5 meters from the camera):
+ [TFLite model](https://github.com/google/mediapipe/tree/master/mediapipe/modules/face_detection/face_detection_full_range.tflite),
+ [Model card](https://mediapipe.page.link/blazeface-back-mc)
+* Full-range model (sparse, best for faces within 5 meters from the camera):
+ [TFLite model](https://github.com/google/mediapipe/tree/master/mediapipe/modules/face_detection/face_detection_full_range_sparse.tflite),
+ [Model card](https://mediapipe.page.link/blazeface-back-sparse-mc)
+
+Full-range dense and sparse models have the same quality in terms of
+[F-score](https://en.wikipedia.org/wiki/F-score) however differ in underlying
+metrics. The dense model is slightly better in
+[Recall](https://en.wikipedia.org/wiki/Precision_and_recall) whereas the sparse
+model outperforms the dense one in
+[Precision](https://en.wikipedia.org/wiki/Precision_and_recall). Speed-wise
+sparse model is ~30% faster when executing on CPU via
+[XNNPACK](https://github.com/google/XNNPACK) whereas on GPU the models
+demonstrate comparable latencies. Depending on your application, you may prefer
+one over the other.
### [Face Mesh](https://google.github.io/mediapipe/solutions/face_mesh)
@@ -60,6 +75,12 @@ nav_order: 30
* Hand recrop model:
[TFLite model](https://github.com/google/mediapipe/tree/master/mediapipe/modules/holistic_landmark/hand_recrop.tflite)
+### [Selfie Segmentation](https://google.github.io/mediapipe/solutions/selfie_segmentation)
+
+* [TFLite model (general)](https://github.com/google/mediapipe/tree/master/mediapipe/modules/selfie_segmentation/selfie_segmentation.tflite)
+* [TFLite model (landscape)](https://github.com/google/mediapipe/tree/master/mediapipe/modules/selfie_segmentation/selfie_segmentation_landscape.tflite)
+* [Model card](https://mediapipe.page.link/selfiesegmentation-mc)
+
### [Hair Segmentation](https://google.github.io/mediapipe/solutions/hair_segmentation)
* [TFLite model](https://github.com/google/mediapipe/tree/master/mediapipe/models/hair_segmentation.tflite)
diff --git a/docs/solutions/object_detection.md b/docs/solutions/object_detection.md
index 044748537..d7cc2cec1 100644
--- a/docs/solutions/object_detection.md
+++ b/docs/solutions/object_detection.md
@@ -2,7 +2,7 @@
layout: default
title: Object Detection
parent: Solutions
-nav_order: 8
+nav_order: 9
---
# MediaPipe Object Detection
diff --git a/docs/solutions/objectron.md b/docs/solutions/objectron.md
index 0164e23b3..d7dc8f045 100644
--- a/docs/solutions/objectron.md
+++ b/docs/solutions/objectron.md
@@ -2,7 +2,7 @@
layout: default
title: Objectron (3D Object Detection)
parent: Solutions
-nav_order: 11
+nav_order: 12
---
# MediaPipe Objectron
@@ -224,29 +224,33 @@ where object detection simply runs on every image. Default to `0.99`.
#### model_name
-Name of the model to use for predicting 3D bounding box landmarks. Currently supports
-`{'Shoe', 'Chair', 'Cup', 'Camera'}`.
+Name of the model to use for predicting 3D bounding box landmarks. Currently
+supports `{'Shoe', 'Chair', 'Cup', 'Camera'}`. Default to `Shoe`.
#### focal_length
-Camera focal length `(fx, fy)`, by default is defined in
-[NDC space](#ndc-space). To use focal length `(fx_pixel, fy_pixel)` in
-[pixel space](#pixel-space), users should provide `image_size` = `(image_width,
-image_height)` to enable conversions inside the API. For further details about
-NDC and pixel space, please see [Coordinate Systems](#coordinate-systems).
+By default, camera focal length defined in [NDC space](#ndc-space), i.e., `(fx,
+fy)`. Default to `(1.0, 1.0)`. To specify focal length in
+[pixel space](#pixel-space) instead, i.e., `(fx_pixel, fy_pixel)`, users should
+provide [`image_size`](#image_size) = `(image_width, image_height)` to enable
+conversions inside the API. For further details about NDC and pixel space,
+please see [Coordinate Systems](#coordinate-systems).
#### principal_point
-Camera principal point `(px, py)`, by default is defined in
-[NDC space](#ndc-space). To use principal point `(px_pixel, py_pixel)` in
-[pixel space](#pixel-space), users should provide `image_size` = `(image_width,
-image_height)` to enable conversions inside the API. For further details about
-NDC and pixel space, please see [Coordinate Systems](#coordinate-systems).
+By default, camera principal point defined in [NDC space](#ndc-space), i.e.,
+`(px, py)`. Default to `(0.0, 0.0)`. To specify principal point in
+[pixel space](#pixel-space), i.e.,`(px_pixel, py_pixel)`, users should provide
+[`image_size`](#image_size) = `(image_width, image_height)` to enable
+conversions inside the API. For further details about NDC and pixel space,
+please see [Coordinate Systems](#coordinate-systems).
#### image_size
-(**Optional**) size `(image_width, image_height)` of the input image, **ONLY**
-needed when use `focal_length` and `principal_point` in pixel space.
+**Specify only when [`focal_length`](#focal_length) and
+[`principal_point`](#principal_point) are specified in pixel space.**
+
+Size of the input image, i.e., `(image_width, image_height)`.
### Output
@@ -277,7 +281,7 @@ following:
Please first follow general [instructions](../getting_started/python.md) to
install MediaPipe Python package, then learn more in the companion
-[Python Colab](#resources) and the following usage example.
+[Python Colab](#resources) and the usage example below.
Supported configuration options:
@@ -297,11 +301,12 @@ mp_drawing = mp.solutions.drawing_utils
mp_objectron = mp.solutions.objectron
# For static images:
+IMAGE_FILES = []
with mp_objectron.Objectron(static_image_mode=True,
max_num_objects=5,
min_detection_confidence=0.5,
model_name='Shoe') as objectron:
- for idx, file in enumerate(file_list):
+ for idx, file in enumerate(IMAGE_FILES):
image = cv2.imread(file)
# Convert the BGR image to RGB and process it with MediaPipe Objectron.
results = objectron.process(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
@@ -355,6 +360,89 @@ with mp_objectron.Objectron(static_image_mode=False,
cap.release()
```
+## JavaScript Solution API
+
+Please first see general [introduction](../getting_started/javascript.md) on
+MediaPipe in JavaScript, then learn more in the companion [web demo](#resources)
+and the following usage example.
+
+Supported configuration options:
+
+* [staticImageMode](#static_image_mode)
+* [maxNumObjects](#max_num_objects)
+* [minDetectionConfidence](#min_detection_confidence)
+* [minTrackingConfidence](#min_tracking_confidence)
+* [modelName](#model_name)
+* [focalLength](#focal_length)
+* [principalPoint](#principal_point)
+* [imageSize](#image_size)
+
+```html
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+```
+
+```javascript
+
+```
+
## Example Apps
Please first see general instructions for
@@ -441,7 +529,7 @@ Example app bounding boxes are rendered with [GlAnimationOverlayCalculator](http
> ```
> and then run
>
-> ```build
+> ```bash
> bazel run -c opt mediapipe/graphs/object_detection_3d/obj_parser:ObjParser -- input_dir=[INTERMEDIATE_OUTPUT_DIR] output_dir=[OUTPUT_DIR]
> ```
> INPUT_DIR should be the folder with initial asset .obj files to be processed,
@@ -560,11 +648,15 @@ py = -py_pixel * 2.0 / image_height + 1.0
[Announcing the Objectron Dataset](https://ai.googleblog.com/2020/11/announcing-objectron-dataset.html)
* Google AI Blog:
[Real-Time 3D Object Detection on Mobile Devices with MediaPipe](https://ai.googleblog.com/2020/03/real-time-3d-object-detection-on-mobile.html)
-* Paper: [Objectron: A Large Scale Dataset of Object-Centric Videos in the Wild with Pose Annotations](https://arxiv.org/abs/2012.09988), to appear in CVPR 2021
+* Paper: [Objectron: A Large Scale Dataset of Object-Centric Videos in the
+ Wild with Pose Annotations](https://arxiv.org/abs/2012.09988), to appear in
+ CVPR 2021
* Paper: [MobilePose: Real-Time Pose Estimation for Unseen Objects with Weak
Shape Supervision](https://arxiv.org/abs/2003.03522)
* Paper:
[Instant 3D Object Tracking with Applications in Augmented Reality](https://drive.google.com/open?id=1O_zHmlgXIzAdKljp20U_JUkEHOGG52R8)
- ([presentation](https://www.youtube.com/watch?v=9ndF1AIo7h0)), Fourth Workshop on Computer Vision for AR/VR, CVPR 2020
+ ([presentation](https://www.youtube.com/watch?v=9ndF1AIo7h0)), Fourth
+ Workshop on Computer Vision for AR/VR, CVPR 2020
* [Models and model cards](./models.md#objectron)
+* [Web demo](https://code.mediapipe.dev/codepen/objectron)
* [Python Colab](https://mediapipe.page.link/objectron_py_colab)
diff --git a/docs/solutions/pose.md b/docs/solutions/pose.md
index feed2ad34..271199bb5 100644
--- a/docs/solutions/pose.md
+++ b/docs/solutions/pose.md
@@ -30,7 +30,8 @@ overlay of digital content and information on top of the physical world in
augmented reality.
MediaPipe Pose is a ML solution for high-fidelity body pose tracking, inferring
-33 3D landmarks on the whole body from RGB video frames utilizing our
+33 3D landmarks and background segmentation mask on the whole body from RGB
+video frames utilizing our
[BlazePose](https://ai.googleblog.com/2020/08/on-device-real-time-body-pose-tracking.html)
research that also powers the
[ML Kit Pose Detection API](https://developers.google.com/ml-kit/vision/pose-detection).
@@ -49,11 +50,11 @@ The solution utilizes a two-step detector-tracker ML pipeline, proven to be
effective in our [MediaPipe Hands](./hands.md) and
[MediaPipe Face Mesh](./face_mesh.md) solutions. Using a detector, the pipeline
first locates the person/pose region-of-interest (ROI) within the frame. The
-tracker subsequently predicts the pose landmarks within the ROI using the
-ROI-cropped frame as input. Note that for video use cases the detector is
-invoked only as needed, i.e., for the very first frame and when the tracker
-could no longer identify body pose presence in the previous frame. For other
-frames the pipeline simply derives the ROI from the previous frame’s pose
+tracker subsequently predicts the pose landmarks and segmentation mask within
+the ROI using the ROI-cropped frame as input. Note that for video use cases the
+detector is invoked only as needed, i.e., for the very first frame and when the
+tracker could no longer identify body pose presence in the previous frame. For
+other frames the pipeline simply derives the ROI from the previous frame’s pose
landmarks.
The pipeline is implemented as a MediaPipe
@@ -87,11 +88,11 @@ from [COCO topology](https://cocodataset.org/#keypoints-2020).
Method | Yoga [`mAP`] | Yoga [`PCK@0.2`] | Dance [`mAP`] | Dance [`PCK@0.2`] | HIIT [`mAP`] | HIIT [`PCK@0.2`]
----------------------------------------------------------------------------------------------------- | -----------------: | ---------------------: | ------------------: | ----------------------: | -----------------: | ---------------------:
-BlazePose.Heavy | 68.1 | **96.4** | 73.0 | **97.2** | 74.0 | **97.5**
-BlazePose.Full | 62.6 | **95.5** | 67.4 | **96.3** | 68.0 | **95.7**
-BlazePose.Lite | 45.0 | **90.2** | 53.6 | **92.5** | 53.8 | **93.5**
-[AlphaPose.ResNet50](https://github.com/MVIG-SJTU/AlphaPose) | 63.4 | **96.0** | 57.8 | **95.5** | 63.4 | **96.0**
-[Apple.Vision](https://developer.apple.com/documentation/vision/detecting_human_body_poses_in_images) | 32.8 | **82.7** | 36.4 | **91.4** | 44.5 | **88.6**
+BlazePose GHUM Heavy | 68.1 | **96.4** | 73.0 | **97.2** | 74.0 | **97.5**
+BlazePose GHUM Full | 62.6 | **95.5** | 67.4 | **96.3** | 68.0 | **95.7**
+BlazePose GHUM Lite | 45.0 | **90.2** | 53.6 | **92.5** | 53.8 | **93.5**
+[AlphaPose ResNet50](https://github.com/MVIG-SJTU/AlphaPose) | 63.4 | **96.0** | 57.8 | **95.5** | 63.4 | **96.0**
+[Apple Vision](https://developer.apple.com/documentation/vision/detecting_human_body_poses_in_images) | 32.8 | **82.7** | 36.4 | **91.4** | 44.5 | **88.6**
 |
:--------------------------------------------------------------------------: |
@@ -100,11 +101,11 @@ BlazePose.Lite
We designed our models specifically for live perception use cases, so all of
them work in real-time on the majority of modern devices.
-Method | Latency Pixel 3 [TFLite GPU](https://www.tensorflow.org/lite/performance/gpu_advanced) | Latency MacBook Pro (15-inch 2017)
---------------- | -------------------------------------------------------------------------------------------: | ---------------------------------------:
-BlazePose.Heavy | 53 ms | 38 ms
-BlazePose.Full | 25 ms | 27 ms
-BlazePose.Lite | 20 ms | 25 ms
+Method | Latency Pixel 3 [TFLite GPU](https://www.tensorflow.org/lite/performance/gpu_advanced) | Latency MacBook Pro (15-inch 2017)
+-------------------- | -------------------------------------------------------------------------------------------: | ---------------------------------------:
+BlazePose GHUM Heavy | 53 ms | 38 ms
+BlazePose GHUM Full | 25 ms | 27 ms
+BlazePose GHUM Lite | 20 ms | 25 ms
## Models
@@ -129,16 +130,19 @@ hip midpoints.
The landmark model in MediaPipe Pose predicts the location of 33 pose landmarks
(see figure below).
-Please find more detail in the
-[BlazePose Google AI Blog](https://ai.googleblog.com/2020/08/on-device-real-time-body-pose-tracking.html),
-this [paper](https://arxiv.org/abs/2006.10204) and
-[the model card](./models.md#pose), and the attributes in each landmark
-[below](#pose_landmarks).
-
 |
:----------------------------------------------------------------------------------------------: |
*Fig 4. 33 pose landmarks.* |
+Optionally, MediaPipe Pose can predicts a full-body
+[segmentation mask](#segmentation_mask) represented as a two-class segmentation
+(human or background).
+
+Please find more detail in the
+[BlazePose Google AI Blog](https://ai.googleblog.com/2020/08/on-device-real-time-body-pose-tracking.html),
+this [paper](https://arxiv.org/abs/2006.10204),
+[the model card](./models.md#pose) and the [Output](#output) section below.
+
## Solution APIs
### Cross-platform Configuration Options
@@ -167,6 +171,18 @@ If set to `true`, the solution filters pose landmarks across different input
images to reduce jitter, but ignored if [static_image_mode](#static_image_mode)
is also set to `true`. Default to `true`.
+#### enable_segmentation
+
+If set to `true`, in addition to the pose landmarks the solution also generates
+the segmentation mask. Default to `false`.
+
+#### smooth_segmentation
+
+If set to `true`, the solution filters segmentation masks across different input
+images to reduce jitter. Ignored if [enable_segmentation](#enable_segmentation)
+is `false` or [static_image_mode](#static_image_mode) is `true`. Default to
+`true`.
+
#### min_detection_confidence
Minimum confidence value (`[0.0, 1.0]`) from the person-detection model for the
@@ -187,28 +203,56 @@ Naming style may differ slightly across platforms/languages.
#### pose_landmarks
-A list of pose landmarks. Each lanmark consists of the following:
+A list of pose landmarks. Each landmark consists of the following:
* `x` and `y`: Landmark coordinates normalized to `[0.0, 1.0]` by the image
width and height respectively.
* `z`: Represents the landmark depth with the depth at the midpoint of hips
being the origin, and the smaller the value the closer the landmark is to
the camera. The magnitude of `z` uses roughly the same scale as `x`.
-
* `visibility`: A value in `[0.0, 1.0]` indicating the likelihood of the
landmark being visible (present and not occluded) in the image.
+#### pose_world_landmarks
+
+*Fig 5. Example of MediaPipe Pose real-world 3D coordinates.* |
+:-----------------------------------------------------------: |
+ |
+
+Another list of pose landmarks in world coordinates. Each landmark consists of
+the following:
+
+* `x`, `y` and `z`: Real-world 3D coordinates in meters with the origin at the
+ center between hips.
+* `visibility`: Identical to that defined in the corresponding
+ [pose_landmarks](#pose_landmarks).
+
+#### segmentation_mask
+
+The output segmentation mask, predicted only when
+[enable_segmentation](#enable_segmentation) is set to `true`. The mask has the
+same width and height as the input image, and contains values in `[0.0, 1.0]`
+where `1.0` and `0.0` indicate high certainty of a "human" and "background"
+pixel respectively. Please refer to the platform-specific usage examples below
+for usage details.
+
+*Fig 6. Example of MediaPipe Pose segmentation mask.* |
+:---------------------------------------------------: |
+ |
+
### Python Solution API
Please first follow general [instructions](../getting_started/python.md) to
install MediaPipe Python package, then learn more in the companion
-[Python Colab](#resources) and the following usage example.
+[Python Colab](#resources) and the usage example below.
Supported configuration options:
* [static_image_mode](#static_image_mode)
* [model_complexity](#model_complexity)
* [smooth_landmarks](#smooth_landmarks)
+* [enable_segmentation](#enable_segmentation)
+* [smooth_segmentation](#smooth_segmentation)
* [min_detection_confidence](#min_detection_confidence)
* [min_tracking_confidence](#min_tracking_confidence)
@@ -216,14 +260,18 @@ Supported configuration options:
import cv2
import mediapipe as mp
mp_drawing = mp.solutions.drawing_utils
+mp_drawing_styles = mp.solutions.drawing_styles
mp_pose = mp.solutions.pose
# For static images:
+IMAGE_FILES = []
+BG_COLOR = (192, 192, 192) # gray
with mp_pose.Pose(
static_image_mode=True,
model_complexity=2,
+ enable_segmentation=True,
min_detection_confidence=0.5) as pose:
- for idx, file in enumerate(file_list):
+ for idx, file in enumerate(IMAGE_FILES):
image = cv2.imread(file)
image_height, image_width, _ = image.shape
# Convert the BGR image to RGB before processing.
@@ -233,14 +281,28 @@ with mp_pose.Pose(
continue
print(
f'Nose coordinates: ('
- f'{results.pose_landmarks.landmark[mp_holistic.PoseLandmark.NOSE].x * image_width}, '
- f'{results.pose_landmarks.landmark[mp_holistic.PoseLandmark.NOSE].y * image_height})'
+ f'{results.pose_landmarks.landmark[mp_pose.PoseLandmark.NOSE].x * image_width}, '
+ f'{results.pose_landmarks.landmark[mp_pose.PoseLandmark.NOSE].y * image_height})'
)
- # Draw pose landmarks on the image.
+
annotated_image = image.copy()
+ # Draw segmentation on the image.
+ # To improve segmentation around boundaries, consider applying a joint
+ # bilateral filter to "results.segmentation_mask" with "image".
+ condition = np.stack((results.segmentation_mask,) * 3, axis=-1) > 0.1
+ bg_image = np.zeros(image.shape, dtype=np.uint8)
+ bg_image[:] = BG_COLOR
+ annotated_image = np.where(condition, annotated_image, bg_image)
+ # Draw pose landmarks on the image.
mp_drawing.draw_landmarks(
- annotated_image, results.pose_landmarks, mp_pose.POSE_CONNECTIONS)
+ annotated_image,
+ results.pose_landmarks,
+ mp_pose.POSE_CONNECTIONS,
+ landmark_drawing_spec=mp_drawing_styles.get_default_pose_landmarks_style())
cv2.imwrite('/tmp/annotated_image' + str(idx) + '.png', annotated_image)
+ # Plot pose world landmarks.
+ mp_drawing.plot_landmarks(
+ results.pose_world_landmarks, mp_pose.POSE_CONNECTIONS)
# For webcam input:
cap = cv2.VideoCapture(0)
@@ -266,7 +328,10 @@ with mp_pose.Pose(
image.flags.writeable = True
image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)
mp_drawing.draw_landmarks(
- image, results.pose_landmarks, mp_pose.POSE_CONNECTIONS)
+ image,
+ results.pose_landmarks,
+ mp_pose.POSE_CONNECTIONS,
+ landmark_drawing_spec=mp_drawing_styles.get_default_pose_landmarks_style())
cv2.imshow('MediaPipe Pose', image)
if cv2.waitKey(5) & 0xFF == 27:
break
@@ -283,6 +348,8 @@ Supported configuration options:
* [modelComplexity](#model_complexity)
* [smoothLandmarks](#smooth_landmarks)
+* [enableSegmentation](#enable_segmentation)
+* [smoothSegmentation](#smooth_segmentation)
* [minDetectionConfidence](#min_detection_confidence)
* [minTrackingConfidence](#min_tracking_confidence)
@@ -293,6 +360,7 @@ Supported configuration options:
+
@@ -301,6 +369,7 @@ Supported configuration options: