Merge branch 'master' of https://github.com/google/mediapipe
This commit is contained in:
commit
a715f1ec4d
27
.github/ISSUE_TEMPLATE/00-build-installation-issue.md
vendored
Normal file
27
.github/ISSUE_TEMPLATE/00-build-installation-issue.md
vendored
Normal file
|
@ -0,0 +1,27 @@
|
|||
---
|
||||
name: "Build/Installation Issue"
|
||||
about: Use this template for build/installation issues
|
||||
labels: type:build/install
|
||||
|
||||
---
|
||||
<em>Please make sure that this is a build/installation issue and also refer to the [troubleshooting](https://google.github.io/mediapipe/getting_started/troubleshooting.html) documentation before raising any issues.</em>
|
||||
|
||||
**System information** (Please provide as much relevant information as possible)
|
||||
- OS Platform and Distribution (e.g. Linux Ubuntu 16.04, Android 11, iOS 14.4):
|
||||
- Compiler version (e.g. gcc/g++ 8 /Apple clang version 12.0.0):
|
||||
- Programming Language and version ( e.g. C++ 14, Python 3.6, Java ):
|
||||
- Installed using virtualenv? pip? Conda? (if python):
|
||||
- [MediaPipe version](https://github.com/google/mediapipe/releases):
|
||||
- Bazel version:
|
||||
- XCode and Tulsi versions (if iOS):
|
||||
- Android SDK and NDK versions (if android):
|
||||
- Android [AAR](https://google.github.io/mediapipe/getting_started/android_archive_library.html) ( if android):
|
||||
- OpenCV version (if running on desktop):
|
||||
|
||||
**Describe the problem**:
|
||||
|
||||
|
||||
**[Provide the exact sequence of commands / steps that you executed before running into the problem](https://google.github.io/mediapipe/getting_started/getting_started.html):**
|
||||
|
||||
**Complete Logs:**
|
||||
Include Complete Log information or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached:
|
26
.github/ISSUE_TEMPLATE/10-solution-issue.md
vendored
Normal file
26
.github/ISSUE_TEMPLATE/10-solution-issue.md
vendored
Normal file
|
@ -0,0 +1,26 @@
|
|||
---
|
||||
name: "Solution Issue"
|
||||
about: Use this template for assistance with a specific mediapipe solution, such as "Pose" or "Iris", including inference model usage/training, solution-specific calculators, etc.
|
||||
labels: type:support
|
||||
|
||||
---
|
||||
<em>Please make sure that this is a [solution](https://google.github.io/mediapipe/solutions/solutions.html) issue.<em>
|
||||
|
||||
**System information** (Please provide as much relevant information as possible)
|
||||
- Have I written custom code (as opposed to using a stock example script provided in Mediapipe):
|
||||
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04, Android 11, iOS 14.4):
|
||||
- [MediaPipe version](https://github.com/google/mediapipe/releases):
|
||||
- Bazel version:
|
||||
- Solution (e.g. FaceMesh, Pose, Holistic):
|
||||
- Programming Language and version ( e.g. C++, Python, Java):
|
||||
|
||||
**Describe the expected behavior:**
|
||||
|
||||
**Standalone code you may have used to try to get what you need :**
|
||||
|
||||
If there is a problem, provide a reproducible test case that is the bare minimum necessary to generate the problem. If possible, please share a link to Colab/repo link /any notebook:
|
||||
|
||||
**Other info / Complete Logs :**
|
||||
Include any logs or source code that would be helpful to
|
||||
diagnose the problem. If including tracebacks, please include the full
|
||||
traceback. Large logs and files should be attached:
|
51
.github/ISSUE_TEMPLATE/20-documentation-issue.md
vendored
Normal file
51
.github/ISSUE_TEMPLATE/20-documentation-issue.md
vendored
Normal file
|
@ -0,0 +1,51 @@
|
|||
---
|
||||
name: "Documentation Issue"
|
||||
about: Use this template for documentation related issues
|
||||
labels: type:docs
|
||||
|
||||
---
|
||||
Thank you for submitting a MediaPipe documentation issue.
|
||||
The MediaPipe docs are open source! To get involved, read the documentation Contributor Guide
|
||||
## URL(s) with the issue:
|
||||
|
||||
Please provide a link to the documentation entry, for example: https://github.com/google/mediapipe/blob/master/docs/solutions/face_mesh.md#models
|
||||
|
||||
## Description of issue (what needs changing):
|
||||
|
||||
Kinds of documentation problems:
|
||||
|
||||
### Clear description
|
||||
|
||||
For example, why should someone use this method? How is it useful?
|
||||
|
||||
### Correct links
|
||||
|
||||
Is the link to the source code correct?
|
||||
|
||||
### Parameters defined
|
||||
Are all parameters defined and formatted correctly?
|
||||
|
||||
### Returns defined
|
||||
|
||||
Are return values defined?
|
||||
|
||||
### Raises listed and defined
|
||||
|
||||
Are the errors defined? For example,
|
||||
|
||||
### Usage example
|
||||
|
||||
Is there a usage example?
|
||||
|
||||
See the API guide:
|
||||
on how to write testable usage examples.
|
||||
|
||||
### Request visuals, if applicable
|
||||
|
||||
Are there currently visuals? If not, will it clarify the content?
|
||||
|
||||
### Submit a pull request?
|
||||
|
||||
Are you planning to also submit a pull request to fix the issue? See the docs
|
||||
https://github.com/google/mediapipe/blob/master/CONTRIBUTING.md
|
||||
|
32
.github/ISSUE_TEMPLATE/30-bug-issue.md
vendored
Normal file
32
.github/ISSUE_TEMPLATE/30-bug-issue.md
vendored
Normal file
|
@ -0,0 +1,32 @@
|
|||
---
|
||||
name: "Bug Issue"
|
||||
about: Use this template for reporting a bug
|
||||
labels: type:bug
|
||||
|
||||
---
|
||||
<em>Please make sure that this is a bug and also refer to the [troubleshooting](https://google.github.io/mediapipe/getting_started/troubleshooting.html), FAQ documentation before raising any issues.</em>
|
||||
|
||||
**System information** (Please provide as much relevant information as possible)
|
||||
|
||||
- Have I written custom code (as opposed to using a stock example script provided in MediaPipe):
|
||||
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04, Android 11, iOS 14.4):
|
||||
- Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device:
|
||||
- Browser and version (e.g. Google Chrome, Safari) if the issue happens on browser:
|
||||
- Programming Language and version ( e.g. C++, Python, Java):
|
||||
- [MediaPipe version](https://github.com/google/mediapipe/releases):
|
||||
- Bazel version (if compiling from source):
|
||||
- Solution ( e.g. FaceMesh, Pose, Holistic ):
|
||||
- Android Studio, NDK, SDK versions (if issue is related to building in Android environment):
|
||||
- Xcode & Tulsi version (if issue is related to building for iOS):
|
||||
|
||||
**Describe the current behavior:**
|
||||
|
||||
**Describe the expected behavior:**
|
||||
|
||||
**Standalone code to reproduce the issue:**
|
||||
Provide a reproducible test case that is the bare minimum necessary to replicate the problem. If possible, please share a link to Colab/repo link /any notebook:
|
||||
|
||||
**Other info / Complete Logs :**
|
||||
Include any logs or source code that would be helpful to
|
||||
diagnose the problem. If including tracebacks, please include the full
|
||||
traceback. Large logs and files should be attached
|
24
.github/ISSUE_TEMPLATE/40-feature-request.md
vendored
Normal file
24
.github/ISSUE_TEMPLATE/40-feature-request.md
vendored
Normal file
|
@ -0,0 +1,24 @@
|
|||
---
|
||||
name: "Feature Request"
|
||||
about: Use this template for raising a feature request
|
||||
labels: type:feature
|
||||
|
||||
---
|
||||
<em>Please make sure that this is a feature request.</em>
|
||||
|
||||
**System information** (Please provide as much relevant information as possible)
|
||||
|
||||
- MediaPipe Solution (you are using):
|
||||
- Programming language : C++/typescript/Python/Objective C/Android Java
|
||||
- Are you willing to contribute it (Yes/No):
|
||||
|
||||
|
||||
**Describe the feature and the current behavior/state:**
|
||||
|
||||
**Will this change the current api? How?**
|
||||
|
||||
**Who will benefit with this feature?**
|
||||
|
||||
**Please specify the use cases for this feature:**
|
||||
|
||||
**Any Other info:**
|
14
.github/ISSUE_TEMPLATE/50-other-issues.md
vendored
Normal file
14
.github/ISSUE_TEMPLATE/50-other-issues.md
vendored
Normal file
|
@ -0,0 +1,14 @@
|
|||
---
|
||||
name: "Other Issue"
|
||||
about: Use this template for any other non-support related issues.
|
||||
labels: type:others
|
||||
|
||||
---
|
||||
This template is for miscellaneous issues not covered by the other issue categories
|
||||
|
||||
For questions on how to work with MediaPipe, or support for problems that are not verified bugs in MediaPipe, please go to [StackOverflow](https://stackoverflow.com/questions/tagged/mediapipe) and [Slack](https://mediapipe.page.link/joinslack) communities.
|
||||
|
||||
If you are reporting a vulnerability, please use the [dedicated reporting process](https://github.com/google/mediapipe/security).
|
||||
|
||||
For high-level discussions about MediaPipe, please post to discuss@mediapipe.org, for questions about the development or internal workings of MediaPipe, or if you would like to know how to contribute to MediaPipe, please post to developers@mediapipe.org.
|
||||
|
18
.github/bot_config.yml
vendored
Normal file
18
.github/bot_config.yml
vendored
Normal file
|
@ -0,0 +1,18 @@
|
|||
# Copyright 2021 The MediaPipe Authors.
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# ============================================================================
|
||||
|
||||
# A list of assignees
|
||||
assignees:
|
||||
- sgowroji
|
34
.github/stale.yml
vendored
Normal file
34
.github/stale.yml
vendored
Normal file
|
@ -0,0 +1,34 @@
|
|||
# Copyright 2021 The MediaPipe Authors.
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# ============================================================================
|
||||
#
|
||||
# This file was assembled from multiple pieces, whose use is documented
|
||||
# throughout. Please refer to the TensorFlow dockerfiles documentation
|
||||
# for more information.
|
||||
|
||||
# Number of days of inactivity before an Issue or Pull Request becomes stale
|
||||
daysUntilStale: 7
|
||||
# Number of days of inactivity before a stale Issue or Pull Request is closed
|
||||
daysUntilClose: 7
|
||||
# Only issues or pull requests with all of these labels are checked if stale. Defaults to `[]` (disabled)
|
||||
onlyLabels:
|
||||
- stat:awaiting response
|
||||
# Comment to post when marking as stale. Set to `false` to disable
|
||||
markComment: >
|
||||
This issue has been automatically marked as stale because it has not had
|
||||
recent activity. It will be closed if no further activity occurs. Thank you.
|
||||
# Comment to post when removing the stale label. Set to `false` to disable
|
||||
unmarkComment: false
|
||||
closeComment: >
|
||||
Closing as stale. Please reopen if you'd like to work on this further.
|
|
@ -8,6 +8,7 @@ include README.md
|
|||
include requirements.txt
|
||||
|
||||
recursive-include mediapipe/modules *.tflite *.txt *.binarypb
|
||||
exclude mediapipe/modules/face_detection/face_detection_full_range.tflite
|
||||
exclude mediapipe/modules/objectron/object_detection_3d_chair_1stage.tflite
|
||||
exclude mediapipe/modules/objectron/object_detection_3d_sneakers_1stage.tflite
|
||||
exclude mediapipe/modules/objectron/object_detection_3d_sneakers.tflite
|
||||
|
|
62
README.md
62
README.md
|
@ -40,11 +40,12 @@ Hair Segmentation
|
|||
[Hands](https://google.github.io/mediapipe/solutions/hands) | ✅ | ✅ | ✅ | ✅ | ✅ |
|
||||
[Pose](https://google.github.io/mediapipe/solutions/pose) | ✅ | ✅ | ✅ | ✅ | ✅ |
|
||||
[Holistic](https://google.github.io/mediapipe/solutions/holistic) | ✅ | ✅ | ✅ | ✅ | ✅ |
|
||||
[Selfie Segmentation](https://google.github.io/mediapipe/solutions/selfie_segmentation) | ✅ | ✅ | ✅ | ✅ | ✅ |
|
||||
[Hair Segmentation](https://google.github.io/mediapipe/solutions/hair_segmentation) | ✅ | | ✅ | | |
|
||||
[Object Detection](https://google.github.io/mediapipe/solutions/object_detection) | ✅ | ✅ | ✅ | | | ✅
|
||||
[Box Tracking](https://google.github.io/mediapipe/solutions/box_tracking) | ✅ | ✅ | ✅ | | |
|
||||
[Instant Motion Tracking](https://google.github.io/mediapipe/solutions/instant_motion_tracking) | ✅ | | | | |
|
||||
[Objectron](https://google.github.io/mediapipe/solutions/objectron) | ✅ | | ✅ | ✅ | |
|
||||
[Objectron](https://google.github.io/mediapipe/solutions/objectron) | ✅ | | ✅ | ✅ | ✅ |
|
||||
[KNIFT](https://google.github.io/mediapipe/solutions/knift) | ✅ | | | | |
|
||||
[AutoFlip](https://google.github.io/mediapipe/solutions/autoflip) | | | ✅ | | |
|
||||
[MediaSequence](https://google.github.io/mediapipe/solutions/media_sequence) | | | ✅ | | |
|
||||
|
@ -54,46 +55,22 @@ See also
|
|||
[MediaPipe Models and Model Cards](https://google.github.io/mediapipe/solutions/models)
|
||||
for ML models released in MediaPipe.
|
||||
|
||||
## MediaPipe in Python
|
||||
|
||||
MediaPipe offers customizable Python solutions as a prebuilt Python package on
|
||||
[PyPI](https://pypi.org/project/mediapipe/), which can be installed simply with
|
||||
`pip install mediapipe`. It also provides tools for users to build their own
|
||||
solutions. Please see
|
||||
[MediaPipe in Python](https://google.github.io/mediapipe/getting_started/python)
|
||||
for more info.
|
||||
|
||||
## MediaPipe on the Web
|
||||
|
||||
MediaPipe on the Web is an effort to run the same ML solutions built for mobile
|
||||
and desktop also in web browsers. The official API is under construction, but
|
||||
the core technology has been proven effective. Please see
|
||||
[MediaPipe on the Web](https://developers.googleblog.com/2020/01/mediapipe-on-web.html)
|
||||
in Google Developers Blog for details.
|
||||
|
||||
You can use the following links to load a demo in the MediaPipe Visualizer, and
|
||||
over there click the "Runner" icon in the top bar like shown below. The demos
|
||||
use your webcam video as input, which is processed all locally in real-time and
|
||||
never leaves your device.
|
||||
|
||||

|
||||
|
||||
* [MediaPipe Face Detection](https://viz.mediapipe.dev/demo/face_detection)
|
||||
* [MediaPipe Iris](https://viz.mediapipe.dev/demo/iris_tracking)
|
||||
* [MediaPipe Iris: Depth-from-Iris](https://viz.mediapipe.dev/demo/iris_depth)
|
||||
* [MediaPipe Hands](https://viz.mediapipe.dev/demo/hand_tracking)
|
||||
* [MediaPipe Hands (palm/hand detection only)](https://viz.mediapipe.dev/demo/hand_detection)
|
||||
* [MediaPipe Pose](https://viz.mediapipe.dev/demo/pose_tracking)
|
||||
* [MediaPipe Hair Segmentation](https://viz.mediapipe.dev/demo/hair_segmentation)
|
||||
|
||||
## Getting started
|
||||
|
||||
Learn how to [install](https://google.github.io/mediapipe/getting_started/install)
|
||||
MediaPipe and
|
||||
[build example applications](https://google.github.io/mediapipe/getting_started/building_examples),
|
||||
and start exploring our ready-to-use
|
||||
[solutions](https://google.github.io/mediapipe/solutions/solutions) that you can
|
||||
further extend and customize.
|
||||
To start using MediaPipe
|
||||
[solutions](https://google.github.io/mediapipe/solutions/solutions) with only a few
|
||||
lines code, see example code and demos in
|
||||
[MediaPipe in Python](https://google.github.io/mediapipe/getting_started/python) and
|
||||
[MediaPipe in JavaScript](https://google.github.io/mediapipe/getting_started/javascript).
|
||||
|
||||
To use MediaPipe in C++, Android and iOS, which allow further customization of
|
||||
the [solutions](https://google.github.io/mediapipe/solutions/solutions) as well as
|
||||
building your own, learn how to
|
||||
[install](https://google.github.io/mediapipe/getting_started/install) MediaPipe and
|
||||
start building example applications in
|
||||
[C++](https://google.github.io/mediapipe/getting_started/cpp),
|
||||
[Android](https://google.github.io/mediapipe/getting_started/android) and
|
||||
[iOS](https://google.github.io/mediapipe/getting_started/ios).
|
||||
|
||||
The source code is hosted in the
|
||||
[MediaPipe Github repository](https://github.com/google/mediapipe), and you can
|
||||
|
@ -167,6 +144,13 @@ bash build_macos_desktop_examples.sh --cpu i386 --app face_detection -r
|
|||
|
||||
## Publications
|
||||
|
||||
* [Bringing artworks to life with AR](https://developers.googleblog.com/2021/07/bringing-artworks-to-life-with-ar.html)
|
||||
in Google Developers Blog
|
||||
* [Prosthesis control via Mirru App using MediaPipe hand tracking](https://developers.googleblog.com/2021/05/control-your-mirru-prosthesis-with-mediapipe-hand-tracking.html)
|
||||
in Google Developers Blog
|
||||
* [SignAll SDK: Sign language interface using MediaPipe is now available for
|
||||
developers](https://developers.googleblog.com/2021/04/signall-sdk-sign-language-interface-using-mediapipe-now-available.html)
|
||||
in Google Developers Blog
|
||||
* [MediaPipe Holistic - Simultaneous Face, Hand and Pose Prediction, on Device](https://ai.googleblog.com/2020/12/mediapipe-holistic-simultaneous-face.html)
|
||||
in Google AI Blog
|
||||
* [Background Features in Google Meet, Powered by Web ML](https://ai.googleblog.com/2020/10/background-features-in-google-meet.html)
|
||||
|
|
76
WORKSPACE
76
WORKSPACE
|
@ -65,26 +65,19 @@ rules_foreign_cc_dependencies()
|
|||
all_content = """filegroup(name = "all", srcs = glob(["**"]), visibility = ["//visibility:public"])"""
|
||||
|
||||
# GoogleTest/GoogleMock framework. Used by most unit-tests.
|
||||
# Last updated 2020-06-30.
|
||||
# Last updated 2021-07-02.
|
||||
http_archive(
|
||||
name = "com_google_googletest",
|
||||
urls = ["https://github.com/google/googletest/archive/aee0f9d9b5b87796ee8a0ab26b7587ec30e8858e.zip"],
|
||||
patches = [
|
||||
# fix for https://github.com/google/googletest/issues/2817
|
||||
"@//third_party:com_google_googletest_9d580ea80592189e6d44fa35bcf9cdea8bf620d6.diff"
|
||||
],
|
||||
patch_args = [
|
||||
"-p1",
|
||||
],
|
||||
strip_prefix = "googletest-aee0f9d9b5b87796ee8a0ab26b7587ec30e8858e",
|
||||
sha256 = "04a1751f94244307cebe695a69cc945f9387a80b0ef1af21394a490697c5c895",
|
||||
urls = ["https://github.com/google/googletest/archive/4ec4cd23f486bf70efcc5d2caa40f24368f752e3.zip"],
|
||||
strip_prefix = "googletest-4ec4cd23f486bf70efcc5d2caa40f24368f752e3",
|
||||
sha256 = "de682ea824bfffba05b4e33b67431c247397d6175962534305136aa06f92e049",
|
||||
)
|
||||
|
||||
# Google Benchmark library.
|
||||
http_archive(
|
||||
name = "com_google_benchmark",
|
||||
urls = ["https://github.com/google/benchmark/archive/master.zip"],
|
||||
strip_prefix = "benchmark-master",
|
||||
urls = ["https://github.com/google/benchmark/archive/main.zip"],
|
||||
strip_prefix = "benchmark-main",
|
||||
build_file = "@//third_party:benchmark.BUILD",
|
||||
)
|
||||
|
||||
|
@ -176,11 +169,11 @@ http_archive(
|
|||
http_archive(
|
||||
name = "pybind11",
|
||||
urls = [
|
||||
"https://storage.googleapis.com/mirror.tensorflow.org/github.com/pybind/pybind11/archive/v2.4.3.tar.gz",
|
||||
"https://github.com/pybind/pybind11/archive/v2.4.3.tar.gz",
|
||||
"https://storage.googleapis.com/mirror.tensorflow.org/github.com/pybind/pybind11/archive/v2.7.1.tar.gz",
|
||||
"https://github.com/pybind/pybind11/archive/v2.7.1.tar.gz",
|
||||
],
|
||||
sha256 = "1eed57bc6863190e35637290f97a20c81cfe4d9090ac0a24f3bbf08f265eb71d",
|
||||
strip_prefix = "pybind11-2.4.3",
|
||||
sha256 = "616d1c42e4cf14fa27b2a4ff759d7d7b33006fdc5ad8fd603bb2c22622f27020",
|
||||
strip_prefix = "pybind11-2.7.1",
|
||||
build_file = "@pybind11_bazel//:pybind11.BUILD",
|
||||
)
|
||||
|
||||
|
@ -254,6 +247,20 @@ http_archive(
|
|||
url = "https://github.com/opencv/opencv/releases/download/3.2.0/opencv-3.2.0-ios-framework.zip",
|
||||
)
|
||||
|
||||
http_archive(
|
||||
name = "stblib",
|
||||
strip_prefix = "stb-b42009b3b9d4ca35bc703f5310eedc74f584be58",
|
||||
sha256 = "13a99ad430e930907f5611325ec384168a958bf7610e63e60e2fd8e7b7379610",
|
||||
urls = ["https://github.com/nothings/stb/archive/b42009b3b9d4ca35bc703f5310eedc74f584be58.tar.gz"],
|
||||
build_file = "@//third_party:stblib.BUILD",
|
||||
patches = [
|
||||
"@//third_party:stb_image_impl.diff"
|
||||
],
|
||||
patch_args = [
|
||||
"-p1",
|
||||
],
|
||||
)
|
||||
|
||||
# You may run setup_android.sh to install Android SDK and NDK.
|
||||
android_ndk_repository(
|
||||
name = "androidndk",
|
||||
|
@ -336,7 +343,9 @@ load("@rules_jvm_external//:defs.bzl", "maven_install")
|
|||
maven_install(
|
||||
artifacts = [
|
||||
"androidx.concurrent:concurrent-futures:1.0.0-alpha03",
|
||||
"androidx.lifecycle:lifecycle-common:2.2.0",
|
||||
"androidx.lifecycle:lifecycle-common:2.3.1",
|
||||
"androidx.activity:activity:1.2.2",
|
||||
"androidx.fragment:fragment:1.3.4",
|
||||
"androidx.annotation:annotation:aar:1.1.0",
|
||||
"androidx.appcompat:appcompat:aar:1.1.0-rc01",
|
||||
"androidx.camera:camera-core:1.0.0-beta10",
|
||||
|
@ -349,11 +358,11 @@ maven_install(
|
|||
"androidx.test.espresso:espresso-core:3.1.1",
|
||||
"com.github.bumptech.glide:glide:4.11.0",
|
||||
"com.google.android.material:material:aar:1.0.0-rc01",
|
||||
"com.google.auto.value:auto-value:1.6.4",
|
||||
"com.google.auto.value:auto-value-annotations:1.6.4",
|
||||
"com.google.code.findbugs:jsr305:3.0.2",
|
||||
"com.google.flogger:flogger-system-backend:0.3.1",
|
||||
"com.google.flogger:flogger:0.3.1",
|
||||
"com.google.auto.value:auto-value:1.8.1",
|
||||
"com.google.auto.value:auto-value-annotations:1.8.1",
|
||||
"com.google.code.findbugs:jsr305:latest.release",
|
||||
"com.google.flogger:flogger-system-backend:latest.release",
|
||||
"com.google.flogger:flogger:latest.release",
|
||||
"com.google.guava:guava:27.0.1-android",
|
||||
"com.google.guava:listenablefuture:1.0",
|
||||
"junit:junit:4.12",
|
||||
|
@ -381,9 +390,9 @@ http_archive(
|
|||
)
|
||||
|
||||
# Tensorflow repo should always go after the other external dependencies.
|
||||
# 2021-04-30
|
||||
_TENSORFLOW_GIT_COMMIT = "5bd3c57ef184543d22e34e36cff9d9bea608e06d"
|
||||
_TENSORFLOW_SHA256= "9a45862834221aafacf6fb275f92b3876bc89443cbecc51be93f13839a6609f0"
|
||||
# 2021-07-29
|
||||
_TENSORFLOW_GIT_COMMIT = "52a2905cbc21034766c08041933053178c5d10e3"
|
||||
_TENSORFLOW_SHA256 = "06d4691bcdb700f3275fa0971a1585221c2b9f3dffe867963be565a6643d7f56"
|
||||
http_archive(
|
||||
name = "org_tensorflow",
|
||||
urls = [
|
||||
|
@ -404,3 +413,18 @@ load("@org_tensorflow//tensorflow:workspace3.bzl", "tf_workspace3")
|
|||
tf_workspace3()
|
||||
load("@org_tensorflow//tensorflow:workspace2.bzl", "tf_workspace2")
|
||||
tf_workspace2()
|
||||
|
||||
# Edge TPU
|
||||
http_archive(
|
||||
name = "libedgetpu",
|
||||
sha256 = "14d5527a943a25bc648c28a9961f954f70ba4d79c0a9ca5ae226e1831d72fe80",
|
||||
strip_prefix = "libedgetpu-3164995622300286ef2bb14d7fdc2792dae045b7",
|
||||
urls = [
|
||||
"https://github.com/google-coral/libedgetpu/archive/3164995622300286ef2bb14d7fdc2792dae045b7.tar.gz"
|
||||
],
|
||||
)
|
||||
load("@libedgetpu//:workspace.bzl", "libedgetpu_dependencies")
|
||||
libedgetpu_dependencies()
|
||||
|
||||
load("@coral_crosstool//:configure.bzl", "cc_crosstool")
|
||||
cc_crosstool(name = "crosstool")
|
||||
|
|
|
@ -97,6 +97,7 @@ for app in ${apps}; do
|
|||
if [[ ${target_name} == "holistic_tracking" ||
|
||||
${target_name} == "iris_tracking" ||
|
||||
${target_name} == "pose_tracking" ||
|
||||
${target_name} == "selfie_segmentation" ||
|
||||
${target_name} == "upper_body_pose_tracking" ]]; then
|
||||
graph_suffix="cpu"
|
||||
else
|
||||
|
|
|
@ -248,12 +248,70 @@ absl::Status MyCalculator::Process() {
|
|||
}
|
||||
```
|
||||
|
||||
## Calculator options
|
||||
|
||||
Calculators accept processing parameters through (1) input stream packets (2)
|
||||
input side packets, and (3) calculator options. Calculator options, if
|
||||
specified, appear as literal values in the `node_options` field of the
|
||||
`CalculatorGraphConfiguration.Node` message.
|
||||
|
||||
```
|
||||
node {
|
||||
calculator: "TfLiteInferenceCalculator"
|
||||
input_stream: "TENSORS:main_model_input"
|
||||
output_stream: "TENSORS:main_model_output"
|
||||
node_options: {
|
||||
[type.googleapis.com/mediapipe.TfLiteInferenceCalculatorOptions] {
|
||||
model_path: "mediapipe/models/detection_model.tflite"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The `node_options` field accepts the proto3 syntax. Alternatively, calculator
|
||||
options can be specified in the `options` field using proto2 syntax.
|
||||
|
||||
```
|
||||
node {
|
||||
calculator: "TfLiteInferenceCalculator"
|
||||
input_stream: "TENSORS:main_model_input"
|
||||
output_stream: "TENSORS:main_model_output"
|
||||
node_options: {
|
||||
[type.googleapis.com/mediapipe.TfLiteInferenceCalculatorOptions] {
|
||||
model_path: "mediapipe/models/detection_model.tflite"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Not all calculators accept calcuator options. In order to accept options, a
|
||||
calculator will normally define a new protobuf message type to represent its
|
||||
options, such as `PacketClonerCalculatorOptions`. The calculator will then
|
||||
read that protobuf message in its `CalculatorBase::Open` method, and possibly
|
||||
also in its `CalculatorBase::GetContract` function or its
|
||||
`CalculatorBase::Process` method. Normally, the new protobuf message type will
|
||||
be defined as a protobuf schema using a ".proto" file and a
|
||||
`mediapipe_proto_library()` build rule.
|
||||
|
||||
```
|
||||
mediapipe_proto_library(
|
||||
name = "packet_cloner_calculator_proto",
|
||||
srcs = ["packet_cloner_calculator.proto"],
|
||||
visibility = ["//visibility:public"],
|
||||
deps = [
|
||||
"//mediapipe/framework:calculator_options_proto",
|
||||
"//mediapipe/framework:calculator_proto",
|
||||
],
|
||||
)
|
||||
```
|
||||
|
||||
|
||||
## Example calculator
|
||||
|
||||
This section discusses the implementation of `PacketClonerCalculator`, which
|
||||
does a relatively simple job, and is used in many calculator graphs.
|
||||
`PacketClonerCalculator` simply produces a copy of its most recent input
|
||||
packets on demand.
|
||||
`PacketClonerCalculator` simply produces a copy of its most recent input packets
|
||||
on demand.
|
||||
|
||||
`PacketClonerCalculator` is useful when the timestamps of arriving data packets
|
||||
are not aligned perfectly. Suppose we have a room with a microphone, light
|
||||
|
@ -279,8 +337,8 @@ input streams:
|
|||
imageframe of video data representing video collected from camera in the
|
||||
room with timestamp.
|
||||
|
||||
Below is the implementation of the `PacketClonerCalculator`. You can see
|
||||
the `GetContract()`, `Open()`, and `Process()` methods as well as the instance
|
||||
Below is the implementation of the `PacketClonerCalculator`. You can see the
|
||||
`GetContract()`, `Open()`, and `Process()` methods as well as the instance
|
||||
variable `current_` which holds the most recent input packets.
|
||||
|
||||
```c++
|
||||
|
@ -401,6 +459,6 @@ node {
|
|||
The diagram below shows how the `PacketClonerCalculator` defines its output
|
||||
packets (bottom) based on its series of input packets (top).
|
||||
|
||||
|  |
|
||||
| :---------------------------------------------------------------------------: |
|
||||
| *Each time it receives a packet on its TICK input stream, the PacketClonerCalculator outputs the most recent packet from each of its input streams. The sequence of output packets (bottom) is determined by the sequence of input packets (top) and their timestamps. The timestamps are shown along the right side of the diagram.* |
|
||||
 |
|
||||
:--------------------------------------------------------------------------: |
|
||||
*Each time it receives a packet on its TICK input stream, the PacketClonerCalculator outputs the most recent packet from each of its input streams. The sequence of output packets (bottom) is determined by the sequence of input packets (top) and their timestamps. The timestamps are shown along the right side of the diagram.* |
|
||||
|
|
|
@ -111,11 +111,11 @@ component known as an InputStreamHandler.
|
|||
|
||||
See [Synchronization](synchronization.md) for more details.
|
||||
|
||||
### Realtime data streams
|
||||
### Real-time streams
|
||||
|
||||
MediaPipe calculator graphs are often used to process streams of video or audio
|
||||
frames for interactive applications. Normally, each Calculator runs as soon as
|
||||
all of its input packets for a given timestamp become available. Calculators
|
||||
used in realtime graphs need to define output timestamp bounds based on input
|
||||
used in real-time graphs need to define output timestamp bounds based on input
|
||||
timestamp bounds in order to allow downstream calculators to be scheduled
|
||||
promptly. See [Realtime data streams](realtime.md) for details.
|
||||
promptly. See [Real-time Streams](realtime_streams.md) for details.
|
||||
|
|
|
@ -1,29 +1,28 @@
|
|||
---
|
||||
layout: default
|
||||
title: Processing real-time data streams
|
||||
title: Real-time Streams
|
||||
parent: Framework Concepts
|
||||
nav_order: 6
|
||||
has_children: true
|
||||
has_toc: false
|
||||
---
|
||||
|
||||
# Processing real-time data streams
|
||||
# Real-time Streams
|
||||
{: .no_toc }
|
||||
|
||||
1. TOC
|
||||
{:toc}
|
||||
---
|
||||
|
||||
## Realtime timestamps
|
||||
## Real-time timestamps
|
||||
|
||||
MediaPipe calculator graphs are often used to process streams of video or audio
|
||||
frames for interactive applications. The MediaPipe framework requires only that
|
||||
successive packets be assigned monotonically increasing timestamps. By
|
||||
convention, realtime calculators and graphs use the recording time or the
|
||||
convention, real-time calculators and graphs use the recording time or the
|
||||
presentation time of each frame as its timestamp, with each timestamp indicating
|
||||
the microseconds since `Jan/1/1970:00:00:00`. This allows packets from various
|
||||
sources to be processed in a globally consistent sequence.
|
||||
|
||||
## Realtime scheduling
|
||||
## Real-time scheduling
|
||||
|
||||
Normally, each Calculator runs as soon as all of its input packets for a given
|
||||
timestamp become available. Normally, this happens when the calculator has
|
||||
|
@ -38,7 +37,7 @@ When a calculator does not produce any output packets for a given timestamp, it
|
|||
can instead output a "timestamp bound" indicating that no packet will be
|
||||
produced for that timestamp. This indication is necessary to allow downstream
|
||||
calculators to run at that timestamp, even though no packet has arrived for
|
||||
certain streams for that timestamp. This is especially important for realtime
|
||||
certain streams for that timestamp. This is especially important for real-time
|
||||
graphs in interactive applications, where it is crucial that each calculator
|
||||
begin processing as soon as possible.
|
||||
|
||||
|
@ -83,12 +82,12 @@ For example, `Timestamp(1).NextAllowedInStream() == Timestamp(2)`.
|
|||
|
||||
## Propagating timestamp bounds
|
||||
|
||||
Calculators that will be used in realtime graphs need to define output timestamp
|
||||
bounds based on input timestamp bounds in order to allow downstream calculators
|
||||
to be scheduled promptly. A common pattern is for calculators to output packets
|
||||
with the same timestamps as their input packets. In this case, simply outputting
|
||||
a packet on every call to `Calculator::Process` is sufficient to define output
|
||||
timestamp bounds.
|
||||
Calculators that will be used in real-time graphs need to define output
|
||||
timestamp bounds based on input timestamp bounds in order to allow downstream
|
||||
calculators to be scheduled promptly. A common pattern is for calculators to
|
||||
output packets with the same timestamps as their input packets. In this case,
|
||||
simply outputting a packet on every call to `Calculator::Process` is sufficient
|
||||
to define output timestamp bounds.
|
||||
|
||||
However, calculators are not required to follow this common pattern for output
|
||||
timestamps, they are only required to choose monotonically increasing output
|
|
@ -16,12 +16,14 @@ nav_order: 1
|
|||
|
||||
Please follow instructions below to build Android example apps in the supported
|
||||
MediaPipe [solutions](../solutions/solutions.md). To learn more about these
|
||||
example apps, start from [Hello World! on Android](./hello_world_android.md). To
|
||||
incorporate MediaPipe into an existing Android Studio project, see these
|
||||
[instructions](./android_archive_library.md) that use Android Archive (AAR) and
|
||||
Gradle.
|
||||
example apps, start from [Hello World! on Android](./hello_world_android.md).
|
||||
|
||||
## Building Android example apps
|
||||
To incorporate MediaPipe into Android Studio projects, see these
|
||||
[instructions](./android_solutions.md) to use the MediaPipe Android Solution
|
||||
APIs (currently in alpha) that are now available in
|
||||
[Google's Maven Repository](https://maven.google.com/web/index.html?#com.google.mediapipe).
|
||||
|
||||
## Building Android example apps with Bazel
|
||||
|
||||
### Prerequisite
|
||||
|
||||
|
@ -51,16 +53,6 @@ $YOUR_INTENDED_API_LEVEL` in android_ndk_repository() and/or
|
|||
android_sdk_repository() in the
|
||||
[`WORKSPACE`](https://github.com/google/mediapipe/blob/master/WORKSPACE) file.
|
||||
|
||||
Please verify all the necessary packages are installed.
|
||||
|
||||
* Android SDK Platform API Level 28 or 29
|
||||
* Android SDK Build-Tools 28 or 29
|
||||
* Android SDK Platform-Tools 28 or 29
|
||||
* Android SDK Tools 26.1.1
|
||||
* Android NDK 19c or above
|
||||
|
||||
### Option 1: Build with Bazel in Command Line
|
||||
|
||||
Tip: You can run this
|
||||
[script](https://github.com/google/mediapipe/blob/master/build_android_examples.sh)
|
||||
to build (and install) all MediaPipe Android example apps.
|
||||
|
@ -84,108 +76,3 @@ to build (and install) all MediaPipe Android example apps.
|
|||
```bash
|
||||
adb install bazel-bin/mediapipe/examples/android/src/java/com/google/mediapipe/apps/handtrackinggpu/handtrackinggpu.apk
|
||||
```
|
||||
|
||||
### Option 2: Build with Bazel in Android Studio
|
||||
|
||||
The MediaPipe project can be imported into Android Studio using the Bazel
|
||||
plugins. This allows the MediaPipe examples to be built and modified in Android
|
||||
Studio.
|
||||
|
||||
To incorporate MediaPipe into an existing Android Studio project, see these
|
||||
[instructions](./android_archive_library.md) that use Android Archive (AAR) and
|
||||
Gradle.
|
||||
|
||||
The steps below use Android Studio 3.5 to build and install a MediaPipe example
|
||||
app:
|
||||
|
||||
1. Install and launch Android Studio 3.5.
|
||||
|
||||
2. Select `Configure` -> `SDK Manager` -> `SDK Platforms`.
|
||||
|
||||
* Verify that Android SDK Platform API Level 28 or 29 is installed.
|
||||
* Take note of the Android SDK Location, e.g.,
|
||||
`/usr/local/home/Android/Sdk`.
|
||||
|
||||
3. Select `Configure` -> `SDK Manager` -> `SDK Tools`.
|
||||
|
||||
* Verify that Android SDK Build-Tools 28 or 29 is installed.
|
||||
* Verify that Android SDK Platform-Tools 28 or 29 is installed.
|
||||
* Verify that Android SDK Tools 26.1.1 is installed.
|
||||
* Verify that Android NDK 19c or above is installed.
|
||||
* Take note of the Android NDK Location, e.g.,
|
||||
`/usr/local/home/Android/Sdk/ndk-bundle` or
|
||||
`/usr/local/home/Android/Sdk/ndk/20.0.5594570`.
|
||||
|
||||
4. Set environment variables `$ANDROID_HOME` and `$ANDROID_NDK_HOME` to point
|
||||
to the installed SDK and NDK.
|
||||
|
||||
```bash
|
||||
export ANDROID_HOME=/usr/local/home/Android/Sdk
|
||||
|
||||
# If the NDK libraries are installed by a previous version of Android Studio, do
|
||||
export ANDROID_NDK_HOME=/usr/local/home/Android/Sdk/ndk-bundle
|
||||
# If the NDK libraries are installed by Android Studio 3.5, do
|
||||
export ANDROID_NDK_HOME=/usr/local/home/Android/Sdk/ndk/<version number>
|
||||
```
|
||||
|
||||
5. Select `Configure` -> `Plugins` to install `Bazel`.
|
||||
|
||||
6. On Linux, select `File` -> `Settings` -> `Bazel settings`. On macos, select
|
||||
`Android Studio` -> `Preferences` -> `Bazel settings`. Then, modify `Bazel
|
||||
binary location` to be the same as the output of `$ which bazel`.
|
||||
|
||||
7. Select `Import Bazel Project`.
|
||||
|
||||
* Select `Workspace`: `/path/to/mediapipe` and select `Next`.
|
||||
* Select `Generate from BUILD file`: `/path/to/mediapipe/BUILD` and select
|
||||
`Next`.
|
||||
* Modify `Project View` to be the following and select `Finish`.
|
||||
|
||||
```
|
||||
directories:
|
||||
# read project settings, e.g., .bazelrc
|
||||
.
|
||||
-mediapipe/objc
|
||||
-mediapipe/examples/ios
|
||||
|
||||
targets:
|
||||
//mediapipe/examples/android/...:all
|
||||
//mediapipe/java/...:all
|
||||
|
||||
android_sdk_platform: android-29
|
||||
|
||||
sync_flags:
|
||||
--host_crosstool_top=@bazel_tools//tools/cpp:toolchain
|
||||
```
|
||||
|
||||
8. Select `Bazel` -> `Sync` -> `Sync project with Build files`.
|
||||
|
||||
Note: Even after doing step 4, if you still see the error: `"no such package
|
||||
'@androidsdk//': Either the path attribute of android_sdk_repository or the
|
||||
ANDROID_HOME environment variable must be set."`, please modify the
|
||||
[`WORKSPACE`](https://github.com/google/mediapipe/blob/master/WORKSPACE)
|
||||
file to point to your SDK and NDK library locations, as below:
|
||||
|
||||
```
|
||||
android_sdk_repository(
|
||||
name = "androidsdk",
|
||||
path = "/path/to/android/sdk"
|
||||
)
|
||||
|
||||
android_ndk_repository(
|
||||
name = "androidndk",
|
||||
path = "/path/to/android/ndk"
|
||||
)
|
||||
```
|
||||
|
||||
9. Connect an Android device to the workstation.
|
||||
|
||||
10. Select `Run...` -> `Edit Configurations...`.
|
||||
|
||||
* Select `Templates` -> `Bazel Command`.
|
||||
* Enter Target Expression:
|
||||
`//mediapipe/examples/android/src/java/com/google/mediapipe/apps/handtrackinggpu:handtrackinggpu`
|
||||
* Enter Bazel command: `mobile-install`.
|
||||
* Enter Bazel flags: `-c opt --config=android_arm64`.
|
||||
* Press the `[+]` button to add the new configuration.
|
||||
* Select `Run` to run the example app on the connected Android device.
|
||||
|
|
|
@ -3,7 +3,7 @@ layout: default
|
|||
title: MediaPipe Android Archive
|
||||
parent: MediaPipe on Android
|
||||
grand_parent: Getting Started
|
||||
nav_order: 2
|
||||
nav_order: 3
|
||||
---
|
||||
|
||||
# MediaPipe Android Archive
|
||||
|
@ -92,12 +92,12 @@ each project.
|
|||
and copy
|
||||
[the binary graph](https://github.com/google/mediapipe/blob/master/mediapipe/examples/android/src/java/com/google/mediapipe/apps/facedetectiongpu/BUILD#L41)
|
||||
and
|
||||
[the face detection tflite model](https://github.com/google/mediapipe/tree/master/mediapipe/modules/face_detection/face_detection_front.tflite).
|
||||
[the face detection tflite model](https://github.com/google/mediapipe/tree/master/mediapipe/modules/face_detection/face_detection_short_range.tflite).
|
||||
|
||||
```bash
|
||||
bazel build -c opt mediapipe/graphs/face_detection:face_detection_mobile_gpu_binary_graph
|
||||
cp bazel-bin/mediapipe/graphs/face_detection/face_detection_mobile_gpu.binarypb /path/to/your/app/src/main/assets/
|
||||
cp mediapipe/modules/face_detection/face_detection_front.tflite /path/to/your/app/src/main/assets/
|
||||
cp mediapipe/modules/face_detection/face_detection_short_range.tflite /path/to/your/app/src/main/assets/
|
||||
```
|
||||
|
||||

|
||||
|
@ -113,10 +113,9 @@ each project.
|
|||
androidTestImplementation 'androidx.test.ext:junit:1.1.0'
|
||||
androidTestImplementation 'androidx.test.espresso:espresso-core:3.1.1'
|
||||
// MediaPipe deps
|
||||
implementation 'com.google.flogger:flogger:0.3.1'
|
||||
implementation 'com.google.flogger:flogger-system-backend:0.3.1'
|
||||
implementation 'com.google.code.findbugs:jsr305:3.0.2'
|
||||
implementation 'com.google.guava:guava:27.0.1-android'
|
||||
implementation 'com.google.flogger:flogger:latest.release'
|
||||
implementation 'com.google.flogger:flogger-system-backend:latest.release'
|
||||
implementation 'com.google.code.findbugs:jsr305:latest.release'
|
||||
implementation 'com.google.guava:guava:27.0.1-android'
|
||||
implementation 'com.google.protobuf:protobuf-java:3.11.4'
|
||||
// CameraX core library
|
||||
|
@ -125,7 +124,7 @@ each project.
|
|||
implementation "androidx.camera:camera-camera2:$camerax_version"
|
||||
implementation "androidx.camera:camera-lifecycle:$camerax_version"
|
||||
// AutoValue
|
||||
def auto_value_version = "1.6.4"
|
||||
def auto_value_version = "1.8.1"
|
||||
implementation "com.google.auto.value:auto-value-annotations:$auto_value_version"
|
||||
annotationProcessor "com.google.auto.value:auto-value:$auto_value_version"
|
||||
}
|
||||
|
|
79
docs/getting_started/android_solutions.md
Normal file
79
docs/getting_started/android_solutions.md
Normal file
|
@ -0,0 +1,79 @@
|
|||
---
|
||||
layout: default
|
||||
title: Android Solutions
|
||||
parent: MediaPipe on Android
|
||||
grand_parent: Getting Started
|
||||
nav_order: 2
|
||||
---
|
||||
|
||||
# Android Solution APIs
|
||||
{: .no_toc }
|
||||
|
||||
1. TOC
|
||||
{:toc}
|
||||
---
|
||||
|
||||
Please follow instructions below to use the MediaPipe Solution APIs in Android
|
||||
Studio projects and build the Android example apps in the supported MediaPipe
|
||||
[solutions](../solutions/solutions.md).
|
||||
|
||||
## Integrate MediaPipe Android Solutions in Android Studio
|
||||
|
||||
MediaPipe Android Solution APIs (currently in alpha) are now available in
|
||||
[Google's Maven Repository](https://maven.google.com/web/index.html?#com.google.mediapipe).
|
||||
To incorporate MediaPipe Android Solutions into an Android Studio project, add
|
||||
the following into the project's Gradle dependencies:
|
||||
|
||||
```
|
||||
dependencies {
|
||||
// MediaPipe solution-core is the foundation of any MediaPipe solutions.
|
||||
implementation 'com.google.mediapipe:solution-core:latest.release'
|
||||
// Optional: MediaPipe Hands solution.
|
||||
implementation 'com.google.mediapipe:hands:latest.release'
|
||||
// Optional: MediaPipe FaceMesh solution.
|
||||
implementation 'com.google.mediapipe:facemesh:latest.release'
|
||||
// MediaPipe deps
|
||||
implementation 'com.google.flogger:flogger:latest.release'
|
||||
implementation 'com.google.flogger:flogger-system-backend:latest.release'
|
||||
implementation 'com.google.guava:guava:27.0.1-android'
|
||||
implementation 'com.google.protobuf:protobuf-java:3.11.4'
|
||||
// CameraX core library
|
||||
def camerax_version = "1.0.0-beta10"
|
||||
implementation "androidx.camera:camera-core:$camerax_version"
|
||||
implementation "androidx.camera:camera-camera2:$camerax_version"
|
||||
implementation "androidx.camera:camera-lifecycle:$camerax_version"
|
||||
}
|
||||
```
|
||||
|
||||
See the detailed solutions API usage examples for different use cases in the
|
||||
solution example apps'
|
||||
[source code](https://github.com/google/mediapipe/tree/master/mediapipe/examples/android/solutions).
|
||||
If the prebuilt maven packages are not sufficient, building the MediaPipe
|
||||
Android archive library locally by following these
|
||||
[instructions](./android_archive_library.md).
|
||||
|
||||
## Build solution example apps in Android Studio
|
||||
|
||||
1. Open Android Studio Arctic Fox on Linux, macOS, or Windows.
|
||||
|
||||
2. Import mediapipe/examples/android/solutions directory into Android Studio.
|
||||
|
||||

|
||||
|
||||
3. For Windows users, run `create_win_symlinks.bat` as administrator to create
|
||||
res directory symlinks.
|
||||
|
||||

|
||||
|
||||
4. Select "File" -> "Sync Project with Gradle Files" to sync project.
|
||||
|
||||
5. Run solution example app in Android Studio.
|
||||
|
||||

|
||||
|
||||
6. (Optional) Run solutions on CPU.
|
||||
|
||||
MediaPipe solution example apps run the pipeline and the model inference on
|
||||
GPU by default. If needed, for example to run the apps on Android Emulator,
|
||||
set the `RUN_ON_GPU` boolean variable to `false` in the app's
|
||||
MainActivity.java to run the pipeline and the model inference on CPU.
|
|
@ -31,8 +31,8 @@ stream on an Android device.
|
|||
|
||||
## Setup
|
||||
|
||||
1. Install MediaPipe on your system, see [MediaPipe installation guide] for
|
||||
details.
|
||||
1. Install MediaPipe on your system, see
|
||||
[MediaPipe installation guide](./install.md) for details.
|
||||
2. Install Android Development SDK and Android NDK. See how to do so also in
|
||||
[MediaPipe installation guide].
|
||||
3. Enable [developer options] on your Android device.
|
||||
|
@ -770,7 +770,6 @@ If you ran into any issues, please see the full code of the tutorial
|
|||
[`ExternalTextureConverter`]:https://github.com/google/mediapipe/tree/master/mediapipe/java/com/google/mediapipe/components/ExternalTextureConverter.java
|
||||
[`FrameLayout`]:https://developer.android.com/reference/android/widget/FrameLayout
|
||||
[`FrameProcessor`]:https://github.com/google/mediapipe/tree/master/mediapipe/java/com/google/mediapipe/components/FrameProcessor.java
|
||||
[MediaPipe installation guide]:./install.md
|
||||
[`PermissionHelper`]: https://github.com/google/mediapipe/tree/master/mediapipe/java/com/google/mediapipe/components/PermissionHelper.java
|
||||
[`SurfaceHolder.Callback`]:https://developer.android.com/reference/android/view/SurfaceHolder.Callback.html
|
||||
[`SurfaceView`]:https://developer.android.com/reference/android/view/SurfaceView
|
||||
|
|
|
@ -31,8 +31,8 @@ stream on an iOS device.
|
|||
|
||||
## Setup
|
||||
|
||||
1. Install MediaPipe on your system, see [MediaPipe installation guide] for
|
||||
details.
|
||||
1. Install MediaPipe on your system, see
|
||||
[MediaPipe installation guide](./install.md) for details.
|
||||
2. Setup your iOS device for development.
|
||||
3. Setup [Bazel] on your system to build and deploy the iOS app.
|
||||
|
||||
|
@ -113,6 +113,10 @@ bazel to build the iOS application. The content of the
|
|||
5. `Main.storyboard` and `Launch.storyboard`
|
||||
6. `Assets.xcassets` directory.
|
||||
|
||||
Note: In newer versions of Xcode, you may see additional files `SceneDelegate.h`
|
||||
and `SceneDelegate.m`. Make sure to copy them too and add them to the `BUILD`
|
||||
file mentioned below.
|
||||
|
||||
Copy these files to a directory named `HelloWorld` to a location that can access
|
||||
the MediaPipe source code. For example, the source code of the application that
|
||||
we will build in this tutorial is located in
|
||||
|
@ -247,6 +251,12 @@ We need to get frames from the `_cameraSource` into our application
|
|||
`MPPInputSourceDelegate`. So our application `ViewController` can be a delegate
|
||||
of `_cameraSource`.
|
||||
|
||||
Update the interface definition of `ViewController` accordingly:
|
||||
|
||||
```
|
||||
@interface ViewController () <MPPInputSourceDelegate>
|
||||
```
|
||||
|
||||
To handle camera setup and process incoming frames, we should use a queue
|
||||
different from the main queue. Add the following to the implementation block of
|
||||
the `ViewController`:
|
||||
|
@ -288,6 +298,12 @@ utility called `MPPLayerRenderer` to display images on the screen. This utility
|
|||
can be used to display `CVPixelBufferRef` objects, which is the type of the
|
||||
images provided by `MPPCameraInputSource` to its delegates.
|
||||
|
||||
In `ViewController.m`, add the following import line:
|
||||
|
||||
```
|
||||
#import "mediapipe/objc/MPPLayerRenderer.h"
|
||||
```
|
||||
|
||||
To display images of the screen, we need to add a new `UIView` object called
|
||||
`_liveView` to the `ViewController`.
|
||||
|
||||
|
@ -411,6 +427,12 @@ Objective-C++.
|
|||
|
||||
### Use the graph in `ViewController`
|
||||
|
||||
In `ViewController.m`, add the following import line:
|
||||
|
||||
```
|
||||
#import "mediapipe/objc/MPPGraph.h"
|
||||
```
|
||||
|
||||
Declare a static constant with the name of the graph, the input stream and the
|
||||
output stream:
|
||||
|
||||
|
@ -549,6 +571,12 @@ method to receive packets on this output stream and display them on the screen:
|
|||
}
|
||||
```
|
||||
|
||||
Update the interface definition of `ViewController` with `MPPGraphDelegate`:
|
||||
|
||||
```
|
||||
@interface ViewController () <MPPGraphDelegate, MPPInputSourceDelegate>
|
||||
```
|
||||
|
||||
And that is all! Build and run the app on your iOS device. You should see the
|
||||
results of running the edge detection graph on a live video feed. Congrats!
|
||||
|
||||
|
@ -560,6 +588,5 @@ appropriate `BUILD` file dependencies for the edge detection graph.
|
|||
|
||||
[Bazel]:https://bazel.build/
|
||||
[`edge_detection_mobile_gpu.pbtxt`]:https://github.com/google/mediapipe/tree/master/mediapipe/graphs/edge_detection/edge_detection_mobile_gpu.pbtxt
|
||||
[MediaPipe installation guide]:./install.md
|
||||
[common]:(https://github.com/google/mediapipe/tree/master/mediapipe/examples/ios/common)
|
||||
[helloworld]:(https://github.com/google/mediapipe/tree/master/mediapipe/examples/ios/helloworld)
|
||||
[common]:https://github.com/google/mediapipe/tree/master/mediapipe/examples/ios/common
|
||||
[helloworld]:https://github.com/google/mediapipe/tree/master/mediapipe/examples/ios/helloworld
|
||||
|
|
|
@ -43,104 +43,189 @@ install --user six`.
|
|||
|
||||
3. Install OpenCV and FFmpeg.
|
||||
|
||||
Option 1. Use package manager tool to install the pre-compiled OpenCV
|
||||
libraries. FFmpeg will be installed via libopencv-video-dev.
|
||||
**Option 1**. Use package manager tool to install the pre-compiled OpenCV
|
||||
libraries. FFmpeg will be installed via `libopencv-video-dev`.
|
||||
|
||||
Note: Debian 9 and Ubuntu 16.04 provide OpenCV 2.4.9. You may want to take
|
||||
option 2 or 3 to install OpenCV 3 or above.
|
||||
OS | OpenCV
|
||||
-------------------- | ------
|
||||
Debian 9 (stretch) | 2.4
|
||||
Debian 10 (buster) | 3.2
|
||||
Debian 11 (bullseye) | 4.5
|
||||
Ubuntu 16.04 LTS | 2.4
|
||||
Ubuntu 18.04 LTS | 3.2
|
||||
Ubuntu 20.04 LTS | 4.2
|
||||
Ubuntu 20.04 LTS | 4.2
|
||||
Ubuntu 21.04 | 4.5
|
||||
|
||||
```bash
|
||||
$ sudo apt-get install libopencv-core-dev libopencv-highgui-dev \
|
||||
libopencv-calib3d-dev libopencv-features2d-dev \
|
||||
libopencv-imgproc-dev libopencv-video-dev
|
||||
$ sudo apt-get install -y \
|
||||
libopencv-core-dev \
|
||||
libopencv-highgui-dev \
|
||||
libopencv-calib3d-dev \
|
||||
libopencv-features2d-dev \
|
||||
libopencv-imgproc-dev \
|
||||
libopencv-video-dev
|
||||
```
|
||||
|
||||
Debian 9 and Ubuntu 18.04 install the packages in
|
||||
`/usr/lib/x86_64-linux-gnu`. MediaPipe's [`opencv_linux.BUILD`] and
|
||||
[`ffmpeg_linux.BUILD`] are configured for this library path. Ubuntu 20.04
|
||||
may install the OpenCV and FFmpeg packages in `/usr/local`, Please follow
|
||||
the option 3 below to modify the [`WORKSPACE`], [`opencv_linux.BUILD`] and
|
||||
[`ffmpeg_linux.BUILD`] files accordingly.
|
||||
|
||||
Moreover, for Nvidia Jetson and Raspberry Pi devices with ARM Ubuntu, the
|
||||
library path needs to be modified like the following:
|
||||
MediaPipe's [`opencv_linux.BUILD`] and [`WORKSPACE`] are already configured
|
||||
for OpenCV 2/3 and should work correctly on any architecture:
|
||||
|
||||
```bash
|
||||
sed -i "s/x86_64-linux-gnu/aarch64-linux-gnu/g" third_party/opencv_linux.BUILD
|
||||
# WORKSPACE
|
||||
new_local_repository(
|
||||
name = "linux_opencv",
|
||||
build_file = "@//third_party:opencv_linux.BUILD",
|
||||
path = "/usr",
|
||||
)
|
||||
|
||||
# opencv_linux.BUILD for OpenCV 2/3 installed from Debian package
|
||||
cc_library(
|
||||
name = "opencv",
|
||||
linkopts = [
|
||||
"-l:libopencv_core.so",
|
||||
"-l:libopencv_calib3d.so",
|
||||
"-l:libopencv_features2d.so",
|
||||
"-l:libopencv_highgui.so",
|
||||
"-l:libopencv_imgcodecs.so",
|
||||
"-l:libopencv_imgproc.so",
|
||||
"-l:libopencv_video.so",
|
||||
"-l:libopencv_videoio.so",
|
||||
],
|
||||
)
|
||||
```
|
||||
|
||||
Option 2. Run [`setup_opencv.sh`] to automatically build OpenCV from source
|
||||
and modify MediaPipe's OpenCV config.
|
||||
For OpenCV 4 you need to modify [`opencv_linux.BUILD`] taking into account
|
||||
current architecture:
|
||||
|
||||
Option 3. Follow OpenCV's
|
||||
```bash
|
||||
# WORKSPACE
|
||||
new_local_repository(
|
||||
name = "linux_opencv",
|
||||
build_file = "@//third_party:opencv_linux.BUILD",
|
||||
path = "/usr",
|
||||
)
|
||||
|
||||
# opencv_linux.BUILD for OpenCV 4 installed from Debian package
|
||||
cc_library(
|
||||
name = "opencv",
|
||||
hdrs = glob([
|
||||
# Uncomment according to your multiarch value (gcc -print-multiarch):
|
||||
# "include/aarch64-linux-gnu/opencv4/opencv2/cvconfig.h",
|
||||
# "include/arm-linux-gnueabihf/opencv4/opencv2/cvconfig.h",
|
||||
# "include/x86_64-linux-gnu/opencv4/opencv2/cvconfig.h",
|
||||
"include/opencv4/opencv2/**/*.h*",
|
||||
]),
|
||||
includes = [
|
||||
# Uncomment according to your multiarch value (gcc -print-multiarch):
|
||||
# "include/aarch64-linux-gnu/opencv4/",
|
||||
# "include/arm-linux-gnueabihf/opencv4/",
|
||||
# "include/x86_64-linux-gnu/opencv4/",
|
||||
"include/opencv4/",
|
||||
],
|
||||
linkopts = [
|
||||
"-l:libopencv_core.so",
|
||||
"-l:libopencv_calib3d.so",
|
||||
"-l:libopencv_features2d.so",
|
||||
"-l:libopencv_highgui.so",
|
||||
"-l:libopencv_imgcodecs.so",
|
||||
"-l:libopencv_imgproc.so",
|
||||
"-l:libopencv_video.so",
|
||||
"-l:libopencv_videoio.so",
|
||||
],
|
||||
)
|
||||
```
|
||||
|
||||
**Option 2**. Run [`setup_opencv.sh`] to automatically build OpenCV from
|
||||
source and modify MediaPipe's OpenCV config. This option will do all steps
|
||||
defined in Option 3 automatically.
|
||||
|
||||
**Option 3**. Follow OpenCV's
|
||||
[documentation](https://docs.opencv.org/3.4.6/d7/d9f/tutorial_linux_install.html)
|
||||
to manually build OpenCV from source code.
|
||||
|
||||
Note: You may need to modify [`WORKSPACE`], [`opencv_linux.BUILD`] and
|
||||
[`ffmpeg_linux.BUILD`] to point MediaPipe to your own OpenCV and FFmpeg
|
||||
libraries. For example if OpenCV and FFmpeg are both manually installed in
|
||||
"/usr/local/", you will need to update: (1) the "linux_opencv" and
|
||||
"linux_ffmpeg" new_local_repository rules in [`WORKSPACE`], (2) the "opencv"
|
||||
cc_library rule in [`opencv_linux.BUILD`], and (3) the "libffmpeg"
|
||||
cc_library rule in [`ffmpeg_linux.BUILD`]. These 3 changes are shown below:
|
||||
You may need to modify [`WORKSPACE`] and [`opencv_linux.BUILD`] to point
|
||||
MediaPipe to your own OpenCV libraries. Assume OpenCV would be installed to
|
||||
`/usr/local/` which is recommended by default.
|
||||
|
||||
OpenCV 2/3 setup:
|
||||
|
||||
```bash
|
||||
# WORKSPACE
|
||||
new_local_repository(
|
||||
name = "linux_opencv",
|
||||
build_file = "@//third_party:opencv_linux.BUILD",
|
||||
path = "/usr/local",
|
||||
)
|
||||
|
||||
# opencv_linux.BUILD for OpenCV 2/3 installed to /usr/local
|
||||
cc_library(
|
||||
name = "opencv",
|
||||
linkopts = [
|
||||
"-L/usr/local/lib",
|
||||
"-l:libopencv_core.so",
|
||||
"-l:libopencv_calib3d.so",
|
||||
"-l:libopencv_features2d.so",
|
||||
"-l:libopencv_highgui.so",
|
||||
"-l:libopencv_imgcodecs.so",
|
||||
"-l:libopencv_imgproc.so",
|
||||
"-l:libopencv_video.so",
|
||||
"-l:libopencv_videoio.so",
|
||||
],
|
||||
)
|
||||
```
|
||||
|
||||
OpenCV 4 setup:
|
||||
|
||||
```bash
|
||||
# WORKSPACE
|
||||
new_local_repository(
|
||||
name = "linux_ffmpeg",
|
||||
build_file = "@//third_party:ffmpeg_linux.BUILD",
|
||||
name = "linux_opencv",
|
||||
build_file = "@//third_party:opencv_linux.BUILD",
|
||||
path = "/usr/local",
|
||||
)
|
||||
|
||||
# opencv_linux.BUILD for OpenCV 4 installed to /usr/local
|
||||
cc_library(
|
||||
name = "opencv",
|
||||
srcs = glob(
|
||||
[
|
||||
"lib/libopencv_core.so",
|
||||
"lib/libopencv_highgui.so",
|
||||
"lib/libopencv_imgcodecs.so",
|
||||
"lib/libopencv_imgproc.so",
|
||||
"lib/libopencv_video.so",
|
||||
"lib/libopencv_videoio.so",
|
||||
],
|
||||
),
|
||||
hdrs = glob([
|
||||
# For OpenCV 3.x
|
||||
"include/opencv2/**/*.h*",
|
||||
# For OpenCV 4.x
|
||||
# "include/opencv4/opencv2/**/*.h*",
|
||||
"include/opencv4/opencv2/**/*.h*",
|
||||
]),
|
||||
includes = [
|
||||
# For OpenCV 3.x
|
||||
"include/",
|
||||
# For OpenCV 4.x
|
||||
# "include/opencv4/",
|
||||
"include/opencv4/",
|
||||
],
|
||||
linkstatic = 1,
|
||||
visibility = ["//visibility:public"],
|
||||
linkopts = [
|
||||
"-L/usr/local/lib",
|
||||
"-l:libopencv_core.so",
|
||||
"-l:libopencv_calib3d.so",
|
||||
"-l:libopencv_features2d.so",
|
||||
"-l:libopencv_highgui.so",
|
||||
"-l:libopencv_imgcodecs.so",
|
||||
"-l:libopencv_imgproc.so",
|
||||
"-l:libopencv_video.so",
|
||||
"-l:libopencv_videoio.so",
|
||||
],
|
||||
)
|
||||
```
|
||||
|
||||
Current FFmpeg setup is defined in [`ffmpeg_linux.BUILD`] and should work
|
||||
for any architecture:
|
||||
|
||||
```bash
|
||||
# WORKSPACE
|
||||
new_local_repository(
|
||||
name = "linux_ffmpeg",
|
||||
build_file = "@//third_party:ffmpeg_linux.BUILD",
|
||||
path = "/usr"
|
||||
)
|
||||
|
||||
# ffmpeg_linux.BUILD for FFmpeg installed from Debian package
|
||||
cc_library(
|
||||
name = "libffmpeg",
|
||||
srcs = glob(
|
||||
[
|
||||
"lib/libav*.so",
|
||||
],
|
||||
),
|
||||
hdrs = glob(["include/libav*/*.h"]),
|
||||
includes = ["include"],
|
||||
linkopts = [
|
||||
"-lavcodec",
|
||||
"-lavformat",
|
||||
"-lavutil",
|
||||
"-l:libavcodec.so",
|
||||
"-l:libavformat.so",
|
||||
"-l:libavutil.so",
|
||||
],
|
||||
linkstatic = 1,
|
||||
visibility = ["//visibility:public"],
|
||||
)
|
||||
```
|
||||
|
||||
|
@ -711,7 +796,7 @@ This will use a Docker image that will isolate mediapipe's installation from the
|
|||
```bash
|
||||
$ docker run -it --name mediapipe mediapipe:latest
|
||||
|
||||
root@bca08b91ff63:/mediapipe# GLOG_logtostderr=1 bazel run --define MEDIAPIPE_DISABLE_GPU=1 mediapipe/examples/desktop/hello_world:hello_world
|
||||
root@bca08b91ff63:/mediapipe# GLOG_logtostderr=1 bazelisk run --define MEDIAPIPE_DISABLE_GPU=1 mediapipe/examples/desktop/hello_world:hello_world
|
||||
|
||||
# Should print:
|
||||
# Hello World!
|
||||
|
|
|
@ -17,16 +17,28 @@ nav_order: 4
|
|||
MediaPipe currently offers the following solutions:
|
||||
|
||||
Solution | NPM Package | Example
|
||||
----------------- | ----------------------------- | -------
|
||||
--------------------------- | --------------------------------------- | -------
|
||||
[Face Mesh][F-pg] | [@mediapipe/face_mesh][F-npm] | [mediapipe.dev/demo/face_mesh][F-demo]
|
||||
[Face Detection][Fd-pg] | [@mediapipe/face_detection][Fd-npm] | [mediapipe.dev/demo/face_detection][Fd-demo]
|
||||
[Hands][H-pg] | [@mediapipe/hands][H-npm] | [mediapipe.dev/demo/hands][H-demo]
|
||||
[Holistic][Ho-pg] | [@mediapipe/holistic][Ho-npm] | [mediapipe.dev/demo/holistic][Ho-demo]
|
||||
[Objectron][Ob-pg] | [@mediapipe/objectron][Ob-npm] | [mediapipe.dev/demo/objectron][Ob-demo]
|
||||
[Pose][P-pg] | [@mediapipe/pose][P-npm] | [mediapipe.dev/demo/pose][P-demo]
|
||||
[Selfie Segmentation][S-pg] | [@mediapipe/selfie_segmentation][S-npm] | [mediapipe.dev/demo/selfie_segmentation][S-demo]
|
||||
|
||||
Click on a solution link above for more information, including API and code
|
||||
snippets.
|
||||
|
||||
### Supported plaforms:
|
||||
|
||||
| Browser | Platform | Notes |
|
||||
| ------- | ----------------------- | -------------------------------------- |
|
||||
| Chrome | Android / Windows / Mac | Pixel 4 and older unsupported. Fuschia |
|
||||
| | | unsupported. |
|
||||
| Chrome | iOS | Camera unavailable in Chrome on iOS. |
|
||||
| Safari | iPad/iPhone/Mac | iOS and Safari on iPad / iPhone / |
|
||||
| | | MacBook |
|
||||
|
||||
The quickest way to get acclimated is to look at the examples above. Each demo
|
||||
has a link to a [CodePen][codepen] so that you can edit the code and try it
|
||||
yourself. We have included a number of utility packages to help you get started:
|
||||
|
@ -66,29 +78,25 @@ affecting your work, restrict your request to a `<minor>` number. e.g.,
|
|||
[F-pg]: ../solutions/face_mesh#javascript-solution-api
|
||||
[Fd-pg]: ../solutions/face_detection#javascript-solution-api
|
||||
[H-pg]: ../solutions/hands#javascript-solution-api
|
||||
[Ob-pg]: ../solutions/objectron#javascript-solution-api
|
||||
[P-pg]: ../solutions/pose#javascript-solution-api
|
||||
[S-pg]: ../solutions/selfie_segmentation#javascript-solution-api
|
||||
[Ho-npm]: https://www.npmjs.com/package/@mediapipe/holistic
|
||||
[F-npm]: https://www.npmjs.com/package/@mediapipe/face_mesh
|
||||
[Fd-npm]: https://www.npmjs.com/package/@mediapipe/face_detection
|
||||
[H-npm]: https://www.npmjs.com/package/@mediapipe/hands
|
||||
[Ob-npm]: https://www.npmjs.com/package/@mediapipe/objectron
|
||||
[P-npm]: https://www.npmjs.com/package/@mediapipe/pose
|
||||
[S-npm]: https://www.npmjs.com/package/@mediapipe/selfie_segmentation
|
||||
[draw-npm]: https://www.npmjs.com/package/@mediapipe/drawing_utils
|
||||
[cam-npm]: https://www.npmjs.com/package/@mediapipe/camera_utils
|
||||
[ctrl-npm]: https://www.npmjs.com/package/@mediapipe/control_utils
|
||||
[Ho-jsd]: https://www.jsdelivr.com/package/npm/@mediapipe/holistic
|
||||
[F-jsd]: https://www.jsdelivr.com/package/npm/@mediapipe/face_mesh
|
||||
[Fd-jsd]: https://www.jsdelivr.com/package/npm/@mediapipe/face_detection
|
||||
[H-jsd]: https://www.jsdelivr.com/package/npm/@mediapipe/hands
|
||||
[P-jsd]: https://www.jsdelivr.com/package/npm/@mediapipe/pose
|
||||
[Ho-pen]: https://code.mediapipe.dev/codepen/holistic
|
||||
[F-pen]: https://code.mediapipe.dev/codepen/face_mesh
|
||||
[Fd-pen]: https://code.mediapipe.dev/codepen/face_detection
|
||||
[H-pen]: https://code.mediapipe.dev/codepen/hands
|
||||
[P-pen]: https://code.mediapipe.dev/codepen/pose
|
||||
[Ho-demo]: https://mediapipe.dev/demo/holistic
|
||||
[F-demo]: https://mediapipe.dev/demo/face_mesh
|
||||
[Fd-demo]: https://mediapipe.dev/demo/face_detection
|
||||
[H-demo]: https://mediapipe.dev/demo/hands
|
||||
[Ob-demo]: https://mediapipe.dev/demo/objectron
|
||||
[P-demo]: https://mediapipe.dev/demo/pose
|
||||
[S-demo]: https://mediapipe.dev/demo/selfie_segmentation
|
||||
[npm]: https://www.npmjs.com/package/@mediapipe
|
||||
[codepen]: https://code.mediapipe.dev/codepen
|
||||
|
|
|
@ -51,6 +51,7 @@ details in each solution via the links below:
|
|||
* [MediaPipe Holistic](../solutions/holistic#python-solution-api)
|
||||
* [MediaPipe Objectron](../solutions/objectron#python-solution-api)
|
||||
* [MediaPipe Pose](../solutions/pose#python-solution-api)
|
||||
* [MediaPipe Selfie Segmentation](../solutions/selfie_segmentation#python-solution-api)
|
||||
|
||||
## MediaPipe on Google Colab
|
||||
|
||||
|
@ -62,6 +63,7 @@ details in each solution via the links below:
|
|||
* [MediaPipe Pose Colab](https://mediapipe.page.link/pose_py_colab)
|
||||
* [MediaPipe Pose Classification Colab (Basic)](https://mediapipe.page.link/pose_classification_basic)
|
||||
* [MediaPipe Pose Classification Colab (Extended)](https://mediapipe.page.link/pose_classification_extended)
|
||||
* [MediaPipe Selfie Segmentation Colab](https://mediapipe.page.link/selfie_segmentation_py_colab)
|
||||
|
||||
## MediaPipe Python Framework
|
||||
|
||||
|
|
|
@ -74,7 +74,7 @@ Mapping\[str, Packet\] | std::map<std::string, Packet> | create_st
|
|||
np.ndarray<br>(cv.mat and PIL.Image) | mp::ImageFrame | create_image_frame(<br> format=ImageFormat.SRGB,<br> data=mat) | get_image_frame(packet)
|
||||
np.ndarray | mp::Matrix | create_matrix(data) | get_matrix(packet)
|
||||
Google Proto Message | Google Proto Message | create_proto(proto) | get_proto(packet)
|
||||
List\[Proto\] | std::vector\<Proto\> | create_proto_vector(proto_list) | get_proto_list(packet)
|
||||
List\[Proto\] | std::vector\<Proto\> | n/a | get_proto_list(packet)
|
||||
|
||||
It's not uncommon that users create custom C++ classes and and send those into
|
||||
the graphs and calculators. To allow the custom classes to be used in Python
|
||||
|
|
BIN
docs/images/import_mp_android_studio_project.png
Normal file
BIN
docs/images/import_mp_android_studio_project.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 128 KiB |
BIN
docs/images/mobile/pose_segmentation.mp4
Normal file
BIN
docs/images/mobile/pose_segmentation.mp4
Normal file
Binary file not shown.
Binary file not shown.
Before Width: | Height: | Size: 56 KiB After Width: | Height: | Size: 77 KiB |
BIN
docs/images/mobile/pose_world_landmarks.mp4
Normal file
BIN
docs/images/mobile/pose_world_landmarks.mp4
Normal file
Binary file not shown.
BIN
docs/images/run_android_solution_app.png
Normal file
BIN
docs/images/run_android_solution_app.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 258 KiB |
BIN
docs/images/run_create_win_symlinks.png
Normal file
BIN
docs/images/run_create_win_symlinks.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 51 KiB |
BIN
docs/images/selfie_segmentation_web.mp4
Normal file
BIN
docs/images/selfie_segmentation_web.mp4
Normal file
Binary file not shown.
|
@ -40,11 +40,12 @@ Hair Segmentation
|
|||
[Hands](https://google.github.io/mediapipe/solutions/hands) | ✅ | ✅ | ✅ | ✅ | ✅ |
|
||||
[Pose](https://google.github.io/mediapipe/solutions/pose) | ✅ | ✅ | ✅ | ✅ | ✅ |
|
||||
[Holistic](https://google.github.io/mediapipe/solutions/holistic) | ✅ | ✅ | ✅ | ✅ | ✅ |
|
||||
[Selfie Segmentation](https://google.github.io/mediapipe/solutions/selfie_segmentation) | ✅ | ✅ | ✅ | ✅ | ✅ |
|
||||
[Hair Segmentation](https://google.github.io/mediapipe/solutions/hair_segmentation) | ✅ | | ✅ | | |
|
||||
[Object Detection](https://google.github.io/mediapipe/solutions/object_detection) | ✅ | ✅ | ✅ | | | ✅
|
||||
[Box Tracking](https://google.github.io/mediapipe/solutions/box_tracking) | ✅ | ✅ | ✅ | | |
|
||||
[Instant Motion Tracking](https://google.github.io/mediapipe/solutions/instant_motion_tracking) | ✅ | | | | |
|
||||
[Objectron](https://google.github.io/mediapipe/solutions/objectron) | ✅ | | ✅ | ✅ | |
|
||||
[Objectron](https://google.github.io/mediapipe/solutions/objectron) | ✅ | | ✅ | ✅ | ✅ |
|
||||
[KNIFT](https://google.github.io/mediapipe/solutions/knift) | ✅ | | | | |
|
||||
[AutoFlip](https://google.github.io/mediapipe/solutions/autoflip) | | | ✅ | | |
|
||||
[MediaSequence](https://google.github.io/mediapipe/solutions/media_sequence) | | | ✅ | | |
|
||||
|
@ -54,46 +55,22 @@ See also
|
|||
[MediaPipe Models and Model Cards](https://google.github.io/mediapipe/solutions/models)
|
||||
for ML models released in MediaPipe.
|
||||
|
||||
## MediaPipe in Python
|
||||
|
||||
MediaPipe offers customizable Python solutions as a prebuilt Python package on
|
||||
[PyPI](https://pypi.org/project/mediapipe/), which can be installed simply with
|
||||
`pip install mediapipe`. It also provides tools for users to build their own
|
||||
solutions. Please see
|
||||
[MediaPipe in Python](https://google.github.io/mediapipe/getting_started/python)
|
||||
for more info.
|
||||
|
||||
## MediaPipe on the Web
|
||||
|
||||
MediaPipe on the Web is an effort to run the same ML solutions built for mobile
|
||||
and desktop also in web browsers. The official API is under construction, but
|
||||
the core technology has been proven effective. Please see
|
||||
[MediaPipe on the Web](https://developers.googleblog.com/2020/01/mediapipe-on-web.html)
|
||||
in Google Developers Blog for details.
|
||||
|
||||
You can use the following links to load a demo in the MediaPipe Visualizer, and
|
||||
over there click the "Runner" icon in the top bar like shown below. The demos
|
||||
use your webcam video as input, which is processed all locally in real-time and
|
||||
never leaves your device.
|
||||
|
||||

|
||||
|
||||
* [MediaPipe Face Detection](https://viz.mediapipe.dev/demo/face_detection)
|
||||
* [MediaPipe Iris](https://viz.mediapipe.dev/demo/iris_tracking)
|
||||
* [MediaPipe Iris: Depth-from-Iris](https://viz.mediapipe.dev/demo/iris_depth)
|
||||
* [MediaPipe Hands](https://viz.mediapipe.dev/demo/hand_tracking)
|
||||
* [MediaPipe Hands (palm/hand detection only)](https://viz.mediapipe.dev/demo/hand_detection)
|
||||
* [MediaPipe Pose](https://viz.mediapipe.dev/demo/pose_tracking)
|
||||
* [MediaPipe Hair Segmentation](https://viz.mediapipe.dev/demo/hair_segmentation)
|
||||
|
||||
## Getting started
|
||||
|
||||
Learn how to [install](https://google.github.io/mediapipe/getting_started/install)
|
||||
MediaPipe and
|
||||
[build example applications](https://google.github.io/mediapipe/getting_started/building_examples),
|
||||
and start exploring our ready-to-use
|
||||
[solutions](https://google.github.io/mediapipe/solutions/solutions) that you can
|
||||
further extend and customize.
|
||||
To start using MediaPipe
|
||||
[solutions](https://google.github.io/mediapipe/solutions/solutions) with only a few
|
||||
lines code, see example code and demos in
|
||||
[MediaPipe in Python](https://google.github.io/mediapipe/getting_started/python) and
|
||||
[MediaPipe in JavaScript](https://google.github.io/mediapipe/getting_started/javascript).
|
||||
|
||||
To use MediaPipe in C++, Android and iOS, which allow further customization of
|
||||
the [solutions](https://google.github.io/mediapipe/solutions/solutions) as well as
|
||||
building your own, learn how to
|
||||
[install](https://google.github.io/mediapipe/getting_started/install) MediaPipe and
|
||||
start building example applications in
|
||||
[C++](https://google.github.io/mediapipe/getting_started/cpp),
|
||||
[Android](https://google.github.io/mediapipe/getting_started/android) and
|
||||
[iOS](https://google.github.io/mediapipe/getting_started/ios).
|
||||
|
||||
The source code is hosted in the
|
||||
[MediaPipe Github repository](https://github.com/google/mediapipe), and you can
|
||||
|
@ -102,6 +79,13 @@ run code search using
|
|||
|
||||
## Publications
|
||||
|
||||
* [Bringing artworks to life with AR](https://developers.googleblog.com/2021/07/bringing-artworks-to-life-with-ar.html)
|
||||
in Google Developers Blog
|
||||
* [Prosthesis control via Mirru App using MediaPipe hand tracking](https://developers.googleblog.com/2021/05/control-your-mirru-prosthesis-with-mediapipe-hand-tracking.html)
|
||||
in Google Developers Blog
|
||||
* [SignAll SDK: Sign language interface using MediaPipe is now available for
|
||||
developers](https://developers.googleblog.com/2021/04/signall-sdk-sign-language-interface-using-mediapipe-now-available.html)
|
||||
in Google Developers Blog
|
||||
* [MediaPipe Holistic - Simultaneous Face, Hand and Pose Prediction, on Device](https://ai.googleblog.com/2020/12/mediapipe-holistic-simultaneous-face.html)
|
||||
in Google AI Blog
|
||||
* [Background Features in Google Meet, Powered by Web ML](https://ai.googleblog.com/2020/10/background-features-in-google-meet.html)
|
||||
|
|
|
@ -2,7 +2,7 @@
|
|||
layout: default
|
||||
title: AutoFlip (Saliency-aware Video Cropping)
|
||||
parent: Solutions
|
||||
nav_order: 13
|
||||
nav_order: 14
|
||||
---
|
||||
|
||||
# AutoFlip: Saliency-aware Video Cropping
|
||||
|
|
|
@ -2,7 +2,7 @@
|
|||
layout: default
|
||||
title: Box Tracking
|
||||
parent: Solutions
|
||||
nav_order: 9
|
||||
nav_order: 10
|
||||
---
|
||||
|
||||
# MediaPipe Box Tracking
|
||||
|
|
|
@ -45,6 +45,15 @@ section.
|
|||
|
||||
Naming style and availability may differ slightly across platforms/languages.
|
||||
|
||||
#### model_selection
|
||||
|
||||
An integer index `0` or `1`. Use `0` to select a short-range model that works
|
||||
best for faces within 2 meters from the camera, and `1` for a full-range model
|
||||
best for faces within 5 meters. For the full-range option, a sparse model is
|
||||
used for its improved inference speed. Please refer to the
|
||||
[model cards](./models.md#face_detection) for details. Default to `0` if not
|
||||
specified.
|
||||
|
||||
#### min_detection_confidence
|
||||
|
||||
Minimum confidence value (`[0.0, 1.0]`) from the face detection model for the
|
||||
|
@ -68,10 +77,11 @@ normalized to `[0.0, 1.0]` by the image width and height respectively.
|
|||
|
||||
Please first follow general [instructions](../getting_started/python.md) to
|
||||
install MediaPipe Python package, then learn more in the companion
|
||||
[Python Colab](#resources) and the following usage example.
|
||||
[Python Colab](#resources) and the usage example below.
|
||||
|
||||
Supported configuration options:
|
||||
|
||||
* [model_selection](#model_selection)
|
||||
* [min_detection_confidence](#min_detection_confidence)
|
||||
|
||||
```python
|
||||
|
@ -81,9 +91,10 @@ mp_face_detection = mp.solutions.face_detection
|
|||
mp_drawing = mp.solutions.drawing_utils
|
||||
|
||||
# For static images:
|
||||
IMAGE_FILES = []
|
||||
with mp_face_detection.FaceDetection(
|
||||
min_detection_confidence=0.5) as face_detection:
|
||||
for idx, file in enumerate(file_list):
|
||||
model_selection=1, min_detection_confidence=0.5) as face_detection:
|
||||
for idx, file in enumerate(IMAGE_FILES):
|
||||
image = cv2.imread(file)
|
||||
# Convert the BGR image to RGB and process it with MediaPipe Face Detection.
|
||||
results = face_detection.process(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
|
||||
|
@ -102,7 +113,7 @@ with mp_face_detection.FaceDetection(
|
|||
# For webcam input:
|
||||
cap = cv2.VideoCapture(0)
|
||||
with mp_face_detection.FaceDetection(
|
||||
min_detection_confidence=0.5) as face_detection:
|
||||
model_selection=0, min_detection_confidence=0.5) as face_detection:
|
||||
while cap.isOpened():
|
||||
success, image = cap.read()
|
||||
if not success:
|
||||
|
@ -138,6 +149,7 @@ and the following usage example.
|
|||
|
||||
Supported configuration options:
|
||||
|
||||
* [modelSelection](#model_selection)
|
||||
* [minDetectionConfidence](#min_detection_confidence)
|
||||
|
||||
```html
|
||||
|
@ -188,6 +200,7 @@ const faceDetection = new FaceDetection({locateFile: (file) => {
|
|||
return `https://cdn.jsdelivr.net/npm/@mediapipe/face_detection@0.0/${file}`;
|
||||
}});
|
||||
faceDetection.setOptions({
|
||||
modelSelection: 0
|
||||
minDetectionConfidence: 0.5
|
||||
});
|
||||
faceDetection.onResults(onResults);
|
||||
|
@ -254,10 +267,6 @@ same configuration as the GPU pipeline, runs entirely on CPU.
|
|||
* Target:
|
||||
[`mediapipe/examples/desktop/face_detection:face_detection_gpu`](https://github.com/google/mediapipe/tree/master/mediapipe/examples/desktop/face_detection/BUILD)
|
||||
|
||||
### Web
|
||||
|
||||
Please refer to [these instructions](../index.md#mediapipe-on-the-web).
|
||||
|
||||
### Coral
|
||||
|
||||
Please refer to
|
||||
|
|
|
@ -69,7 +69,7 @@ and renders using a dedicated
|
|||
The
|
||||
[face landmark subgraph](https://github.com/google/mediapipe/tree/master/mediapipe/modules/face_landmark/face_landmark_front_gpu.pbtxt)
|
||||
internally uses a
|
||||
[face_detection_subgraph](https://github.com/google/mediapipe/tree/master/mediapipe/modules/face_detection/face_detection_front_gpu.pbtxt)
|
||||
[face_detection_subgraph](https://github.com/google/mediapipe/tree/master/mediapipe/modules/face_detection/face_detection_short_range_gpu.pbtxt)
|
||||
from the
|
||||
[face detection module](https://github.com/google/mediapipe/tree/master/mediapipe/modules/face_detection).
|
||||
|
||||
|
@ -265,7 +265,7 @@ magnitude of `z` uses roughly the same scale as `x`.
|
|||
|
||||
Please first follow general [instructions](../getting_started/python.md) to
|
||||
install MediaPipe Python package, then learn more in the companion
|
||||
[Python Colab](#resources) and the following usage example.
|
||||
[Python Colab](#resources) and the usage example below.
|
||||
|
||||
Supported configuration options:
|
||||
|
||||
|
@ -278,15 +278,17 @@ Supported configuration options:
|
|||
import cv2
|
||||
import mediapipe as mp
|
||||
mp_drawing = mp.solutions.drawing_utils
|
||||
mp_drawing_styles = mp.solutions.drawing_styles
|
||||
mp_face_mesh = mp.solutions.face_mesh
|
||||
|
||||
# For static images:
|
||||
IMAGE_FILES = []
|
||||
drawing_spec = mp_drawing.DrawingSpec(thickness=1, circle_radius=1)
|
||||
with mp_face_mesh.FaceMesh(
|
||||
static_image_mode=True,
|
||||
max_num_faces=1,
|
||||
min_detection_confidence=0.5) as face_mesh:
|
||||
for idx, file in enumerate(file_list):
|
||||
for idx, file in enumerate(IMAGE_FILES):
|
||||
image = cv2.imread(file)
|
||||
# Convert the BGR image to RGB before processing.
|
||||
results = face_mesh.process(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
|
||||
|
@ -300,9 +302,17 @@ with mp_face_mesh.FaceMesh(
|
|||
mp_drawing.draw_landmarks(
|
||||
image=annotated_image,
|
||||
landmark_list=face_landmarks,
|
||||
connections=mp_face_mesh.FACE_CONNECTIONS,
|
||||
landmark_drawing_spec=drawing_spec,
|
||||
connection_drawing_spec=drawing_spec)
|
||||
connections=mp_face_mesh.FACEMESH_TESSELATION,
|
||||
landmark_drawing_spec=None,
|
||||
connection_drawing_spec=mp_drawing_styles
|
||||
.get_default_face_mesh_tesselation_style())
|
||||
mp_drawing.draw_landmarks(
|
||||
image=annotated_image,
|
||||
landmark_list=face_landmarks,
|
||||
connections=mp_face_mesh.FACEMESH_CONTOURS,
|
||||
landmark_drawing_spec=None,
|
||||
connection_drawing_spec=mp_drawing_styles
|
||||
.get_default_face_mesh_contours_style())
|
||||
cv2.imwrite('/tmp/annotated_image' + str(idx) + '.png', annotated_image)
|
||||
|
||||
# For webcam input:
|
||||
|
@ -334,9 +344,17 @@ with mp_face_mesh.FaceMesh(
|
|||
mp_drawing.draw_landmarks(
|
||||
image=image,
|
||||
landmark_list=face_landmarks,
|
||||
connections=mp_face_mesh.FACE_CONNECTIONS,
|
||||
landmark_drawing_spec=drawing_spec,
|
||||
connection_drawing_spec=drawing_spec)
|
||||
connections=mp_face_mesh.FACEMESH_TESSELATION,
|
||||
landmark_drawing_spec=None,
|
||||
connection_drawing_spec=mp_drawing_styles
|
||||
.get_default_face_mesh_tesselation_style())
|
||||
mp_drawing.draw_landmarks(
|
||||
image=image,
|
||||
landmark_list=face_landmarks,
|
||||
connections=mp_face_mesh.FACEMESH_CONTOURS,
|
||||
landmark_drawing_spec=None,
|
||||
connection_drawing_spec=mp_drawing_styles
|
||||
.get_default_face_mesh_contours_style())
|
||||
cv2.imshow('MediaPipe FaceMesh', image)
|
||||
if cv2.waitKey(5) & 0xFF == 27:
|
||||
break
|
||||
|
@ -422,6 +440,200 @@ camera.start();
|
|||
</script>
|
||||
```
|
||||
|
||||
### Android Solution API
|
||||
|
||||
Please first follow general
|
||||
[instructions](../getting_started/android_solutions.md#integrate-mediapipe-android-solutions-api)
|
||||
to add MediaPipe Gradle dependencies, then try the FaceMash solution API in the
|
||||
companion
|
||||
[example Android Studio project](https://github.com/google/mediapipe/tree/master/mediapipe/examples/android/solutions/facemesh)
|
||||
following
|
||||
[these instructions](../getting_started/android_solutions.md#build-solution-example-apps-in-android-studio)
|
||||
and learn more in the usage example below.
|
||||
|
||||
Supported configuration options:
|
||||
|
||||
* [staticImageMode](#static_image_mode)
|
||||
* [maxNumFaces](#max_num_faces)
|
||||
* runOnGpu: Run the pipeline and the model inference on GPU or CPU.
|
||||
|
||||
#### Camera Input
|
||||
|
||||
```java
|
||||
// For camera input and result rendering with OpenGL.
|
||||
FaceMeshOptions faceMeshOptions =
|
||||
FaceMeshOptions.builder()
|
||||
.setMode(FaceMeshOptions.STREAMING_MODE) // API soon to become
|
||||
.setMaxNumFaces(1) // setStaticImageMode(false)
|
||||
.setRunOnGpu(true).build();
|
||||
FaceMesh facemesh = new FaceMesh(this, faceMeshOptions);
|
||||
facemesh.setErrorListener(
|
||||
(message, e) -> Log.e(TAG, "MediaPipe FaceMesh error:" + message));
|
||||
|
||||
// Initializes a new CameraInput instance and connects it to MediaPipe FaceMesh.
|
||||
CameraInput cameraInput = new CameraInput(this);
|
||||
cameraInput.setNewFrameListener(
|
||||
textureFrame -> facemesh.send(textureFrame));
|
||||
|
||||
// Initializes a new GlSurfaceView with a ResultGlRenderer<FaceMeshResult> instance
|
||||
// that provides the interfaces to run user-defined OpenGL rendering code.
|
||||
// See mediapipe/examples/android/solutions/facemesh/src/main/java/com/google/mediapipe/examples/facemesh/FaceMeshResultGlRenderer.java
|
||||
// as an example.
|
||||
SolutionGlSurfaceView<FaceMeshResult> glSurfaceView =
|
||||
new SolutionGlSurfaceView<>(
|
||||
this, facemesh.getGlContext(), facemesh.getGlMajorVersion());
|
||||
glSurfaceView.setSolutionResultRenderer(new FaceMeshResultGlRenderer());
|
||||
glSurfaceView.setRenderInputImage(true);
|
||||
|
||||
facemesh.setResultListener(
|
||||
faceMeshResult -> {
|
||||
NormalizedLandmark noseLandmark =
|
||||
result.multiFaceLandmarks().get(0).getLandmarkList().get(1);
|
||||
Log.i(
|
||||
TAG,
|
||||
String.format(
|
||||
"MediaPipe FaceMesh nose normalized coordinates (value range: [0, 1]): x=%f, y=%f",
|
||||
noseLandmark.getX(), noseLandmark.getY()));
|
||||
// Request GL rendering.
|
||||
glSurfaceView.setRenderData(faceMeshResult);
|
||||
glSurfaceView.requestRender();
|
||||
});
|
||||
|
||||
// The runnable to start camera after the GLSurfaceView is attached.
|
||||
glSurfaceView.post(
|
||||
() ->
|
||||
cameraInput.start(
|
||||
this,
|
||||
facemesh.getGlContext(),
|
||||
CameraInput.CameraFacing.FRONT,
|
||||
glSurfaceView.getWidth(),
|
||||
glSurfaceView.getHeight()));
|
||||
```
|
||||
|
||||
#### Image Input
|
||||
|
||||
```java
|
||||
// For reading images from gallery and drawing the output in an ImageView.
|
||||
FaceMeshOptions faceMeshOptions =
|
||||
FaceMeshOptions.builder()
|
||||
.setMode(FaceMeshOptions.STATIC_IMAGE_MODE) // API soon to become
|
||||
.setMaxNumFaces(1) // setStaticImageMode(true)
|
||||
.setRunOnGpu(true).build();
|
||||
FaceMesh facemesh = new FaceMesh(this, faceMeshOptions);
|
||||
|
||||
// Connects MediaPipe FaceMesh to the user-defined ImageView instance that allows
|
||||
// users to have the custom drawing of the output landmarks on it.
|
||||
// See mediapipe/examples/android/solutions/facemesh/src/main/java/com/google/mediapipe/examples/facemesh/FaceMeshResultImageView.java
|
||||
// as an example.
|
||||
FaceMeshResultImageView imageView = new FaceMeshResultImageView(this);
|
||||
facemesh.setResultListener(
|
||||
faceMeshResult -> {
|
||||
int width = faceMeshResult.inputBitmap().getWidth();
|
||||
int height = faceMeshResult.inputBitmap().getHeight();
|
||||
NormalizedLandmark noseLandmark =
|
||||
result.multiFaceLandmarks().get(0).getLandmarkList().get(1);
|
||||
Log.i(
|
||||
TAG,
|
||||
String.format(
|
||||
"MediaPipe FaceMesh nose coordinates (pixel values): x=%f, y=%f",
|
||||
noseLandmark.getX() * width, noseLandmark.getY() * height));
|
||||
// Request canvas drawing.
|
||||
imageView.setFaceMeshResult(faceMeshResult);
|
||||
runOnUiThread(() -> imageView.update());
|
||||
});
|
||||
facemesh.setErrorListener(
|
||||
(message, e) -> Log.e(TAG, "MediaPipe FaceMesh error:" + message));
|
||||
|
||||
// ActivityResultLauncher to get an image from the gallery as Bitmap.
|
||||
ActivityResultLauncher<Intent> imageGetter =
|
||||
registerForActivityResult(
|
||||
new ActivityResultContracts.StartActivityForResult(),
|
||||
result -> {
|
||||
Intent resultIntent = result.getData();
|
||||
if (resultIntent != null && result.getResultCode() == RESULT_OK) {
|
||||
Bitmap bitmap = null;
|
||||
try {
|
||||
bitmap =
|
||||
MediaStore.Images.Media.getBitmap(
|
||||
this.getContentResolver(), resultIntent.getData());
|
||||
} catch (IOException e) {
|
||||
Log.e(TAG, "Bitmap reading error:" + e);
|
||||
}
|
||||
if (bitmap != null) {
|
||||
facemesh.send(bitmap);
|
||||
}
|
||||
}
|
||||
});
|
||||
Intent gallery = new Intent(
|
||||
Intent.ACTION_PICK, MediaStore.Images.Media.INTERNAL_CONTENT_URI);
|
||||
imageGetter.launch(gallery);
|
||||
```
|
||||
|
||||
#### Video Input
|
||||
|
||||
```java
|
||||
// For video input and result rendering with OpenGL.
|
||||
FaceMeshOptions faceMeshOptions =
|
||||
FaceMeshOptions.builder()
|
||||
.setMode(FaceMeshOptions.STREAMING_MODE) // API soon to become
|
||||
.setMaxNumFaces(1) // setStaticImageMode(false)
|
||||
.setRunOnGpu(true).build();
|
||||
FaceMesh facemesh = new FaceMesh(this, faceMeshOptions);
|
||||
facemesh.setErrorListener(
|
||||
(message, e) -> Log.e(TAG, "MediaPipe FaceMesh error:" + message));
|
||||
|
||||
// Initializes a new VideoInput instance and connects it to MediaPipe FaceMesh.
|
||||
VideoInput videoInput = new VideoInput(this);
|
||||
videoInput.setNewFrameListener(
|
||||
textureFrame -> facemesh.send(textureFrame));
|
||||
|
||||
// Initializes a new GlSurfaceView with a ResultGlRenderer<FaceMeshResult> instance
|
||||
// that provides the interfaces to run user-defined OpenGL rendering code.
|
||||
// See mediapipe/examples/android/solutions/facemesh/src/main/java/com/google/mediapipe/examples/facemesh/FaceMeshResultGlRenderer.java
|
||||
// as an example.
|
||||
SolutionGlSurfaceView<FaceMeshResult> glSurfaceView =
|
||||
new SolutionGlSurfaceView<>(
|
||||
this, facemesh.getGlContext(), facemesh.getGlMajorVersion());
|
||||
glSurfaceView.setSolutionResultRenderer(new FaceMeshResultGlRenderer());
|
||||
glSurfaceView.setRenderInputImage(true);
|
||||
|
||||
facemesh.setResultListener(
|
||||
faceMeshResult -> {
|
||||
NormalizedLandmark noseLandmark =
|
||||
result.multiFaceLandmarks().get(0).getLandmarkList().get(1);
|
||||
Log.i(
|
||||
TAG,
|
||||
String.format(
|
||||
"MediaPipe FaceMesh nose normalized coordinates (value range: [0, 1]): x=%f, y=%f",
|
||||
noseLandmark.getX(), noseLandmark.getY()));
|
||||
// Request GL rendering.
|
||||
glSurfaceView.setRenderData(faceMeshResult);
|
||||
glSurfaceView.requestRender();
|
||||
});
|
||||
|
||||
ActivityResultLauncher<Intent> videoGetter =
|
||||
registerForActivityResult(
|
||||
new ActivityResultContracts.StartActivityForResult(),
|
||||
result -> {
|
||||
Intent resultIntent = result.getData();
|
||||
if (resultIntent != null) {
|
||||
if (result.getResultCode() == RESULT_OK) {
|
||||
glSurfaceView.post(
|
||||
() ->
|
||||
videoInput.start(
|
||||
this,
|
||||
resultIntent.getData(),
|
||||
facemesh.getGlContext(),
|
||||
glSurfaceView.getWidth(),
|
||||
glSurfaceView.getHeight()));
|
||||
}
|
||||
}
|
||||
});
|
||||
Intent gallery =
|
||||
new Intent(Intent.ACTION_PICK, MediaStore.Video.Media.INTERNAL_CONTENT_URI);
|
||||
videoGetter.launch(gallery);
|
||||
```
|
||||
|
||||
## Example Apps
|
||||
|
||||
Please first see general instructions for
|
||||
|
|
|
@ -2,7 +2,7 @@
|
|||
layout: default
|
||||
title: Hair Segmentation
|
||||
parent: Solutions
|
||||
nav_order: 7
|
||||
nav_order: 8
|
||||
---
|
||||
|
||||
# MediaPipe Hair Segmentation
|
||||
|
@ -51,7 +51,14 @@ to visualize its associated subgraphs, please see
|
|||
|
||||
### Web
|
||||
|
||||
Please refer to [these instructions](../index.md#mediapipe-on-the-web).
|
||||
Use [this link](https://viz.mediapipe.dev/demo/hair_segmentation) to load a demo
|
||||
in the MediaPipe Visualizer, and over there click the "Runner" icon in the top
|
||||
bar like shown below. The demos use your webcam video as input, which is
|
||||
processed all locally in real-time and never leaves your device. Please see
|
||||
[MediaPipe on the Web](https://developers.googleblog.com/2020/01/mediapipe-on-web.html)
|
||||
in Google Developers Blog for details.
|
||||
|
||||

|
||||
|
||||
## Resources
|
||||
|
||||
|
|
|
@ -206,7 +206,7 @@ is not the case, please swap the handedness output in the application.
|
|||
|
||||
Please first follow general [instructions](../getting_started/python.md) to
|
||||
install MediaPipe Python package, then learn more in the companion
|
||||
[Python Colab](#resources) and the following usage example.
|
||||
[Python Colab](#resources) and the usage example below.
|
||||
|
||||
Supported configuration options:
|
||||
|
||||
|
@ -219,14 +219,16 @@ Supported configuration options:
|
|||
import cv2
|
||||
import mediapipe as mp
|
||||
mp_drawing = mp.solutions.drawing_utils
|
||||
mp_drawing_styles = mp.solutions.drawing_styles
|
||||
mp_hands = mp.solutions.hands
|
||||
|
||||
# For static images:
|
||||
IMAGE_FILES = []
|
||||
with mp_hands.Hands(
|
||||
static_image_mode=True,
|
||||
max_num_hands=2,
|
||||
min_detection_confidence=0.5) as hands:
|
||||
for idx, file in enumerate(file_list):
|
||||
for idx, file in enumerate(IMAGE_FILES):
|
||||
# Read an image, flip it around y-axis for correct handedness output (see
|
||||
# above).
|
||||
image = cv2.flip(cv2.imread(file), 1)
|
||||
|
@ -247,7 +249,11 @@ with mp_hands.Hands(
|
|||
f'{hand_landmarks.landmark[mp_hands.HandLandmark.INDEX_FINGER_TIP].y * image_height})'
|
||||
)
|
||||
mp_drawing.draw_landmarks(
|
||||
annotated_image, hand_landmarks, mp_hands.HAND_CONNECTIONS)
|
||||
annotated_image,
|
||||
hand_landmarks,
|
||||
mp_hands.HAND_CONNECTIONS,
|
||||
mp_drawing_styles.get_default_hand_landmarks_style(),
|
||||
mp_drawing_styles.get_default_hand_connections_style())
|
||||
cv2.imwrite(
|
||||
'/tmp/annotated_image' + str(idx) + '.png', cv2.flip(annotated_image, 1))
|
||||
|
||||
|
@ -277,7 +283,11 @@ with mp_hands.Hands(
|
|||
if results.multi_hand_landmarks:
|
||||
for hand_landmarks in results.multi_hand_landmarks:
|
||||
mp_drawing.draw_landmarks(
|
||||
image, hand_landmarks, mp_hands.HAND_CONNECTIONS)
|
||||
image,
|
||||
hand_landmarks,
|
||||
mp_hands.HAND_CONNECTIONS,
|
||||
mp_drawing_styles.get_default_hand_landmarks_style(),
|
||||
mp_drawing_styles.get_default_hand_connections_style())
|
||||
cv2.imshow('MediaPipe Hands', image)
|
||||
if cv2.waitKey(5) & 0xFF == 27:
|
||||
break
|
||||
|
@ -358,6 +368,200 @@ camera.start();
|
|||
</script>
|
||||
```
|
||||
|
||||
### Android Solution API
|
||||
|
||||
Please first follow general
|
||||
[instructions](../getting_started/android_solutions.md#integrate-mediapipe-android-solutions-api)
|
||||
to add MediaPipe Gradle dependencies, then try the Hands solution API in the
|
||||
companion
|
||||
[example Android Studio project](https://github.com/google/mediapipe/tree/master/mediapipe/examples/android/solutions/hands)
|
||||
following
|
||||
[these instructions](../getting_started/android_solutions.md#build-solution-example-apps-in-android-studio)
|
||||
and learn more in usage example below.
|
||||
|
||||
Supported configuration options:
|
||||
|
||||
* [staticImageMode](#static_image_mode)
|
||||
* [maxNumHands](#max_num_hands)
|
||||
* runOnGpu: Run the pipeline and the model inference on GPU or CPU.
|
||||
|
||||
#### Camera Input
|
||||
|
||||
```java
|
||||
// For camera input and result rendering with OpenGL.
|
||||
HandsOptions handsOptions =
|
||||
HandsOptions.builder()
|
||||
.setMode(HandsOptions.STREAMING_MODE) // API soon to become
|
||||
.setMaxNumHands(1) // setStaticImageMode(false)
|
||||
.setRunOnGpu(true).build();
|
||||
Hands hands = new Hands(this, handsOptions);
|
||||
hands.setErrorListener(
|
||||
(message, e) -> Log.e(TAG, "MediaPipe Hands error:" + message));
|
||||
|
||||
// Initializes a new CameraInput instance and connects it to MediaPipe Hands.
|
||||
CameraInput cameraInput = new CameraInput(this);
|
||||
cameraInput.setNewFrameListener(
|
||||
textureFrame -> hands.send(textureFrame));
|
||||
|
||||
// Initializes a new GlSurfaceView with a ResultGlRenderer<HandsResult> instance
|
||||
// that provides the interfaces to run user-defined OpenGL rendering code.
|
||||
// See mediapipe/examples/android/solutions/hands/src/main/java/com/google/mediapipe/examples/hands/HandsResultGlRenderer.java
|
||||
// as an example.
|
||||
SolutionGlSurfaceView<HandsResult> glSurfaceView =
|
||||
new SolutionGlSurfaceView<>(
|
||||
this, hands.getGlContext(), hands.getGlMajorVersion());
|
||||
glSurfaceView.setSolutionResultRenderer(new HandsResultGlRenderer());
|
||||
glSurfaceView.setRenderInputImage(true);
|
||||
|
||||
hands.setResultListener(
|
||||
handsResult -> {
|
||||
NormalizedLandmark wristLandmark = Hands.getHandLandmark(
|
||||
handsResult, 0, HandLandmark.WRIST);
|
||||
Log.i(
|
||||
TAG,
|
||||
String.format(
|
||||
"MediaPipe Hand wrist normalized coordinates (value range: [0, 1]): x=%f, y=%f",
|
||||
wristLandmark.getX(), wristLandmark.getY()));
|
||||
// Request GL rendering.
|
||||
glSurfaceView.setRenderData(handsResult);
|
||||
glSurfaceView.requestRender();
|
||||
});
|
||||
|
||||
// The runnable to start camera after the GLSurfaceView is attached.
|
||||
glSurfaceView.post(
|
||||
() ->
|
||||
cameraInput.start(
|
||||
this,
|
||||
hands.getGlContext(),
|
||||
CameraInput.CameraFacing.FRONT,
|
||||
glSurfaceView.getWidth(),
|
||||
glSurfaceView.getHeight()));
|
||||
```
|
||||
|
||||
#### Image Input
|
||||
|
||||
```java
|
||||
// For reading images from gallery and drawing the output in an ImageView.
|
||||
HandsOptions handsOptions =
|
||||
HandsOptions.builder()
|
||||
.setMode(HandsOptions.STATIC_IMAGE_MODE) // API soon to become
|
||||
.setMaxNumHands(1) // setStaticImageMode(true)
|
||||
.setRunOnGpu(true).build();
|
||||
Hands hands = new Hands(this, handsOptions);
|
||||
|
||||
// Connects MediaPipe Hands to the user-defined ImageView instance that allows
|
||||
// users to have the custom drawing of the output landmarks on it.
|
||||
// See mediapipe/examples/android/solutions/hands/src/main/java/com/google/mediapipe/examples/hands/HandsResultImageView.java
|
||||
// as an example.
|
||||
HandsResultImageView imageView = new HandsResultImageView(this);
|
||||
hands.setResultListener(
|
||||
handsResult -> {
|
||||
int width = handsResult.inputBitmap().getWidth();
|
||||
int height = handsResult.inputBitmap().getHeight();
|
||||
NormalizedLandmark wristLandmark = Hands.getHandLandmark(
|
||||
handsResult, 0, HandLandmark.WRIST);
|
||||
Log.i(
|
||||
TAG,
|
||||
String.format(
|
||||
"MediaPipe Hand wrist coordinates (pixel values): x=%f, y=%f",
|
||||
wristLandmark.getX() * width, wristLandmark.getY() * height));
|
||||
// Request canvas drawing.
|
||||
imageView.setHandsResult(handsResult);
|
||||
runOnUiThread(() -> imageView.update());
|
||||
});
|
||||
hands.setErrorListener(
|
||||
(message, e) -> Log.e(TAG, "MediaPipe Hands error:" + message));
|
||||
|
||||
// ActivityResultLauncher to get an image from the gallery as Bitmap.
|
||||
ActivityResultLauncher<Intent> imageGetter =
|
||||
registerForActivityResult(
|
||||
new ActivityResultContracts.StartActivityForResult(),
|
||||
result -> {
|
||||
Intent resultIntent = result.getData();
|
||||
if (resultIntent != null && result.getResultCode() == RESULT_OK) {
|
||||
Bitmap bitmap = null;
|
||||
try {
|
||||
bitmap =
|
||||
MediaStore.Images.Media.getBitmap(
|
||||
this.getContentResolver(), resultIntent.getData());
|
||||
} catch (IOException e) {
|
||||
Log.e(TAG, "Bitmap reading error:" + e);
|
||||
}
|
||||
if (bitmap != null) {
|
||||
hands.send(bitmap);
|
||||
}
|
||||
}
|
||||
});
|
||||
Intent gallery = new Intent(
|
||||
Intent.ACTION_PICK, MediaStore.Images.Media.INTERNAL_CONTENT_URI);
|
||||
imageGetter.launch(gallery);
|
||||
```
|
||||
|
||||
#### Video Input
|
||||
|
||||
```java
|
||||
// For video input and result rendering with OpenGL.
|
||||
HandsOptions handsOptions =
|
||||
HandsOptions.builder()
|
||||
.setMode(HandsOptions.STREAMING_MODE) // API soon to become
|
||||
.setMaxNumHands(1) // setStaticImageMode(false)
|
||||
.setRunOnGpu(true).build();
|
||||
Hands hands = new Hands(this, handsOptions);
|
||||
hands.setErrorListener(
|
||||
(message, e) -> Log.e(TAG, "MediaPipe Hands error:" + message));
|
||||
|
||||
// Initializes a new VideoInput instance and connects it to MediaPipe Hands.
|
||||
VideoInput videoInput = new VideoInput(this);
|
||||
videoInput.setNewFrameListener(
|
||||
textureFrame -> hands.send(textureFrame));
|
||||
|
||||
// Initializes a new GlSurfaceView with a ResultGlRenderer<HandsResult> instance
|
||||
// that provides the interfaces to run user-defined OpenGL rendering code.
|
||||
// See mediapipe/examples/android/solutions/hands/src/main/java/com/google/mediapipe/examples/hands/HandsResultGlRenderer.java
|
||||
// as an example.
|
||||
SolutionGlSurfaceView<HandsResult> glSurfaceView =
|
||||
new SolutionGlSurfaceView<>(
|
||||
this, hands.getGlContext(), hands.getGlMajorVersion());
|
||||
glSurfaceView.setSolutionResultRenderer(new HandsResultGlRenderer());
|
||||
glSurfaceView.setRenderInputImage(true);
|
||||
|
||||
hands.setResultListener(
|
||||
handsResult -> {
|
||||
NormalizedLandmark wristLandmark = Hands.getHandLandmark(
|
||||
handsResult, 0, HandLandmark.WRIST);
|
||||
Log.i(
|
||||
TAG,
|
||||
String.format(
|
||||
"MediaPipe Hand wrist normalized coordinates (value range: [0, 1]): x=%f, y=%f",
|
||||
wristLandmark.getX(), wristLandmark.getY()));
|
||||
// Request GL rendering.
|
||||
glSurfaceView.setRenderData(handsResult);
|
||||
glSurfaceView.requestRender();
|
||||
});
|
||||
|
||||
ActivityResultLauncher<Intent> videoGetter =
|
||||
registerForActivityResult(
|
||||
new ActivityResultContracts.StartActivityForResult(),
|
||||
result -> {
|
||||
Intent resultIntent = result.getData();
|
||||
if (resultIntent != null) {
|
||||
if (result.getResultCode() == RESULT_OK) {
|
||||
glSurfaceView.post(
|
||||
() ->
|
||||
videoInput.start(
|
||||
this,
|
||||
resultIntent.getData(),
|
||||
hands.getGlContext(),
|
||||
glSurfaceView.getWidth(),
|
||||
glSurfaceView.getHeight()));
|
||||
}
|
||||
}
|
||||
});
|
||||
Intent gallery =
|
||||
new Intent(Intent.ACTION_PICK, MediaStore.Video.Media.INTERNAL_CONTENT_URI);
|
||||
videoGetter.launch(gallery);
|
||||
```
|
||||
|
||||
## Example Apps
|
||||
|
||||
Please first see general instructions for
|
||||
|
|
|
@ -176,6 +176,16 @@ A list of pose landmarks. Each landmark consists of the following:
|
|||
* `visibility`: A value in `[0.0, 1.0]` indicating the likelihood of the
|
||||
landmark being visible (present and not occluded) in the image.
|
||||
|
||||
#### pose_world_landmarks
|
||||
|
||||
Another list of pose landmarks in world coordinates. Each landmark consists of
|
||||
the following:
|
||||
|
||||
* `x`, `y` and `z`: Real-world 3D coordinates in meters with the origin at the
|
||||
center between hips.
|
||||
* `visibility`: Identical to that defined in the corresponding
|
||||
[pose_landmarks](#pose_landmarks).
|
||||
|
||||
#### face_landmarks
|
||||
|
||||
A list of 468 face landmarks. Each landmark consists of `x`, `y` and `z`. `x`
|
||||
|
@ -201,7 +211,7 @@ A list of 21 hand landmarks on the right hand, in the same representation as
|
|||
|
||||
Please first follow general [instructions](../getting_started/python.md) to
|
||||
install MediaPipe Python package, then learn more in the companion
|
||||
[Python Colab](#resources) and the following usage example.
|
||||
[Python Colab](#resources) and the usage example below.
|
||||
|
||||
Supported configuration options:
|
||||
|
||||
|
@ -215,13 +225,15 @@ Supported configuration options:
|
|||
import cv2
|
||||
import mediapipe as mp
|
||||
mp_drawing = mp.solutions.drawing_utils
|
||||
mp_drawing_styles = mp.solutions.drawing_styles
|
||||
mp_holistic = mp.solutions.holistic
|
||||
|
||||
# For static images:
|
||||
IMAGE_FILES = []
|
||||
with mp_holistic.Holistic(
|
||||
static_image_mode=True,
|
||||
model_complexity=2) as holistic:
|
||||
for idx, file in enumerate(file_list):
|
||||
for idx, file in enumerate(IMAGE_FILES):
|
||||
image = cv2.imread(file)
|
||||
image_height, image_width, _ = image.shape
|
||||
# Convert the BGR image to RGB before processing.
|
||||
|
@ -236,14 +248,22 @@ with mp_holistic.Holistic(
|
|||
# Draw pose, left and right hands, and face landmarks on the image.
|
||||
annotated_image = image.copy()
|
||||
mp_drawing.draw_landmarks(
|
||||
annotated_image, results.face_landmarks, mp_holistic.FACE_CONNECTIONS)
|
||||
annotated_image,
|
||||
results.face_landmarks,
|
||||
mp_holistic.FACEMESH_TESSELATION,
|
||||
landmark_drawing_spec=None,
|
||||
connection_drawing_spec=mp_drawing_styles
|
||||
.get_default_face_mesh_tesselation_style())
|
||||
mp_drawing.draw_landmarks(
|
||||
annotated_image, results.left_hand_landmarks, mp_holistic.HAND_CONNECTIONS)
|
||||
mp_drawing.draw_landmarks(
|
||||
annotated_image, results.right_hand_landmarks, mp_holistic.HAND_CONNECTIONS)
|
||||
mp_drawing.draw_landmarks(
|
||||
annotated_image, results.pose_landmarks, mp_holistic.POSE_CONNECTIONS)
|
||||
annotated_image,
|
||||
results.pose_landmarks,
|
||||
mp_holistic.POSE_CONNECTIONS,
|
||||
landmark_drawing_spec=mp_drawing_styles.
|
||||
get_default_pose_landmarks_style())
|
||||
cv2.imwrite('/tmp/annotated_image' + str(idx) + '.png', annotated_image)
|
||||
# Plot pose world landmarks.
|
||||
mp_drawing.plot_landmarks(
|
||||
results.pose_world_landmarks, mp_holistic.POSE_CONNECTIONS)
|
||||
|
||||
# For webcam input:
|
||||
cap = cv2.VideoCapture(0)
|
||||
|
@ -269,13 +289,18 @@ with mp_holistic.Holistic(
|
|||
image.flags.writeable = True
|
||||
image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)
|
||||
mp_drawing.draw_landmarks(
|
||||
image, results.face_landmarks, mp_holistic.FACE_CONNECTIONS)
|
||||
image,
|
||||
results.face_landmarks,
|
||||
mp_holistic.FACEMESH_CONTOURS,
|
||||
landmark_drawing_spec=None,
|
||||
connection_drawing_spec=mp_drawing_styles
|
||||
.get_default_face_mesh_contours_style())
|
||||
mp_drawing.draw_landmarks(
|
||||
image, results.left_hand_landmarks, mp_holistic.HAND_CONNECTIONS)
|
||||
mp_drawing.draw_landmarks(
|
||||
image, results.right_hand_landmarks, mp_holistic.HAND_CONNECTIONS)
|
||||
mp_drawing.draw_landmarks(
|
||||
image, results.pose_landmarks, mp_holistic.POSE_CONNECTIONS)
|
||||
image,
|
||||
results.pose_landmarks,
|
||||
mp_holistic.POSE_CONNECTIONS,
|
||||
landmark_drawing_spec=mp_drawing_styles
|
||||
.get_default_pose_landmarks_style())
|
||||
cv2.imshow('MediaPipe Holistic', image)
|
||||
if cv2.waitKey(5) & 0xFF == 27:
|
||||
break
|
||||
|
|
|
@ -2,7 +2,7 @@
|
|||
layout: default
|
||||
title: Instant Motion Tracking
|
||||
parent: Solutions
|
||||
nav_order: 10
|
||||
nav_order: 11
|
||||
---
|
||||
|
||||
# MediaPipe Instant Motion Tracking
|
||||
|
|
|
@ -69,7 +69,7 @@ and renders using a dedicated
|
|||
The
|
||||
[face landmark subgraph](https://github.com/google/mediapipe/tree/master/mediapipe/modules/face_landmark/face_landmark_front_gpu.pbtxt)
|
||||
internally uses a
|
||||
[face detection subgraph](https://github.com/google/mediapipe/tree/master/mediapipe/modules/face_detection/face_detection_front_gpu.pbtxt)
|
||||
[face detection subgraph](https://github.com/google/mediapipe/tree/master/mediapipe/modules/face_detection/face_detection_short_range_gpu.pbtxt)
|
||||
from the
|
||||
[face detection module](https://github.com/google/mediapipe/tree/master/mediapipe/modules/face_detection).
|
||||
|
||||
|
@ -193,7 +193,17 @@ on how to build MediaPipe examples.
|
|||
|
||||
### Web
|
||||
|
||||
Please refer to [these instructions](../index.md#mediapipe-on-the-web).
|
||||
You can use the following links to load a demo in the MediaPipe Visualizer, and
|
||||
over there click the "Runner" icon in the top bar like shown below. The demos
|
||||
use your webcam video as input, which is processed all locally in real-time and
|
||||
never leaves your device. Please see
|
||||
[MediaPipe on the Web](https://developers.googleblog.com/2020/01/mediapipe-on-web.html)
|
||||
in Google Developers Blog for details.
|
||||
|
||||

|
||||
|
||||
* [MediaPipe Iris](https://viz.mediapipe.dev/demo/iris_tracking)
|
||||
* [MediaPipe Iris: Depth-from-Iris](https://viz.mediapipe.dev/demo/iris_depth)
|
||||
|
||||
## Resources
|
||||
|
||||
|
|
|
@ -2,7 +2,7 @@
|
|||
layout: default
|
||||
title: KNIFT (Template-based Feature Matching)
|
||||
parent: Solutions
|
||||
nav_order: 12
|
||||
nav_order: 13
|
||||
---
|
||||
|
||||
# MediaPipe KNIFT
|
||||
|
|
|
@ -2,7 +2,7 @@
|
|||
layout: default
|
||||
title: Dataset Preparation with MediaSequence
|
||||
parent: Solutions
|
||||
nav_order: 14
|
||||
nav_order: 15
|
||||
---
|
||||
|
||||
# Dataset Preparation with MediaSequence
|
||||
|
|
|
@ -14,12 +14,27 @@ nav_order: 30
|
|||
|
||||
### [Face Detection](https://google.github.io/mediapipe/solutions/face_detection)
|
||||
|
||||
* Face detection model for front-facing/selfie camera:
|
||||
[TFLite model](https://github.com/google/mediapipe/tree/master/mediapipe/modules/face_detection/face_detection_front.tflite),
|
||||
[TFLite model quantized for EdgeTPU/Coral](https://github.com/google/mediapipe/tree/master/mediapipe/examples/coral/models/face-detector-quantized_edgetpu.tflite)
|
||||
* Face detection model for back-facing camera:
|
||||
[TFLite model ](https://github.com/google/mediapipe/tree/master/mediapipe/modules/face_detection/face_detection_back.tflite)
|
||||
* [Model card](https://mediapipe.page.link/blazeface-mc)
|
||||
* Short-range model (best for faces within 2 meters from the camera):
|
||||
[TFLite model](https://github.com/google/mediapipe/tree/master/mediapipe/modules/face_detection/face_detection_short_range.tflite),
|
||||
[TFLite model quantized for EdgeTPU/Coral](https://github.com/google/mediapipe/tree/master/mediapipe/examples/coral/models/face-detector-quantized_edgetpu.tflite),
|
||||
[Model card](https://mediapipe.page.link/blazeface-mc)
|
||||
* Full-range model (dense, best for faces within 5 meters from the camera):
|
||||
[TFLite model](https://github.com/google/mediapipe/tree/master/mediapipe/modules/face_detection/face_detection_full_range.tflite),
|
||||
[Model card](https://mediapipe.page.link/blazeface-back-mc)
|
||||
* Full-range model (sparse, best for faces within 5 meters from the camera):
|
||||
[TFLite model](https://github.com/google/mediapipe/tree/master/mediapipe/modules/face_detection/face_detection_full_range_sparse.tflite),
|
||||
[Model card](https://mediapipe.page.link/blazeface-back-sparse-mc)
|
||||
|
||||
Full-range dense and sparse models have the same quality in terms of
|
||||
[F-score](https://en.wikipedia.org/wiki/F-score) however differ in underlying
|
||||
metrics. The dense model is slightly better in
|
||||
[Recall](https://en.wikipedia.org/wiki/Precision_and_recall) whereas the sparse
|
||||
model outperforms the dense one in
|
||||
[Precision](https://en.wikipedia.org/wiki/Precision_and_recall). Speed-wise
|
||||
sparse model is ~30% faster when executing on CPU via
|
||||
[XNNPACK](https://github.com/google/XNNPACK) whereas on GPU the models
|
||||
demonstrate comparable latencies. Depending on your application, you may prefer
|
||||
one over the other.
|
||||
|
||||
### [Face Mesh](https://google.github.io/mediapipe/solutions/face_mesh)
|
||||
|
||||
|
@ -60,6 +75,12 @@ nav_order: 30
|
|||
* Hand recrop model:
|
||||
[TFLite model](https://github.com/google/mediapipe/tree/master/mediapipe/modules/holistic_landmark/hand_recrop.tflite)
|
||||
|
||||
### [Selfie Segmentation](https://google.github.io/mediapipe/solutions/selfie_segmentation)
|
||||
|
||||
* [TFLite model (general)](https://github.com/google/mediapipe/tree/master/mediapipe/modules/selfie_segmentation/selfie_segmentation.tflite)
|
||||
* [TFLite model (landscape)](https://github.com/google/mediapipe/tree/master/mediapipe/modules/selfie_segmentation/selfie_segmentation_landscape.tflite)
|
||||
* [Model card](https://mediapipe.page.link/selfiesegmentation-mc)
|
||||
|
||||
### [Hair Segmentation](https://google.github.io/mediapipe/solutions/hair_segmentation)
|
||||
|
||||
* [TFLite model](https://github.com/google/mediapipe/tree/master/mediapipe/models/hair_segmentation.tflite)
|
||||
|
|
|
@ -2,7 +2,7 @@
|
|||
layout: default
|
||||
title: Object Detection
|
||||
parent: Solutions
|
||||
nav_order: 8
|
||||
nav_order: 9
|
||||
---
|
||||
|
||||
# MediaPipe Object Detection
|
||||
|
|
|
@ -2,7 +2,7 @@
|
|||
layout: default
|
||||
title: Objectron (3D Object Detection)
|
||||
parent: Solutions
|
||||
nav_order: 11
|
||||
nav_order: 12
|
||||
---
|
||||
|
||||
# MediaPipe Objectron
|
||||
|
@ -224,29 +224,33 @@ where object detection simply runs on every image. Default to `0.99`.
|
|||
|
||||
#### model_name
|
||||
|
||||
Name of the model to use for predicting 3D bounding box landmarks. Currently supports
|
||||
`{'Shoe', 'Chair', 'Cup', 'Camera'}`.
|
||||
Name of the model to use for predicting 3D bounding box landmarks. Currently
|
||||
supports `{'Shoe', 'Chair', 'Cup', 'Camera'}`. Default to `Shoe`.
|
||||
|
||||
#### focal_length
|
||||
|
||||
Camera focal length `(fx, fy)`, by default is defined in
|
||||
[NDC space](#ndc-space). To use focal length `(fx_pixel, fy_pixel)` in
|
||||
[pixel space](#pixel-space), users should provide `image_size` = `(image_width,
|
||||
image_height)` to enable conversions inside the API. For further details about
|
||||
NDC and pixel space, please see [Coordinate Systems](#coordinate-systems).
|
||||
By default, camera focal length defined in [NDC space](#ndc-space), i.e., `(fx,
|
||||
fy)`. Default to `(1.0, 1.0)`. To specify focal length in
|
||||
[pixel space](#pixel-space) instead, i.e., `(fx_pixel, fy_pixel)`, users should
|
||||
provide [`image_size`](#image_size) = `(image_width, image_height)` to enable
|
||||
conversions inside the API. For further details about NDC and pixel space,
|
||||
please see [Coordinate Systems](#coordinate-systems).
|
||||
|
||||
#### principal_point
|
||||
|
||||
Camera principal point `(px, py)`, by default is defined in
|
||||
[NDC space](#ndc-space). To use principal point `(px_pixel, py_pixel)` in
|
||||
[pixel space](#pixel-space), users should provide `image_size` = `(image_width,
|
||||
image_height)` to enable conversions inside the API. For further details about
|
||||
NDC and pixel space, please see [Coordinate Systems](#coordinate-systems).
|
||||
By default, camera principal point defined in [NDC space](#ndc-space), i.e.,
|
||||
`(px, py)`. Default to `(0.0, 0.0)`. To specify principal point in
|
||||
[pixel space](#pixel-space), i.e.,`(px_pixel, py_pixel)`, users should provide
|
||||
[`image_size`](#image_size) = `(image_width, image_height)` to enable
|
||||
conversions inside the API. For further details about NDC and pixel space,
|
||||
please see [Coordinate Systems](#coordinate-systems).
|
||||
|
||||
#### image_size
|
||||
|
||||
(**Optional**) size `(image_width, image_height)` of the input image, **ONLY**
|
||||
needed when use `focal_length` and `principal_point` in pixel space.
|
||||
**Specify only when [`focal_length`](#focal_length) and
|
||||
[`principal_point`](#principal_point) are specified in pixel space.**
|
||||
|
||||
Size of the input image, i.e., `(image_width, image_height)`.
|
||||
|
||||
### Output
|
||||
|
||||
|
@ -277,7 +281,7 @@ following:
|
|||
|
||||
Please first follow general [instructions](../getting_started/python.md) to
|
||||
install MediaPipe Python package, then learn more in the companion
|
||||
[Python Colab](#resources) and the following usage example.
|
||||
[Python Colab](#resources) and the usage example below.
|
||||
|
||||
Supported configuration options:
|
||||
|
||||
|
@ -297,11 +301,12 @@ mp_drawing = mp.solutions.drawing_utils
|
|||
mp_objectron = mp.solutions.objectron
|
||||
|
||||
# For static images:
|
||||
IMAGE_FILES = []
|
||||
with mp_objectron.Objectron(static_image_mode=True,
|
||||
max_num_objects=5,
|
||||
min_detection_confidence=0.5,
|
||||
model_name='Shoe') as objectron:
|
||||
for idx, file in enumerate(file_list):
|
||||
for idx, file in enumerate(IMAGE_FILES):
|
||||
image = cv2.imread(file)
|
||||
# Convert the BGR image to RGB and process it with MediaPipe Objectron.
|
||||
results = objectron.process(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
|
||||
|
@ -355,6 +360,89 @@ with mp_objectron.Objectron(static_image_mode=False,
|
|||
cap.release()
|
||||
```
|
||||
|
||||
## JavaScript Solution API
|
||||
|
||||
Please first see general [introduction](../getting_started/javascript.md) on
|
||||
MediaPipe in JavaScript, then learn more in the companion [web demo](#resources)
|
||||
and the following usage example.
|
||||
|
||||
Supported configuration options:
|
||||
|
||||
* [staticImageMode](#static_image_mode)
|
||||
* [maxNumObjects](#max_num_objects)
|
||||
* [minDetectionConfidence](#min_detection_confidence)
|
||||
* [minTrackingConfidence](#min_tracking_confidence)
|
||||
* [modelName](#model_name)
|
||||
* [focalLength](#focal_length)
|
||||
* [principalPoint](#principal_point)
|
||||
* [imageSize](#image_size)
|
||||
|
||||
```html
|
||||
<!DOCTYPE html>
|
||||
<html>
|
||||
<head>
|
||||
<meta charset="utf-8">
|
||||
<script src="https://cdn.jsdelivr.net/npm/@mediapipe/camera_utils/camera_utils.js" crossorigin="anonymous"></script>
|
||||
<script src="https://cdn.jsdelivr.net/npm/@mediapipe/control_utils/control_utils.js" crossorigin="anonymous"></script>
|
||||
<script src="https://cdn.jsdelivr.net/npm/@mediapipe/drawing_utils/control_utils_3d.js" crossorigin="anonymous"></script>
|
||||
<script src="https://cdn.jsdelivr.net/npm/@mediapipe/drawing_utils/drawing_utils.js" crossorigin="anonymous"></script>
|
||||
<script src="https://cdn.jsdelivr.net/npm/@mediapipe/objectron/objectron.js" crossorigin="anonymous"></script>
|
||||
</head>
|
||||
|
||||
<body>
|
||||
<div class="container">
|
||||
<video class="input_video"></video>
|
||||
<canvas class="output_canvas" width="1280px" height="720px"></canvas>
|
||||
</div>
|
||||
</body>
|
||||
</html>
|
||||
```
|
||||
|
||||
```javascript
|
||||
<script type="module">
|
||||
const videoElement = document.getElementsByClassName('input_video')[0];
|
||||
const canvasElement = document.getElementsByClassName('output_canvas')[0];
|
||||
const canvasCtx = canvasElement.getContext('2d');
|
||||
|
||||
function onResults(results) {
|
||||
canvasCtx.save();
|
||||
canvasCtx.drawImage(
|
||||
results.image, 0, 0, canvasElement.width, canvasElement.height);
|
||||
if (!!results.objectDetections) {
|
||||
for (const detectedObject of results.objectDetections) {
|
||||
// Reformat keypoint information as landmarks, for easy drawing.
|
||||
const landmarks: mpObjectron.Point2D[] =
|
||||
detectedObject.keypoints.map(x => x.point2d);
|
||||
// Draw bounding box.
|
||||
drawingUtils.drawConnectors(canvasCtx, landmarks,
|
||||
mpObjectron.BOX_CONNECTIONS, {color: '#FF0000'});
|
||||
// Draw centroid.
|
||||
drawingUtils.drawLandmarks(canvasCtx, [landmarks[0]], {color: '#FFFFFF'});
|
||||
}
|
||||
}
|
||||
canvasCtx.restore();
|
||||
}
|
||||
|
||||
const objectron = new Objectron({locateFile: (file) => {
|
||||
return `https://cdn.jsdelivr.net/npm/@mediapipe/objectron/${file}`;
|
||||
}});
|
||||
objectron.setOptions({
|
||||
modelName: 'Chair',
|
||||
maxNumObjects: 3,
|
||||
});
|
||||
objectron.onResults(onResults);
|
||||
|
||||
const camera = new Camera(videoElement, {
|
||||
onFrame: async () => {
|
||||
await objectron.send({image: videoElement});
|
||||
},
|
||||
width: 1280,
|
||||
height: 720
|
||||
});
|
||||
camera.start();
|
||||
</script>
|
||||
```
|
||||
|
||||
## Example Apps
|
||||
|
||||
Please first see general instructions for
|
||||
|
@ -441,7 +529,7 @@ Example app bounding boxes are rendered with [GlAnimationOverlayCalculator](http
|
|||
> ```
|
||||
> and then run
|
||||
>
|
||||
> ```build
|
||||
> ```bash
|
||||
> bazel run -c opt mediapipe/graphs/object_detection_3d/obj_parser:ObjParser -- input_dir=[INTERMEDIATE_OUTPUT_DIR] output_dir=[OUTPUT_DIR]
|
||||
> ```
|
||||
> INPUT_DIR should be the folder with initial asset .obj files to be processed,
|
||||
|
@ -560,11 +648,15 @@ py = -py_pixel * 2.0 / image_height + 1.0
|
|||
[Announcing the Objectron Dataset](https://ai.googleblog.com/2020/11/announcing-objectron-dataset.html)
|
||||
* Google AI Blog:
|
||||
[Real-Time 3D Object Detection on Mobile Devices with MediaPipe](https://ai.googleblog.com/2020/03/real-time-3d-object-detection-on-mobile.html)
|
||||
* Paper: [Objectron: A Large Scale Dataset of Object-Centric Videos in the Wild with Pose Annotations](https://arxiv.org/abs/2012.09988), to appear in CVPR 2021
|
||||
* Paper: [Objectron: A Large Scale Dataset of Object-Centric Videos in the
|
||||
Wild with Pose Annotations](https://arxiv.org/abs/2012.09988), to appear in
|
||||
CVPR 2021
|
||||
* Paper: [MobilePose: Real-Time Pose Estimation for Unseen Objects with Weak
|
||||
Shape Supervision](https://arxiv.org/abs/2003.03522)
|
||||
* Paper:
|
||||
[Instant 3D Object Tracking with Applications in Augmented Reality](https://drive.google.com/open?id=1O_zHmlgXIzAdKljp20U_JUkEHOGG52R8)
|
||||
([presentation](https://www.youtube.com/watch?v=9ndF1AIo7h0)), Fourth Workshop on Computer Vision for AR/VR, CVPR 2020
|
||||
([presentation](https://www.youtube.com/watch?v=9ndF1AIo7h0)), Fourth
|
||||
Workshop on Computer Vision for AR/VR, CVPR 2020
|
||||
* [Models and model cards](./models.md#objectron)
|
||||
* [Web demo](https://code.mediapipe.dev/codepen/objectron)
|
||||
* [Python Colab](https://mediapipe.page.link/objectron_py_colab)
|
||||
|
|
|
@ -30,7 +30,8 @@ overlay of digital content and information on top of the physical world in
|
|||
augmented reality.
|
||||
|
||||
MediaPipe Pose is a ML solution for high-fidelity body pose tracking, inferring
|
||||
33 3D landmarks on the whole body from RGB video frames utilizing our
|
||||
33 3D landmarks and background segmentation mask on the whole body from RGB
|
||||
video frames utilizing our
|
||||
[BlazePose](https://ai.googleblog.com/2020/08/on-device-real-time-body-pose-tracking.html)
|
||||
research that also powers the
|
||||
[ML Kit Pose Detection API](https://developers.google.com/ml-kit/vision/pose-detection).
|
||||
|
@ -49,11 +50,11 @@ The solution utilizes a two-step detector-tracker ML pipeline, proven to be
|
|||
effective in our [MediaPipe Hands](./hands.md) and
|
||||
[MediaPipe Face Mesh](./face_mesh.md) solutions. Using a detector, the pipeline
|
||||
first locates the person/pose region-of-interest (ROI) within the frame. The
|
||||
tracker subsequently predicts the pose landmarks within the ROI using the
|
||||
ROI-cropped frame as input. Note that for video use cases the detector is
|
||||
invoked only as needed, i.e., for the very first frame and when the tracker
|
||||
could no longer identify body pose presence in the previous frame. For other
|
||||
frames the pipeline simply derives the ROI from the previous frame’s pose
|
||||
tracker subsequently predicts the pose landmarks and segmentation mask within
|
||||
the ROI using the ROI-cropped frame as input. Note that for video use cases the
|
||||
detector is invoked only as needed, i.e., for the very first frame and when the
|
||||
tracker could no longer identify body pose presence in the previous frame. For
|
||||
other frames the pipeline simply derives the ROI from the previous frame’s pose
|
||||
landmarks.
|
||||
|
||||
The pipeline is implemented as a MediaPipe
|
||||
|
@ -87,11 +88,11 @@ from [COCO topology](https://cocodataset.org/#keypoints-2020).
|
|||
|
||||
Method | Yoga <br/> [`mAP`] | Yoga <br/> [`PCK@0.2`] | Dance <br/> [`mAP`] | Dance <br/> [`PCK@0.2`] | HIIT <br/> [`mAP`] | HIIT <br/> [`PCK@0.2`]
|
||||
----------------------------------------------------------------------------------------------------- | -----------------: | ---------------------: | ------------------: | ----------------------: | -----------------: | ---------------------:
|
||||
BlazePose.Heavy | 68.1 | **96.4** | 73.0 | **97.2** | 74.0 | **97.5**
|
||||
BlazePose.Full | 62.6 | **95.5** | 67.4 | **96.3** | 68.0 | **95.7**
|
||||
BlazePose.Lite | 45.0 | **90.2** | 53.6 | **92.5** | 53.8 | **93.5**
|
||||
[AlphaPose.ResNet50](https://github.com/MVIG-SJTU/AlphaPose) | 63.4 | **96.0** | 57.8 | **95.5** | 63.4 | **96.0**
|
||||
[Apple.Vision](https://developer.apple.com/documentation/vision/detecting_human_body_poses_in_images) | 32.8 | **82.7** | 36.4 | **91.4** | 44.5 | **88.6**
|
||||
BlazePose GHUM Heavy | 68.1 | **96.4** | 73.0 | **97.2** | 74.0 | **97.5**
|
||||
BlazePose GHUM Full | 62.6 | **95.5** | 67.4 | **96.3** | 68.0 | **95.7**
|
||||
BlazePose GHUM Lite | 45.0 | **90.2** | 53.6 | **92.5** | 53.8 | **93.5**
|
||||
[AlphaPose ResNet50](https://github.com/MVIG-SJTU/AlphaPose) | 63.4 | **96.0** | 57.8 | **95.5** | 63.4 | **96.0**
|
||||
[Apple Vision](https://developer.apple.com/documentation/vision/detecting_human_body_poses_in_images) | 32.8 | **82.7** | 36.4 | **91.4** | 44.5 | **88.6**
|
||||
|
||||
 |
|
||||
:--------------------------------------------------------------------------: |
|
||||
|
@ -101,10 +102,10 @@ We designed our models specifically for live perception use cases, so all of
|
|||
them work in real-time on the majority of modern devices.
|
||||
|
||||
Method | Latency <br/> Pixel 3 [TFLite GPU](https://www.tensorflow.org/lite/performance/gpu_advanced) | Latency <br/> MacBook Pro (15-inch 2017)
|
||||
--------------- | -------------------------------------------------------------------------------------------: | ---------------------------------------:
|
||||
BlazePose.Heavy | 53 ms | 38 ms
|
||||
BlazePose.Full | 25 ms | 27 ms
|
||||
BlazePose.Lite | 20 ms | 25 ms
|
||||
-------------------- | -------------------------------------------------------------------------------------------: | ---------------------------------------:
|
||||
BlazePose GHUM Heavy | 53 ms | 38 ms
|
||||
BlazePose GHUM Full | 25 ms | 27 ms
|
||||
BlazePose GHUM Lite | 20 ms | 25 ms
|
||||
|
||||
## Models
|
||||
|
||||
|
@ -129,16 +130,19 @@ hip midpoints.
|
|||
The landmark model in MediaPipe Pose predicts the location of 33 pose landmarks
|
||||
(see figure below).
|
||||
|
||||
Please find more detail in the
|
||||
[BlazePose Google AI Blog](https://ai.googleblog.com/2020/08/on-device-real-time-body-pose-tracking.html),
|
||||
this [paper](https://arxiv.org/abs/2006.10204) and
|
||||
[the model card](./models.md#pose), and the attributes in each landmark
|
||||
[below](#pose_landmarks).
|
||||
|
||||
 |
|
||||
:----------------------------------------------------------------------------------------------: |
|
||||
*Fig 4. 33 pose landmarks.* |
|
||||
|
||||
Optionally, MediaPipe Pose can predicts a full-body
|
||||
[segmentation mask](#segmentation_mask) represented as a two-class segmentation
|
||||
(human or background).
|
||||
|
||||
Please find more detail in the
|
||||
[BlazePose Google AI Blog](https://ai.googleblog.com/2020/08/on-device-real-time-body-pose-tracking.html),
|
||||
this [paper](https://arxiv.org/abs/2006.10204),
|
||||
[the model card](./models.md#pose) and the [Output](#output) section below.
|
||||
|
||||
## Solution APIs
|
||||
|
||||
### Cross-platform Configuration Options
|
||||
|
@ -167,6 +171,18 @@ If set to `true`, the solution filters pose landmarks across different input
|
|||
images to reduce jitter, but ignored if [static_image_mode](#static_image_mode)
|
||||
is also set to `true`. Default to `true`.
|
||||
|
||||
#### enable_segmentation
|
||||
|
||||
If set to `true`, in addition to the pose landmarks the solution also generates
|
||||
the segmentation mask. Default to `false`.
|
||||
|
||||
#### smooth_segmentation
|
||||
|
||||
If set to `true`, the solution filters segmentation masks across different input
|
||||
images to reduce jitter. Ignored if [enable_segmentation](#enable_segmentation)
|
||||
is `false` or [static_image_mode](#static_image_mode) is `true`. Default to
|
||||
`true`.
|
||||
|
||||
#### min_detection_confidence
|
||||
|
||||
Minimum confidence value (`[0.0, 1.0]`) from the person-detection model for the
|
||||
|
@ -187,28 +203,56 @@ Naming style may differ slightly across platforms/languages.
|
|||
|
||||
#### pose_landmarks
|
||||
|
||||
A list of pose landmarks. Each lanmark consists of the following:
|
||||
A list of pose landmarks. Each landmark consists of the following:
|
||||
|
||||
* `x` and `y`: Landmark coordinates normalized to `[0.0, 1.0]` by the image
|
||||
width and height respectively.
|
||||
* `z`: Represents the landmark depth with the depth at the midpoint of hips
|
||||
being the origin, and the smaller the value the closer the landmark is to
|
||||
the camera. The magnitude of `z` uses roughly the same scale as `x`.
|
||||
|
||||
* `visibility`: A value in `[0.0, 1.0]` indicating the likelihood of the
|
||||
landmark being visible (present and not occluded) in the image.
|
||||
|
||||
#### pose_world_landmarks
|
||||
|
||||
*Fig 5. Example of MediaPipe Pose real-world 3D coordinates.* |
|
||||
:-----------------------------------------------------------: |
|
||||
<video autoplay muted loop preload style="height: auto; width: 480px"><source src="../images/mobile/pose_world_landmarks.mp4" type="video/mp4"></video> |
|
||||
|
||||
Another list of pose landmarks in world coordinates. Each landmark consists of
|
||||
the following:
|
||||
|
||||
* `x`, `y` and `z`: Real-world 3D coordinates in meters with the origin at the
|
||||
center between hips.
|
||||
* `visibility`: Identical to that defined in the corresponding
|
||||
[pose_landmarks](#pose_landmarks).
|
||||
|
||||
#### segmentation_mask
|
||||
|
||||
The output segmentation mask, predicted only when
|
||||
[enable_segmentation](#enable_segmentation) is set to `true`. The mask has the
|
||||
same width and height as the input image, and contains values in `[0.0, 1.0]`
|
||||
where `1.0` and `0.0` indicate high certainty of a "human" and "background"
|
||||
pixel respectively. Please refer to the platform-specific usage examples below
|
||||
for usage details.
|
||||
|
||||
*Fig 6. Example of MediaPipe Pose segmentation mask.* |
|
||||
:---------------------------------------------------: |
|
||||
<video autoplay muted loop preload style="height: auto; width: 480px"><source src="../images/mobile/pose_segmentation.mp4" type="video/mp4"></video> |
|
||||
|
||||
### Python Solution API
|
||||
|
||||
Please first follow general [instructions](../getting_started/python.md) to
|
||||
install MediaPipe Python package, then learn more in the companion
|
||||
[Python Colab](#resources) and the following usage example.
|
||||
[Python Colab](#resources) and the usage example below.
|
||||
|
||||
Supported configuration options:
|
||||
|
||||
* [static_image_mode](#static_image_mode)
|
||||
* [model_complexity](#model_complexity)
|
||||
* [smooth_landmarks](#smooth_landmarks)
|
||||
* [enable_segmentation](#enable_segmentation)
|
||||
* [smooth_segmentation](#smooth_segmentation)
|
||||
* [min_detection_confidence](#min_detection_confidence)
|
||||
* [min_tracking_confidence](#min_tracking_confidence)
|
||||
|
||||
|
@ -216,14 +260,18 @@ Supported configuration options:
|
|||
import cv2
|
||||
import mediapipe as mp
|
||||
mp_drawing = mp.solutions.drawing_utils
|
||||
mp_drawing_styles = mp.solutions.drawing_styles
|
||||
mp_pose = mp.solutions.pose
|
||||
|
||||
# For static images:
|
||||
IMAGE_FILES = []
|
||||
BG_COLOR = (192, 192, 192) # gray
|
||||
with mp_pose.Pose(
|
||||
static_image_mode=True,
|
||||
model_complexity=2,
|
||||
enable_segmentation=True,
|
||||
min_detection_confidence=0.5) as pose:
|
||||
for idx, file in enumerate(file_list):
|
||||
for idx, file in enumerate(IMAGE_FILES):
|
||||
image = cv2.imread(file)
|
||||
image_height, image_width, _ = image.shape
|
||||
# Convert the BGR image to RGB before processing.
|
||||
|
@ -233,14 +281,28 @@ with mp_pose.Pose(
|
|||
continue
|
||||
print(
|
||||
f'Nose coordinates: ('
|
||||
f'{results.pose_landmarks.landmark[mp_holistic.PoseLandmark.NOSE].x * image_width}, '
|
||||
f'{results.pose_landmarks.landmark[mp_holistic.PoseLandmark.NOSE].y * image_height})'
|
||||
f'{results.pose_landmarks.landmark[mp_pose.PoseLandmark.NOSE].x * image_width}, '
|
||||
f'{results.pose_landmarks.landmark[mp_pose.PoseLandmark.NOSE].y * image_height})'
|
||||
)
|
||||
# Draw pose landmarks on the image.
|
||||
|
||||
annotated_image = image.copy()
|
||||
# Draw segmentation on the image.
|
||||
# To improve segmentation around boundaries, consider applying a joint
|
||||
# bilateral filter to "results.segmentation_mask" with "image".
|
||||
condition = np.stack((results.segmentation_mask,) * 3, axis=-1) > 0.1
|
||||
bg_image = np.zeros(image.shape, dtype=np.uint8)
|
||||
bg_image[:] = BG_COLOR
|
||||
annotated_image = np.where(condition, annotated_image, bg_image)
|
||||
# Draw pose landmarks on the image.
|
||||
mp_drawing.draw_landmarks(
|
||||
annotated_image, results.pose_landmarks, mp_pose.POSE_CONNECTIONS)
|
||||
annotated_image,
|
||||
results.pose_landmarks,
|
||||
mp_pose.POSE_CONNECTIONS,
|
||||
landmark_drawing_spec=mp_drawing_styles.get_default_pose_landmarks_style())
|
||||
cv2.imwrite('/tmp/annotated_image' + str(idx) + '.png', annotated_image)
|
||||
# Plot pose world landmarks.
|
||||
mp_drawing.plot_landmarks(
|
||||
results.pose_world_landmarks, mp_pose.POSE_CONNECTIONS)
|
||||
|
||||
# For webcam input:
|
||||
cap = cv2.VideoCapture(0)
|
||||
|
@ -266,7 +328,10 @@ with mp_pose.Pose(
|
|||
image.flags.writeable = True
|
||||
image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)
|
||||
mp_drawing.draw_landmarks(
|
||||
image, results.pose_landmarks, mp_pose.POSE_CONNECTIONS)
|
||||
image,
|
||||
results.pose_landmarks,
|
||||
mp_pose.POSE_CONNECTIONS,
|
||||
landmark_drawing_spec=mp_drawing_styles.get_default_pose_landmarks_style())
|
||||
cv2.imshow('MediaPipe Pose', image)
|
||||
if cv2.waitKey(5) & 0xFF == 27:
|
||||
break
|
||||
|
@ -283,6 +348,8 @@ Supported configuration options:
|
|||
|
||||
* [modelComplexity](#model_complexity)
|
||||
* [smoothLandmarks](#smooth_landmarks)
|
||||
* [enableSegmentation](#enable_segmentation)
|
||||
* [smoothSegmentation](#smooth_segmentation)
|
||||
* [minDetectionConfidence](#min_detection_confidence)
|
||||
* [minTrackingConfidence](#min_tracking_confidence)
|
||||
|
||||
|
@ -293,6 +360,7 @@ Supported configuration options:
|
|||
<meta charset="utf-8">
|
||||
<script src="https://cdn.jsdelivr.net/npm/@mediapipe/camera_utils/camera_utils.js" crossorigin="anonymous"></script>
|
||||
<script src="https://cdn.jsdelivr.net/npm/@mediapipe/control_utils/control_utils.js" crossorigin="anonymous"></script>
|
||||
<script src="https://cdn.jsdelivr.net/npm/@mediapipe/drawing_utils/control_utils_3d.js" crossorigin="anonymous"></script>
|
||||
<script src="https://cdn.jsdelivr.net/npm/@mediapipe/drawing_utils/drawing_utils.js" crossorigin="anonymous"></script>
|
||||
<script src="https://cdn.jsdelivr.net/npm/@mediapipe/pose/pose.js" crossorigin="anonymous"></script>
|
||||
</head>
|
||||
|
@ -301,6 +369,7 @@ Supported configuration options:
|
|||
<div class="container">
|
||||
<video class="input_video"></video>
|
||||
<canvas class="output_canvas" width="1280px" height="720px"></canvas>
|
||||
<div class="landmark-grid-container"></div>
|
||||
</div>
|
||||
</body>
|
||||
</html>
|
||||
|
@ -311,17 +380,38 @@ Supported configuration options:
|
|||
const videoElement = document.getElementsByClassName('input_video')[0];
|
||||
const canvasElement = document.getElementsByClassName('output_canvas')[0];
|
||||
const canvasCtx = canvasElement.getContext('2d');
|
||||
const landmarkContainer = document.getElementsByClassName('landmark-grid-container')[0];
|
||||
const grid = new LandmarkGrid(landmarkContainer);
|
||||
|
||||
function onResults(results) {
|
||||
if (!results.poseLandmarks) {
|
||||
grid.updateLandmarks([]);
|
||||
return;
|
||||
}
|
||||
|
||||
canvasCtx.save();
|
||||
canvasCtx.clearRect(0, 0, canvasElement.width, canvasElement.height);
|
||||
canvasCtx.drawImage(results.segmentationMask, 0, 0,
|
||||
canvasElement.width, canvasElement.height);
|
||||
|
||||
// Only overwrite existing pixels.
|
||||
canvasCtx.globalCompositeOperation = 'source-in';
|
||||
canvasCtx.fillStyle = '#00FF00';
|
||||
canvasCtx.fillRect(0, 0, canvasElement.width, canvasElement.height);
|
||||
|
||||
// Only overwrite missing pixels.
|
||||
canvasCtx.globalCompositeOperation = 'destination-atop';
|
||||
canvasCtx.drawImage(
|
||||
results.image, 0, 0, canvasElement.width, canvasElement.height);
|
||||
|
||||
canvasCtx.globalCompositeOperation = 'source-over';
|
||||
drawConnectors(canvasCtx, results.poseLandmarks, POSE_CONNECTIONS,
|
||||
{color: '#00FF00', lineWidth: 4});
|
||||
drawLandmarks(canvasCtx, results.poseLandmarks,
|
||||
{color: '#FF0000', lineWidth: 2});
|
||||
canvasCtx.restore();
|
||||
|
||||
grid.updateLandmarks(results.poseWorldLandmarks);
|
||||
}
|
||||
|
||||
const pose = new Pose({locateFile: (file) => {
|
||||
|
@ -330,6 +420,8 @@ const pose = new Pose({locateFile: (file) => {
|
|||
pose.setOptions({
|
||||
modelComplexity: 1,
|
||||
smoothLandmarks: true,
|
||||
enableSegmentation: true,
|
||||
smoothSegmentation: true,
|
||||
minDetectionConfidence: 0.5,
|
||||
minTrackingConfidence: 0.5
|
||||
});
|
||||
|
|
290
docs/solutions/selfie_segmentation.md
Normal file
290
docs/solutions/selfie_segmentation.md
Normal file
|
@ -0,0 +1,290 @@
|
|||
---
|
||||
layout: default
|
||||
title: Selfie Segmentation
|
||||
parent: Solutions
|
||||
nav_order: 7
|
||||
---
|
||||
|
||||
# MediaPipe Selfie Segmentation
|
||||
{: .no_toc }
|
||||
|
||||
<details close markdown="block">
|
||||
<summary>
|
||||
Table of contents
|
||||
</summary>
|
||||
{: .text-delta }
|
||||
1. TOC
|
||||
{:toc}
|
||||
</details>
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
*Fig 1. Example of MediaPipe Selfie Segmentation.* |
|
||||
:------------------------------------------------: |
|
||||
<video autoplay muted loop preload style="height: auto; width: 480px"><source src="../images/selfie_segmentation_web.mp4" type="video/mp4"></video> |
|
||||
|
||||
MediaPipe Selfie Segmentation segments the prominent humans in the scene. It can
|
||||
run in real-time on both smartphones and laptops. The intended use cases include
|
||||
selfie effects and video conferencing, where the person is close (< 2m) to the
|
||||
camera.
|
||||
|
||||
## Models
|
||||
|
||||
In this solution, we provide two models: general and landscape. Both models are
|
||||
based on
|
||||
[MobileNetV3](https://ai.googleblog.com/2019/11/introducing-next-generation-on-device.html),
|
||||
with modifications to make them more efficient. The general model operates on a
|
||||
256x256x3 (HWC) tensor, and outputs a 256x256x1 tensor representing the
|
||||
segmentation mask. The landscape model is similar to the general model, but
|
||||
operates on a 144x256x3 (HWC) tensor. It has fewer FLOPs than the general model,
|
||||
and therefore, runs faster. Note that MediaPipe Selfie Segmentation
|
||||
automatically resizes the input image to the desired tensor dimension before
|
||||
feeding it into the ML models.
|
||||
|
||||
The general model is also powering
|
||||
[ML Kit](https://developers.google.com/ml-kit/vision/selfie-segmentation), and a
|
||||
variant of the landscape model is powering
|
||||
[Google Meet](https://ai.googleblog.com/2020/10/background-features-in-google-meet.html).
|
||||
Please find more detail about the models in the
|
||||
[model card](./models.md#selfie-segmentation).
|
||||
|
||||
## ML Pipeline
|
||||
|
||||
The pipeline is implemented as a MediaPipe
|
||||
[graph](https://github.com/google/mediapipe/tree/master/mediapipe/graphs/selfie_segmentation/selfie_segmentation_gpu.pbtxt)
|
||||
that uses a
|
||||
[selfie segmentation subgraph](https://github.com/google/mediapipe/tree/master/mediapipe/modules/selfie_segmentation/selfie_segmentation_gpu.pbtxt)
|
||||
from the
|
||||
[selfie segmentation module](https://github.com/google/mediapipe/tree/master/mediapipe/modules/selfie_segmentation).
|
||||
|
||||
Note: To visualize a graph, copy the graph and paste it into
|
||||
[MediaPipe Visualizer](https://viz.mediapipe.dev/). For more information on how
|
||||
to visualize its associated subgraphs, please see
|
||||
[visualizer documentation](../tools/visualizer.md).
|
||||
|
||||
## Solution APIs
|
||||
|
||||
### Cross-platform Configuration Options
|
||||
|
||||
Naming style and availability may differ slightly across platforms/languages.
|
||||
|
||||
#### model_selection
|
||||
|
||||
An integer index `0` or `1`. Use `0` to select the general model, and `1` to
|
||||
select the landscape model (see details in [Models](#models)). Default to `0` if
|
||||
not specified.
|
||||
|
||||
### Output
|
||||
|
||||
Naming style may differ slightly across platforms/languages.
|
||||
|
||||
#### segmentation_mask
|
||||
|
||||
The output segmentation mask, which has the same dimension as the input image.
|
||||
|
||||
### Python Solution API
|
||||
|
||||
Please first follow general [instructions](../getting_started/python.md) to
|
||||
install MediaPipe Python package, then learn more in the companion
|
||||
[Python Colab](#resources) and the usage example below.
|
||||
|
||||
Supported configuration options:
|
||||
|
||||
* [model_selection](#model_selection)
|
||||
|
||||
```python
|
||||
import cv2
|
||||
import mediapipe as mp
|
||||
import numpy as np
|
||||
mp_drawing = mp.solutions.drawing_utils
|
||||
mp_selfie_segmentation = mp.solutions.selfie_segmentation
|
||||
|
||||
# For static images:
|
||||
IMAGE_FILES = []
|
||||
BG_COLOR = (192, 192, 192) # gray
|
||||
MASK_COLOR = (255, 255, 255) # white
|
||||
with mp_selfie_segmentation.SelfieSegmentation(
|
||||
model_selection=0) as selfie_segmentation:
|
||||
for idx, file in enumerate(IMAGE_FILES):
|
||||
image = cv2.imread(file)
|
||||
image_height, image_width, _ = image.shape
|
||||
# Convert the BGR image to RGB before processing.
|
||||
results = selfie_segmentation.process(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
|
||||
|
||||
# Draw selfie segmentation on the background image.
|
||||
# To improve segmentation around boundaries, consider applying a joint
|
||||
# bilateral filter to "results.segmentation_mask" with "image".
|
||||
condition = np.stack((results.segmentation_mask,) * 3, axis=-1) > 0.1
|
||||
# Generate solid color images for showing the output selfie segmentation mask.
|
||||
fg_image = np.zeros(image.shape, dtype=np.uint8)
|
||||
fg_image[:] = MASK_COLOR
|
||||
bg_image = np.zeros(image.shape, dtype=np.uint8)
|
||||
bg_image[:] = BG_COLOR
|
||||
output_image = np.where(condition, fg_image, bg_image)
|
||||
cv2.imwrite('/tmp/selfie_segmentation_output' + str(idx) + '.png', output_image)
|
||||
|
||||
# For webcam input:
|
||||
BG_COLOR = (192, 192, 192) # gray
|
||||
cap = cv2.VideoCapture(0)
|
||||
with mp_selfie_segmentation.SelfieSegmentation(
|
||||
model_selection=1) as selfie_segmentation:
|
||||
bg_image = None
|
||||
while cap.isOpened():
|
||||
success, image = cap.read()
|
||||
if not success:
|
||||
print("Ignoring empty camera frame.")
|
||||
# If loading a video, use 'break' instead of 'continue'.
|
||||
continue
|
||||
|
||||
# Flip the image horizontally for a later selfie-view display, and convert
|
||||
# the BGR image to RGB.
|
||||
image = cv2.cvtColor(cv2.flip(image, 1), cv2.COLOR_BGR2RGB)
|
||||
# To improve performance, optionally mark the image as not writeable to
|
||||
# pass by reference.
|
||||
image.flags.writeable = False
|
||||
results = selfie_segmentation.process(image)
|
||||
|
||||
image.flags.writeable = True
|
||||
image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)
|
||||
|
||||
# Draw selfie segmentation on the background image.
|
||||
# To improve segmentation around boundaries, consider applying a joint
|
||||
# bilateral filter to "results.segmentation_mask" with "image".
|
||||
condition = np.stack(
|
||||
(results.segmentation_mask,) * 3, axis=-1) > 0.1
|
||||
# The background can be customized.
|
||||
# a) Load an image (with the same width and height of the input image) to
|
||||
# be the background, e.g., bg_image = cv2.imread('/path/to/image/file')
|
||||
# b) Blur the input image by applying image filtering, e.g.,
|
||||
# bg_image = cv2.GaussianBlur(image,(55,55),0)
|
||||
if bg_image is None:
|
||||
bg_image = np.zeros(image.shape, dtype=np.uint8)
|
||||
bg_image[:] = BG_COLOR
|
||||
output_image = np.where(condition, image, bg_image)
|
||||
|
||||
cv2.imshow('MediaPipe Selfie Segmentation', output_image)
|
||||
if cv2.waitKey(5) & 0xFF == 27:
|
||||
break
|
||||
cap.release()
|
||||
```
|
||||
|
||||
### JavaScript Solution API
|
||||
|
||||
Please first see general [introduction](../getting_started/javascript.md) on
|
||||
MediaPipe in JavaScript, then learn more in the companion [web demo](#resources)
|
||||
and the following usage example.
|
||||
|
||||
Supported configuration options:
|
||||
|
||||
* [modelSelection](#model_selection)
|
||||
|
||||
```html
|
||||
<!DOCTYPE html>
|
||||
<html>
|
||||
<head>
|
||||
<meta charset="utf-8">
|
||||
<script src="https://cdn.jsdelivr.net/npm/@mediapipe/camera_utils/camera_utils.js" crossorigin="anonymous"></script>
|
||||
<script src="https://cdn.jsdelivr.net/npm/@mediapipe/control_utils/control_utils.js" crossorigin="anonymous"></script>
|
||||
<script src="https://cdn.jsdelivr.net/npm/@mediapipe/drawing_utils/drawing_utils.js" crossorigin="anonymous"></script>
|
||||
<script src="https://cdn.jsdelivr.net/npm/@mediapipe/selfie_segmentation/selfie_segmentation.js" crossorigin="anonymous"></script>
|
||||
</head>
|
||||
|
||||
<body>
|
||||
<div class="container">
|
||||
<video class="input_video"></video>
|
||||
<canvas class="output_canvas" width="1280px" height="720px"></canvas>
|
||||
</div>
|
||||
</body>
|
||||
</html>
|
||||
```
|
||||
|
||||
```javascript
|
||||
<script type="module">
|
||||
const videoElement = document.getElementsByClassName('input_video')[0];
|
||||
const canvasElement = document.getElementsByClassName('output_canvas')[0];
|
||||
const canvasCtx = canvasElement.getContext('2d');
|
||||
|
||||
function onResults(results) {
|
||||
canvasCtx.save();
|
||||
canvasCtx.clearRect(0, 0, canvasElement.width, canvasElement.height);
|
||||
canvasCtx.drawImage(results.segmentationMask, 0, 0,
|
||||
canvasElement.width, canvasElement.height);
|
||||
|
||||
// Only overwrite existing pixels.
|
||||
canvasCtx.globalCompositeOperation = 'source-in';
|
||||
canvasCtx.fillStyle = '#00FF00';
|
||||
canvasCtx.fillRect(0, 0, canvasElement.width, canvasElement.height);
|
||||
|
||||
// Only overwrite missing pixels.
|
||||
canvasCtx.globalCompositeOperation = 'destination-atop';
|
||||
canvasCtx.drawImage(
|
||||
results.image, 0, 0, canvasElement.width, canvasElement.height);
|
||||
|
||||
canvasCtx.restore();
|
||||
}
|
||||
|
||||
const selfieSegmentation = new SelfieSegmentation({locateFile: (file) => {
|
||||
return `https://cdn.jsdelivr.net/npm/@mediapipe/selfie_segmentation/${file}`;
|
||||
}});
|
||||
selfieSegmentation.setOptions({
|
||||
modelSelection: 1,
|
||||
});
|
||||
selfieSegmentation.onResults(onResults);
|
||||
|
||||
const camera = new Camera(videoElement, {
|
||||
onFrame: async () => {
|
||||
await selfieSegmentation.send({image: videoElement});
|
||||
},
|
||||
width: 1280,
|
||||
height: 720
|
||||
});
|
||||
camera.start();
|
||||
</script>
|
||||
```
|
||||
|
||||
## Example Apps
|
||||
|
||||
Please first see general instructions for
|
||||
[Android](../getting_started/android.md), [iOS](../getting_started/ios.md), and
|
||||
[desktop](../getting_started/cpp.md) on how to build MediaPipe examples.
|
||||
|
||||
Note: To visualize a graph, copy the graph and paste it into
|
||||
[MediaPipe Visualizer](https://viz.mediapipe.dev/). For more information on how
|
||||
to visualize its associated subgraphs, please see
|
||||
[visualizer documentation](../tools/visualizer.md).
|
||||
|
||||
### Mobile
|
||||
|
||||
* Graph:
|
||||
[`mediapipe/graphs/selfie_segmentation/selfie_segmentation_gpu.pbtxt`](https://github.com/google/mediapipe/tree/master/mediapipe/graphs/selfie_segmentation/selfie_segmentation_gpu.pbtxt)
|
||||
* Android target:
|
||||
[(or download prebuilt ARM64 APK)](https://drive.google.com/file/d/1DoeyGzMmWUsjfVgZfGGecrn7GKzYcEAo/view?usp=sharing)
|
||||
[`mediapipe/examples/android/src/java/com/google/mediapipe/apps/selfiesegmentationgpu:selfiesegmentationgpu`](https://github.com/google/mediapipe/tree/master/mediapipe/examples/android/src/java/com/google/mediapipe/apps/selfiesegmentationgpu/BUILD)
|
||||
* iOS target:
|
||||
[`mediapipe/examples/ios/selfiesegmentationgpu:SelfieSegmentationGpuApp`](https://github.com/google/mediapipe/tree/master/mediapipe/examples/ios/selfiesegmentationgpu/BUILD)
|
||||
|
||||
### Desktop
|
||||
|
||||
Please first see general instructions for [desktop](../getting_started/cpp.md)
|
||||
on how to build MediaPipe examples.
|
||||
|
||||
* Running on CPU
|
||||
* Graph:
|
||||
[`mediapipe/graphs/selfie_segmentation/selfie_segmentation_cpu.pbtxt`](https://github.com/google/mediapipe/tree/master/mediapipe/graphs/selfie_segmentation/selfie_segmentation_cpu.pbtxt)
|
||||
* Target:
|
||||
[`mediapipe/examples/desktop/selfie_segmentation:selfie_segmentation_cpu`](https://github.com/google/mediapipe/tree/master/mediapipe/examples/desktop/selfie_segmentation/BUILD)
|
||||
* Running on GPU
|
||||
* Graph:
|
||||
[`mediapipe/graphs/selfie_segmentation/selfie_segmentation_gpu.pbtxt`](https://github.com/google/mediapipe/tree/master/mediapipe/graphs/selfie_segmentation/selfie_segmentation_gpu.pbtxt)
|
||||
* Target:
|
||||
[`mediapipe/examples/desktop/selfie_segmentation:selfie_segmentation_gpu`](https://github.com/google/mediapipe/tree/master/mediapipe/examples/desktop/selfie_segmentation/BUILD)
|
||||
|
||||
## Resources
|
||||
|
||||
* Google AI Blog:
|
||||
[Background Features in Google Meet, Powered by Web ML](https://ai.googleblog.com/2020/10/background-features-in-google-meet.html)
|
||||
* [ML Kit Selfie Segmentation API](https://developers.google.com/ml-kit/vision/selfie-segmentation)
|
||||
* [Models and model cards](./models.md#selfie-segmentation)
|
||||
* [Web demo](https://code.mediapipe.dev/codepen/selfie_segmentation)
|
||||
* [Python Colab](https://mediapipe.page.link/selfie_segmentation_py_colab)
|
|
@ -13,6 +13,9 @@ has_toc: false
|
|||
{:toc}
|
||||
---
|
||||
|
||||
MediaPipe offers open source cross-platform, customizable ML solutions for live
|
||||
and streaming media.
|
||||
|
||||
<!-- []() in the first cell is needed to preserve table formatting in GitHub Pages. -->
|
||||
<!-- Whenever this table is updated, paste a copy to ../external_index.md. -->
|
||||
|
||||
|
@ -24,11 +27,12 @@ has_toc: false
|
|||
[Hands](https://google.github.io/mediapipe/solutions/hands) | ✅ | ✅ | ✅ | ✅ | ✅ |
|
||||
[Pose](https://google.github.io/mediapipe/solutions/pose) | ✅ | ✅ | ✅ | ✅ | ✅ |
|
||||
[Holistic](https://google.github.io/mediapipe/solutions/holistic) | ✅ | ✅ | ✅ | ✅ | ✅ |
|
||||
[Selfie Segmentation](https://google.github.io/mediapipe/solutions/selfie_segmentation) | ✅ | ✅ | ✅ | ✅ | ✅ |
|
||||
[Hair Segmentation](https://google.github.io/mediapipe/solutions/hair_segmentation) | ✅ | | ✅ | | |
|
||||
[Object Detection](https://google.github.io/mediapipe/solutions/object_detection) | ✅ | ✅ | ✅ | | | ✅
|
||||
[Box Tracking](https://google.github.io/mediapipe/solutions/box_tracking) | ✅ | ✅ | ✅ | | |
|
||||
[Instant Motion Tracking](https://google.github.io/mediapipe/solutions/instant_motion_tracking) | ✅ | | | | |
|
||||
[Objectron](https://google.github.io/mediapipe/solutions/objectron) | ✅ | | ✅ | ✅ | |
|
||||
[Objectron](https://google.github.io/mediapipe/solutions/objectron) | ✅ | | ✅ | ✅ | ✅ |
|
||||
[KNIFT](https://google.github.io/mediapipe/solutions/knift) | ✅ | | | | |
|
||||
[AutoFlip](https://google.github.io/mediapipe/solutions/autoflip) | | | ✅ | | |
|
||||
[MediaSequence](https://google.github.io/mediapipe/solutions/media_sequence) | | | ✅ | | |
|
||||
|
|
|
@ -2,7 +2,7 @@
|
|||
layout: default
|
||||
title: YouTube-8M Feature Extraction and Model Inference
|
||||
parent: Solutions
|
||||
nav_order: 15
|
||||
nav_order: 16
|
||||
---
|
||||
|
||||
# YouTube-8M Feature Extraction and Model Inference
|
||||
|
|
|
@ -16,6 +16,7 @@
|
|||
"mediapipe/examples/ios/objectdetectiongpu/BUILD",
|
||||
"mediapipe/examples/ios/objectdetectiontrackinggpu/BUILD",
|
||||
"mediapipe/examples/ios/posetrackinggpu/BUILD",
|
||||
"mediapipe/examples/ios/selfiesegmentationgpu/BUILD",
|
||||
"mediapipe/framework/BUILD",
|
||||
"mediapipe/gpu/BUILD",
|
||||
"mediapipe/objc/BUILD",
|
||||
|
@ -35,6 +36,7 @@
|
|||
"//mediapipe/examples/ios/objectdetectiongpu:ObjectDetectionGpuApp",
|
||||
"//mediapipe/examples/ios/objectdetectiontrackinggpu:ObjectDetectionTrackingGpuApp",
|
||||
"//mediapipe/examples/ios/posetrackinggpu:PoseTrackingGpuApp",
|
||||
"//mediapipe/examples/ios/selfiesegmentationgpu:SelfieSegmentationGpuApp",
|
||||
"//mediapipe/objc:mediapipe_framework_ios"
|
||||
],
|
||||
"optionSet" : {
|
||||
|
@ -103,6 +105,7 @@
|
|||
"mediapipe/examples/ios/objectdetectioncpu",
|
||||
"mediapipe/examples/ios/objectdetectiongpu",
|
||||
"mediapipe/examples/ios/posetrackinggpu",
|
||||
"mediapipe/examples/ios/selfiesegmentationgpu",
|
||||
"mediapipe/framework",
|
||||
"mediapipe/framework/deps",
|
||||
"mediapipe/framework/formats",
|
||||
|
@ -120,6 +123,7 @@
|
|||
"mediapipe/graphs/hand_tracking",
|
||||
"mediapipe/graphs/object_detection",
|
||||
"mediapipe/graphs/pose_tracking",
|
||||
"mediapipe/graphs/selfie_segmentation",
|
||||
"mediapipe/models",
|
||||
"mediapipe/modules",
|
||||
"mediapipe/objc",
|
||||
|
|
|
@ -22,6 +22,7 @@
|
|||
"mediapipe/examples/ios/objectdetectiongpu",
|
||||
"mediapipe/examples/ios/objectdetectiontrackinggpu",
|
||||
"mediapipe/examples/ios/posetrackinggpu",
|
||||
"mediapipe/examples/ios/selfiesegmentationgpu",
|
||||
"mediapipe/objc"
|
||||
],
|
||||
"projectName" : "Mediapipe",
|
||||
|
|
|
@ -140,6 +140,16 @@ mediapipe_proto_library(
|
|||
],
|
||||
)
|
||||
|
||||
mediapipe_proto_library(
|
||||
name = "graph_profile_calculator_proto",
|
||||
srcs = ["graph_profile_calculator.proto"],
|
||||
visibility = ["//visibility:public"],
|
||||
deps = [
|
||||
"//mediapipe/framework:calculator_options_proto",
|
||||
"//mediapipe/framework:calculator_proto",
|
||||
],
|
||||
)
|
||||
|
||||
cc_library(
|
||||
name = "add_header_calculator",
|
||||
srcs = ["add_header_calculator.cc"],
|
||||
|
@ -419,6 +429,23 @@ cc_library(
|
|||
alwayslink = 1,
|
||||
)
|
||||
|
||||
cc_test(
|
||||
name = "make_pair_calculator_test",
|
||||
size = "small",
|
||||
srcs = ["make_pair_calculator_test.cc"],
|
||||
deps = [
|
||||
":make_pair_calculator",
|
||||
"//mediapipe/framework:calculator_framework",
|
||||
"//mediapipe/framework:calculator_runner",
|
||||
"//mediapipe/framework:timestamp",
|
||||
"//mediapipe/framework/port:gtest_main",
|
||||
"//mediapipe/framework/port:status",
|
||||
"//mediapipe/framework/tool:validate_type",
|
||||
"//mediapipe/util:packet_test_util",
|
||||
"//mediapipe/util:time_series_test_util",
|
||||
],
|
||||
)
|
||||
|
||||
cc_library(
|
||||
name = "matrix_multiply_calculator",
|
||||
srcs = ["matrix_multiply_calculator.cc"],
|
||||
|
@ -933,8 +960,8 @@ cc_test(
|
|||
)
|
||||
|
||||
cc_library(
|
||||
name = "split_normalized_landmark_list_calculator",
|
||||
srcs = ["split_normalized_landmark_list_calculator.cc"],
|
||||
name = "split_landmarks_calculator",
|
||||
srcs = ["split_landmarks_calculator.cc"],
|
||||
visibility = ["//visibility:public"],
|
||||
deps = [
|
||||
":split_vector_calculator_cc_proto",
|
||||
|
@ -948,10 +975,10 @@ cc_library(
|
|||
)
|
||||
|
||||
cc_test(
|
||||
name = "split_normalized_landmark_list_calculator_test",
|
||||
srcs = ["split_normalized_landmark_list_calculator_test.cc"],
|
||||
name = "split_landmarks_calculator_test",
|
||||
srcs = ["split_landmarks_calculator_test.cc"],
|
||||
deps = [
|
||||
":split_normalized_landmark_list_calculator",
|
||||
":split_landmarks_calculator",
|
||||
":split_vector_calculator_cc_proto",
|
||||
"//mediapipe/framework:calculator_framework",
|
||||
"//mediapipe/framework:calculator_runner",
|
||||
|
@ -1183,3 +1210,45 @@ cc_test(
|
|||
"@com_google_absl//absl/strings",
|
||||
],
|
||||
)
|
||||
|
||||
cc_library(
|
||||
name = "graph_profile_calculator",
|
||||
srcs = ["graph_profile_calculator.cc"],
|
||||
visibility = ["//visibility:public"],
|
||||
deps = [
|
||||
":graph_profile_calculator_cc_proto",
|
||||
"//mediapipe/framework:calculator_framework",
|
||||
"//mediapipe/framework:calculator_profile_cc_proto",
|
||||
"//mediapipe/framework/api2:node",
|
||||
"//mediapipe/framework/api2:packet",
|
||||
"//mediapipe/framework/api2:port",
|
||||
"//mediapipe/framework/port:ret_check",
|
||||
"//mediapipe/framework/port:status",
|
||||
],
|
||||
alwayslink = 1,
|
||||
)
|
||||
|
||||
cc_test(
|
||||
name = "graph_profile_calculator_test",
|
||||
srcs = ["graph_profile_calculator_test.cc"],
|
||||
deps = [
|
||||
":graph_profile_calculator",
|
||||
"//mediapipe/framework:calculator_cc_proto",
|
||||
"//mediapipe/framework:calculator_framework",
|
||||
"//mediapipe/framework:calculator_profile_cc_proto",
|
||||
"//mediapipe/framework:test_calculators",
|
||||
"//mediapipe/framework/deps:clock",
|
||||
"//mediapipe/framework/deps:message_matchers",
|
||||
"//mediapipe/framework/port:core_proto",
|
||||
"//mediapipe/framework/port:gtest_main",
|
||||
"//mediapipe/framework/port:integral_types",
|
||||
"//mediapipe/framework/port:logging",
|
||||
"//mediapipe/framework/port:parse_text_proto",
|
||||
"//mediapipe/framework/port:threadpool",
|
||||
"//mediapipe/framework/tool:simulation_clock_executor",
|
||||
"//mediapipe/framework/tool:sink",
|
||||
"@com_google_absl//absl/status",
|
||||
"@com_google_absl//absl/strings",
|
||||
"@com_google_absl//absl/time",
|
||||
],
|
||||
)
|
||||
|
|
|
@ -24,6 +24,9 @@
|
|||
|
||||
namespace mediapipe {
|
||||
|
||||
constexpr char kDataTag[] = "DATA";
|
||||
constexpr char kHeaderTag[] = "HEADER";
|
||||
|
||||
class AddHeaderCalculatorTest : public ::testing::Test {};
|
||||
|
||||
TEST_F(AddHeaderCalculatorTest, HeaderStream) {
|
||||
|
@ -36,11 +39,11 @@ TEST_F(AddHeaderCalculatorTest, HeaderStream) {
|
|||
CalculatorRunner runner(node);
|
||||
|
||||
// Set header and add 5 packets.
|
||||
runner.MutableInputs()->Tag("HEADER").header =
|
||||
runner.MutableInputs()->Tag(kHeaderTag).header =
|
||||
Adopt(new std::string("my_header"));
|
||||
for (int i = 0; i < 5; ++i) {
|
||||
Packet packet = Adopt(new int(i)).At(Timestamp(i * 1000));
|
||||
runner.MutableInputs()->Tag("DATA").packets.push_back(packet);
|
||||
runner.MutableInputs()->Tag(kDataTag).packets.push_back(packet);
|
||||
}
|
||||
|
||||
// Run calculator.
|
||||
|
@ -85,13 +88,14 @@ TEST_F(AddHeaderCalculatorTest, NoPacketsOnHeaderStream) {
|
|||
CalculatorRunner runner(node);
|
||||
|
||||
// Set header and add 5 packets.
|
||||
runner.MutableInputs()->Tag("HEADER").header =
|
||||
runner.MutableInputs()->Tag(kHeaderTag).header =
|
||||
Adopt(new std::string("my_header"));
|
||||
runner.MutableInputs()->Tag("HEADER").packets.push_back(
|
||||
Adopt(new std::string("not allowed")));
|
||||
runner.MutableInputs()
|
||||
->Tag(kHeaderTag)
|
||||
.packets.push_back(Adopt(new std::string("not allowed")));
|
||||
for (int i = 0; i < 5; ++i) {
|
||||
Packet packet = Adopt(new int(i)).At(Timestamp(i * 1000));
|
||||
runner.MutableInputs()->Tag("DATA").packets.push_back(packet);
|
||||
runner.MutableInputs()->Tag(kDataTag).packets.push_back(packet);
|
||||
}
|
||||
|
||||
// Run calculator.
|
||||
|
@ -108,11 +112,11 @@ TEST_F(AddHeaderCalculatorTest, InputSidePacket) {
|
|||
CalculatorRunner runner(node);
|
||||
|
||||
// Set header and add 5 packets.
|
||||
runner.MutableSidePackets()->Tag("HEADER") =
|
||||
runner.MutableSidePackets()->Tag(kHeaderTag) =
|
||||
Adopt(new std::string("my_header"));
|
||||
for (int i = 0; i < 5; ++i) {
|
||||
Packet packet = Adopt(new int(i)).At(Timestamp(i * 1000));
|
||||
runner.MutableInputs()->Tag("DATA").packets.push_back(packet);
|
||||
runner.MutableInputs()->Tag(kDataTag).packets.push_back(packet);
|
||||
}
|
||||
|
||||
// Run calculator.
|
||||
|
@ -143,13 +147,13 @@ TEST_F(AddHeaderCalculatorTest, UsingBothSideInputAndStream) {
|
|||
CalculatorRunner runner(node);
|
||||
|
||||
// Set both headers and add 5 packets.
|
||||
runner.MutableSidePackets()->Tag("HEADER") =
|
||||
runner.MutableSidePackets()->Tag(kHeaderTag) =
|
||||
Adopt(new std::string("my_header"));
|
||||
runner.MutableSidePackets()->Tag("HEADER") =
|
||||
runner.MutableSidePackets()->Tag(kHeaderTag) =
|
||||
Adopt(new std::string("my_header"));
|
||||
for (int i = 0; i < 5; ++i) {
|
||||
Packet packet = Adopt(new int(i)).At(Timestamp(i * 1000));
|
||||
runner.MutableInputs()->Tag("DATA").packets.push_back(packet);
|
||||
runner.MutableInputs()->Tag(kDataTag).packets.push_back(packet);
|
||||
}
|
||||
|
||||
// Run should fail because header can only be provided one way.
|
||||
|
|
|
@ -42,4 +42,9 @@ REGISTER_CALCULATOR(BeginLoopDetectionCalculator);
|
|||
typedef BeginLoopCalculator<std::vector<Matrix>> BeginLoopMatrixCalculator;
|
||||
REGISTER_CALCULATOR(BeginLoopMatrixCalculator);
|
||||
|
||||
// A calculator to process std::vector<std::vector<Matrix>>.
|
||||
typedef BeginLoopCalculator<std::vector<std::vector<Matrix>>>
|
||||
BeginLoopMatrixVectorCalculator;
|
||||
REGISTER_CALCULATOR(BeginLoopMatrixVectorCalculator);
|
||||
|
||||
} // namespace mediapipe
|
||||
|
|
|
@ -19,6 +19,13 @@
|
|||
|
||||
namespace mediapipe {
|
||||
|
||||
constexpr char kIncrementTag[] = "INCREMENT";
|
||||
constexpr char kInitialValueTag[] = "INITIAL_VALUE";
|
||||
constexpr char kBatchSizeTag[] = "BATCH_SIZE";
|
||||
constexpr char kErrorCountTag[] = "ERROR_COUNT";
|
||||
constexpr char kMaxCountTag[] = "MAX_COUNT";
|
||||
constexpr char kErrorOnOpenTag[] = "ERROR_ON_OPEN";
|
||||
|
||||
// Source calculator that produces MAX_COUNT*BATCH_SIZE int packets of
|
||||
// sequential numbers from INITIAL_VALUE (default 0) with a common
|
||||
// difference of INCREMENT (default 1) between successive numbers (with
|
||||
|
@ -33,53 +40,53 @@ class CountingSourceCalculator : public CalculatorBase {
|
|||
static absl::Status GetContract(CalculatorContract* cc) {
|
||||
cc->Outputs().Index(0).Set<int>();
|
||||
|
||||
if (cc->InputSidePackets().HasTag("ERROR_ON_OPEN")) {
|
||||
cc->InputSidePackets().Tag("ERROR_ON_OPEN").Set<bool>();
|
||||
if (cc->InputSidePackets().HasTag(kErrorOnOpenTag)) {
|
||||
cc->InputSidePackets().Tag(kErrorOnOpenTag).Set<bool>();
|
||||
}
|
||||
|
||||
RET_CHECK(cc->InputSidePackets().HasTag("MAX_COUNT") ||
|
||||
cc->InputSidePackets().HasTag("ERROR_COUNT"));
|
||||
if (cc->InputSidePackets().HasTag("MAX_COUNT")) {
|
||||
cc->InputSidePackets().Tag("MAX_COUNT").Set<int>();
|
||||
RET_CHECK(cc->InputSidePackets().HasTag(kMaxCountTag) ||
|
||||
cc->InputSidePackets().HasTag(kErrorCountTag));
|
||||
if (cc->InputSidePackets().HasTag(kMaxCountTag)) {
|
||||
cc->InputSidePackets().Tag(kMaxCountTag).Set<int>();
|
||||
}
|
||||
if (cc->InputSidePackets().HasTag("ERROR_COUNT")) {
|
||||
cc->InputSidePackets().Tag("ERROR_COUNT").Set<int>();
|
||||
if (cc->InputSidePackets().HasTag(kErrorCountTag)) {
|
||||
cc->InputSidePackets().Tag(kErrorCountTag).Set<int>();
|
||||
}
|
||||
|
||||
if (cc->InputSidePackets().HasTag("BATCH_SIZE")) {
|
||||
cc->InputSidePackets().Tag("BATCH_SIZE").Set<int>();
|
||||
if (cc->InputSidePackets().HasTag(kBatchSizeTag)) {
|
||||
cc->InputSidePackets().Tag(kBatchSizeTag).Set<int>();
|
||||
}
|
||||
if (cc->InputSidePackets().HasTag("INITIAL_VALUE")) {
|
||||
cc->InputSidePackets().Tag("INITIAL_VALUE").Set<int>();
|
||||
if (cc->InputSidePackets().HasTag(kInitialValueTag)) {
|
||||
cc->InputSidePackets().Tag(kInitialValueTag).Set<int>();
|
||||
}
|
||||
if (cc->InputSidePackets().HasTag("INCREMENT")) {
|
||||
cc->InputSidePackets().Tag("INCREMENT").Set<int>();
|
||||
if (cc->InputSidePackets().HasTag(kIncrementTag)) {
|
||||
cc->InputSidePackets().Tag(kIncrementTag).Set<int>();
|
||||
}
|
||||
return absl::OkStatus();
|
||||
}
|
||||
|
||||
absl::Status Open(CalculatorContext* cc) override {
|
||||
if (cc->InputSidePackets().HasTag("ERROR_ON_OPEN") &&
|
||||
cc->InputSidePackets().Tag("ERROR_ON_OPEN").Get<bool>()) {
|
||||
if (cc->InputSidePackets().HasTag(kErrorOnOpenTag) &&
|
||||
cc->InputSidePackets().Tag(kErrorOnOpenTag).Get<bool>()) {
|
||||
return absl::NotFoundError("expected error");
|
||||
}
|
||||
if (cc->InputSidePackets().HasTag("ERROR_COUNT")) {
|
||||
error_count_ = cc->InputSidePackets().Tag("ERROR_COUNT").Get<int>();
|
||||
if (cc->InputSidePackets().HasTag(kErrorCountTag)) {
|
||||
error_count_ = cc->InputSidePackets().Tag(kErrorCountTag).Get<int>();
|
||||
RET_CHECK_LE(0, error_count_);
|
||||
}
|
||||
if (cc->InputSidePackets().HasTag("MAX_COUNT")) {
|
||||
max_count_ = cc->InputSidePackets().Tag("MAX_COUNT").Get<int>();
|
||||
if (cc->InputSidePackets().HasTag(kMaxCountTag)) {
|
||||
max_count_ = cc->InputSidePackets().Tag(kMaxCountTag).Get<int>();
|
||||
RET_CHECK_LE(0, max_count_);
|
||||
}
|
||||
if (cc->InputSidePackets().HasTag("BATCH_SIZE")) {
|
||||
batch_size_ = cc->InputSidePackets().Tag("BATCH_SIZE").Get<int>();
|
||||
if (cc->InputSidePackets().HasTag(kBatchSizeTag)) {
|
||||
batch_size_ = cc->InputSidePackets().Tag(kBatchSizeTag).Get<int>();
|
||||
RET_CHECK_LT(0, batch_size_);
|
||||
}
|
||||
if (cc->InputSidePackets().HasTag("INITIAL_VALUE")) {
|
||||
counter_ = cc->InputSidePackets().Tag("INITIAL_VALUE").Get<int>();
|
||||
if (cc->InputSidePackets().HasTag(kInitialValueTag)) {
|
||||
counter_ = cc->InputSidePackets().Tag(kInitialValueTag).Get<int>();
|
||||
}
|
||||
if (cc->InputSidePackets().HasTag("INCREMENT")) {
|
||||
increment_ = cc->InputSidePackets().Tag("INCREMENT").Get<int>();
|
||||
if (cc->InputSidePackets().HasTag(kIncrementTag)) {
|
||||
increment_ = cc->InputSidePackets().Tag(kIncrementTag).Get<int>();
|
||||
RET_CHECK_LT(0, increment_);
|
||||
}
|
||||
RET_CHECK(error_count_ >= 0 || max_count_ >= 0);
|
||||
|
|
|
@ -35,11 +35,14 @@
|
|||
// }
|
||||
namespace mediapipe {
|
||||
|
||||
constexpr char kFloatVectorTag[] = "FLOAT_VECTOR";
|
||||
constexpr char kEncodedTag[] = "ENCODED";
|
||||
|
||||
class DequantizeByteArrayCalculator : public CalculatorBase {
|
||||
public:
|
||||
static absl::Status GetContract(CalculatorContract* cc) {
|
||||
cc->Inputs().Tag("ENCODED").Set<std::string>();
|
||||
cc->Outputs().Tag("FLOAT_VECTOR").Set<std::vector<float>>();
|
||||
cc->Inputs().Tag(kEncodedTag).Set<std::string>();
|
||||
cc->Outputs().Tag(kFloatVectorTag).Set<std::vector<float>>();
|
||||
return absl::OkStatus();
|
||||
}
|
||||
|
||||
|
@ -66,7 +69,7 @@ class DequantizeByteArrayCalculator : public CalculatorBase {
|
|||
|
||||
absl::Status Process(CalculatorContext* cc) final {
|
||||
const std::string& encoded =
|
||||
cc->Inputs().Tag("ENCODED").Value().Get<std::string>();
|
||||
cc->Inputs().Tag(kEncodedTag).Value().Get<std::string>();
|
||||
std::vector<float> float_vector;
|
||||
float_vector.reserve(encoded.length());
|
||||
for (int i = 0; i < encoded.length(); ++i) {
|
||||
|
@ -74,7 +77,7 @@ class DequantizeByteArrayCalculator : public CalculatorBase {
|
|||
static_cast<unsigned char>(encoded.at(i)) * scalar_ + bias_);
|
||||
}
|
||||
cc->Outputs()
|
||||
.Tag("FLOAT_VECTOR")
|
||||
.Tag(kFloatVectorTag)
|
||||
.AddPacket(MakePacket<std::vector<float>>(float_vector)
|
||||
.At(cc->InputTimestamp()));
|
||||
return absl::OkStatus();
|
||||
|
|
|
@ -25,6 +25,9 @@
|
|||
|
||||
namespace mediapipe {
|
||||
|
||||
constexpr char kFloatVectorTag[] = "FLOAT_VECTOR";
|
||||
constexpr char kEncodedTag[] = "ENCODED";
|
||||
|
||||
TEST(QuantizeFloatVectorCalculatorTest, WrongConfig) {
|
||||
CalculatorGraphConfig::Node node_config =
|
||||
ParseTextProtoOrDie<CalculatorGraphConfig::Node>(R"pb(
|
||||
|
@ -39,7 +42,9 @@ TEST(QuantizeFloatVectorCalculatorTest, WrongConfig) {
|
|||
)pb");
|
||||
CalculatorRunner runner(node_config);
|
||||
std::string empty_string;
|
||||
runner.MutableInputs()->Tag("ENCODED").packets.push_back(
|
||||
runner.MutableInputs()
|
||||
->Tag(kEncodedTag)
|
||||
.packets.push_back(
|
||||
MakePacket<std::string>(empty_string).At(Timestamp(0)));
|
||||
auto status = runner.Run();
|
||||
EXPECT_FALSE(status.ok());
|
||||
|
@ -64,7 +69,9 @@ TEST(QuantizeFloatVectorCalculatorTest, WrongConfig2) {
|
|||
)pb");
|
||||
CalculatorRunner runner(node_config);
|
||||
std::string empty_string;
|
||||
runner.MutableInputs()->Tag("ENCODED").packets.push_back(
|
||||
runner.MutableInputs()
|
||||
->Tag(kEncodedTag)
|
||||
.packets.push_back(
|
||||
MakePacket<std::string>(empty_string).At(Timestamp(0)));
|
||||
auto status = runner.Run();
|
||||
EXPECT_FALSE(status.ok());
|
||||
|
@ -89,7 +96,9 @@ TEST(QuantizeFloatVectorCalculatorTest, WrongConfig3) {
|
|||
)pb");
|
||||
CalculatorRunner runner(node_config);
|
||||
std::string empty_string;
|
||||
runner.MutableInputs()->Tag("ENCODED").packets.push_back(
|
||||
runner.MutableInputs()
|
||||
->Tag(kEncodedTag)
|
||||
.packets.push_back(
|
||||
MakePacket<std::string>(empty_string).At(Timestamp(0)));
|
||||
auto status = runner.Run();
|
||||
EXPECT_FALSE(status.ok());
|
||||
|
@ -114,14 +123,16 @@ TEST(DequantizeByteArrayCalculatorTest, TestDequantization) {
|
|||
)pb");
|
||||
CalculatorRunner runner(node_config);
|
||||
unsigned char input[4] = {0x7F, 0xFF, 0x00, 0x01};
|
||||
runner.MutableInputs()->Tag("ENCODED").packets.push_back(
|
||||
runner.MutableInputs()
|
||||
->Tag(kEncodedTag)
|
||||
.packets.push_back(
|
||||
MakePacket<std::string>(
|
||||
std::string(reinterpret_cast<char const*>(input), 4))
|
||||
.At(Timestamp(0)));
|
||||
auto status = runner.Run();
|
||||
MP_ASSERT_OK(runner.Run());
|
||||
const std::vector<Packet>& outputs =
|
||||
runner.Outputs().Tag("FLOAT_VECTOR").packets;
|
||||
runner.Outputs().Tag(kFloatVectorTag).packets;
|
||||
EXPECT_EQ(1, outputs.size());
|
||||
const std::vector<float>& result = outputs[0].Get<std::vector<float>>();
|
||||
ASSERT_FALSE(result.empty());
|
||||
|
|
|
@ -28,6 +28,10 @@ typedef EndLoopCalculator<std::vector<::mediapipe::NormalizedRect>>
|
|||
EndLoopNormalizedRectCalculator;
|
||||
REGISTER_CALCULATOR(EndLoopNormalizedRectCalculator);
|
||||
|
||||
typedef EndLoopCalculator<std::vector<::mediapipe::LandmarkList>>
|
||||
EndLoopLandmarkListVectorCalculator;
|
||||
REGISTER_CALCULATOR(EndLoopLandmarkListVectorCalculator);
|
||||
|
||||
typedef EndLoopCalculator<std::vector<::mediapipe::NormalizedLandmarkList>>
|
||||
EndLoopNormalizedLandmarkListVectorCalculator;
|
||||
REGISTER_CALCULATOR(EndLoopNormalizedLandmarkListVectorCalculator);
|
||||
|
|
|
@ -24,6 +24,11 @@
|
|||
|
||||
namespace mediapipe {
|
||||
|
||||
constexpr char kFinishedTag[] = "FINISHED";
|
||||
constexpr char kAllowTag[] = "ALLOW";
|
||||
constexpr char kMaxInFlightTag[] = "MAX_IN_FLIGHT";
|
||||
constexpr char kOptionsTag[] = "OPTIONS";
|
||||
|
||||
// FlowLimiterCalculator is used to limit the number of frames in flight
|
||||
// by dropping input frames when necessary.
|
||||
//
|
||||
|
@ -69,16 +74,19 @@ class FlowLimiterCalculator : public CalculatorBase {
|
|||
public:
|
||||
static absl::Status GetContract(CalculatorContract* cc) {
|
||||
auto& side_inputs = cc->InputSidePackets();
|
||||
side_inputs.Tag("OPTIONS").Set<FlowLimiterCalculatorOptions>().Optional();
|
||||
cc->Inputs().Tag("OPTIONS").Set<FlowLimiterCalculatorOptions>().Optional();
|
||||
side_inputs.Tag(kOptionsTag).Set<FlowLimiterCalculatorOptions>().Optional();
|
||||
cc->Inputs()
|
||||
.Tag(kOptionsTag)
|
||||
.Set<FlowLimiterCalculatorOptions>()
|
||||
.Optional();
|
||||
RET_CHECK_GE(cc->Inputs().NumEntries(""), 1);
|
||||
for (int i = 0; i < cc->Inputs().NumEntries(""); ++i) {
|
||||
cc->Inputs().Get("", i).SetAny();
|
||||
cc->Outputs().Get("", i).SetSameAs(&(cc->Inputs().Get("", i)));
|
||||
}
|
||||
cc->Inputs().Get("FINISHED", 0).SetAny();
|
||||
cc->InputSidePackets().Tag("MAX_IN_FLIGHT").Set<int>().Optional();
|
||||
cc->Outputs().Tag("ALLOW").Set<bool>().Optional();
|
||||
cc->InputSidePackets().Tag(kMaxInFlightTag).Set<int>().Optional();
|
||||
cc->Outputs().Tag(kAllowTag).Set<bool>().Optional();
|
||||
cc->SetInputStreamHandler("ImmediateInputStreamHandler");
|
||||
cc->SetProcessTimestampBounds(true);
|
||||
return absl::OkStatus();
|
||||
|
@ -87,9 +95,9 @@ class FlowLimiterCalculator : public CalculatorBase {
|
|||
absl::Status Open(CalculatorContext* cc) final {
|
||||
options_ = cc->Options<FlowLimiterCalculatorOptions>();
|
||||
options_ = tool::RetrieveOptions(options_, cc->InputSidePackets());
|
||||
if (cc->InputSidePackets().HasTag("MAX_IN_FLIGHT")) {
|
||||
if (cc->InputSidePackets().HasTag(kMaxInFlightTag)) {
|
||||
options_.set_max_in_flight(
|
||||
cc->InputSidePackets().Tag("MAX_IN_FLIGHT").Get<int>());
|
||||
cc->InputSidePackets().Tag(kMaxInFlightTag).Get<int>());
|
||||
}
|
||||
input_queues_.resize(cc->Inputs().NumEntries(""));
|
||||
RET_CHECK_OK(CopyInputHeadersToOutputs(cc->Inputs(), &(cc->Outputs())));
|
||||
|
@ -104,8 +112,8 @@ class FlowLimiterCalculator : public CalculatorBase {
|
|||
|
||||
// Outputs a packet indicating whether a frame was sent or dropped.
|
||||
void SendAllow(bool allow, Timestamp ts, CalculatorContext* cc) {
|
||||
if (cc->Outputs().HasTag("ALLOW")) {
|
||||
cc->Outputs().Tag("ALLOW").AddPacket(MakePacket<bool>(allow).At(ts));
|
||||
if (cc->Outputs().HasTag(kAllowTag)) {
|
||||
cc->Outputs().Tag(kAllowTag).AddPacket(MakePacket<bool>(allow).At(ts));
|
||||
}
|
||||
}
|
||||
|
||||
|
@ -155,7 +163,7 @@ class FlowLimiterCalculator : public CalculatorBase {
|
|||
options_ = tool::RetrieveOptions(options_, cc->Inputs());
|
||||
|
||||
// Process the FINISHED input stream.
|
||||
Packet finished_packet = cc->Inputs().Tag("FINISHED").Value();
|
||||
Packet finished_packet = cc->Inputs().Tag(kFinishedTag).Value();
|
||||
if (finished_packet.Timestamp() == cc->InputTimestamp()) {
|
||||
while (!frames_in_flight_.empty() &&
|
||||
frames_in_flight_.front() <= finished_packet.Timestamp()) {
|
||||
|
@ -210,8 +218,8 @@ class FlowLimiterCalculator : public CalculatorBase {
|
|||
Timestamp bound =
|
||||
cc->Inputs().Get("", 0).Value().Timestamp().NextAllowedInStream();
|
||||
SetNextTimestampBound(bound, &cc->Outputs().Get("", 0));
|
||||
if (cc->Outputs().HasTag("ALLOW")) {
|
||||
SetNextTimestampBound(bound, &cc->Outputs().Tag("ALLOW"));
|
||||
if (cc->Outputs().HasTag(kAllowTag)) {
|
||||
SetNextTimestampBound(bound, &cc->Outputs().Tag(kAllowTag));
|
||||
}
|
||||
}
|
||||
|
||||
|
|
|
@ -36,6 +36,13 @@
|
|||
namespace mediapipe {
|
||||
|
||||
namespace {
|
||||
|
||||
constexpr char kDropTimestampsTag[] = "DROP_TIMESTAMPS";
|
||||
constexpr char kClockTag[] = "CLOCK";
|
||||
constexpr char kWarmupTimeTag[] = "WARMUP_TIME";
|
||||
constexpr char kSleepTimeTag[] = "SLEEP_TIME";
|
||||
constexpr char kPacketTag[] = "PACKET";
|
||||
|
||||
// A simple Semaphore for synchronizing test threads.
|
||||
class AtomicSemaphore {
|
||||
public:
|
||||
|
@ -204,17 +211,17 @@ TEST_F(FlowLimiterCalculatorSemaphoreTest, FramesDropped) {
|
|||
class SleepCalculator : public CalculatorBase {
|
||||
public:
|
||||
static absl::Status GetContract(CalculatorContract* cc) {
|
||||
cc->Inputs().Tag("PACKET").SetAny();
|
||||
cc->Outputs().Tag("PACKET").SetSameAs(&cc->Inputs().Tag("PACKET"));
|
||||
cc->InputSidePackets().Tag("SLEEP_TIME").Set<int64>();
|
||||
cc->InputSidePackets().Tag("WARMUP_TIME").Set<int64>();
|
||||
cc->InputSidePackets().Tag("CLOCK").Set<mediapipe::Clock*>();
|
||||
cc->Inputs().Tag(kPacketTag).SetAny();
|
||||
cc->Outputs().Tag(kPacketTag).SetSameAs(&cc->Inputs().Tag(kPacketTag));
|
||||
cc->InputSidePackets().Tag(kSleepTimeTag).Set<int64>();
|
||||
cc->InputSidePackets().Tag(kWarmupTimeTag).Set<int64>();
|
||||
cc->InputSidePackets().Tag(kClockTag).Set<mediapipe::Clock*>();
|
||||
cc->SetTimestampOffset(0);
|
||||
return absl::OkStatus();
|
||||
}
|
||||
|
||||
absl::Status Open(CalculatorContext* cc) final {
|
||||
clock_ = cc->InputSidePackets().Tag("CLOCK").Get<mediapipe::Clock*>();
|
||||
clock_ = cc->InputSidePackets().Tag(kClockTag).Get<mediapipe::Clock*>();
|
||||
return absl::OkStatus();
|
||||
}
|
||||
|
||||
|
@ -222,10 +229,12 @@ class SleepCalculator : public CalculatorBase {
|
|||
++packet_count;
|
||||
absl::Duration sleep_time = absl::Microseconds(
|
||||
packet_count == 1
|
||||
? cc->InputSidePackets().Tag("WARMUP_TIME").Get<int64>()
|
||||
: cc->InputSidePackets().Tag("SLEEP_TIME").Get<int64>());
|
||||
? cc->InputSidePackets().Tag(kWarmupTimeTag).Get<int64>()
|
||||
: cc->InputSidePackets().Tag(kSleepTimeTag).Get<int64>());
|
||||
clock_->Sleep(sleep_time);
|
||||
cc->Outputs().Tag("PACKET").AddPacket(cc->Inputs().Tag("PACKET").Value());
|
||||
cc->Outputs()
|
||||
.Tag(kPacketTag)
|
||||
.AddPacket(cc->Inputs().Tag(kPacketTag).Value());
|
||||
return absl::OkStatus();
|
||||
}
|
||||
|
||||
|
@ -240,24 +249,27 @@ REGISTER_CALCULATOR(SleepCalculator);
|
|||
class DropCalculator : public CalculatorBase {
|
||||
public:
|
||||
static absl::Status GetContract(CalculatorContract* cc) {
|
||||
cc->Inputs().Tag("PACKET").SetAny();
|
||||
cc->Outputs().Tag("PACKET").SetSameAs(&cc->Inputs().Tag("PACKET"));
|
||||
cc->InputSidePackets().Tag("DROP_TIMESTAMPS").Set<bool>();
|
||||
cc->Inputs().Tag(kPacketTag).SetAny();
|
||||
cc->Outputs().Tag(kPacketTag).SetSameAs(&cc->Inputs().Tag(kPacketTag));
|
||||
cc->InputSidePackets().Tag(kDropTimestampsTag).Set<bool>();
|
||||
cc->SetProcessTimestampBounds(true);
|
||||
return absl::OkStatus();
|
||||
}
|
||||
|
||||
absl::Status Process(CalculatorContext* cc) final {
|
||||
if (!cc->Inputs().Tag("PACKET").Value().IsEmpty()) {
|
||||
if (!cc->Inputs().Tag(kPacketTag).Value().IsEmpty()) {
|
||||
++packet_count;
|
||||
}
|
||||
bool drop = (packet_count == 3);
|
||||
if (!drop && !cc->Inputs().Tag("PACKET").Value().IsEmpty()) {
|
||||
cc->Outputs().Tag("PACKET").AddPacket(cc->Inputs().Tag("PACKET").Value());
|
||||
if (!drop && !cc->Inputs().Tag(kPacketTag).Value().IsEmpty()) {
|
||||
cc->Outputs()
|
||||
.Tag(kPacketTag)
|
||||
.AddPacket(cc->Inputs().Tag(kPacketTag).Value());
|
||||
}
|
||||
if (!drop || !cc->InputSidePackets().Tag("DROP_TIMESTAMPS").Get<bool>()) {
|
||||
cc->Outputs().Tag("PACKET").SetNextTimestampBound(
|
||||
cc->InputTimestamp().NextAllowedInStream());
|
||||
if (!drop || !cc->InputSidePackets().Tag(kDropTimestampsTag).Get<bool>()) {
|
||||
cc->Outputs()
|
||||
.Tag(kPacketTag)
|
||||
.SetNextTimestampBound(cc->InputTimestamp().NextAllowedInStream());
|
||||
}
|
||||
return absl::OkStatus();
|
||||
}
|
||||
|
|
|
@ -21,6 +21,11 @@
|
|||
namespace mediapipe {
|
||||
|
||||
namespace {
|
||||
|
||||
constexpr char kStateChangeTag[] = "STATE_CHANGE";
|
||||
constexpr char kDisallowTag[] = "DISALLOW";
|
||||
constexpr char kAllowTag[] = "ALLOW";
|
||||
|
||||
enum GateState {
|
||||
GATE_UNINITIALIZED,
|
||||
GATE_ALLOW,
|
||||
|
@ -59,8 +64,9 @@ std::string ToString(GateState state) {
|
|||
// ALLOW or DISALLOW can also be specified as an input side packet. The rules
|
||||
// for evaluation remain the same as above.
|
||||
//
|
||||
// ALLOW/DISALLOW inputs must be specified either using input stream or
|
||||
// via input side packet but not both.
|
||||
// ALLOW/DISALLOW inputs must be specified either using input stream or via
|
||||
// input side packet but not both. If neither is specified, the behavior is then
|
||||
// determined by the "allow" field in the calculator options.
|
||||
//
|
||||
// Intended to be used with the default input stream handler, which synchronizes
|
||||
// all data input streams with the ALLOW/DISALLOW control input stream.
|
||||
|
@ -83,30 +89,33 @@ class GateCalculator : public CalculatorBase {
|
|||
GateCalculator() {}
|
||||
|
||||
static absl::Status CheckAndInitAllowDisallowInputs(CalculatorContract* cc) {
|
||||
bool input_via_side_packet = cc->InputSidePackets().HasTag("ALLOW") ||
|
||||
cc->InputSidePackets().HasTag("DISALLOW");
|
||||
bool input_via_side_packet = cc->InputSidePackets().HasTag(kAllowTag) ||
|
||||
cc->InputSidePackets().HasTag(kDisallowTag);
|
||||
bool input_via_stream =
|
||||
cc->Inputs().HasTag("ALLOW") || cc->Inputs().HasTag("DISALLOW");
|
||||
// Only one of input_side_packet or input_stream may specify ALLOW/DISALLOW
|
||||
// input.
|
||||
RET_CHECK(input_via_side_packet ^ input_via_stream);
|
||||
cc->Inputs().HasTag(kAllowTag) || cc->Inputs().HasTag(kDisallowTag);
|
||||
|
||||
// Only one of input_side_packet or input_stream may specify
|
||||
// ALLOW/DISALLOW input.
|
||||
if (input_via_side_packet) {
|
||||
RET_CHECK(cc->InputSidePackets().HasTag("ALLOW") ^
|
||||
cc->InputSidePackets().HasTag("DISALLOW"));
|
||||
RET_CHECK(!input_via_stream);
|
||||
RET_CHECK(cc->InputSidePackets().HasTag(kAllowTag) ^
|
||||
cc->InputSidePackets().HasTag(kDisallowTag));
|
||||
|
||||
if (cc->InputSidePackets().HasTag("ALLOW")) {
|
||||
cc->InputSidePackets().Tag("ALLOW").Set<bool>();
|
||||
if (cc->InputSidePackets().HasTag(kAllowTag)) {
|
||||
cc->InputSidePackets().Tag(kAllowTag).Set<bool>().Optional();
|
||||
} else {
|
||||
cc->InputSidePackets().Tag("DISALLOW").Set<bool>();
|
||||
cc->InputSidePackets().Tag(kDisallowTag).Set<bool>().Optional();
|
||||
}
|
||||
} else {
|
||||
RET_CHECK(cc->Inputs().HasTag("ALLOW") ^ cc->Inputs().HasTag("DISALLOW"));
|
||||
}
|
||||
if (input_via_stream) {
|
||||
RET_CHECK(!input_via_side_packet);
|
||||
RET_CHECK(cc->Inputs().HasTag(kAllowTag) ^
|
||||
cc->Inputs().HasTag(kDisallowTag));
|
||||
|
||||
if (cc->Inputs().HasTag("ALLOW")) {
|
||||
cc->Inputs().Tag("ALLOW").Set<bool>();
|
||||
if (cc->Inputs().HasTag(kAllowTag)) {
|
||||
cc->Inputs().Tag(kAllowTag).Set<bool>();
|
||||
} else {
|
||||
cc->Inputs().Tag("DISALLOW").Set<bool>();
|
||||
cc->Inputs().Tag(kDisallowTag).Set<bool>();
|
||||
}
|
||||
}
|
||||
return absl::OkStatus();
|
||||
|
@ -125,23 +134,22 @@ class GateCalculator : public CalculatorBase {
|
|||
cc->Outputs().Get("", i).SetSameAs(&cc->Inputs().Get("", i));
|
||||
}
|
||||
|
||||
if (cc->Outputs().HasTag("STATE_CHANGE")) {
|
||||
cc->Outputs().Tag("STATE_CHANGE").Set<bool>();
|
||||
if (cc->Outputs().HasTag(kStateChangeTag)) {
|
||||
cc->Outputs().Tag(kStateChangeTag).Set<bool>();
|
||||
}
|
||||
|
||||
return absl::OkStatus();
|
||||
}
|
||||
|
||||
absl::Status Open(CalculatorContext* cc) final {
|
||||
use_side_packet_for_allow_disallow_ = false;
|
||||
if (cc->InputSidePackets().HasTag("ALLOW")) {
|
||||
if (cc->InputSidePackets().HasTag(kAllowTag)) {
|
||||
use_side_packet_for_allow_disallow_ = true;
|
||||
allow_by_side_packet_decision_ =
|
||||
cc->InputSidePackets().Tag("ALLOW").Get<bool>();
|
||||
} else if (cc->InputSidePackets().HasTag("DISALLOW")) {
|
||||
cc->InputSidePackets().Tag(kAllowTag).Get<bool>();
|
||||
} else if (cc->InputSidePackets().HasTag(kDisallowTag)) {
|
||||
use_side_packet_for_allow_disallow_ = true;
|
||||
allow_by_side_packet_decision_ =
|
||||
!cc->InputSidePackets().Tag("DISALLOW").Get<bool>();
|
||||
!cc->InputSidePackets().Tag(kDisallowTag).Get<bool>();
|
||||
}
|
||||
|
||||
cc->SetOffset(TimestampDiff(0));
|
||||
|
@ -152,26 +160,34 @@ class GateCalculator : public CalculatorBase {
|
|||
const auto& options = cc->Options<::mediapipe::GateCalculatorOptions>();
|
||||
empty_packets_as_allow_ = options.empty_packets_as_allow();
|
||||
|
||||
if (!use_side_packet_for_allow_disallow_ &&
|
||||
!cc->Inputs().HasTag(kAllowTag) && !cc->Inputs().HasTag(kDisallowTag)) {
|
||||
use_option_for_allow_disallow_ = true;
|
||||
allow_by_option_decision_ = options.allow();
|
||||
}
|
||||
|
||||
return absl::OkStatus();
|
||||
}
|
||||
|
||||
absl::Status Process(CalculatorContext* cc) final {
|
||||
bool allow = empty_packets_as_allow_;
|
||||
if (use_side_packet_for_allow_disallow_) {
|
||||
if (use_option_for_allow_disallow_) {
|
||||
allow = allow_by_option_decision_;
|
||||
} else if (use_side_packet_for_allow_disallow_) {
|
||||
allow = allow_by_side_packet_decision_;
|
||||
} else {
|
||||
if (cc->Inputs().HasTag("ALLOW") &&
|
||||
!cc->Inputs().Tag("ALLOW").IsEmpty()) {
|
||||
allow = cc->Inputs().Tag("ALLOW").Get<bool>();
|
||||
if (cc->Inputs().HasTag(kAllowTag) &&
|
||||
!cc->Inputs().Tag(kAllowTag).IsEmpty()) {
|
||||
allow = cc->Inputs().Tag(kAllowTag).Get<bool>();
|
||||
}
|
||||
if (cc->Inputs().HasTag("DISALLOW") &&
|
||||
!cc->Inputs().Tag("DISALLOW").IsEmpty()) {
|
||||
allow = !cc->Inputs().Tag("DISALLOW").Get<bool>();
|
||||
if (cc->Inputs().HasTag(kDisallowTag) &&
|
||||
!cc->Inputs().Tag(kDisallowTag).IsEmpty()) {
|
||||
allow = !cc->Inputs().Tag(kDisallowTag).Get<bool>();
|
||||
}
|
||||
}
|
||||
const GateState new_gate_state = allow ? GATE_ALLOW : GATE_DISALLOW;
|
||||
|
||||
if (cc->Outputs().HasTag("STATE_CHANGE")) {
|
||||
if (cc->Outputs().HasTag(kStateChangeTag)) {
|
||||
if (last_gate_state_ != GATE_UNINITIALIZED &&
|
||||
last_gate_state_ != new_gate_state) {
|
||||
VLOG(2) << "State transition in " << cc->NodeName() << " @ "
|
||||
|
@ -179,7 +195,7 @@ class GateCalculator : public CalculatorBase {
|
|||
<< ToString(last_gate_state_) << " to "
|
||||
<< ToString(new_gate_state);
|
||||
cc->Outputs()
|
||||
.Tag("STATE_CHANGE")
|
||||
.Tag(kStateChangeTag)
|
||||
.AddPacket(MakePacket<bool>(allow).At(cc->InputTimestamp()));
|
||||
}
|
||||
}
|
||||
|
@ -211,8 +227,10 @@ class GateCalculator : public CalculatorBase {
|
|||
GateState last_gate_state_ = GATE_UNINITIALIZED;
|
||||
int num_data_streams_;
|
||||
bool empty_packets_as_allow_;
|
||||
bool use_side_packet_for_allow_disallow_;
|
||||
bool use_side_packet_for_allow_disallow_ = false;
|
||||
bool allow_by_side_packet_decision_;
|
||||
bool use_option_for_allow_disallow_ = false;
|
||||
bool allow_by_option_decision_;
|
||||
};
|
||||
REGISTER_CALCULATOR(GateCalculator);
|
||||
|
||||
|
|
|
@ -29,4 +29,8 @@ message GateCalculatorOptions {
|
|||
// disallowing the corresponding packets in the data input streams. Setting
|
||||
// this option to true inverts that, allowing the data packets to go through.
|
||||
optional bool empty_packets_as_allow = 1;
|
||||
|
||||
// Whether to allow or disallow the input streams to pass when no
|
||||
// ALLOW/DISALLOW input or side input is specified.
|
||||
optional bool allow = 2 [default = false];
|
||||
}
|
||||
|
|
|
@ -22,6 +22,9 @@ namespace mediapipe {
|
|||
|
||||
namespace {
|
||||
|
||||
constexpr char kDisallowTag[] = "DISALLOW";
|
||||
constexpr char kAllowTag[] = "ALLOW";
|
||||
|
||||
class GateCalculatorTest : public ::testing::Test {
|
||||
protected:
|
||||
// Helper to run a graph and return status.
|
||||
|
@ -110,6 +113,68 @@ TEST_F(GateCalculatorTest, InvalidInputs) {
|
|||
)")));
|
||||
}
|
||||
|
||||
TEST_F(GateCalculatorTest, AllowByALLOWOptionToTrue) {
|
||||
SetRunner(R"(
|
||||
calculator: "GateCalculator"
|
||||
input_stream: "test_input"
|
||||
output_stream: "test_output"
|
||||
options: {
|
||||
[mediapipe.GateCalculatorOptions.ext] {
|
||||
allow: true
|
||||
}
|
||||
}
|
||||
)");
|
||||
|
||||
constexpr int64 kTimestampValue0 = 42;
|
||||
RunTimeStep(kTimestampValue0, true);
|
||||
constexpr int64 kTimestampValue1 = 43;
|
||||
RunTimeStep(kTimestampValue1, false);
|
||||
|
||||
const std::vector<Packet>& output = runner()->Outputs().Get("", 0).packets;
|
||||
ASSERT_EQ(2, output.size());
|
||||
EXPECT_EQ(kTimestampValue0, output[0].Timestamp().Value());
|
||||
EXPECT_EQ(kTimestampValue1, output[1].Timestamp().Value());
|
||||
EXPECT_EQ(true, output[0].Get<bool>());
|
||||
EXPECT_EQ(false, output[1].Get<bool>());
|
||||
}
|
||||
|
||||
TEST_F(GateCalculatorTest, DisallowByALLOWOptionSetToFalse) {
|
||||
SetRunner(R"(
|
||||
calculator: "GateCalculator"
|
||||
input_stream: "test_input"
|
||||
output_stream: "test_output"
|
||||
options: {
|
||||
[mediapipe.GateCalculatorOptions.ext] {
|
||||
allow: false
|
||||
}
|
||||
}
|
||||
)");
|
||||
|
||||
constexpr int64 kTimestampValue0 = 42;
|
||||
RunTimeStep(kTimestampValue0, true);
|
||||
constexpr int64 kTimestampValue1 = 43;
|
||||
RunTimeStep(kTimestampValue1, false);
|
||||
|
||||
const std::vector<Packet>& output = runner()->Outputs().Get("", 0).packets;
|
||||
ASSERT_EQ(0, output.size());
|
||||
}
|
||||
|
||||
TEST_F(GateCalculatorTest, DisallowByALLOWOptionNotSet) {
|
||||
SetRunner(R"(
|
||||
calculator: "GateCalculator"
|
||||
input_stream: "test_input"
|
||||
output_stream: "test_output"
|
||||
)");
|
||||
|
||||
constexpr int64 kTimestampValue0 = 42;
|
||||
RunTimeStep(kTimestampValue0, true);
|
||||
constexpr int64 kTimestampValue1 = 43;
|
||||
RunTimeStep(kTimestampValue1, false);
|
||||
|
||||
const std::vector<Packet>& output = runner()->Outputs().Get("", 0).packets;
|
||||
ASSERT_EQ(0, output.size());
|
||||
}
|
||||
|
||||
TEST_F(GateCalculatorTest, AllowByALLOWSidePacketSetToTrue) {
|
||||
SetRunner(R"(
|
||||
calculator: "GateCalculator"
|
||||
|
@ -117,7 +182,7 @@ TEST_F(GateCalculatorTest, AllowByALLOWSidePacketSetToTrue) {
|
|||
input_stream: "test_input"
|
||||
output_stream: "test_output"
|
||||
)");
|
||||
runner()->MutableSidePackets()->Tag("ALLOW") = Adopt(new bool(true));
|
||||
runner()->MutableSidePackets()->Tag(kAllowTag) = Adopt(new bool(true));
|
||||
|
||||
constexpr int64 kTimestampValue0 = 42;
|
||||
RunTimeStep(kTimestampValue0, true);
|
||||
|
@ -139,7 +204,7 @@ TEST_F(GateCalculatorTest, AllowByDisallowSidePacketSetToFalse) {
|
|||
input_stream: "test_input"
|
||||
output_stream: "test_output"
|
||||
)");
|
||||
runner()->MutableSidePackets()->Tag("DISALLOW") = Adopt(new bool(false));
|
||||
runner()->MutableSidePackets()->Tag(kDisallowTag) = Adopt(new bool(false));
|
||||
|
||||
constexpr int64 kTimestampValue0 = 42;
|
||||
RunTimeStep(kTimestampValue0, true);
|
||||
|
@ -161,7 +226,7 @@ TEST_F(GateCalculatorTest, DisallowByALLOWSidePacketSetToFalse) {
|
|||
input_stream: "test_input"
|
||||
output_stream: "test_output"
|
||||
)");
|
||||
runner()->MutableSidePackets()->Tag("ALLOW") = Adopt(new bool(false));
|
||||
runner()->MutableSidePackets()->Tag(kAllowTag) = Adopt(new bool(false));
|
||||
|
||||
constexpr int64 kTimestampValue0 = 42;
|
||||
RunTimeStep(kTimestampValue0, true);
|
||||
|
@ -179,7 +244,7 @@ TEST_F(GateCalculatorTest, DisallowByDISALLOWSidePacketSetToTrue) {
|
|||
input_stream: "test_input"
|
||||
output_stream: "test_output"
|
||||
)");
|
||||
runner()->MutableSidePackets()->Tag("DISALLOW") = Adopt(new bool(true));
|
||||
runner()->MutableSidePackets()->Tag(kDisallowTag) = Adopt(new bool(true));
|
||||
|
||||
constexpr int64 kTimestampValue0 = 42;
|
||||
RunTimeStep(kTimestampValue0, true);
|
||||
|
|
70
mediapipe/calculators/core/graph_profile_calculator.cc
Normal file
70
mediapipe/calculators/core/graph_profile_calculator.cc
Normal file
|
@ -0,0 +1,70 @@
|
|||
// Copyright 2019 The MediaPipe Authors.
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
// You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software
|
||||
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
|
||||
#include <memory>
|
||||
|
||||
#include "mediapipe/calculators/core/graph_profile_calculator.pb.h"
|
||||
#include "mediapipe/framework/api2/node.h"
|
||||
#include "mediapipe/framework/api2/packet.h"
|
||||
#include "mediapipe/framework/api2/port.h"
|
||||
#include "mediapipe/framework/calculator_framework.h"
|
||||
#include "mediapipe/framework/calculator_profile.pb.h"
|
||||
#include "mediapipe/framework/port/ret_check.h"
|
||||
#include "mediapipe/framework/port/status.h"
|
||||
|
||||
namespace mediapipe {
|
||||
namespace api2 {
|
||||
|
||||
// This calculator periodically copies the GraphProfile from
|
||||
// mediapipe::GraphProfiler::CaptureProfile to the "PROFILE" output stream.
|
||||
//
|
||||
// Example config:
|
||||
// node {
|
||||
// calculator: "GraphProfileCalculator"
|
||||
// output_stream: "FRAME:any_frame"
|
||||
// output_stream: "PROFILE:graph_profile"
|
||||
// }
|
||||
//
|
||||
class GraphProfileCalculator : public Node {
|
||||
public:
|
||||
static constexpr Input<AnyType>::Multiple kFrameIn{"FRAME"};
|
||||
static constexpr Output<GraphProfile> kProfileOut{"PROFILE"};
|
||||
|
||||
MEDIAPIPE_NODE_CONTRACT(kFrameIn, kProfileOut);
|
||||
|
||||
static absl::Status UpdateContract(CalculatorContract* cc) {
|
||||
return absl::OkStatus();
|
||||
}
|
||||
|
||||
absl::Status Process(CalculatorContext* cc) final {
|
||||
auto options = cc->Options<::mediapipe::GraphProfileCalculatorOptions>();
|
||||
|
||||
if (prev_profile_ts_ == Timestamp::Unset() ||
|
||||
cc->InputTimestamp() - prev_profile_ts_ >= options.profile_interval()) {
|
||||
prev_profile_ts_ = cc->InputTimestamp();
|
||||
GraphProfile result;
|
||||
MP_RETURN_IF_ERROR(cc->GetProfilingContext()->CaptureProfile(&result));
|
||||
kProfileOut(cc).Send(result);
|
||||
}
|
||||
return absl::OkStatus();
|
||||
}
|
||||
|
||||
private:
|
||||
Timestamp prev_profile_ts_;
|
||||
};
|
||||
|
||||
MEDIAPIPE_REGISTER_NODE(GraphProfileCalculator);
|
||||
|
||||
} // namespace api2
|
||||
} // namespace mediapipe
|
30
mediapipe/calculators/core/graph_profile_calculator.proto
Normal file
30
mediapipe/calculators/core/graph_profile_calculator.proto
Normal file
|
@ -0,0 +1,30 @@
|
|||
// Copyright 2019 The MediaPipe Authors.
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
// You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software
|
||||
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
|
||||
syntax = "proto2";
|
||||
|
||||
package mediapipe;
|
||||
|
||||
import "mediapipe/framework/calculator.proto";
|
||||
|
||||
option objc_class_prefix = "MediaPipe";
|
||||
|
||||
message GraphProfileCalculatorOptions {
|
||||
extend mediapipe.CalculatorOptions {
|
||||
optional GraphProfileCalculatorOptions ext = 367481815;
|
||||
}
|
||||
|
||||
// The interval in microseconds between successive reported GraphProfiles.
|
||||
optional int64 profile_interval = 1 [default = 1000000];
|
||||
}
|
211
mediapipe/calculators/core/graph_profile_calculator_test.cc
Normal file
211
mediapipe/calculators/core/graph_profile_calculator_test.cc
Normal file
|
@ -0,0 +1,211 @@
|
|||
// Copyright 2019 The MediaPipe Authors.
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
// You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software
|
||||
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
|
||||
#include <memory>
|
||||
#include <string>
|
||||
#include <vector>
|
||||
|
||||
#include "absl/status/status.h"
|
||||
#include "absl/strings/str_cat.h"
|
||||
#include "absl/time/time.h"
|
||||
#include "mediapipe/framework/calculator.pb.h"
|
||||
#include "mediapipe/framework/calculator_framework.h"
|
||||
#include "mediapipe/framework/calculator_profile.pb.h"
|
||||
#include "mediapipe/framework/deps/clock.h"
|
||||
#include "mediapipe/framework/deps/message_matchers.h"
|
||||
#include "mediapipe/framework/port/gmock.h"
|
||||
#include "mediapipe/framework/port/gtest.h"
|
||||
#include "mediapipe/framework/port/integral_types.h"
|
||||
#include "mediapipe/framework/port/logging.h"
|
||||
#include "mediapipe/framework/port/parse_text_proto.h"
|
||||
#include "mediapipe/framework/port/proto_ns.h"
|
||||
#include "mediapipe/framework/port/status_matchers.h"
|
||||
#include "mediapipe/framework/port/threadpool.h"
|
||||
#include "mediapipe/framework/tool/simulation_clock_executor.h"
|
||||
|
||||
// Tests for GraphProfileCalculator.
|
||||
using testing::ElementsAre;
|
||||
|
||||
namespace mediapipe {
|
||||
namespace {
|
||||
|
||||
constexpr char kClockTag[] = "CLOCK";
|
||||
|
||||
using mediapipe::Clock;
|
||||
|
||||
// A Calculator with a fixed Process call latency.
|
||||
class SleepCalculator : public CalculatorBase {
|
||||
public:
|
||||
static absl::Status GetContract(CalculatorContract* cc) {
|
||||
cc->InputSidePackets().Tag(kClockTag).Set<std::shared_ptr<Clock>>();
|
||||
cc->Inputs().Index(0).SetAny();
|
||||
cc->Outputs().Index(0).SetSameAs(&cc->Inputs().Index(0));
|
||||
cc->SetTimestampOffset(TimestampDiff(0));
|
||||
return absl::OkStatus();
|
||||
}
|
||||
absl::Status Open(CalculatorContext* cc) final {
|
||||
clock_ =
|
||||
cc->InputSidePackets().Tag(kClockTag).Get<std::shared_ptr<Clock>>();
|
||||
return absl::OkStatus();
|
||||
}
|
||||
|
||||
absl::Status Process(CalculatorContext* cc) final {
|
||||
clock_->Sleep(absl::Milliseconds(5));
|
||||
cc->Outputs().Index(0).AddPacket(cc->Inputs().Index(0).Value());
|
||||
return absl::OkStatus();
|
||||
}
|
||||
std::shared_ptr<::mediapipe::Clock> clock_ = nullptr;
|
||||
};
|
||||
REGISTER_CALCULATOR(SleepCalculator);
|
||||
|
||||
// Tests showing GraphProfileCalculator reporting GraphProfile output packets.
|
||||
class GraphProfileCalculatorTest : public ::testing::Test {
|
||||
protected:
|
||||
void SetUpProfileGraph() {
|
||||
ASSERT_TRUE(proto_ns::TextFormat::ParseFromString(R"(
|
||||
input_stream: "input_packets_0"
|
||||
node {
|
||||
calculator: 'SleepCalculator'
|
||||
input_side_packet: 'CLOCK:sync_clock'
|
||||
input_stream: 'input_packets_0'
|
||||
output_stream: 'output_packets_1'
|
||||
}
|
||||
node {
|
||||
calculator: "GraphProfileCalculator"
|
||||
options: {
|
||||
[mediapipe.GraphProfileCalculatorOptions.ext]: {
|
||||
profile_interval: 25000
|
||||
}
|
||||
}
|
||||
input_stream: "FRAME:output_packets_1"
|
||||
output_stream: "PROFILE:output_packets_0"
|
||||
}
|
||||
)",
|
||||
&graph_config_));
|
||||
}
|
||||
|
||||
static Packet PacketAt(int64 ts) {
|
||||
return Adopt(new int64(999)).At(Timestamp(ts));
|
||||
}
|
||||
static Packet None() { return Packet().At(Timestamp::OneOverPostStream()); }
|
||||
static bool IsNone(const Packet& packet) {
|
||||
return packet.Timestamp() == Timestamp::OneOverPostStream();
|
||||
}
|
||||
// Return the values of the timestamps of a vector of Packets.
|
||||
static std::vector<int64> TimestampValues(
|
||||
const std::vector<Packet>& packets) {
|
||||
std::vector<int64> result;
|
||||
for (const Packet& p : packets) {
|
||||
result.push_back(p.Timestamp().Value());
|
||||
}
|
||||
return result;
|
||||
}
|
||||
|
||||
// Runs a CalculatorGraph with a series of packet sets.
|
||||
// Returns a vector of packets from each graph output stream.
|
||||
void RunGraph(const std::vector<std::vector<Packet>>& input_sets,
|
||||
std::vector<Packet>* output_packets) {
|
||||
// Register output packet observers.
|
||||
tool::AddVectorSink("output_packets_0", &graph_config_, output_packets);
|
||||
|
||||
// Start running the graph.
|
||||
std::shared_ptr<SimulationClockExecutor> executor(
|
||||
new SimulationClockExecutor(3 /*num_threads*/));
|
||||
CalculatorGraph graph;
|
||||
MP_ASSERT_OK(graph.SetExecutor("", executor));
|
||||
graph.profiler()->SetClock(executor->GetClock());
|
||||
MP_ASSERT_OK(graph.Initialize(graph_config_));
|
||||
executor->GetClock()->ThreadStart();
|
||||
MP_ASSERT_OK(graph.StartRun({
|
||||
{"sync_clock",
|
||||
Adopt(new std::shared_ptr<::mediapipe::Clock>(executor->GetClock()))},
|
||||
}));
|
||||
|
||||
// Send each packet to the graph in the specified order.
|
||||
for (int t = 0; t < input_sets.size(); t++) {
|
||||
const std::vector<Packet>& input_set = input_sets[t];
|
||||
for (int i = 0; i < input_set.size(); i++) {
|
||||
const Packet& packet = input_set[i];
|
||||
if (!IsNone(packet)) {
|
||||
MP_EXPECT_OK(graph.AddPacketToInputStream(
|
||||
absl::StrCat("input_packets_", i), packet));
|
||||
}
|
||||
executor->GetClock()->Sleep(absl::Milliseconds(10));
|
||||
}
|
||||
}
|
||||
MP_ASSERT_OK(graph.CloseAllInputStreams());
|
||||
executor->GetClock()->Sleep(absl::Milliseconds(100));
|
||||
executor->GetClock()->ThreadFinish();
|
||||
MP_ASSERT_OK(graph.WaitUntilDone());
|
||||
}
|
||||
|
||||
CalculatorGraphConfig graph_config_;
|
||||
};
|
||||
|
||||
TEST_F(GraphProfileCalculatorTest, GraphProfile) {
|
||||
SetUpProfileGraph();
|
||||
auto profiler_config = graph_config_.mutable_profiler_config();
|
||||
profiler_config->set_enable_profiler(true);
|
||||
profiler_config->set_trace_enabled(false);
|
||||
profiler_config->set_trace_log_disabled(true);
|
||||
profiler_config->set_enable_stream_latency(true);
|
||||
profiler_config->set_calculator_filter(".*Calculator");
|
||||
|
||||
// Run the graph with a series of packet sets.
|
||||
std::vector<std::vector<Packet>> input_sets = {
|
||||
{PacketAt(10000)}, //
|
||||
{PacketAt(20000)}, //
|
||||
{PacketAt(30000)}, //
|
||||
{PacketAt(40000)},
|
||||
};
|
||||
std::vector<Packet> output_packets;
|
||||
RunGraph(input_sets, &output_packets);
|
||||
|
||||
// Validate the output packets.
|
||||
EXPECT_THAT(TimestampValues(output_packets), //
|
||||
ElementsAre(10000, 40000));
|
||||
|
||||
GraphProfile expected_profile =
|
||||
mediapipe::ParseTextProtoOrDie<GraphProfile>(R"pb(
|
||||
calculator_profiles {
|
||||
name: "GraphProfileCalculator"
|
||||
open_runtime: 0
|
||||
process_runtime { total: 0 count: 3 }
|
||||
process_input_latency { total: 15000 count: 3 }
|
||||
process_output_latency { total: 15000 count: 3 }
|
||||
input_stream_profiles {
|
||||
name: "output_packets_1"
|
||||
back_edge: false
|
||||
latency { total: 0 count: 3 }
|
||||
}
|
||||
}
|
||||
calculator_profiles {
|
||||
name: "SleepCalculator"
|
||||
open_runtime: 0
|
||||
process_runtime { total: 15000 count: 3 }
|
||||
process_input_latency { total: 0 count: 3 }
|
||||
process_output_latency { total: 15000 count: 3 }
|
||||
input_stream_profiles {
|
||||
name: "input_packets_0"
|
||||
back_edge: false
|
||||
latency { total: 0 count: 3 }
|
||||
}
|
||||
})pb");
|
||||
|
||||
EXPECT_THAT(output_packets[1].Get<GraphProfile>(),
|
||||
mediapipe::EqualsProto(expected_profile));
|
||||
}
|
||||
|
||||
} // namespace
|
||||
} // namespace mediapipe
|
70
mediapipe/calculators/core/make_pair_calculator_test.cc
Normal file
70
mediapipe/calculators/core/make_pair_calculator_test.cc
Normal file
|
@ -0,0 +1,70 @@
|
|||
// Copyright 2021 The MediaPipe Authors.
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
// You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software
|
||||
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
|
||||
#include "mediapipe/framework/calculator_framework.h"
|
||||
#include "mediapipe/framework/calculator_runner.h"
|
||||
#include "mediapipe/framework/port/canonical_errors.h"
|
||||
#include "mediapipe/framework/port/gmock.h"
|
||||
#include "mediapipe/framework/port/gtest.h"
|
||||
#include "mediapipe/framework/port/status.h"
|
||||
#include "mediapipe/framework/port/status_matchers.h"
|
||||
#include "mediapipe/framework/timestamp.h"
|
||||
#include "mediapipe/framework/tool/validate_type.h"
|
||||
#include "mediapipe/util/packet_test_util.h"
|
||||
#include "mediapipe/util/time_series_test_util.h"
|
||||
|
||||
namespace mediapipe {
|
||||
|
||||
class MakePairCalculatorTest
|
||||
: public mediapipe::TimeSeriesCalculatorTest<mediapipe::NoOptions> {
|
||||
protected:
|
||||
void SetUp() override {
|
||||
calculator_name_ = "MakePairCalculator";
|
||||
num_input_streams_ = 2;
|
||||
}
|
||||
};
|
||||
|
||||
TEST_F(MakePairCalculatorTest, ProducesExpectedPairs) {
|
||||
InitializeGraph();
|
||||
AppendInputPacket(new std::string("first packet"), Timestamp(1),
|
||||
/* input_index= */ 0);
|
||||
AppendInputPacket(new std::string("second packet"), Timestamp(5),
|
||||
/* input_index= */ 0);
|
||||
AppendInputPacket(new int(10), Timestamp(1), /* input_index= */ 1);
|
||||
AppendInputPacket(new int(20), Timestamp(5), /* input_index= */ 1);
|
||||
|
||||
MP_ASSERT_OK(RunGraph());
|
||||
|
||||
EXPECT_THAT(
|
||||
output().packets,
|
||||
::testing::ElementsAre(
|
||||
mediapipe::PacketContainsTimestampAndPayload<
|
||||
std::pair<Packet, Packet>>(
|
||||
Timestamp(1),
|
||||
::testing::Pair(
|
||||
mediapipe::PacketContainsTimestampAndPayload<std::string>(
|
||||
Timestamp(1), std::string("first packet")),
|
||||
mediapipe::PacketContainsTimestampAndPayload<int>(
|
||||
Timestamp(1), 10))),
|
||||
mediapipe::PacketContainsTimestampAndPayload<
|
||||
std::pair<Packet, Packet>>(
|
||||
Timestamp(5),
|
||||
::testing::Pair(
|
||||
mediapipe::PacketContainsTimestampAndPayload<std::string>(
|
||||
Timestamp(5), std::string("second packet")),
|
||||
mediapipe::PacketContainsTimestampAndPayload<int>(
|
||||
Timestamp(5), 20)))));
|
||||
}
|
||||
|
||||
} // namespace mediapipe
|
|
@ -29,6 +29,9 @@
|
|||
namespace mediapipe {
|
||||
namespace {
|
||||
|
||||
constexpr char kMinuendTag[] = "MINUEND";
|
||||
constexpr char kSubtrahendTag[] = "SUBTRAHEND";
|
||||
|
||||
// A 3x4 Matrix of random integers in [0,1000).
|
||||
const char kMatrixText[] =
|
||||
"rows: 3\n"
|
||||
|
@ -104,12 +107,13 @@ TEST(MatrixSubtractCalculatorTest, SubtractFromInput) {
|
|||
CalculatorRunner runner(node_config);
|
||||
Matrix* side_matrix = new Matrix();
|
||||
MatrixFromTextProto(kMatrixText, side_matrix);
|
||||
runner.MutableSidePackets()->Tag("SUBTRAHEND") = Adopt(side_matrix);
|
||||
runner.MutableSidePackets()->Tag(kSubtrahendTag) = Adopt(side_matrix);
|
||||
|
||||
Matrix* input_matrix = new Matrix();
|
||||
MatrixFromTextProto(kMatrixText2, input_matrix);
|
||||
runner.MutableInputs()->Tag("MINUEND").packets.push_back(
|
||||
Adopt(input_matrix).At(Timestamp(0)));
|
||||
runner.MutableInputs()
|
||||
->Tag(kMinuendTag)
|
||||
.packets.push_back(Adopt(input_matrix).At(Timestamp(0)));
|
||||
|
||||
MP_ASSERT_OK(runner.Run());
|
||||
EXPECT_EQ(1, runner.Outputs().Index(0).packets.size());
|
||||
|
@ -133,12 +137,12 @@ TEST(MatrixSubtractCalculatorTest, SubtractFromSideMatrix) {
|
|||
CalculatorRunner runner(node_config);
|
||||
Matrix* side_matrix = new Matrix();
|
||||
MatrixFromTextProto(kMatrixText, side_matrix);
|
||||
runner.MutableSidePackets()->Tag("MINUEND") = Adopt(side_matrix);
|
||||
runner.MutableSidePackets()->Tag(kMinuendTag) = Adopt(side_matrix);
|
||||
|
||||
Matrix* input_matrix = new Matrix();
|
||||
MatrixFromTextProto(kMatrixText2, input_matrix);
|
||||
runner.MutableInputs()
|
||||
->Tag("SUBTRAHEND")
|
||||
->Tag(kSubtrahendTag)
|
||||
.packets.push_back(Adopt(input_matrix).At(Timestamp(0)));
|
||||
|
||||
MP_ASSERT_OK(runner.Run());
|
||||
|
|
|
@ -17,6 +17,9 @@
|
|||
|
||||
namespace mediapipe {
|
||||
|
||||
constexpr char kPresenceTag[] = "PRESENCE";
|
||||
constexpr char kPacketTag[] = "PACKET";
|
||||
|
||||
// For each non empty input packet, emits a single output packet containing a
|
||||
// boolean value "true", "false" in response to empty packets (a.k.a. timestamp
|
||||
// bound updates) This can be used to "flag" the presence of an arbitrary packet
|
||||
|
@ -58,8 +61,8 @@ namespace mediapipe {
|
|||
class PacketPresenceCalculator : public CalculatorBase {
|
||||
public:
|
||||
static absl::Status GetContract(CalculatorContract* cc) {
|
||||
cc->Inputs().Tag("PACKET").SetAny();
|
||||
cc->Outputs().Tag("PRESENCE").Set<bool>();
|
||||
cc->Inputs().Tag(kPacketTag).SetAny();
|
||||
cc->Outputs().Tag(kPresenceTag).Set<bool>();
|
||||
// Process() function is invoked in response to input stream timestamp
|
||||
// bound updates.
|
||||
cc->SetProcessTimestampBounds(true);
|
||||
|
@ -73,8 +76,8 @@ class PacketPresenceCalculator : public CalculatorBase {
|
|||
|
||||
absl::Status Process(CalculatorContext* cc) final {
|
||||
cc->Outputs()
|
||||
.Tag("PRESENCE")
|
||||
.AddPacket(MakePacket<bool>(!cc->Inputs().Tag("PACKET").IsEmpty())
|
||||
.Tag(kPresenceTag)
|
||||
.AddPacket(MakePacket<bool>(!cc->Inputs().Tag(kPacketTag).IsEmpty())
|
||||
.At(cc->InputTimestamp()));
|
||||
return absl::OkStatus();
|
||||
}
|
||||
|
|
|
@ -39,6 +39,11 @@ namespace mediapipe {
|
|||
|
||||
REGISTER_CALCULATOR(PacketResamplerCalculator);
|
||||
namespace {
|
||||
|
||||
constexpr char kSeedTag[] = "SEED";
|
||||
constexpr char kVideoHeaderTag[] = "VIDEO_HEADER";
|
||||
constexpr char kOptionsTag[] = "OPTIONS";
|
||||
|
||||
// Returns a TimestampDiff (assuming microseconds) corresponding to the
|
||||
// given time in seconds.
|
||||
TimestampDiff TimestampDiffFromSeconds(double seconds) {
|
||||
|
@ -50,16 +55,16 @@ TimestampDiff TimestampDiffFromSeconds(double seconds) {
|
|||
absl::Status PacketResamplerCalculator::GetContract(CalculatorContract* cc) {
|
||||
const auto& resampler_options =
|
||||
cc->Options<PacketResamplerCalculatorOptions>();
|
||||
if (cc->InputSidePackets().HasTag("OPTIONS")) {
|
||||
cc->InputSidePackets().Tag("OPTIONS").Set<CalculatorOptions>();
|
||||
if (cc->InputSidePackets().HasTag(kOptionsTag)) {
|
||||
cc->InputSidePackets().Tag(kOptionsTag).Set<CalculatorOptions>();
|
||||
}
|
||||
CollectionItemId input_data_id = cc->Inputs().GetId("DATA", 0);
|
||||
if (!input_data_id.IsValid()) {
|
||||
input_data_id = cc->Inputs().GetId("", 0);
|
||||
}
|
||||
cc->Inputs().Get(input_data_id).SetAny();
|
||||
if (cc->Inputs().HasTag("VIDEO_HEADER")) {
|
||||
cc->Inputs().Tag("VIDEO_HEADER").Set<VideoHeader>();
|
||||
if (cc->Inputs().HasTag(kVideoHeaderTag)) {
|
||||
cc->Inputs().Tag(kVideoHeaderTag).Set<VideoHeader>();
|
||||
}
|
||||
|
||||
CollectionItemId output_data_id = cc->Outputs().GetId("DATA", 0);
|
||||
|
@ -67,15 +72,15 @@ absl::Status PacketResamplerCalculator::GetContract(CalculatorContract* cc) {
|
|||
output_data_id = cc->Outputs().GetId("", 0);
|
||||
}
|
||||
cc->Outputs().Get(output_data_id).SetSameAs(&cc->Inputs().Get(input_data_id));
|
||||
if (cc->Outputs().HasTag("VIDEO_HEADER")) {
|
||||
cc->Outputs().Tag("VIDEO_HEADER").Set<VideoHeader>();
|
||||
if (cc->Outputs().HasTag(kVideoHeaderTag)) {
|
||||
cc->Outputs().Tag(kVideoHeaderTag).Set<VideoHeader>();
|
||||
}
|
||||
|
||||
if (resampler_options.jitter() != 0.0) {
|
||||
RET_CHECK_GT(resampler_options.jitter(), 0.0);
|
||||
RET_CHECK_LE(resampler_options.jitter(), 1.0);
|
||||
RET_CHECK(cc->InputSidePackets().HasTag("SEED"));
|
||||
cc->InputSidePackets().Tag("SEED").Set<std::string>();
|
||||
RET_CHECK(cc->InputSidePackets().HasTag(kSeedTag));
|
||||
cc->InputSidePackets().Tag(kSeedTag).Set<std::string>();
|
||||
}
|
||||
return absl::OkStatus();
|
||||
}
|
||||
|
@ -143,9 +148,9 @@ absl::Status PacketResamplerCalculator::Open(CalculatorContext* cc) {
|
|||
|
||||
absl::Status PacketResamplerCalculator::Process(CalculatorContext* cc) {
|
||||
if (cc->InputTimestamp() == Timestamp::PreStream() &&
|
||||
cc->Inputs().UsesTags() && cc->Inputs().HasTag("VIDEO_HEADER") &&
|
||||
!cc->Inputs().Tag("VIDEO_HEADER").IsEmpty()) {
|
||||
video_header_ = cc->Inputs().Tag("VIDEO_HEADER").Get<VideoHeader>();
|
||||
cc->Inputs().UsesTags() && cc->Inputs().HasTag(kVideoHeaderTag) &&
|
||||
!cc->Inputs().Tag(kVideoHeaderTag).IsEmpty()) {
|
||||
video_header_ = cc->Inputs().Tag(kVideoHeaderTag).Get<VideoHeader>();
|
||||
video_header_.frame_rate = frame_rate_;
|
||||
if (cc->Inputs().Get(input_data_id_).IsEmpty()) {
|
||||
return absl::OkStatus();
|
||||
|
@ -234,7 +239,7 @@ absl::Status LegacyJitterWithReflectionStrategy::Open(CalculatorContext* cc) {
|
|||
"ignored, because we are adding jitter.";
|
||||
}
|
||||
|
||||
const auto& seed = cc->InputSidePackets().Tag("SEED").Get<std::string>();
|
||||
const auto& seed = cc->InputSidePackets().Tag(kSeedTag).Get<std::string>();
|
||||
random_ = CreateSecureRandom(seed);
|
||||
if (random_ == nullptr) {
|
||||
return absl::InvalidArgumentError(
|
||||
|
@ -357,7 +362,7 @@ absl::Status ReproducibleJitterWithReflectionStrategy::Open(
|
|||
"ignored, because we are adding jitter.";
|
||||
}
|
||||
|
||||
const auto& seed = cc->InputSidePackets().Tag("SEED").Get<std::string>();
|
||||
const auto& seed = cc->InputSidePackets().Tag(kSeedTag).Get<std::string>();
|
||||
random_ = CreateSecureRandom(seed);
|
||||
if (random_ == nullptr) {
|
||||
return absl::InvalidArgumentError(
|
||||
|
@ -504,7 +509,7 @@ absl::Status JitterWithoutReflectionStrategy::Open(CalculatorContext* cc) {
|
|||
"ignored, because we are adding jitter.";
|
||||
}
|
||||
|
||||
const auto& seed = cc->InputSidePackets().Tag("SEED").Get<std::string>();
|
||||
const auto& seed = cc->InputSidePackets().Tag(kSeedTag).Get<std::string>();
|
||||
random_ = CreateSecureRandom(seed);
|
||||
if (random_ == nullptr) {
|
||||
return absl::InvalidArgumentError(
|
||||
|
@ -635,9 +640,9 @@ absl::Status NoJitterStrategy::Process(CalculatorContext* cc) {
|
|||
base_timestamp_ +
|
||||
TimestampDiffFromSeconds(first_index / calculator_->frame_rate_);
|
||||
}
|
||||
if (cc->Outputs().UsesTags() && cc->Outputs().HasTag("VIDEO_HEADER")) {
|
||||
if (cc->Outputs().UsesTags() && cc->Outputs().HasTag(kVideoHeaderTag)) {
|
||||
cc->Outputs()
|
||||
.Tag("VIDEO_HEADER")
|
||||
.Tag(kVideoHeaderTag)
|
||||
.Add(new VideoHeader(calculator_->video_header_),
|
||||
Timestamp::PreStream());
|
||||
}
|
||||
|
|
|
@ -32,6 +32,12 @@ namespace mediapipe {
|
|||
|
||||
using ::testing::ElementsAre;
|
||||
namespace {
|
||||
|
||||
constexpr char kOptionsTag[] = "OPTIONS";
|
||||
constexpr char kSeedTag[] = "SEED";
|
||||
constexpr char kVideoHeaderTag[] = "VIDEO_HEADER";
|
||||
constexpr char kDataTag[] = "DATA";
|
||||
|
||||
// A simple version of CalculatorRunner with built-in convenience
|
||||
// methods for setting inputs from a vector and checking outputs
|
||||
// against expected outputs (both timestamps and contents).
|
||||
|
@ -464,7 +470,7 @@ TEST(PacketResamplerCalculatorTest, SetVideoHeader) {
|
|||
)pb"));
|
||||
|
||||
for (const int64 ts : {0, 5000, 10010, 15001, 19990}) {
|
||||
runner.MutableInputs()->Tag("DATA").packets.push_back(
|
||||
runner.MutableInputs()->Tag(kDataTag).packets.push_back(
|
||||
Adopt(new std::string(absl::StrCat("Frame #", ts))).At(Timestamp(ts)));
|
||||
}
|
||||
VideoHeader video_header_in;
|
||||
|
@ -474,16 +480,16 @@ TEST(PacketResamplerCalculatorTest, SetVideoHeader) {
|
|||
video_header_in.duration = 1.0;
|
||||
video_header_in.format = ImageFormat::SRGB;
|
||||
runner.MutableInputs()
|
||||
->Tag("VIDEO_HEADER")
|
||||
->Tag(kVideoHeaderTag)
|
||||
.packets.push_back(
|
||||
Adopt(new VideoHeader(video_header_in)).At(Timestamp::PreStream()));
|
||||
MP_ASSERT_OK(runner.Run());
|
||||
|
||||
ASSERT_EQ(1, runner.Outputs().Tag("VIDEO_HEADER").packets.size());
|
||||
ASSERT_EQ(1, runner.Outputs().Tag(kVideoHeaderTag).packets.size());
|
||||
EXPECT_EQ(Timestamp::PreStream(),
|
||||
runner.Outputs().Tag("VIDEO_HEADER").packets[0].Timestamp());
|
||||
runner.Outputs().Tag(kVideoHeaderTag).packets[0].Timestamp());
|
||||
const VideoHeader& video_header_out =
|
||||
runner.Outputs().Tag("VIDEO_HEADER").packets[0].Get<VideoHeader>();
|
||||
runner.Outputs().Tag(kVideoHeaderTag).packets[0].Get<VideoHeader>();
|
||||
EXPECT_EQ(video_header_in.width, video_header_out.width);
|
||||
EXPECT_EQ(video_header_in.height, video_header_out.height);
|
||||
EXPECT_DOUBLE_EQ(50.0, video_header_out.frame_rate);
|
||||
|
@ -725,7 +731,7 @@ TEST(PacketResamplerCalculatorTest, OptionsSidePacket) {
|
|||
[mediapipe.PacketResamplerCalculatorOptions.ext] {
|
||||
frame_rate: 30
|
||||
})pb"));
|
||||
runner.MutableSidePackets()->Tag("OPTIONS") = Adopt(options);
|
||||
runner.MutableSidePackets()->Tag(kOptionsTag) = Adopt(options);
|
||||
runner.SetInput({-222, 15000, 32000, 49999, 150000});
|
||||
MP_ASSERT_OK(runner.Run());
|
||||
EXPECT_EQ(6, runner.Outputs().Index(0).packets.size());
|
||||
|
@ -740,7 +746,7 @@ TEST(PacketResamplerCalculatorTest, OptionsSidePacket) {
|
|||
frame_rate: 30
|
||||
base_timestamp: 0
|
||||
})pb"));
|
||||
runner.MutableSidePackets()->Tag("OPTIONS") = Adopt(options);
|
||||
runner.MutableSidePackets()->Tag(kOptionsTag) = Adopt(options);
|
||||
|
||||
runner.SetInput({-222, 15000, 32000, 49999, 150000});
|
||||
MP_ASSERT_OK(runner.Run());
|
||||
|
|
|
@ -217,6 +217,7 @@ absl::Status PacketThinnerCalculator::Open(CalculatorContext* cc) {
|
|||
header->format = video_header.format;
|
||||
header->width = video_header.width;
|
||||
header->height = video_header.height;
|
||||
header->duration = video_header.duration;
|
||||
header->frame_rate = new_frame_rate;
|
||||
cc->Outputs().Index(0).SetHeader(Adopt(header.release()));
|
||||
} else {
|
||||
|
|
|
@ -29,6 +29,8 @@
|
|||
namespace mediapipe {
|
||||
namespace {
|
||||
|
||||
constexpr char kPeriodTag[] = "PERIOD";
|
||||
|
||||
// A simple version of CalculatorRunner with built-in convenience methods for
|
||||
// setting inputs from a vector and checking outputs against a vector of
|
||||
// expected outputs.
|
||||
|
@ -121,7 +123,7 @@ TEST(PacketThinnerCalculatorTest, ASyncUniformStreamThinningTestBySidePacket) {
|
|||
|
||||
SimpleRunner runner(node);
|
||||
runner.SetInput({2, 4, 6, 8, 10, 12, 14});
|
||||
runner.MutableSidePackets()->Tag("PERIOD") = MakePacket<int64>(5);
|
||||
runner.MutableSidePackets()->Tag(kPeriodTag) = MakePacket<int64>(5);
|
||||
MP_ASSERT_OK(runner.Run());
|
||||
|
||||
const std::vector<int64> expected_timestamps = {2, 8, 14};
|
||||
|
@ -160,7 +162,7 @@ TEST(PacketThinnerCalculatorTest, SyncUniformStreamThinningTestBySidePacket1) {
|
|||
|
||||
SimpleRunner runner(node);
|
||||
runner.SetInput({2, 4, 6, 8, 10, 12, 14});
|
||||
runner.MutableSidePackets()->Tag("PERIOD") = MakePacket<int64>(5);
|
||||
runner.MutableSidePackets()->Tag(kPeriodTag) = MakePacket<int64>(5);
|
||||
MP_ASSERT_OK(runner.Run());
|
||||
|
||||
const std::vector<int64> expected_timestamps = {2, 6, 10, 14};
|
||||
|
|
|
@ -39,6 +39,8 @@ using ::testing::Pair;
|
|||
using ::testing::Value;
|
||||
namespace {
|
||||
|
||||
constexpr char kDisallowTag[] = "DISALLOW";
|
||||
|
||||
// Returns the timestamp values for a vector of Packets.
|
||||
// TODO: puth this kind of test util in a common place.
|
||||
std::vector<int64> TimestampValues(const std::vector<Packet>& packets) {
|
||||
|
@ -702,14 +704,14 @@ class DroppingGateCalculator : public CalculatorBase {
|
|||
public:
|
||||
static absl::Status GetContract(CalculatorContract* cc) {
|
||||
cc->Inputs().Index(0).SetAny();
|
||||
cc->Inputs().Tag("DISALLOW").Set<bool>();
|
||||
cc->Inputs().Tag(kDisallowTag).Set<bool>();
|
||||
cc->Outputs().Index(0).SetSameAs(&cc->Inputs().Index(0));
|
||||
return absl::OkStatus();
|
||||
}
|
||||
|
||||
absl::Status Process(CalculatorContext* cc) final {
|
||||
if (!cc->Inputs().Index(0).IsEmpty() &&
|
||||
!cc->Inputs().Tag("DISALLOW").Get<bool>()) {
|
||||
!cc->Inputs().Tag(kDisallowTag).Get<bool>()) {
|
||||
cc->Outputs().Index(0).AddPacket(cc->Inputs().Index(0).Value());
|
||||
}
|
||||
return absl::OkStatus();
|
||||
|
|
|
@ -41,11 +41,14 @@
|
|||
// }
|
||||
namespace mediapipe {
|
||||
|
||||
constexpr char kEncodedTag[] = "ENCODED";
|
||||
constexpr char kFloatVectorTag[] = "FLOAT_VECTOR";
|
||||
|
||||
class QuantizeFloatVectorCalculator : public CalculatorBase {
|
||||
public:
|
||||
static absl::Status GetContract(CalculatorContract* cc) {
|
||||
cc->Inputs().Tag("FLOAT_VECTOR").Set<std::vector<float>>();
|
||||
cc->Outputs().Tag("ENCODED").Set<std::string>();
|
||||
cc->Inputs().Tag(kFloatVectorTag).Set<std::vector<float>>();
|
||||
cc->Outputs().Tag(kEncodedTag).Set<std::string>();
|
||||
return absl::OkStatus();
|
||||
}
|
||||
|
||||
|
@ -70,7 +73,7 @@ class QuantizeFloatVectorCalculator : public CalculatorBase {
|
|||
|
||||
absl::Status Process(CalculatorContext* cc) final {
|
||||
const std::vector<float>& float_vector =
|
||||
cc->Inputs().Tag("FLOAT_VECTOR").Value().Get<std::vector<float>>();
|
||||
cc->Inputs().Tag(kFloatVectorTag).Value().Get<std::vector<float>>();
|
||||
int feature_size = float_vector.size();
|
||||
std::string encoded_features;
|
||||
encoded_features.reserve(feature_size);
|
||||
|
@ -86,7 +89,9 @@ class QuantizeFloatVectorCalculator : public CalculatorBase {
|
|||
(old_value - min_quantized_value_) * (255.0 / range_));
|
||||
encoded_features += encoded;
|
||||
}
|
||||
cc->Outputs().Tag("ENCODED").AddPacket(
|
||||
cc->Outputs()
|
||||
.Tag(kEncodedTag)
|
||||
.AddPacket(
|
||||
MakePacket<std::string>(encoded_features).At(cc->InputTimestamp()));
|
||||
return absl::OkStatus();
|
||||
}
|
||||
|
|
|
@ -25,6 +25,9 @@
|
|||
|
||||
namespace mediapipe {
|
||||
|
||||
constexpr char kEncodedTag[] = "ENCODED";
|
||||
constexpr char kFloatVectorTag[] = "FLOAT_VECTOR";
|
||||
|
||||
TEST(QuantizeFloatVectorCalculatorTest, WrongConfig) {
|
||||
CalculatorGraphConfig::Node node_config =
|
||||
ParseTextProtoOrDie<CalculatorGraphConfig::Node>(R"pb(
|
||||
|
@ -40,7 +43,7 @@ TEST(QuantizeFloatVectorCalculatorTest, WrongConfig) {
|
|||
CalculatorRunner runner(node_config);
|
||||
std::vector<float> empty_vector;
|
||||
runner.MutableInputs()
|
||||
->Tag("FLOAT_VECTOR")
|
||||
->Tag(kFloatVectorTag)
|
||||
.packets.push_back(
|
||||
MakePacket<std::vector<float>>(empty_vector).At(Timestamp(0)));
|
||||
auto status = runner.Run();
|
||||
|
@ -67,7 +70,7 @@ TEST(QuantizeFloatVectorCalculatorTest, WrongConfig2) {
|
|||
CalculatorRunner runner(node_config);
|
||||
std::vector<float> empty_vector;
|
||||
runner.MutableInputs()
|
||||
->Tag("FLOAT_VECTOR")
|
||||
->Tag(kFloatVectorTag)
|
||||
.packets.push_back(
|
||||
MakePacket<std::vector<float>>(empty_vector).At(Timestamp(0)));
|
||||
auto status = runner.Run();
|
||||
|
@ -94,7 +97,7 @@ TEST(QuantizeFloatVectorCalculatorTest, WrongConfig3) {
|
|||
CalculatorRunner runner(node_config);
|
||||
std::vector<float> empty_vector;
|
||||
runner.MutableInputs()
|
||||
->Tag("FLOAT_VECTOR")
|
||||
->Tag(kFloatVectorTag)
|
||||
.packets.push_back(
|
||||
MakePacket<std::vector<float>>(empty_vector).At(Timestamp(0)));
|
||||
auto status = runner.Run();
|
||||
|
@ -121,11 +124,12 @@ TEST(QuantizeFloatVectorCalculatorTest, TestEmptyVector) {
|
|||
CalculatorRunner runner(node_config);
|
||||
std::vector<float> empty_vector;
|
||||
runner.MutableInputs()
|
||||
->Tag("FLOAT_VECTOR")
|
||||
->Tag(kFloatVectorTag)
|
||||
.packets.push_back(
|
||||
MakePacket<std::vector<float>>(empty_vector).At(Timestamp(0)));
|
||||
MP_ASSERT_OK(runner.Run());
|
||||
const std::vector<Packet>& outputs = runner.Outputs().Tag("ENCODED").packets;
|
||||
const std::vector<Packet>& outputs =
|
||||
runner.Outputs().Tag(kEncodedTag).packets;
|
||||
EXPECT_EQ(1, outputs.size());
|
||||
EXPECT_TRUE(outputs[0].Get<std::string>().empty());
|
||||
EXPECT_EQ(Timestamp(0), outputs[0].Timestamp());
|
||||
|
@ -147,11 +151,12 @@ TEST(QuantizeFloatVectorCalculatorTest, TestNonEmptyVector) {
|
|||
CalculatorRunner runner(node_config);
|
||||
std::vector<float> vector = {0.0f, -64.0f, 64.0f, -32.0f, 32.0f};
|
||||
runner.MutableInputs()
|
||||
->Tag("FLOAT_VECTOR")
|
||||
->Tag(kFloatVectorTag)
|
||||
.packets.push_back(
|
||||
MakePacket<std::vector<float>>(vector).At(Timestamp(0)));
|
||||
MP_ASSERT_OK(runner.Run());
|
||||
const std::vector<Packet>& outputs = runner.Outputs().Tag("ENCODED").packets;
|
||||
const std::vector<Packet>& outputs =
|
||||
runner.Outputs().Tag(kEncodedTag).packets;
|
||||
EXPECT_EQ(1, outputs.size());
|
||||
const std::string& result = outputs[0].Get<std::string>();
|
||||
ASSERT_FALSE(result.empty());
|
||||
|
@ -185,11 +190,12 @@ TEST(QuantizeFloatVectorCalculatorTest, TestSaturation) {
|
|||
CalculatorRunner runner(node_config);
|
||||
std::vector<float> vector = {-65.0f, 65.0f};
|
||||
runner.MutableInputs()
|
||||
->Tag("FLOAT_VECTOR")
|
||||
->Tag(kFloatVectorTag)
|
||||
.packets.push_back(
|
||||
MakePacket<std::vector<float>>(vector).At(Timestamp(0)));
|
||||
MP_ASSERT_OK(runner.Run());
|
||||
const std::vector<Packet>& outputs = runner.Outputs().Tag("ENCODED").packets;
|
||||
const std::vector<Packet>& outputs =
|
||||
runner.Outputs().Tag(kEncodedTag).packets;
|
||||
EXPECT_EQ(1, outputs.size());
|
||||
const std::string& result = outputs[0].Get<std::string>();
|
||||
ASSERT_FALSE(result.empty());
|
||||
|
|
|
@ -23,6 +23,9 @@
|
|||
|
||||
namespace mediapipe {
|
||||
|
||||
constexpr char kAllowTag[] = "ALLOW";
|
||||
constexpr char kMaxInFlightTag[] = "MAX_IN_FLIGHT";
|
||||
|
||||
// RealTimeFlowLimiterCalculator is used to limit the number of pipelined
|
||||
// processing operations in a section of the graph.
|
||||
//
|
||||
|
@ -86,11 +89,11 @@ class RealTimeFlowLimiterCalculator : public CalculatorBase {
|
|||
cc->Outputs().Get("", i).SetSameAs(&(cc->Inputs().Get("", i)));
|
||||
}
|
||||
cc->Inputs().Get("FINISHED", 0).SetAny();
|
||||
if (cc->InputSidePackets().HasTag("MAX_IN_FLIGHT")) {
|
||||
cc->InputSidePackets().Tag("MAX_IN_FLIGHT").Set<int>();
|
||||
if (cc->InputSidePackets().HasTag(kMaxInFlightTag)) {
|
||||
cc->InputSidePackets().Tag(kMaxInFlightTag).Set<int>();
|
||||
}
|
||||
if (cc->Outputs().HasTag("ALLOW")) {
|
||||
cc->Outputs().Tag("ALLOW").Set<bool>();
|
||||
if (cc->Outputs().HasTag(kAllowTag)) {
|
||||
cc->Outputs().Tag(kAllowTag).Set<bool>();
|
||||
}
|
||||
|
||||
cc->SetInputStreamHandler("ImmediateInputStreamHandler");
|
||||
|
@ -101,8 +104,8 @@ class RealTimeFlowLimiterCalculator : public CalculatorBase {
|
|||
absl::Status Open(CalculatorContext* cc) final {
|
||||
finished_id_ = cc->Inputs().GetId("FINISHED", 0);
|
||||
max_in_flight_ = 1;
|
||||
if (cc->InputSidePackets().HasTag("MAX_IN_FLIGHT")) {
|
||||
max_in_flight_ = cc->InputSidePackets().Tag("MAX_IN_FLIGHT").Get<int>();
|
||||
if (cc->InputSidePackets().HasTag(kMaxInFlightTag)) {
|
||||
max_in_flight_ = cc->InputSidePackets().Tag(kMaxInFlightTag).Get<int>();
|
||||
}
|
||||
RET_CHECK_GE(max_in_flight_, 1);
|
||||
num_in_flight_ = 0;
|
||||
|
|
|
@ -33,6 +33,9 @@
|
|||
namespace mediapipe {
|
||||
|
||||
namespace {
|
||||
|
||||
constexpr char kFinishedTag[] = "FINISHED";
|
||||
|
||||
// A simple Semaphore for synchronizing test threads.
|
||||
class AtomicSemaphore {
|
||||
public:
|
||||
|
@ -112,7 +115,7 @@ TEST(RealTimeFlowLimiterCalculator, BasicTest) {
|
|||
Timestamp timestamp =
|
||||
Timestamp((i + 1) * Timestamp::kTimestampUnitsPerSecond);
|
||||
runner.MutableInputs()
|
||||
->Tag("FINISHED")
|
||||
->Tag(kFinishedTag)
|
||||
.packets.push_back(MakePacket<bool>(true).At(timestamp));
|
||||
}
|
||||
|
||||
|
|
|
@ -22,6 +22,8 @@ namespace mediapipe {
|
|||
|
||||
namespace {
|
||||
|
||||
constexpr char kPacketOffsetTag[] = "PACKET_OFFSET";
|
||||
|
||||
// Adds packets containing integers equal to their original timestamp.
|
||||
void AddPackets(CalculatorRunner* runner) {
|
||||
for (int i = 0; i < 10; ++i) {
|
||||
|
@ -111,7 +113,7 @@ TEST(SequenceShiftCalculatorTest, SidePacketOffset) {
|
|||
|
||||
CalculatorRunner runner(node);
|
||||
AddPackets(&runner);
|
||||
runner.MutableSidePackets()->Tag("PACKET_OFFSET") = Adopt(new int(-2));
|
||||
runner.MutableSidePackets()->Tag(kPacketOffsetTag) = Adopt(new int(-2));
|
||||
MP_ASSERT_OK(runner.Run());
|
||||
const std::vector<Packet>& input_packets =
|
||||
runner.MutableInputs()->Index(0).packets;
|
||||
|
|
|
@ -12,8 +12,8 @@
|
|||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
|
||||
#ifndef MEDIAPIPE_CALCULATORS_CORE_SPLIT_NORMALIZED_LANDMARK_LIST_CALCULATOR_H_ // NOLINT
|
||||
#define MEDIAPIPE_CALCULATORS_CORE_SPLIT_NORMALIZED_LANDMARK_LIST_CALCULATOR_H_ // NOLINT
|
||||
#ifndef MEDIAPIPE_CALCULATORS_CORE_SPLIT_LANDMARKS_CALCULATOR_H_ // NOLINT
|
||||
#define MEDIAPIPE_CALCULATORS_CORE_SPLIT_LANDMARKS_CALCULATOR_H_ // NOLINT
|
||||
|
||||
#include "mediapipe/calculators/core/split_vector_calculator.pb.h"
|
||||
#include "mediapipe/framework/calculator_framework.h"
|
||||
|
@ -24,29 +24,30 @@
|
|||
|
||||
namespace mediapipe {
|
||||
|
||||
// Splits an input packet with NormalizedLandmarkList into
|
||||
// multiple NormalizedLandmarkList output packets using the [begin, end) ranges
|
||||
// Splits an input packet with LandmarkListType into
|
||||
// multiple LandmarkListType output packets using the [begin, end) ranges
|
||||
// specified in SplitVectorCalculatorOptions. If the option "element_only" is
|
||||
// set to true, all ranges should be of size 1 and all outputs will be elements
|
||||
// of type NormalizedLandmark. If "element_only" is false, ranges can be
|
||||
// non-zero in size and all outputs will be of type NormalizedLandmarkList.
|
||||
// of type LandmarkType. If "element_only" is false, ranges can be
|
||||
// non-zero in size and all outputs will be of type LandmarkListType.
|
||||
// If the option "combine_outputs" is set to true, only one output stream can be
|
||||
// specified and all ranges of elements will be combined into one
|
||||
// NormalizedLandmarkList.
|
||||
class SplitNormalizedLandmarkListCalculator : public CalculatorBase {
|
||||
// LandmarkListType.
|
||||
template <typename LandmarkType, typename LandmarkListType>
|
||||
class SplitLandmarksCalculator : public CalculatorBase {
|
||||
public:
|
||||
static absl::Status GetContract(CalculatorContract* cc) {
|
||||
RET_CHECK(cc->Inputs().NumEntries() == 1);
|
||||
RET_CHECK(cc->Outputs().NumEntries() != 0);
|
||||
|
||||
cc->Inputs().Index(0).Set<NormalizedLandmarkList>();
|
||||
cc->Inputs().Index(0).Set<LandmarkListType>();
|
||||
|
||||
const auto& options =
|
||||
cc->Options<::mediapipe::SplitVectorCalculatorOptions>();
|
||||
|
||||
if (options.combine_outputs()) {
|
||||
RET_CHECK_EQ(cc->Outputs().NumEntries(), 1);
|
||||
cc->Outputs().Index(0).Set<NormalizedLandmarkList>();
|
||||
cc->Outputs().Index(0).Set<LandmarkListType>();
|
||||
for (int i = 0; i < options.ranges_size() - 1; ++i) {
|
||||
for (int j = i + 1; j < options.ranges_size(); ++j) {
|
||||
const auto& range_0 = options.ranges(i);
|
||||
|
@ -81,9 +82,9 @@ class SplitNormalizedLandmarkListCalculator : public CalculatorBase {
|
|||
return absl::InvalidArgumentError(
|
||||
"Since element_only is true, all ranges should be of size 1.");
|
||||
}
|
||||
cc->Outputs().Index(i).Set<NormalizedLandmark>();
|
||||
cc->Outputs().Index(i).Set<LandmarkType>();
|
||||
} else {
|
||||
cc->Outputs().Index(i).Set<NormalizedLandmarkList>();
|
||||
cc->Outputs().Index(i).Set<LandmarkListType>();
|
||||
}
|
||||
}
|
||||
}
|
||||
|
@ -110,40 +111,39 @@ class SplitNormalizedLandmarkListCalculator : public CalculatorBase {
|
|||
}
|
||||
|
||||
absl::Status Process(CalculatorContext* cc) override {
|
||||
const NormalizedLandmarkList& input =
|
||||
cc->Inputs().Index(0).Get<NormalizedLandmarkList>();
|
||||
const LandmarkListType& input =
|
||||
cc->Inputs().Index(0).Get<LandmarkListType>();
|
||||
RET_CHECK_GE(input.landmark_size(), max_range_end_)
|
||||
<< "Max range end " << max_range_end_ << " exceeds landmarks size "
|
||||
<< input.landmark_size();
|
||||
|
||||
if (combine_outputs_) {
|
||||
NormalizedLandmarkList output;
|
||||
LandmarkListType output;
|
||||
for (int i = 0; i < ranges_.size(); ++i) {
|
||||
for (int j = ranges_[i].first; j < ranges_[i].second; ++j) {
|
||||
const NormalizedLandmark& input_landmark = input.landmark(j);
|
||||
const LandmarkType& input_landmark = input.landmark(j);
|
||||
*output.add_landmark() = input_landmark;
|
||||
}
|
||||
}
|
||||
RET_CHECK_EQ(output.landmark_size(), total_elements_);
|
||||
cc->Outputs().Index(0).AddPacket(
|
||||
MakePacket<NormalizedLandmarkList>(output).At(cc->InputTimestamp()));
|
||||
MakePacket<LandmarkListType>(output).At(cc->InputTimestamp()));
|
||||
} else {
|
||||
if (element_only_) {
|
||||
for (int i = 0; i < ranges_.size(); ++i) {
|
||||
cc->Outputs().Index(i).AddPacket(
|
||||
MakePacket<NormalizedLandmark>(input.landmark(ranges_[i].first))
|
||||
MakePacket<LandmarkType>(input.landmark(ranges_[i].first))
|
||||
.At(cc->InputTimestamp()));
|
||||
}
|
||||
} else {
|
||||
for (int i = 0; i < ranges_.size(); ++i) {
|
||||
NormalizedLandmarkList output;
|
||||
LandmarkListType output;
|
||||
for (int j = ranges_[i].first; j < ranges_[i].second; ++j) {
|
||||
const NormalizedLandmark& input_landmark = input.landmark(j);
|
||||
const LandmarkType& input_landmark = input.landmark(j);
|
||||
*output.add_landmark() = input_landmark;
|
||||
}
|
||||
cc->Outputs().Index(i).AddPacket(
|
||||
MakePacket<NormalizedLandmarkList>(output).At(
|
||||
cc->InputTimestamp()));
|
||||
MakePacket<LandmarkListType>(output).At(cc->InputTimestamp()));
|
||||
}
|
||||
}
|
||||
}
|
||||
|
@ -159,9 +159,15 @@ class SplitNormalizedLandmarkListCalculator : public CalculatorBase {
|
|||
bool combine_outputs_ = false;
|
||||
};
|
||||
|
||||
typedef SplitLandmarksCalculator<NormalizedLandmark, NormalizedLandmarkList>
|
||||
SplitNormalizedLandmarkListCalculator;
|
||||
REGISTER_CALCULATOR(SplitNormalizedLandmarkListCalculator);
|
||||
|
||||
typedef SplitLandmarksCalculator<Landmark, LandmarkList>
|
||||
SplitLandmarkListCalculator;
|
||||
REGISTER_CALCULATOR(SplitLandmarkListCalculator);
|
||||
|
||||
} // namespace mediapipe
|
||||
|
||||
// NOLINTNEXTLINE
|
||||
#endif // MEDIAPIPE_CALCULATORS_CORE_SPLIT_NORMALIZED_LANDMARK_LIST_CALCULATOR_H_
|
||||
#endif // MEDIAPIPE_CALCULATORS_CORE_SPLIT_LANDMARKS_CALCULATOR_H_
|
|
@ -80,6 +80,16 @@ mediapipe_proto_library(
|
|||
],
|
||||
)
|
||||
|
||||
mediapipe_proto_library(
|
||||
name = "segmentation_smoothing_calculator_proto",
|
||||
srcs = ["segmentation_smoothing_calculator.proto"],
|
||||
visibility = ["//visibility:public"],
|
||||
deps = [
|
||||
"//mediapipe/framework:calculator_options_proto",
|
||||
"//mediapipe/framework:calculator_proto",
|
||||
],
|
||||
)
|
||||
|
||||
cc_library(
|
||||
name = "color_convert_calculator",
|
||||
srcs = ["color_convert_calculator.cc"],
|
||||
|
@ -602,3 +612,187 @@ cc_test(
|
|||
"//mediapipe/framework/port:parse_text_proto",
|
||||
],
|
||||
)
|
||||
|
||||
cc_library(
|
||||
name = "segmentation_smoothing_calculator",
|
||||
srcs = ["segmentation_smoothing_calculator.cc"],
|
||||
visibility = ["//visibility:public"],
|
||||
deps = [
|
||||
":segmentation_smoothing_calculator_cc_proto",
|
||||
"//mediapipe/framework:calculator_options_cc_proto",
|
||||
"//mediapipe/framework/formats:image_format_cc_proto",
|
||||
"//mediapipe/framework:calculator_framework",
|
||||
"//mediapipe/framework/formats:image_frame",
|
||||
"//mediapipe/framework/formats:image_frame_opencv",
|
||||
"//mediapipe/framework/formats:image",
|
||||
"//mediapipe/framework/formats:image_opencv",
|
||||
"//mediapipe/framework/port:logging",
|
||||
"//mediapipe/framework/port:opencv_core",
|
||||
"//mediapipe/framework/port:status",
|
||||
"//mediapipe/framework/port:vector",
|
||||
] + select({
|
||||
"//mediapipe/gpu:disable_gpu": [],
|
||||
"//conditions:default": [
|
||||
"//mediapipe/gpu:gl_calculator_helper",
|
||||
"//mediapipe/gpu:gl_simple_shaders",
|
||||
"//mediapipe/gpu:gl_quad_renderer",
|
||||
"//mediapipe/gpu:shader_util",
|
||||
],
|
||||
}),
|
||||
alwayslink = 1,
|
||||
)
|
||||
|
||||
cc_test(
|
||||
name = "segmentation_smoothing_calculator_test",
|
||||
srcs = ["segmentation_smoothing_calculator_test.cc"],
|
||||
deps = [
|
||||
":image_clone_calculator",
|
||||
":image_clone_calculator_cc_proto",
|
||||
":segmentation_smoothing_calculator",
|
||||
":segmentation_smoothing_calculator_cc_proto",
|
||||
"//mediapipe/framework:calculator_framework",
|
||||
"//mediapipe/framework:calculator_runner",
|
||||
"//mediapipe/framework/deps:file_path",
|
||||
"//mediapipe/framework/formats:image_frame",
|
||||
"//mediapipe/framework/formats:image_opencv",
|
||||
"//mediapipe/framework/port:gtest_main",
|
||||
"//mediapipe/framework/port:opencv_imgcodecs",
|
||||
"//mediapipe/framework/port:opencv_imgproc",
|
||||
"//mediapipe/framework/port:parse_text_proto",
|
||||
],
|
||||
)
|
||||
|
||||
cc_library(
|
||||
name = "affine_transformation",
|
||||
hdrs = ["affine_transformation.h"],
|
||||
deps = ["@com_google_absl//absl/status:statusor"],
|
||||
)
|
||||
|
||||
cc_library(
|
||||
name = "affine_transformation_runner_gl",
|
||||
srcs = ["affine_transformation_runner_gl.cc"],
|
||||
hdrs = ["affine_transformation_runner_gl.h"],
|
||||
deps = [
|
||||
":affine_transformation",
|
||||
"//mediapipe/framework:calculator_framework",
|
||||
"//mediapipe/framework/port:ret_check",
|
||||
"//mediapipe/gpu:gl_calculator_helper",
|
||||
"//mediapipe/gpu:gl_simple_shaders",
|
||||
"//mediapipe/gpu:gpu_buffer",
|
||||
"//mediapipe/gpu:gpu_origin_cc_proto",
|
||||
"//mediapipe/gpu:shader_util",
|
||||
"@com_google_absl//absl/memory",
|
||||
"@com_google_absl//absl/status",
|
||||
"@com_google_absl//absl/status:statusor",
|
||||
"@eigen_archive//:eigen3",
|
||||
],
|
||||
)
|
||||
|
||||
cc_library(
|
||||
name = "affine_transformation_runner_opencv",
|
||||
srcs = ["affine_transformation_runner_opencv.cc"],
|
||||
hdrs = ["affine_transformation_runner_opencv.h"],
|
||||
deps = [
|
||||
":affine_transformation",
|
||||
"//mediapipe/framework:calculator_framework",
|
||||
"//mediapipe/framework/formats:image_frame",
|
||||
"//mediapipe/framework/formats:image_frame_opencv",
|
||||
"//mediapipe/framework/port:opencv_core",
|
||||
"//mediapipe/framework/port:opencv_imgproc",
|
||||
"//mediapipe/framework/port:ret_check",
|
||||
"@com_google_absl//absl/memory",
|
||||
"@com_google_absl//absl/status:statusor",
|
||||
"@eigen_archive//:eigen3",
|
||||
],
|
||||
)
|
||||
|
||||
mediapipe_proto_library(
|
||||
name = "warp_affine_calculator_proto",
|
||||
srcs = ["warp_affine_calculator.proto"],
|
||||
visibility = ["//visibility:public"],
|
||||
deps = [
|
||||
"//mediapipe/framework:calculator_options_proto",
|
||||
"//mediapipe/framework:calculator_proto",
|
||||
"//mediapipe/gpu:gpu_origin_proto",
|
||||
],
|
||||
)
|
||||
|
||||
cc_library(
|
||||
name = "warp_affine_calculator",
|
||||
srcs = ["warp_affine_calculator.cc"],
|
||||
hdrs = ["warp_affine_calculator.h"],
|
||||
visibility = ["//visibility:public"],
|
||||
deps = [
|
||||
":affine_transformation",
|
||||
":affine_transformation_runner_opencv",
|
||||
":warp_affine_calculator_cc_proto",
|
||||
"@com_google_absl//absl/status",
|
||||
"@com_google_absl//absl/status:statusor",
|
||||
"//mediapipe/framework:calculator_framework",
|
||||
"//mediapipe/framework/api2:node",
|
||||
"//mediapipe/framework/api2:port",
|
||||
"//mediapipe/framework/formats:image",
|
||||
"//mediapipe/framework/formats:image_frame",
|
||||
"//mediapipe/framework/port:ret_check",
|
||||
"//mediapipe/framework/port:status",
|
||||
] + select({
|
||||
"//mediapipe/gpu:disable_gpu": [],
|
||||
"//conditions:default": [
|
||||
"//mediapipe/gpu:gl_calculator_helper",
|
||||
"//mediapipe/gpu:gpu_buffer",
|
||||
":affine_transformation_runner_gl",
|
||||
],
|
||||
}),
|
||||
alwayslink = 1,
|
||||
)
|
||||
|
||||
cc_test(
|
||||
name = "warp_affine_calculator_test",
|
||||
srcs = ["warp_affine_calculator_test.cc"],
|
||||
data = [
|
||||
"//mediapipe/calculators/tensor:testdata/image_to_tensor/input.jpg",
|
||||
"//mediapipe/calculators/tensor:testdata/image_to_tensor/large_sub_rect.png",
|
||||
"//mediapipe/calculators/tensor:testdata/image_to_tensor/large_sub_rect_border_zero.png",
|
||||
"//mediapipe/calculators/tensor:testdata/image_to_tensor/large_sub_rect_keep_aspect.png",
|
||||
"//mediapipe/calculators/tensor:testdata/image_to_tensor/large_sub_rect_keep_aspect_border_zero.png",
|
||||
"//mediapipe/calculators/tensor:testdata/image_to_tensor/large_sub_rect_keep_aspect_with_rotation.png",
|
||||
"//mediapipe/calculators/tensor:testdata/image_to_tensor/large_sub_rect_keep_aspect_with_rotation_border_zero.png",
|
||||
"//mediapipe/calculators/tensor:testdata/image_to_tensor/medium_sub_rect_keep_aspect.png",
|
||||
"//mediapipe/calculators/tensor:testdata/image_to_tensor/medium_sub_rect_keep_aspect_border_zero.png",
|
||||
"//mediapipe/calculators/tensor:testdata/image_to_tensor/medium_sub_rect_keep_aspect_with_rotation.png",
|
||||
"//mediapipe/calculators/tensor:testdata/image_to_tensor/medium_sub_rect_keep_aspect_with_rotation_border_zero.png",
|
||||
"//mediapipe/calculators/tensor:testdata/image_to_tensor/medium_sub_rect_with_rotation.png",
|
||||
"//mediapipe/calculators/tensor:testdata/image_to_tensor/medium_sub_rect_with_rotation_border_zero.png",
|
||||
"//mediapipe/calculators/tensor:testdata/image_to_tensor/noop_except_range.png",
|
||||
],
|
||||
tags = ["desktop_only_test"],
|
||||
deps = [
|
||||
":affine_transformation",
|
||||
":warp_affine_calculator",
|
||||
"//mediapipe/calculators/image:image_transformation_calculator",
|
||||
"//mediapipe/calculators/tensor:image_to_tensor_converter",
|
||||
"//mediapipe/calculators/tensor:image_to_tensor_utils",
|
||||
"//mediapipe/calculators/util:from_image_calculator",
|
||||
"//mediapipe/calculators/util:to_image_calculator",
|
||||
"//mediapipe/framework:calculator_framework",
|
||||
"//mediapipe/framework:calculator_runner",
|
||||
"//mediapipe/framework/deps:file_path",
|
||||
"//mediapipe/framework/formats:image",
|
||||
"//mediapipe/framework/formats:image_format_cc_proto",
|
||||
"//mediapipe/framework/formats:image_frame",
|
||||
"//mediapipe/framework/formats:image_frame_opencv",
|
||||
"//mediapipe/framework/formats:rect_cc_proto",
|
||||
"//mediapipe/framework/formats:tensor",
|
||||
"//mediapipe/framework/port:gtest_main",
|
||||
"//mediapipe/framework/port:integral_types",
|
||||
"//mediapipe/framework/port:opencv_core",
|
||||
"//mediapipe/framework/port:opencv_imgcodecs",
|
||||
"//mediapipe/framework/port:opencv_imgproc",
|
||||
"//mediapipe/framework/port:parse_text_proto",
|
||||
"//mediapipe/gpu:gpu_buffer_to_image_frame_calculator",
|
||||
"//mediapipe/gpu:image_frame_to_gpu_buffer_calculator",
|
||||
"@com_google_absl//absl/flags:flag",
|
||||
"@com_google_absl//absl/memory",
|
||||
"@com_google_absl//absl/strings",
|
||||
],
|
||||
)
|
||||
|
|
55
mediapipe/calculators/image/affine_transformation.h
Normal file
55
mediapipe/calculators/image/affine_transformation.h
Normal file
|
@ -0,0 +1,55 @@
|
|||
// Copyright 2021 The MediaPipe Authors.
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
// You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software
|
||||
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
|
||||
#ifndef MEDIAPIPE_CALCULATORS_IMAGE_AFFINE_TRANSFORMATION_H_
|
||||
#define MEDIAPIPE_CALCULATORS_IMAGE_AFFINE_TRANSFORMATION_H_
|
||||
|
||||
#include <array>
|
||||
|
||||
#include "absl/status/statusor.h"
|
||||
|
||||
namespace mediapipe {
|
||||
|
||||
class AffineTransformation {
|
||||
public:
|
||||
// Pixel extrapolation method.
|
||||
// When converting image to tensor it may happen that tensor needs to read
|
||||
// pixels outside image boundaries. Border mode helps to specify how such
|
||||
// pixels will be calculated.
|
||||
enum class BorderMode { kZero, kReplicate };
|
||||
|
||||
struct Size {
|
||||
int width;
|
||||
int height;
|
||||
};
|
||||
|
||||
template <typename InputT, typename OutputT>
|
||||
class Runner {
|
||||
public:
|
||||
virtual ~Runner() = default;
|
||||
|
||||
// Transforms input into output using @matrix as following:
|
||||
// output(x, y) = input(matrix[0] * x + matrix[1] * y + matrix[3],
|
||||
// matrix[4] * x + matrix[5] * y + matrix[7])
|
||||
// where x and y ranges are defined by @output_size.
|
||||
virtual absl::StatusOr<OutputT> Run(const InputT& input,
|
||||
const std::array<float, 16>& matrix,
|
||||
const Size& output_size,
|
||||
BorderMode border_mode) = 0;
|
||||
};
|
||||
};
|
||||
|
||||
} // namespace mediapipe
|
||||
|
||||
#endif // MEDIAPIPE_CALCULATORS_IMAGE_AFFINE_TRANSFORMATION_H_
|
354
mediapipe/calculators/image/affine_transformation_runner_gl.cc
Normal file
354
mediapipe/calculators/image/affine_transformation_runner_gl.cc
Normal file
|
@ -0,0 +1,354 @@
|
|||
// Copyright 2021 The MediaPipe Authors.
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
// You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software
|
||||
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
|
||||
#include "mediapipe/calculators/image/affine_transformation_runner_gl.h"
|
||||
|
||||
#include <memory>
|
||||
#include <optional>
|
||||
|
||||
#include "Eigen/Core"
|
||||
#include "Eigen/Geometry"
|
||||
#include "Eigen/LU"
|
||||
#include "absl/memory/memory.h"
|
||||
#include "absl/status/status.h"
|
||||
#include "absl/status/statusor.h"
|
||||
#include "mediapipe/calculators/image/affine_transformation.h"
|
||||
#include "mediapipe/framework/calculator_framework.h"
|
||||
#include "mediapipe/framework/port/ret_check.h"
|
||||
#include "mediapipe/gpu/gl_calculator_helper.h"
|
||||
#include "mediapipe/gpu/gl_simple_shaders.h"
|
||||
#include "mediapipe/gpu/gpu_buffer.h"
|
||||
#include "mediapipe/gpu/gpu_origin.pb.h"
|
||||
#include "mediapipe/gpu/shader_util.h"
|
||||
|
||||
namespace mediapipe {
|
||||
|
||||
namespace {
|
||||
|
||||
using mediapipe::GlCalculatorHelper;
|
||||
using mediapipe::GlhCreateProgram;
|
||||
using mediapipe::GlTexture;
|
||||
using mediapipe::GpuBuffer;
|
||||
using mediapipe::GpuOrigin;
|
||||
|
||||
bool IsMatrixVerticalFlipNeeded(GpuOrigin::Mode gpu_origin) {
|
||||
switch (gpu_origin) {
|
||||
case GpuOrigin::DEFAULT:
|
||||
case GpuOrigin::CONVENTIONAL:
|
||||
#ifdef __APPLE__
|
||||
return false;
|
||||
#else
|
||||
return true;
|
||||
#endif // __APPLE__
|
||||
case GpuOrigin::TOP_LEFT:
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
#ifdef __APPLE__
|
||||
#define GL_CLAMP_TO_BORDER_MAY_BE_SUPPORTED 0
|
||||
#else
|
||||
#define GL_CLAMP_TO_BORDER_MAY_BE_SUPPORTED 1
|
||||
#endif // __APPLE__
|
||||
|
||||
bool IsGlClampToBorderSupported(const mediapipe::GlContext& gl_context) {
|
||||
return gl_context.gl_major_version() > 3 ||
|
||||
(gl_context.gl_major_version() == 3 &&
|
||||
gl_context.gl_minor_version() >= 2);
|
||||
}
|
||||
|
||||
constexpr int kAttribVertex = 0;
|
||||
constexpr int kAttribTexturePosition = 1;
|
||||
constexpr int kNumAttributes = 2;
|
||||
|
||||
class GlTextureWarpAffineRunner
|
||||
: public AffineTransformation::Runner<GpuBuffer,
|
||||
std::unique_ptr<GpuBuffer>> {
|
||||
public:
|
||||
GlTextureWarpAffineRunner(std::shared_ptr<GlCalculatorHelper> gl_helper,
|
||||
GpuOrigin::Mode gpu_origin)
|
||||
: gl_helper_(gl_helper), gpu_origin_(gpu_origin) {}
|
||||
absl::Status Init() {
|
||||
return gl_helper_->RunInGlContext([this]() -> absl::Status {
|
||||
const GLint attr_location[kNumAttributes] = {
|
||||
kAttribVertex,
|
||||
kAttribTexturePosition,
|
||||
};
|
||||
const GLchar* attr_name[kNumAttributes] = {
|
||||
"position",
|
||||
"texture_coordinate",
|
||||
};
|
||||
|
||||
constexpr GLchar kVertShader[] = R"(
|
||||
in vec4 position;
|
||||
in mediump vec4 texture_coordinate;
|
||||
out mediump vec2 sample_coordinate;
|
||||
uniform mat4 transform_matrix;
|
||||
|
||||
void main() {
|
||||
gl_Position = position;
|
||||
vec4 tc = transform_matrix * texture_coordinate;
|
||||
sample_coordinate = tc.xy;
|
||||
}
|
||||
)";
|
||||
|
||||
constexpr GLchar kFragShader[] = R"(
|
||||
DEFAULT_PRECISION(mediump, float)
|
||||
in vec2 sample_coordinate;
|
||||
uniform sampler2D input_texture;
|
||||
|
||||
#ifdef GL_ES
|
||||
#define fragColor gl_FragColor
|
||||
#else
|
||||
out vec4 fragColor;
|
||||
#endif // defined(GL_ES);
|
||||
|
||||
void main() {
|
||||
vec4 color = texture2D(input_texture, sample_coordinate);
|
||||
#ifdef CUSTOM_ZERO_BORDER_MODE
|
||||
float out_of_bounds =
|
||||
float(sample_coordinate.x < 0.0 || sample_coordinate.x > 1.0 ||
|
||||
sample_coordinate.y < 0.0 || sample_coordinate.y > 1.0);
|
||||
color = mix(color, vec4(0.0, 0.0, 0.0, 0.0), out_of_bounds);
|
||||
#endif // defined(CUSTOM_ZERO_BORDER_MODE)
|
||||
fragColor = color;
|
||||
}
|
||||
)";
|
||||
|
||||
// Create program and set parameters.
|
||||
auto create_fn = [&](const std::string& vs,
|
||||
const std::string& fs) -> absl::StatusOr<Program> {
|
||||
GLuint program = 0;
|
||||
GlhCreateProgram(vs.c_str(), fs.c_str(), kNumAttributes, &attr_name[0],
|
||||
attr_location, &program);
|
||||
|
||||
RET_CHECK(program) << "Problem initializing warp affine program.";
|
||||
glUseProgram(program);
|
||||
glUniform1i(glGetUniformLocation(program, "input_texture"), 1);
|
||||
GLint matrix_id = glGetUniformLocation(program, "transform_matrix");
|
||||
return Program{.id = program, .matrix_id = matrix_id};
|
||||
};
|
||||
|
||||
const std::string vert_src =
|
||||
absl::StrCat(mediapipe::kMediaPipeVertexShaderPreamble, kVertShader);
|
||||
|
||||
const std::string frag_src = absl::StrCat(
|
||||
mediapipe::kMediaPipeFragmentShaderPreamble, kFragShader);
|
||||
|
||||
ASSIGN_OR_RETURN(program_, create_fn(vert_src, frag_src));
|
||||
|
||||
auto create_custom_zero_fn = [&]() -> absl::StatusOr<Program> {
|
||||
std::string custom_zero_border_mode_def = R"(
|
||||
#define CUSTOM_ZERO_BORDER_MODE
|
||||
)";
|
||||
const std::string frag_custom_zero_src =
|
||||
absl::StrCat(mediapipe::kMediaPipeFragmentShaderPreamble,
|
||||
custom_zero_border_mode_def, kFragShader);
|
||||
return create_fn(vert_src, frag_custom_zero_src);
|
||||
};
|
||||
#if GL_CLAMP_TO_BORDER_MAY_BE_SUPPORTED
|
||||
if (!IsGlClampToBorderSupported(gl_helper_->GetGlContext())) {
|
||||
ASSIGN_OR_RETURN(program_custom_zero_, create_custom_zero_fn());
|
||||
}
|
||||
#else
|
||||
ASSIGN_OR_RETURN(program_custom_zero_, create_custom_zero_fn());
|
||||
#endif // GL_CLAMP_TO_BORDER_MAY_BE_SUPPORTED
|
||||
|
||||
glGenFramebuffers(1, &framebuffer_);
|
||||
|
||||
// vertex storage
|
||||
glGenBuffers(2, vbo_);
|
||||
glGenVertexArrays(1, &vao_);
|
||||
|
||||
// vbo 0
|
||||
glBindBuffer(GL_ARRAY_BUFFER, vbo_[0]);
|
||||
glBufferData(GL_ARRAY_BUFFER, sizeof(mediapipe::kBasicSquareVertices),
|
||||
mediapipe::kBasicSquareVertices, GL_STATIC_DRAW);
|
||||
|
||||
// vbo 1
|
||||
glBindBuffer(GL_ARRAY_BUFFER, vbo_[1]);
|
||||
glBufferData(GL_ARRAY_BUFFER, sizeof(mediapipe::kBasicTextureVertices),
|
||||
mediapipe::kBasicTextureVertices, GL_STATIC_DRAW);
|
||||
|
||||
glBindBuffer(GL_ARRAY_BUFFER, 0);
|
||||
|
||||
return absl::OkStatus();
|
||||
});
|
||||
}
|
||||
|
||||
absl::StatusOr<std::unique_ptr<GpuBuffer>> Run(
|
||||
const GpuBuffer& input, const std::array<float, 16>& matrix,
|
||||
const AffineTransformation::Size& size,
|
||||
AffineTransformation::BorderMode border_mode) override {
|
||||
std::unique_ptr<GpuBuffer> gpu_buffer;
|
||||
MP_RETURN_IF_ERROR(
|
||||
gl_helper_->RunInGlContext([this, &input, &matrix, &size, &border_mode,
|
||||
&gpu_buffer]() -> absl::Status {
|
||||
auto input_texture = gl_helper_->CreateSourceTexture(input);
|
||||
auto output_texture = gl_helper_->CreateDestinationTexture(
|
||||
size.width, size.height, input.format());
|
||||
|
||||
MP_RETURN_IF_ERROR(
|
||||
RunInternal(input_texture, matrix, border_mode, &output_texture));
|
||||
gpu_buffer = output_texture.GetFrame<GpuBuffer>();
|
||||
return absl::OkStatus();
|
||||
}));
|
||||
|
||||
return gpu_buffer;
|
||||
}
|
||||
|
||||
absl::Status RunInternal(const GlTexture& texture,
|
||||
const std::array<float, 16>& matrix,
|
||||
AffineTransformation::BorderMode border_mode,
|
||||
GlTexture* output) {
|
||||
glDisable(GL_DEPTH_TEST);
|
||||
glBindFramebuffer(GL_FRAMEBUFFER, framebuffer_);
|
||||
glViewport(0, 0, output->width(), output->height());
|
||||
|
||||
glActiveTexture(GL_TEXTURE0);
|
||||
glBindTexture(GL_TEXTURE_2D, output->name());
|
||||
glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D,
|
||||
output->name(), 0);
|
||||
|
||||
glActiveTexture(GL_TEXTURE1);
|
||||
glBindTexture(texture.target(), texture.name());
|
||||
|
||||
// a) Filtering.
|
||||
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
|
||||
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
|
||||
|
||||
// b) Clamping.
|
||||
std::optional<Program> program = program_;
|
||||
switch (border_mode) {
|
||||
case AffineTransformation::BorderMode::kReplicate: {
|
||||
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
|
||||
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
|
||||
break;
|
||||
}
|
||||
case AffineTransformation::BorderMode::kZero: {
|
||||
#if GL_CLAMP_TO_BORDER_MAY_BE_SUPPORTED
|
||||
if (program_custom_zero_) {
|
||||
program = program_custom_zero_;
|
||||
} else {
|
||||
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_BORDER);
|
||||
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_BORDER);
|
||||
glTexParameterfv(GL_TEXTURE_2D, GL_TEXTURE_BORDER_COLOR,
|
||||
std::array<float, 4>{0.0f, 0.0f, 0.0f, 0.0f}.data());
|
||||
}
|
||||
#else
|
||||
RET_CHECK(program_custom_zero_)
|
||||
<< "Program must have been initialized.";
|
||||
program = program_custom_zero_;
|
||||
#endif // GL_CLAMP_TO_BORDER_MAY_BE_SUPPORTED
|
||||
break;
|
||||
}
|
||||
}
|
||||
glUseProgram(program->id);
|
||||
|
||||
Eigen::Matrix<float, 4, 4, Eigen::RowMajor> eigen_mat(matrix.data());
|
||||
if (IsMatrixVerticalFlipNeeded(gpu_origin_)) {
|
||||
// @matrix describes affine transformation in terms of TOP LEFT origin, so
|
||||
// in some cases/on some platforms an extra flipping should be done before
|
||||
// and after.
|
||||
const Eigen::Matrix<float, 4, 4, Eigen::RowMajor> flip_y(
|
||||
{{1.0f, 0.0f, 0.0f, 0.0f},
|
||||
{0.0f, -1.0f, 0.0f, 1.0f},
|
||||
{0.0f, 0.0f, 1.0f, 0.0f},
|
||||
{0.0f, 0.0f, 0.0f, 1.0f}});
|
||||
eigen_mat = flip_y * eigen_mat * flip_y;
|
||||
}
|
||||
|
||||
// If GL context is ES2, then GL_FALSE must be used for 'transpose'
|
||||
// GLboolean in glUniformMatrix4fv, or else INVALID_VALUE error is reported.
|
||||
// Hence, transposing the matrix and always passing transposed.
|
||||
eigen_mat.transposeInPlace();
|
||||
glUniformMatrix4fv(program->matrix_id, 1, GL_FALSE, eigen_mat.data());
|
||||
|
||||
// vao
|
||||
glBindVertexArray(vao_);
|
||||
|
||||
// vbo 0
|
||||
glBindBuffer(GL_ARRAY_BUFFER, vbo_[0]);
|
||||
glEnableVertexAttribArray(kAttribVertex);
|
||||
glVertexAttribPointer(kAttribVertex, 2, GL_FLOAT, 0, 0, nullptr);
|
||||
|
||||
// vbo 1
|
||||
glBindBuffer(GL_ARRAY_BUFFER, vbo_[1]);
|
||||
glEnableVertexAttribArray(kAttribTexturePosition);
|
||||
glVertexAttribPointer(kAttribTexturePosition, 2, GL_FLOAT, 0, 0, nullptr);
|
||||
|
||||
// draw
|
||||
glDrawArrays(GL_TRIANGLE_STRIP, 0, 4);
|
||||
|
||||
// Resetting to MediaPipe texture param defaults.
|
||||
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
|
||||
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
|
||||
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
|
||||
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
|
||||
|
||||
glDisableVertexAttribArray(kAttribVertex);
|
||||
glDisableVertexAttribArray(kAttribTexturePosition);
|
||||
glBindBuffer(GL_ARRAY_BUFFER, 0);
|
||||
glBindVertexArray(0);
|
||||
|
||||
glActiveTexture(GL_TEXTURE1);
|
||||
glBindTexture(GL_TEXTURE_2D, 0);
|
||||
glActiveTexture(GL_TEXTURE0);
|
||||
glBindTexture(GL_TEXTURE_2D, 0);
|
||||
|
||||
return absl::OkStatus();
|
||||
}
|
||||
|
||||
~GlTextureWarpAffineRunner() override {
|
||||
gl_helper_->RunInGlContext([this]() {
|
||||
// Release OpenGL resources.
|
||||
if (framebuffer_ != 0) glDeleteFramebuffers(1, &framebuffer_);
|
||||
if (program_.id != 0) glDeleteProgram(program_.id);
|
||||
if (program_custom_zero_ && program_custom_zero_->id != 0) {
|
||||
glDeleteProgram(program_custom_zero_->id);
|
||||
}
|
||||
if (vao_ != 0) glDeleteVertexArrays(1, &vao_);
|
||||
glDeleteBuffers(2, vbo_);
|
||||
});
|
||||
}
|
||||
|
||||
private:
|
||||
struct Program {
|
||||
GLuint id;
|
||||
GLint matrix_id;
|
||||
};
|
||||
std::shared_ptr<GlCalculatorHelper> gl_helper_;
|
||||
GpuOrigin::Mode gpu_origin_;
|
||||
GLuint vao_ = 0;
|
||||
GLuint vbo_[2] = {0, 0};
|
||||
Program program_;
|
||||
std::optional<Program> program_custom_zero_;
|
||||
GLuint framebuffer_ = 0;
|
||||
};
|
||||
|
||||
#undef GL_CLAMP_TO_BORDER_MAY_BE_SUPPORTED
|
||||
|
||||
} // namespace
|
||||
|
||||
absl::StatusOr<std::unique_ptr<
|
||||
AffineTransformation::Runner<GpuBuffer, std::unique_ptr<GpuBuffer>>>>
|
||||
CreateAffineTransformationGlRunner(
|
||||
std::shared_ptr<GlCalculatorHelper> gl_helper, GpuOrigin::Mode gpu_origin) {
|
||||
auto runner =
|
||||
absl::make_unique<GlTextureWarpAffineRunner>(gl_helper, gpu_origin);
|
||||
MP_RETURN_IF_ERROR(runner->Init());
|
||||
return runner;
|
||||
}
|
||||
|
||||
} // namespace mediapipe
|
|
@ -0,0 +1,36 @@
|
|||
// Copyright 2021 The MediaPipe Authors.
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
// You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software
|
||||
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
|
||||
#ifndef MEDIAPIPE_CALCULATORS_IMAGE_AFFINE_TRANSFORMATION_RUNNER_GL_H_
|
||||
#define MEDIAPIPE_CALCULATORS_IMAGE_AFFINE_TRANSFORMATION_RUNNER_GL_H_
|
||||
|
||||
#include <memory>
|
||||
|
||||
#include "absl/status/statusor.h"
|
||||
#include "mediapipe/calculators/image/affine_transformation.h"
|
||||
#include "mediapipe/gpu/gl_calculator_helper.h"
|
||||
#include "mediapipe/gpu/gpu_buffer.h"
|
||||
#include "mediapipe/gpu/gpu_origin.pb.h"
|
||||
|
||||
namespace mediapipe {
|
||||
|
||||
absl::StatusOr<std::unique_ptr<AffineTransformation::Runner<
|
||||
mediapipe::GpuBuffer, std::unique_ptr<mediapipe::GpuBuffer>>>>
|
||||
CreateAffineTransformationGlRunner(
|
||||
std::shared_ptr<mediapipe::GlCalculatorHelper> gl_helper,
|
||||
mediapipe::GpuOrigin::Mode gpu_origin);
|
||||
|
||||
} // namespace mediapipe
|
||||
|
||||
#endif // MEDIAPIPE_CALCULATORS_IMAGE_AFFINE_TRANSFORMATION_RUNNER_GL_H_
|
|
@ -0,0 +1,160 @@
|
|||
// Copyright 2021 The MediaPipe Authors.
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
// You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software
|
||||
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
|
||||
#include "mediapipe/calculators/image/affine_transformation_runner_opencv.h"
|
||||
|
||||
#include <memory>
|
||||
|
||||
#include "absl/memory/memory.h"
|
||||
#include "absl/status/statusor.h"
|
||||
#include "mediapipe/calculators/image/affine_transformation.h"
|
||||
#include "mediapipe/framework/formats/image_frame.h"
|
||||
#include "mediapipe/framework/formats/image_frame_opencv.h"
|
||||
#include "mediapipe/framework/port/opencv_core_inc.h"
|
||||
#include "mediapipe/framework/port/opencv_imgproc_inc.h"
|
||||
#include "mediapipe/framework/port/ret_check.h"
|
||||
|
||||
namespace mediapipe {
|
||||
|
||||
namespace {
|
||||
|
||||
cv::BorderTypes GetBorderModeForOpenCv(
|
||||
AffineTransformation::BorderMode border_mode) {
|
||||
switch (border_mode) {
|
||||
case AffineTransformation::BorderMode::kZero:
|
||||
return cv::BORDER_CONSTANT;
|
||||
case AffineTransformation::BorderMode::kReplicate:
|
||||
return cv::BORDER_REPLICATE;
|
||||
}
|
||||
}
|
||||
|
||||
class OpenCvRunner
|
||||
: public AffineTransformation::Runner<ImageFrame, ImageFrame> {
|
||||
public:
|
||||
absl::StatusOr<ImageFrame> Run(
|
||||
const ImageFrame& input, const std::array<float, 16>& matrix,
|
||||
const AffineTransformation::Size& size,
|
||||
AffineTransformation::BorderMode border_mode) override {
|
||||
// OpenCV warpAffine works in absolute coordinates, so the transfom (which
|
||||
// accepts and produces relative coordinates) should be adjusted to first
|
||||
// normalize coordinates and then scale them.
|
||||
// clang-format off
|
||||
cv::Matx44f normalize_dst_coordinate({
|
||||
1.0f / size.width, 0.0f, 0.0f, 0.0f,
|
||||
0.0f, 1.0f / size.height, 0.0f, 0.0f,
|
||||
0.0f, 0.0f, 1.0f, 0.0f,
|
||||
0.0f, 0.0f, 0.0f, 1.0f});
|
||||
cv::Matx44f scale_src_coordinate({
|
||||
1.0f * input.Width(), 0.0f, 0.0f, 0.0f,
|
||||
0.0f, 1.0f * input.Height(), 0.0f, 0.0f,
|
||||
0.0f, 0.0f, 1.0f, 0.0f,
|
||||
0.0f, 0.0f, 0.0f, 1.0f});
|
||||
// clang-format on
|
||||
cv::Matx44f adjust_dst_coordinate;
|
||||
cv::Matx44f adjust_src_coordinate;
|
||||
// TODO: update to always use accurate implementation.
|
||||
constexpr bool kOpenCvCompatibility = true;
|
||||
if (kOpenCvCompatibility) {
|
||||
adjust_dst_coordinate = normalize_dst_coordinate;
|
||||
adjust_src_coordinate = scale_src_coordinate;
|
||||
} else {
|
||||
// To do an accurate affine image transformation and make "on-cpu" and
|
||||
// "on-gpu" calculations aligned - extra offset is required to select
|
||||
// correct pixels.
|
||||
//
|
||||
// Each destination pixel corresponds to some pixels region from source
|
||||
// image.(In case of downscaling there can be more than one pixel.) The
|
||||
// offset for x and y is calculated in the way, so pixel in the middle of
|
||||
// the region is selected.
|
||||
//
|
||||
// For simplicity sake, let's consider downscaling from 100x50 to 10x10
|
||||
// without a rotation:
|
||||
// 1. Each destination pixel corresponds to 10x5 region
|
||||
// X range: [0, .. , 9]
|
||||
// Y range: [0, .. , 4]
|
||||
// 2. Considering we have __discrete__ pixels, the center of the region is
|
||||
// between (4, 2) and (5, 2) pixels, let's assume it's a "pixel"
|
||||
// (4.5, 2).
|
||||
// 3. When using the above as an offset for every pixel select while
|
||||
// downscaling, resulting pixels are:
|
||||
// (4.5, 2), (14.5, 2), .. , (94.5, 2)
|
||||
// (4.5, 7), (14.5, 7), .. , (94.5, 7)
|
||||
// ..
|
||||
// (4.5, 47), (14.5, 47), .., (94.5, 47)
|
||||
// instead of:
|
||||
// (0, 0), (10, 0), .. , (90, 0)
|
||||
// (0, 5), (10, 7), .. , (90, 5)
|
||||
// ..
|
||||
// (0, 45), (10, 45), .., (90, 45)
|
||||
// The latter looks shifted.
|
||||
//
|
||||
// Offsets are needed, so that __discrete__ pixel at (0, 0) corresponds to
|
||||
// the same pixel as would __non discrete__ pixel at (0.5, 0.5). Hence,
|
||||
// transformation matrix should shift coordinates by (0.5, 0.5) as the
|
||||
// very first step.
|
||||
//
|
||||
// Due to the above shift, transformed coordinates would be valid for
|
||||
// float coordinates where pixel (0, 0) spans [0.0, 1.0) x [0.0, 1.0).
|
||||
// T0 make it valid for __discrete__ pixels, transformation matrix should
|
||||
// shift coordinate by (-0.5f, -0.5f) as the very last step. (E.g. if we
|
||||
// get (0.5f, 0.5f), then it's (0, 0) __discrete__ pixel.)
|
||||
// clang-format off
|
||||
cv::Matx44f shift_dst({1.0f, 0.0f, 0.0f, 0.5f,
|
||||
0.0f, 1.0f, 0.0f, 0.5f,
|
||||
0.0f, 0.0f, 1.0f, 0.0f,
|
||||
0.0f, 0.0f, 0.0f, 1.0f});
|
||||
cv::Matx44f shift_src({1.0f, 0.0f, 0.0f, -0.5f,
|
||||
0.0f, 1.0f, 0.0f, -0.5f,
|
||||
0.0f, 0.0f, 1.0f, 0.0f,
|
||||
0.0f, 0.0f, 0.0f, 1.0f});
|
||||
// clang-format on
|
||||
adjust_dst_coordinate = normalize_dst_coordinate * shift_dst;
|
||||
adjust_src_coordinate = shift_src * scale_src_coordinate;
|
||||
}
|
||||
|
||||
cv::Matx44f transform(matrix.data());
|
||||
cv::Matx44f transform_absolute =
|
||||
adjust_src_coordinate * transform * adjust_dst_coordinate;
|
||||
|
||||
cv::Mat in_mat = formats::MatView(&input);
|
||||
|
||||
cv::Mat cv_affine_transform(2, 3, CV_32F);
|
||||
cv_affine_transform.at<float>(0, 0) = transform_absolute.val[0];
|
||||
cv_affine_transform.at<float>(0, 1) = transform_absolute.val[1];
|
||||
cv_affine_transform.at<float>(0, 2) = transform_absolute.val[3];
|
||||
cv_affine_transform.at<float>(1, 0) = transform_absolute.val[4];
|
||||
cv_affine_transform.at<float>(1, 1) = transform_absolute.val[5];
|
||||
cv_affine_transform.at<float>(1, 2) = transform_absolute.val[7];
|
||||
|
||||
ImageFrame out_image(input.Format(), size.width, size.height);
|
||||
cv::Mat out_mat = formats::MatView(&out_image);
|
||||
|
||||
cv::warpAffine(in_mat, out_mat, cv_affine_transform,
|
||||
cv::Size(out_mat.cols, out_mat.rows),
|
||||
/*flags=*/cv::INTER_LINEAR | cv::WARP_INVERSE_MAP,
|
||||
GetBorderModeForOpenCv(border_mode));
|
||||
|
||||
return out_image;
|
||||
}
|
||||
};
|
||||
|
||||
} // namespace
|
||||
|
||||
absl::StatusOr<
|
||||
std::unique_ptr<AffineTransformation::Runner<ImageFrame, ImageFrame>>>
|
||||
CreateAffineTransformationOpenCvRunner() {
|
||||
return absl::make_unique<OpenCvRunner>();
|
||||
}
|
||||
|
||||
} // namespace mediapipe
|
|
@ -0,0 +1,32 @@
|
|||
// Copyright 2021 The MediaPipe Authors.
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
// You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software
|
||||
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
|
||||
#ifndef MEDIAPIPE_CALCULATORS_IMAGE_AFFINE_TRANSFORMATION_RUNNER_OPENCV_H_
|
||||
#define MEDIAPIPE_CALCULATORS_IMAGE_AFFINE_TRANSFORMATION_RUNNER_OPENCV_H_
|
||||
|
||||
#include <memory>
|
||||
|
||||
#include "absl/status/statusor.h"
|
||||
#include "mediapipe/calculators/image/affine_transformation.h"
|
||||
#include "mediapipe/framework/formats/image_frame.h"
|
||||
|
||||
namespace mediapipe {
|
||||
|
||||
absl::StatusOr<
|
||||
std::unique_ptr<AffineTransformation::Runner<ImageFrame, ImageFrame>>>
|
||||
CreateAffineTransformationOpenCvRunner();
|
||||
|
||||
} // namespace mediapipe
|
||||
|
||||
#endif // MEDIAPIPE_CALCULATORS_IMAGE_AFFINE_TRANSFORMATION_RUNNER_OPENCV_H_
|
|
@ -240,7 +240,7 @@ absl::Status BilateralFilterCalculator::RenderCpu(CalculatorContext* cc) {
|
|||
auto input_mat = mediapipe::formats::MatView(&input_frame);
|
||||
|
||||
// Only 1 or 3 channel images supported by OpenCV.
|
||||
if ((input_mat.channels() == 1 || input_mat.channels() == 3)) {
|
||||
if (!(input_mat.channels() == 1 || input_mat.channels() == 3)) {
|
||||
return absl::InternalError(
|
||||
"CPU filtering supports only 1 or 3 channel input images.");
|
||||
}
|
||||
|
|
|
@ -36,7 +36,7 @@ using GpuBuffer = mediapipe::GpuBuffer;
|
|||
// stored on the target storage (CPU vs GPU) specified in the calculator option.
|
||||
//
|
||||
// The clone shares ownership of the input pixel data on the existing storage.
|
||||
// If the target storage is diffrent from the existing one, then the data is
|
||||
// If the target storage is different from the existing one, then the data is
|
||||
// further copied there.
|
||||
//
|
||||
// Example usage:
|
||||
|
|
|
@ -102,6 +102,10 @@ mediapipe::ScaleMode_Mode ParseScaleMode(
|
|||
// IMAGE: ImageFrame representing the input image.
|
||||
// IMAGE_GPU: GpuBuffer representing the input image.
|
||||
//
|
||||
// OUTPUT_DIMENSIONS (optional): The output width and height in pixels as
|
||||
// pair<int, int>. If set, it will override corresponding field in calculator
|
||||
// options and input side packet.
|
||||
//
|
||||
// ROTATION_DEGREES (optional): The counterclockwise rotation angle in
|
||||
// degrees. This allows different rotation angles for different frames. It has
|
||||
// to be a multiple of 90 degrees. If provided, it overrides the
|
||||
|
@ -221,6 +225,10 @@ absl::Status ImageTransformationCalculator::GetContract(
|
|||
}
|
||||
#endif // !MEDIAPIPE_DISABLE_GPU
|
||||
|
||||
if (cc->Inputs().HasTag("OUTPUT_DIMENSIONS")) {
|
||||
cc->Inputs().Tag("OUTPUT_DIMENSIONS").Set<std::pair<int, int>>();
|
||||
}
|
||||
|
||||
if (cc->Inputs().HasTag("ROTATION_DEGREES")) {
|
||||
cc->Inputs().Tag("ROTATION_DEGREES").Set<int>();
|
||||
}
|
||||
|
@ -329,6 +337,13 @@ absl::Status ImageTransformationCalculator::Process(CalculatorContext* cc) {
|
|||
!cc->Inputs().Tag("FLIP_VERTICALLY").IsEmpty()) {
|
||||
flip_vertically_ = cc->Inputs().Tag("FLIP_VERTICALLY").Get<bool>();
|
||||
}
|
||||
if (cc->Inputs().HasTag("OUTPUT_DIMENSIONS") &&
|
||||
!cc->Inputs().Tag("OUTPUT_DIMENSIONS").IsEmpty()) {
|
||||
const auto& image_size =
|
||||
cc->Inputs().Tag("OUTPUT_DIMENSIONS").Get<std::pair<int, int>>();
|
||||
output_width_ = image_size.first;
|
||||
output_height_ = image_size.second;
|
||||
}
|
||||
|
||||
if (use_gpu_) {
|
||||
#if !MEDIAPIPE_DISABLE_GPU
|
||||
|
|
|
@ -37,6 +37,22 @@ constexpr char kImageFrameTag[] = "IMAGE";
|
|||
constexpr char kMaskCpuTag[] = "MASK";
|
||||
constexpr char kGpuBufferTag[] = "IMAGE_GPU";
|
||||
constexpr char kMaskGpuTag[] = "MASK_GPU";
|
||||
|
||||
inline cv::Vec3b Blend(const cv::Vec3b& color1, const cv::Vec3b& color2,
|
||||
float weight, int invert_mask,
|
||||
int adjust_with_luminance) {
|
||||
weight = (1 - invert_mask) * weight + invert_mask * (1.0f - weight);
|
||||
|
||||
float luminance =
|
||||
(1 - adjust_with_luminance) * 1.0f +
|
||||
adjust_with_luminance *
|
||||
(color1[0] * 0.299 + color1[1] * 0.587 + color1[2] * 0.114) / 255;
|
||||
|
||||
float mix_value = weight * luminance;
|
||||
|
||||
return color1 * (1.0 - mix_value) + color2 * mix_value;
|
||||
}
|
||||
|
||||
} // namespace
|
||||
|
||||
namespace mediapipe {
|
||||
|
@ -44,15 +60,14 @@ namespace mediapipe {
|
|||
// A calculator to recolor a masked area of an image to a specified color.
|
||||
//
|
||||
// A mask image is used to specify where to overlay a user defined color.
|
||||
// The luminance of the input image is used to adjust the blending weight,
|
||||
// to help preserve image textures.
|
||||
//
|
||||
// Inputs:
|
||||
// One of the following IMAGE tags:
|
||||
// IMAGE: An ImageFrame input image, RGB or RGBA.
|
||||
// IMAGE: An ImageFrame input image in ImageFormat::SRGB.
|
||||
// IMAGE_GPU: A GpuBuffer input image, RGBA.
|
||||
// One of the following MASK tags:
|
||||
// MASK: An ImageFrame input mask, Gray, RGB or RGBA.
|
||||
// MASK: An ImageFrame input mask in ImageFormat::GRAY8, SRGB, SRGBA, or
|
||||
// VEC32F1
|
||||
// MASK_GPU: A GpuBuffer input mask, RGBA.
|
||||
// Output:
|
||||
// One of the following IMAGE tags:
|
||||
|
@ -98,10 +113,12 @@ class RecolorCalculator : public CalculatorBase {
|
|||
void GlRender();
|
||||
|
||||
bool initialized_ = false;
|
||||
std::vector<float> color_;
|
||||
std::vector<uint8> color_;
|
||||
mediapipe::RecolorCalculatorOptions::MaskChannel mask_channel_;
|
||||
|
||||
bool use_gpu_ = false;
|
||||
bool invert_mask_ = false;
|
||||
bool adjust_with_luminance_ = false;
|
||||
#if !MEDIAPIPE_DISABLE_GPU
|
||||
mediapipe::GlCalculatorHelper gpu_helper_;
|
||||
GLuint program_ = 0;
|
||||
|
@ -233,11 +250,15 @@ absl::Status RecolorCalculator::RenderCpu(CalculatorContext* cc) {
|
|||
}
|
||||
cv::Mat mask_full;
|
||||
cv::resize(mask_mat, mask_full, input_mat.size());
|
||||
const cv::Vec3b recolor = {color_[0], color_[1], color_[2]};
|
||||
|
||||
auto output_img = absl::make_unique<ImageFrame>(
|
||||
input_img.Format(), input_mat.cols, input_mat.rows);
|
||||
cv::Mat output_mat = mediapipe::formats::MatView(output_img.get());
|
||||
|
||||
const int invert_mask = invert_mask_ ? 1 : 0;
|
||||
const int adjust_with_luminance = adjust_with_luminance_ ? 1 : 0;
|
||||
|
||||
// From GPU shader:
|
||||
/*
|
||||
vec4 weight = texture2D(mask, sample_coordinate);
|
||||
|
@ -249,18 +270,23 @@ absl::Status RecolorCalculator::RenderCpu(CalculatorContext* cc) {
|
|||
|
||||
fragColor = mix(color1, color2, mix_value);
|
||||
*/
|
||||
if (mask_img.Format() == ImageFormat::VEC32F1) {
|
||||
for (int i = 0; i < output_mat.rows; ++i) {
|
||||
for (int j = 0; j < output_mat.cols; ++j) {
|
||||
float weight = mask_full.at<uchar>(i, j) * (1.0 / 255.0);
|
||||
cv::Vec3f color1 = input_mat.at<cv::Vec3b>(i, j);
|
||||
cv::Vec3f color2 = {color_[0], color_[1], color_[2]};
|
||||
|
||||
float luminance =
|
||||
(color1[0] * 0.299 + color1[1] * 0.587 + color1[2] * 0.114) / 255;
|
||||
float mix_value = weight * luminance;
|
||||
|
||||
cv::Vec3b mix_color = color1 * (1.0 - mix_value) + color2 * mix_value;
|
||||
output_mat.at<cv::Vec3b>(i, j) = mix_color;
|
||||
const float weight = mask_full.at<float>(i, j);
|
||||
output_mat.at<cv::Vec3b>(i, j) =
|
||||
Blend(input_mat.at<cv::Vec3b>(i, j), recolor, weight, invert_mask,
|
||||
adjust_with_luminance);
|
||||
}
|
||||
}
|
||||
} else {
|
||||
for (int i = 0; i < output_mat.rows; ++i) {
|
||||
for (int j = 0; j < output_mat.cols; ++j) {
|
||||
const float weight = mask_full.at<uchar>(i, j) * (1.0 / 255.0);
|
||||
output_mat.at<cv::Vec3b>(i, j) =
|
||||
Blend(input_mat.at<cv::Vec3b>(i, j), recolor, weight, invert_mask,
|
||||
adjust_with_luminance);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
|
@ -385,6 +411,9 @@ absl::Status RecolorCalculator::LoadOptions(CalculatorContext* cc) {
|
|||
color_.push_back(options.color().g());
|
||||
color_.push_back(options.color().b());
|
||||
|
||||
invert_mask_ = options.invert_mask();
|
||||
adjust_with_luminance_ = options.adjust_with_luminance();
|
||||
|
||||
return absl::OkStatus();
|
||||
}
|
||||
|
||||
|
@ -435,13 +464,20 @@ absl::Status RecolorCalculator::InitGpu(CalculatorContext* cc) {
|
|||
uniform sampler2D frame;
|
||||
uniform sampler2D mask;
|
||||
uniform vec3 recolor;
|
||||
uniform float invert_mask;
|
||||
uniform float adjust_with_luminance;
|
||||
|
||||
void main() {
|
||||
vec4 weight = texture2D(mask, sample_coordinate);
|
||||
vec4 color1 = texture2D(frame, sample_coordinate);
|
||||
vec4 color2 = vec4(recolor, 1.0);
|
||||
|
||||
float luminance = dot(color1.rgb, vec3(0.299, 0.587, 0.114));
|
||||
weight = mix(weight, 1.0 - weight, invert_mask);
|
||||
|
||||
float luminance = mix(1.0,
|
||||
dot(color1.rgb, vec3(0.299, 0.587, 0.114)),
|
||||
adjust_with_luminance);
|
||||
|
||||
float mix_value = weight.MASK_COMPONENT * luminance;
|
||||
|
||||
fragColor = mix(color1, color2, mix_value);
|
||||
|
@ -458,6 +494,10 @@ absl::Status RecolorCalculator::InitGpu(CalculatorContext* cc) {
|
|||
glUniform1i(glGetUniformLocation(program_, "mask"), 2);
|
||||
glUniform3f(glGetUniformLocation(program_, "recolor"), color_[0] / 255.0,
|
||||
color_[1] / 255.0, color_[2] / 255.0);
|
||||
glUniform1f(glGetUniformLocation(program_, "invert_mask"),
|
||||
invert_mask_ ? 1.0f : 0.0f);
|
||||
glUniform1f(glGetUniformLocation(program_, "adjust_with_luminance"),
|
||||
adjust_with_luminance_ ? 1.0f : 0.0f);
|
||||
#endif // !MEDIAPIPE_DISABLE_GPU
|
||||
|
||||
return absl::OkStatus();
|
||||
|
|
|
@ -36,4 +36,11 @@ message RecolorCalculatorOptions {
|
|||
// Color to blend into input image where mask is > 0.
|
||||
// The blending is based on the input image luminosity.
|
||||
optional Color color = 2;
|
||||
|
||||
// Swap the meaning of mask values for foreground/background.
|
||||
optional bool invert_mask = 3 [default = false];
|
||||
|
||||
// Whether to use the luminance of the input image to further adjust the
|
||||
// blending weight, to help preserve image textures.
|
||||
optional bool adjust_with_luminance = 4 [default = true];
|
||||
}
|
||||
|
|
|
@ -262,6 +262,7 @@ absl::Status ScaleImageCalculator::InitializeFrameInfo(CalculatorContext* cc) {
|
|||
scale_image::FindOutputDimensions(crop_width_, crop_height_, //
|
||||
options_.target_width(), //
|
||||
options_.target_height(), //
|
||||
options_.target_max_area(), //
|
||||
options_.preserve_aspect_ratio(), //
|
||||
options_.scale_to_multiple_of(), //
|
||||
&output_width_, &output_height_));
|
||||
|
|
|
@ -28,6 +28,11 @@ message ScaleImageCalculatorOptions {
|
|||
optional int32 target_width = 1;
|
||||
optional int32 target_height = 2;
|
||||
|
||||
// If set, then automatically calculates a target_width and target_height that
|
||||
// has an area below the target max area. Aspect ratio preservation cannot be
|
||||
// disabled.
|
||||
optional int32 target_max_area = 15;
|
||||
|
||||
// If true, the image is scaled up or down proportionally so that it
|
||||
// fits inside the box represented by target_width and target_height.
|
||||
// Otherwise it is scaled to fit target_width and target_height
|
||||
|
|
|
@ -92,12 +92,21 @@ absl::Status FindOutputDimensions(int input_width, //
|
|||
int input_height, //
|
||||
int target_width, //
|
||||
int target_height, //
|
||||
int target_max_area, //
|
||||
bool preserve_aspect_ratio, //
|
||||
int scale_to_multiple_of, //
|
||||
int* output_width, int* output_height) {
|
||||
CHECK(output_width);
|
||||
CHECK(output_height);
|
||||
|
||||
if (target_max_area > 0 && input_width * input_height > target_max_area) {
|
||||
preserve_aspect_ratio = true;
|
||||
target_height = static_cast<int>(sqrt(static_cast<double>(target_max_area) /
|
||||
(static_cast<double>(input_width) /
|
||||
static_cast<double>(input_height))));
|
||||
target_width = -1; // Resize width to preserve aspect ratio.
|
||||
}
|
||||
|
||||
if (preserve_aspect_ratio) {
|
||||
RET_CHECK(scale_to_multiple_of == 2)
|
||||
<< "FindOutputDimensions always outputs width and height that are "
|
||||
|
@ -164,5 +173,17 @@ absl::Status FindOutputDimensions(int input_width, //
|
|||
<< "Unable to set output dimensions based on target dimensions.";
|
||||
}
|
||||
|
||||
absl::Status FindOutputDimensions(int input_width, //
|
||||
int input_height, //
|
||||
int target_width, //
|
||||
int target_height, //
|
||||
bool preserve_aspect_ratio, //
|
||||
int scale_to_multiple_of, //
|
||||
int* output_width, int* output_height) {
|
||||
return FindOutputDimensions(
|
||||
input_width, input_height, target_width, target_height, -1,
|
||||
preserve_aspect_ratio, scale_to_multiple_of, output_width, output_height);
|
||||
}
|
||||
|
||||
} // namespace scale_image
|
||||
} // namespace mediapipe
|
||||
|
|
|
@ -34,15 +34,25 @@ absl::Status FindCropDimensions(int input_width, int input_height, //
|
|||
int* crop_width, int* crop_height, //
|
||||
int* col_start, int* row_start);
|
||||
|
||||
// Given an input width and height, a target width and height, whether to
|
||||
// preserve the aspect ratio, and whether to round-down to the multiple of a
|
||||
// given number nearest to the targets, determine the output width and height.
|
||||
// If target_width or target_height is non-positive, then they will be set to
|
||||
// the input_width and input_height respectively. If scale_to_multiple_of is
|
||||
// less than 1, it will be treated like 1. The output_width and
|
||||
// output_height will be reduced as necessary to preserve_aspect_ratio if the
|
||||
// option is specified. If preserving the aspect ratio is desired, you must set
|
||||
// scale_to_multiple_of to 2.
|
||||
// Given an input width and height, a target width and height or max area,
|
||||
// whether to preserve the aspect ratio, and whether to round-down to the
|
||||
// multiple of a given number nearest to the targets, determine the output width
|
||||
// and height. If target_width or target_height is non-positive, then they will
|
||||
// be set to the input_width and input_height respectively. If target_area is
|
||||
// non-positive, then it will be ignored. If scale_to_multiple_of is less than
|
||||
// 1, it will be treated like 1. The output_width and output_height will be
|
||||
// reduced as necessary to preserve_aspect_ratio if the option is specified. If
|
||||
// preserving the aspect ratio is desired, you must set scale_to_multiple_of
|
||||
// to 2.
|
||||
absl::Status FindOutputDimensions(int input_width, int input_height, //
|
||||
int target_width,
|
||||
int target_height, //
|
||||
int target_max_area, //
|
||||
bool preserve_aspect_ratio, //
|
||||
int scale_to_multiple_of, //
|
||||
int* output_width, int* output_height);
|
||||
|
||||
// Backwards compatible helper.
|
||||
absl::Status FindOutputDimensions(int input_width, int input_height, //
|
||||
int target_width,
|
||||
int target_height, //
|
||||
|
|
|
@ -79,49 +79,49 @@ TEST(ScaleImageUtilsTest, FindOutputDimensionsPreserveRatio) {
|
|||
int output_width;
|
||||
int output_height;
|
||||
// Not scale.
|
||||
MP_ASSERT_OK(FindOutputDimensions(200, 100, -1, -1, true, 2, &output_width,
|
||||
&output_height));
|
||||
MP_ASSERT_OK(FindOutputDimensions(200, 100, -1, -1, -1, true, 2,
|
||||
&output_width, &output_height));
|
||||
EXPECT_EQ(200, output_width);
|
||||
EXPECT_EQ(100, output_height);
|
||||
// Not scale with odd input size.
|
||||
MP_ASSERT_OK(FindOutputDimensions(201, 101, -1, -1, false, 1, &output_width,
|
||||
&output_height));
|
||||
MP_ASSERT_OK(FindOutputDimensions(201, 101, -1, -1, -1, false, 1,
|
||||
&output_width, &output_height));
|
||||
EXPECT_EQ(201, output_width);
|
||||
EXPECT_EQ(101, output_height);
|
||||
// Scale down by 1/2.
|
||||
MP_ASSERT_OK(FindOutputDimensions(200, 100, 100, -1, true, 2, &output_width,
|
||||
&output_height));
|
||||
MP_ASSERT_OK(FindOutputDimensions(200, 100, 100, -1, -1, true, 2,
|
||||
&output_width, &output_height));
|
||||
EXPECT_EQ(100, output_width);
|
||||
EXPECT_EQ(50, output_height);
|
||||
// Scale up, doubling dimensions.
|
||||
MP_ASSERT_OK(FindOutputDimensions(200, 100, -1, 200, true, 2, &output_width,
|
||||
&output_height));
|
||||
MP_ASSERT_OK(FindOutputDimensions(200, 100, -1, 200, -1, true, 2,
|
||||
&output_width, &output_height));
|
||||
EXPECT_EQ(400, output_width);
|
||||
EXPECT_EQ(200, output_height);
|
||||
// Fits a 2:1 image into a 150 x 150 box. Output dimensions are always
|
||||
// visible by 2.
|
||||
MP_ASSERT_OK(FindOutputDimensions(200, 100, 150, 150, true, 2, &output_width,
|
||||
&output_height));
|
||||
MP_ASSERT_OK(FindOutputDimensions(200, 100, 150, 150, -1, true, 2,
|
||||
&output_width, &output_height));
|
||||
EXPECT_EQ(150, output_width);
|
||||
EXPECT_EQ(74, output_height);
|
||||
// Fits a 2:1 image into a 400 x 50 box.
|
||||
MP_ASSERT_OK(FindOutputDimensions(200, 100, 400, 50, true, 2, &output_width,
|
||||
&output_height));
|
||||
MP_ASSERT_OK(FindOutputDimensions(200, 100, 400, 50, -1, true, 2,
|
||||
&output_width, &output_height));
|
||||
EXPECT_EQ(100, output_width);
|
||||
EXPECT_EQ(50, output_height);
|
||||
// Scale to multiple number with odd targe size.
|
||||
MP_ASSERT_OK(FindOutputDimensions(200, 100, 101, -1, true, 2, &output_width,
|
||||
&output_height));
|
||||
MP_ASSERT_OK(FindOutputDimensions(200, 100, 101, -1, -1, true, 2,
|
||||
&output_width, &output_height));
|
||||
EXPECT_EQ(100, output_width);
|
||||
EXPECT_EQ(50, output_height);
|
||||
// Scale to multiple number with odd targe size.
|
||||
MP_ASSERT_OK(FindOutputDimensions(200, 100, 101, -1, true, 2, &output_width,
|
||||
&output_height));
|
||||
MP_ASSERT_OK(FindOutputDimensions(200, 100, 101, -1, -1, true, 2,
|
||||
&output_width, &output_height));
|
||||
EXPECT_EQ(100, output_width);
|
||||
EXPECT_EQ(50, output_height);
|
||||
// Scale to odd size.
|
||||
MP_ASSERT_OK(FindOutputDimensions(200, 100, 151, 101, false, 1, &output_width,
|
||||
&output_height));
|
||||
MP_ASSERT_OK(FindOutputDimensions(200, 100, 151, 101, -1, false, 1,
|
||||
&output_width, &output_height));
|
||||
EXPECT_EQ(151, output_width);
|
||||
EXPECT_EQ(101, output_height);
|
||||
}
|
||||
|
@ -131,18 +131,18 @@ TEST(ScaleImageUtilsTest, FindOutputDimensionsNoAspectRatio) {
|
|||
int output_width;
|
||||
int output_height;
|
||||
// Scale width only.
|
||||
MP_ASSERT_OK(FindOutputDimensions(200, 100, 100, -1, false, 2, &output_width,
|
||||
&output_height));
|
||||
MP_ASSERT_OK(FindOutputDimensions(200, 100, 100, -1, -1, false, 2,
|
||||
&output_width, &output_height));
|
||||
EXPECT_EQ(100, output_width);
|
||||
EXPECT_EQ(100, output_height);
|
||||
// Scale height only.
|
||||
MP_ASSERT_OK(FindOutputDimensions(200, 100, -1, 200, false, 2, &output_width,
|
||||
&output_height));
|
||||
MP_ASSERT_OK(FindOutputDimensions(200, 100, -1, 200, -1, false, 2,
|
||||
&output_width, &output_height));
|
||||
EXPECT_EQ(200, output_width);
|
||||
EXPECT_EQ(200, output_height);
|
||||
// Scale both dimensions.
|
||||
MP_ASSERT_OK(FindOutputDimensions(200, 100, 150, 200, false, 2, &output_width,
|
||||
&output_height));
|
||||
MP_ASSERT_OK(FindOutputDimensions(200, 100, 150, 200, -1, false, 2,
|
||||
&output_width, &output_height));
|
||||
EXPECT_EQ(150, output_width);
|
||||
EXPECT_EQ(200, output_height);
|
||||
}
|
||||
|
@ -152,41 +152,78 @@ TEST(ScaleImageUtilsTest, FindOutputDimensionsDownScaleToMultipleOf) {
|
|||
int output_width;
|
||||
int output_height;
|
||||
// Set no targets, downscale to a multiple of 8.
|
||||
MP_ASSERT_OK(FindOutputDimensions(100, 100, -1, -1, false, 8, &output_width,
|
||||
&output_height));
|
||||
MP_ASSERT_OK(FindOutputDimensions(100, 100, -1, -1, -1, false, 8,
|
||||
&output_width, &output_height));
|
||||
EXPECT_EQ(96, output_width);
|
||||
EXPECT_EQ(96, output_height);
|
||||
// Set width target, downscale to a multiple of 8.
|
||||
MP_ASSERT_OK(FindOutputDimensions(200, 100, 100, -1, false, 8, &output_width,
|
||||
&output_height));
|
||||
MP_ASSERT_OK(FindOutputDimensions(200, 100, 100, -1, -1, false, 8,
|
||||
&output_width, &output_height));
|
||||
EXPECT_EQ(96, output_width);
|
||||
EXPECT_EQ(96, output_height);
|
||||
// Set height target, downscale to a multiple of 8.
|
||||
MP_ASSERT_OK(FindOutputDimensions(201, 101, -1, 201, false, 8, &output_width,
|
||||
&output_height));
|
||||
MP_ASSERT_OK(FindOutputDimensions(201, 101, -1, 201, -1, false, 8,
|
||||
&output_width, &output_height));
|
||||
EXPECT_EQ(200, output_width);
|
||||
EXPECT_EQ(200, output_height);
|
||||
// Set both targets, downscale to a multiple of 8.
|
||||
MP_ASSERT_OK(FindOutputDimensions(200, 100, 150, 200, false, 8, &output_width,
|
||||
&output_height));
|
||||
MP_ASSERT_OK(FindOutputDimensions(200, 100, 150, 200, -1, false, 8,
|
||||
&output_width, &output_height));
|
||||
EXPECT_EQ(144, output_width);
|
||||
EXPECT_EQ(200, output_height);
|
||||
// Doesn't throw error if keep aspect is true and downscale multiple is 2.
|
||||
MP_ASSERT_OK(FindOutputDimensions(200, 100, 400, 200, true, 2, &output_width,
|
||||
&output_height));
|
||||
MP_ASSERT_OK(FindOutputDimensions(200, 100, 400, 200, -1, true, 2,
|
||||
&output_width, &output_height));
|
||||
EXPECT_EQ(400, output_width);
|
||||
EXPECT_EQ(200, output_height);
|
||||
// Throws error if keep aspect is true, but downscale multiple is not 2.
|
||||
ASSERT_THAT(FindOutputDimensions(200, 100, 400, 200, true, 4, &output_width,
|
||||
&output_height),
|
||||
ASSERT_THAT(FindOutputDimensions(200, 100, 400, 200, -1, true, 4,
|
||||
&output_width, &output_height),
|
||||
testing::Not(testing::status::IsOk()));
|
||||
// Downscaling to multiple ignored if multiple is less than 2.
|
||||
MP_ASSERT_OK(FindOutputDimensions(200, 100, 401, 201, false, 1, &output_width,
|
||||
&output_height));
|
||||
MP_ASSERT_OK(FindOutputDimensions(200, 100, 401, 201, -1, false, 1,
|
||||
&output_width, &output_height));
|
||||
EXPECT_EQ(401, output_width);
|
||||
EXPECT_EQ(201, output_height);
|
||||
}
|
||||
|
||||
// Tests scaling without keeping the aspect ratio fixed.
|
||||
TEST(ScaleImageUtilsTest, FindOutputDimensionsMaxArea) {
|
||||
int output_width;
|
||||
int output_height;
|
||||
// Smaller area.
|
||||
MP_ASSERT_OK(FindOutputDimensions(200, 100, -1, -1, 9000, false, 2,
|
||||
&output_width, &output_height));
|
||||
EXPECT_NEAR(
|
||||
200 / 100,
|
||||
static_cast<double>(output_width) / static_cast<double>(output_height),
|
||||
0.1f);
|
||||
EXPECT_LE(output_width * output_height, 9000);
|
||||
// Close to original area.
|
||||
MP_ASSERT_OK(FindOutputDimensions(200, 100, -1, -1, 19999, false, 2,
|
||||
&output_width, &output_height));
|
||||
EXPECT_NEAR(
|
||||
200.0 / 100.0,
|
||||
static_cast<double>(output_width) / static_cast<double>(output_height),
|
||||
0.1f);
|
||||
EXPECT_LE(output_width * output_height, 19999);
|
||||
// Don't scale with larger area.
|
||||
MP_ASSERT_OK(FindOutputDimensions(200, 100, -1, -1, 20001, false, 2,
|
||||
&output_width, &output_height));
|
||||
EXPECT_EQ(200, output_width);
|
||||
EXPECT_EQ(100, output_height);
|
||||
// Don't scale with equal area.
|
||||
MP_ASSERT_OK(FindOutputDimensions(200, 100, -1, -1, 20000, false, 2,
|
||||
&output_width, &output_height));
|
||||
EXPECT_EQ(200, output_width);
|
||||
EXPECT_EQ(100, output_height);
|
||||
// Don't scale at all.
|
||||
MP_ASSERT_OK(FindOutputDimensions(200, 100, -1, -1, -1, false, 2,
|
||||
&output_width, &output_height));
|
||||
EXPECT_EQ(200, output_width);
|
||||
EXPECT_EQ(100, output_height);
|
||||
}
|
||||
|
||||
} // namespace
|
||||
} // namespace scale_image
|
||||
} // namespace mediapipe
|
||||
|
|
429
mediapipe/calculators/image/segmentation_smoothing_calculator.cc
Normal file
429
mediapipe/calculators/image/segmentation_smoothing_calculator.cc
Normal file
|
@ -0,0 +1,429 @@
|
|||
// Copyright 2021 The MediaPipe Authors.
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
// You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software
|
||||
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
|
||||
#include <algorithm>
|
||||
#include <memory>
|
||||
|
||||
#include "mediapipe/calculators/image/segmentation_smoothing_calculator.pb.h"
|
||||
#include "mediapipe/framework/calculator_framework.h"
|
||||
#include "mediapipe/framework/calculator_options.pb.h"
|
||||
#include "mediapipe/framework/formats/image.h"
|
||||
#include "mediapipe/framework/formats/image_format.pb.h"
|
||||
#include "mediapipe/framework/formats/image_frame.h"
|
||||
#include "mediapipe/framework/formats/image_frame_opencv.h"
|
||||
#include "mediapipe/framework/formats/image_opencv.h"
|
||||
#include "mediapipe/framework/port/logging.h"
|
||||
#include "mediapipe/framework/port/opencv_core_inc.h"
|
||||
#include "mediapipe/framework/port/status.h"
|
||||
#include "mediapipe/framework/port/vector.h"
|
||||
|
||||
#if !MEDIAPIPE_DISABLE_GPU
|
||||
#include "mediapipe/gpu/gl_calculator_helper.h"
|
||||
#include "mediapipe/gpu/gl_simple_shaders.h"
|
||||
#include "mediapipe/gpu/shader_util.h"
|
||||
#endif // !MEDIAPIPE_DISABLE_GPU
|
||||
|
||||
namespace mediapipe {
|
||||
|
||||
namespace {
|
||||
constexpr char kCurrentMaskTag[] = "MASK";
|
||||
constexpr char kPreviousMaskTag[] = "MASK_PREVIOUS";
|
||||
constexpr char kOutputMaskTag[] = "MASK_SMOOTHED";
|
||||
|
||||
enum { ATTRIB_VERTEX, ATTRIB_TEXTURE_POSITION, NUM_ATTRIBUTES };
|
||||
} // namespace
|
||||
|
||||
// A calculator for mixing two segmentation masks together,
|
||||
// based on an uncertantity probability estimate.
|
||||
//
|
||||
// Inputs:
|
||||
// MASK - Image containing the new/current mask.
|
||||
// [ImageFormat::VEC32F1, or
|
||||
// GpuBufferFormat::kBGRA32/kRGB24/kGrayHalf16/kGrayFloat32]
|
||||
// MASK_PREVIOUS - Image containing previous mask.
|
||||
// [Same format as MASK_CURRENT]
|
||||
// * If input channels is >1, only the first channel (R) is used as the mask.
|
||||
//
|
||||
// Output:
|
||||
// MASK_SMOOTHED - Blended mask.
|
||||
// [Same format as MASK_CURRENT]
|
||||
// * The resulting filtered mask will be stored in R channel,
|
||||
// and duplicated in A if 4 channels.
|
||||
//
|
||||
// Options:
|
||||
// combine_with_previous_ratio - Amount of previous to blend with current.
|
||||
//
|
||||
// Example:
|
||||
// node {
|
||||
// calculator: "SegmentationSmoothingCalculator"
|
||||
// input_stream: "MASK:mask"
|
||||
// input_stream: "MASK_PREVIOUS:mask_previous"
|
||||
// output_stream: "MASK_SMOOTHED:mask_smoothed"
|
||||
// options: {
|
||||
// [mediapipe.SegmentationSmoothingCalculatorOptions.ext] {
|
||||
// combine_with_previous_ratio: 0.9
|
||||
// }
|
||||
// }
|
||||
// }
|
||||
//
|
||||
class SegmentationSmoothingCalculator : public CalculatorBase {
|
||||
public:
|
||||
SegmentationSmoothingCalculator() = default;
|
||||
|
||||
static absl::Status GetContract(CalculatorContract* cc);
|
||||
|
||||
// From Calculator.
|
||||
absl::Status Open(CalculatorContext* cc) override;
|
||||
absl::Status Process(CalculatorContext* cc) override;
|
||||
absl::Status Close(CalculatorContext* cc) override;
|
||||
|
||||
private:
|
||||
absl::Status RenderGpu(CalculatorContext* cc);
|
||||
absl::Status RenderCpu(CalculatorContext* cc);
|
||||
|
||||
absl::Status GlSetup(CalculatorContext* cc);
|
||||
void GlRender(CalculatorContext* cc);
|
||||
|
||||
float combine_with_previous_ratio_;
|
||||
|
||||
bool gpu_initialized_ = false;
|
||||
#if !MEDIAPIPE_DISABLE_GPU
|
||||
mediapipe::GlCalculatorHelper gpu_helper_;
|
||||
GLuint program_ = 0;
|
||||
#endif // !MEDIAPIPE_DISABLE_GPU
|
||||
};
|
||||
REGISTER_CALCULATOR(SegmentationSmoothingCalculator);
|
||||
|
||||
absl::Status SegmentationSmoothingCalculator::GetContract(
|
||||
CalculatorContract* cc) {
|
||||
CHECK_GE(cc->Inputs().NumEntries(), 1);
|
||||
|
||||
cc->Inputs().Tag(kCurrentMaskTag).Set<Image>();
|
||||
cc->Inputs().Tag(kPreviousMaskTag).Set<Image>();
|
||||
cc->Outputs().Tag(kOutputMaskTag).Set<Image>();
|
||||
|
||||
#if !MEDIAPIPE_DISABLE_GPU
|
||||
MP_RETURN_IF_ERROR(mediapipe::GlCalculatorHelper::UpdateContract(cc));
|
||||
#endif // !MEDIAPIPE_DISABLE_GPU
|
||||
|
||||
return absl::OkStatus();
|
||||
}
|
||||
|
||||
absl::Status SegmentationSmoothingCalculator::Open(CalculatorContext* cc) {
|
||||
cc->SetOffset(TimestampDiff(0));
|
||||
|
||||
auto options =
|
||||
cc->Options<mediapipe::SegmentationSmoothingCalculatorOptions>();
|
||||
combine_with_previous_ratio_ = options.combine_with_previous_ratio();
|
||||
|
||||
#if !MEDIAPIPE_DISABLE_GPU
|
||||
MP_RETURN_IF_ERROR(gpu_helper_.Open(cc));
|
||||
#endif // !MEDIAPIPE_DISABLE_GPU
|
||||
|
||||
return absl::OkStatus();
|
||||
}
|
||||
|
||||
absl::Status SegmentationSmoothingCalculator::Process(CalculatorContext* cc) {
|
||||
if (cc->Inputs().Tag(kCurrentMaskTag).IsEmpty()) {
|
||||
return absl::OkStatus();
|
||||
}
|
||||
if (cc->Inputs().Tag(kPreviousMaskTag).IsEmpty()) {
|
||||
// Pass through current image if previous is not available.
|
||||
cc->Outputs()
|
||||
.Tag(kOutputMaskTag)
|
||||
.AddPacket(cc->Inputs().Tag(kCurrentMaskTag).Value());
|
||||
return absl::OkStatus();
|
||||
}
|
||||
|
||||
// Run on GPU if incoming data is on GPU.
|
||||
const bool use_gpu = cc->Inputs().Tag(kCurrentMaskTag).Get<Image>().UsesGpu();
|
||||
|
||||
if (use_gpu) {
|
||||
#if !MEDIAPIPE_DISABLE_GPU
|
||||
MP_RETURN_IF_ERROR(gpu_helper_.RunInGlContext([this, cc]() -> absl::Status {
|
||||
if (!gpu_initialized_) {
|
||||
MP_RETURN_IF_ERROR(GlSetup(cc));
|
||||
gpu_initialized_ = true;
|
||||
}
|
||||
MP_RETURN_IF_ERROR(RenderGpu(cc));
|
||||
return absl::OkStatus();
|
||||
}));
|
||||
#else
|
||||
return absl::InternalError("GPU processing is disabled.");
|
||||
#endif // !MEDIAPIPE_DISABLE_GPU
|
||||
} else {
|
||||
MP_RETURN_IF_ERROR(RenderCpu(cc));
|
||||
}
|
||||
|
||||
return absl::OkStatus();
|
||||
}
|
||||
|
||||
absl::Status SegmentationSmoothingCalculator::Close(CalculatorContext* cc) {
|
||||
#if !MEDIAPIPE_DISABLE_GPU
|
||||
gpu_helper_.RunInGlContext([this] {
|
||||
if (program_) glDeleteProgram(program_);
|
||||
program_ = 0;
|
||||
});
|
||||
#endif // !MEDIAPIPE_DISABLE_GPU
|
||||
|
||||
return absl::OkStatus();
|
||||
}
|
||||
|
||||
absl::Status SegmentationSmoothingCalculator::RenderCpu(CalculatorContext* cc) {
|
||||
// Setup source images.
|
||||
const auto& current_frame = cc->Inputs().Tag(kCurrentMaskTag).Get<Image>();
|
||||
const cv::Mat current_mat = mediapipe::formats::MatView(¤t_frame);
|
||||
RET_CHECK_EQ(current_mat.type(), CV_32FC1)
|
||||
<< "Only 1-channel float input image is supported.";
|
||||
|
||||
const auto& previous_frame = cc->Inputs().Tag(kPreviousMaskTag).Get<Image>();
|
||||
const cv::Mat previous_mat = mediapipe::formats::MatView(&previous_frame);
|
||||
RET_CHECK_EQ(previous_mat.type(), current_mat.type())
|
||||
<< "Warning: mixing input format types: " << previous_mat.type()
|
||||
<< " != " << previous_mat.type();
|
||||
|
||||
RET_CHECK_EQ(current_mat.rows, previous_mat.rows);
|
||||
RET_CHECK_EQ(current_mat.cols, previous_mat.cols);
|
||||
|
||||
// Setup destination image.
|
||||
auto output_frame = std::make_shared<ImageFrame>(
|
||||
current_frame.image_format(), current_mat.cols, current_mat.rows);
|
||||
cv::Mat output_mat = mediapipe::formats::MatView(output_frame.get());
|
||||
output_mat.setTo(cv::Scalar(0));
|
||||
|
||||
// Blending function.
|
||||
const auto blending_fn = [&](const float prev_mask_value,
|
||||
const float new_mask_value) {
|
||||
/*
|
||||
* Assume p := new_mask_value
|
||||
* H(p) := 1 + (p * log(p) + (1-p) * log(1-p)) / log(2)
|
||||
* uncertainty alpha(p) =
|
||||
* Clamp(1 - (1 - H(p)) * (1 - H(p)), 0, 1) [squaring the uncertainty]
|
||||
*
|
||||
* The following polynomial approximates uncertainty alpha as a function
|
||||
* of (p + 0.5):
|
||||
*/
|
||||
const float c1 = 5.68842;
|
||||
const float c2 = -0.748699;
|
||||
const float c3 = -57.8051;
|
||||
const float c4 = 291.309;
|
||||
const float c5 = -624.717;
|
||||
const float t = new_mask_value - 0.5f;
|
||||
const float x = t * t;
|
||||
|
||||
const float uncertainty =
|
||||
1.0f -
|
||||
std::min(1.0f, x * (c1 + x * (c2 + x * (c3 + x * (c4 + x * c5)))));
|
||||
|
||||
return new_mask_value + (prev_mask_value - new_mask_value) *
|
||||
(uncertainty * combine_with_previous_ratio_);
|
||||
};
|
||||
|
||||
// Write directly to the first channel of output.
|
||||
for (int i = 0; i < output_mat.rows; ++i) {
|
||||
float* out_ptr = output_mat.ptr<float>(i);
|
||||
const float* curr_ptr = current_mat.ptr<float>(i);
|
||||
const float* prev_ptr = previous_mat.ptr<float>(i);
|
||||
for (int j = 0; j < output_mat.cols; ++j) {
|
||||
const float new_mask_value = curr_ptr[j];
|
||||
const float prev_mask_value = prev_ptr[j];
|
||||
out_ptr[j] = blending_fn(prev_mask_value, new_mask_value);
|
||||
}
|
||||
}
|
||||
|
||||
cc->Outputs()
|
||||
.Tag(kOutputMaskTag)
|
||||
.AddPacket(MakePacket<Image>(output_frame).At(cc->InputTimestamp()));
|
||||
|
||||
return absl::OkStatus();
|
||||
}
|
||||
|
||||
absl::Status SegmentationSmoothingCalculator::RenderGpu(CalculatorContext* cc) {
|
||||
#if !MEDIAPIPE_DISABLE_GPU
|
||||
// Setup source textures.
|
||||
const auto& current_frame = cc->Inputs().Tag(kCurrentMaskTag).Get<Image>();
|
||||
RET_CHECK(
|
||||
(current_frame.format() == mediapipe::GpuBufferFormat::kBGRA32 ||
|
||||
current_frame.format() == mediapipe::GpuBufferFormat::kGrayHalf16 ||
|
||||
current_frame.format() == mediapipe::GpuBufferFormat::kGrayFloat32 ||
|
||||
current_frame.format() == mediapipe::GpuBufferFormat::kRGB24))
|
||||
<< "Only RGBA, RGB, or 1-channel Float input image supported.";
|
||||
|
||||
auto current_texture = gpu_helper_.CreateSourceTexture(current_frame);
|
||||
|
||||
const auto& previous_frame = cc->Inputs().Tag(kPreviousMaskTag).Get<Image>();
|
||||
if (previous_frame.format() != current_frame.format()) {
|
||||
LOG(ERROR) << "Warning: mixing input format types. ";
|
||||
}
|
||||
auto previous_texture = gpu_helper_.CreateSourceTexture(previous_frame);
|
||||
|
||||
// Setup destination texture.
|
||||
const int width = current_frame.width(), height = current_frame.height();
|
||||
auto output_texture = gpu_helper_.CreateDestinationTexture(
|
||||
width, height, current_frame.format());
|
||||
|
||||
// Process shader.
|
||||
{
|
||||
gpu_helper_.BindFramebuffer(output_texture);
|
||||
glActiveTexture(GL_TEXTURE1);
|
||||
glBindTexture(GL_TEXTURE_2D, current_texture.name());
|
||||
glActiveTexture(GL_TEXTURE2);
|
||||
glBindTexture(GL_TEXTURE_2D, previous_texture.name());
|
||||
GlRender(cc);
|
||||
glActiveTexture(GL_TEXTURE2);
|
||||
glBindTexture(GL_TEXTURE_2D, 0);
|
||||
glActiveTexture(GL_TEXTURE1);
|
||||
glBindTexture(GL_TEXTURE_2D, 0);
|
||||
}
|
||||
glFlush();
|
||||
|
||||
// Send out image as GPU packet.
|
||||
auto output_frame = output_texture.GetFrame<Image>();
|
||||
cc->Outputs()
|
||||
.Tag(kOutputMaskTag)
|
||||
.Add(output_frame.release(), cc->InputTimestamp());
|
||||
#endif // !MEDIAPIPE_DISABLE_GPU
|
||||
|
||||
return absl::OkStatus();
|
||||
}
|
||||
|
||||
void SegmentationSmoothingCalculator::GlRender(CalculatorContext* cc) {
|
||||
#if !MEDIAPIPE_DISABLE_GPU
|
||||
static const GLfloat square_vertices[] = {
|
||||
-1.0f, -1.0f, // bottom left
|
||||
1.0f, -1.0f, // bottom right
|
||||
-1.0f, 1.0f, // top left
|
||||
1.0f, 1.0f, // top right
|
||||
};
|
||||
static const GLfloat texture_vertices[] = {
|
||||
0.0f, 0.0f, // bottom left
|
||||
1.0f, 0.0f, // bottom right
|
||||
0.0f, 1.0f, // top left
|
||||
1.0f, 1.0f, // top right
|
||||
};
|
||||
|
||||
// program
|
||||
glUseProgram(program_);
|
||||
|
||||
// vertex storage
|
||||
GLuint vbo[2];
|
||||
glGenBuffers(2, vbo);
|
||||
GLuint vao;
|
||||
glGenVertexArrays(1, &vao);
|
||||
glBindVertexArray(vao);
|
||||
|
||||
// vbo 0
|
||||
glBindBuffer(GL_ARRAY_BUFFER, vbo[0]);
|
||||
glBufferData(GL_ARRAY_BUFFER, 4 * 2 * sizeof(GLfloat), square_vertices,
|
||||
GL_STATIC_DRAW);
|
||||
glEnableVertexAttribArray(ATTRIB_VERTEX);
|
||||
glVertexAttribPointer(ATTRIB_VERTEX, 2, GL_FLOAT, 0, 0, nullptr);
|
||||
|
||||
// vbo 1
|
||||
glBindBuffer(GL_ARRAY_BUFFER, vbo[1]);
|
||||
glBufferData(GL_ARRAY_BUFFER, 4 * 2 * sizeof(GLfloat), texture_vertices,
|
||||
GL_STATIC_DRAW);
|
||||
glEnableVertexAttribArray(ATTRIB_TEXTURE_POSITION);
|
||||
glVertexAttribPointer(ATTRIB_TEXTURE_POSITION, 2, GL_FLOAT, 0, 0, nullptr);
|
||||
|
||||
// draw
|
||||
glDrawArrays(GL_TRIANGLE_STRIP, 0, 4);
|
||||
|
||||
// cleanup
|
||||
glDisableVertexAttribArray(ATTRIB_VERTEX);
|
||||
glDisableVertexAttribArray(ATTRIB_TEXTURE_POSITION);
|
||||
glBindBuffer(GL_ARRAY_BUFFER, 0);
|
||||
glBindVertexArray(0);
|
||||
glDeleteVertexArrays(1, &vao);
|
||||
glDeleteBuffers(2, vbo);
|
||||
|
||||
#endif // !MEDIAPIPE_DISABLE_GPU
|
||||
}
|
||||
|
||||
absl::Status SegmentationSmoothingCalculator::GlSetup(CalculatorContext* cc) {
|
||||
#if !MEDIAPIPE_DISABLE_GPU
|
||||
const GLint attr_location[NUM_ATTRIBUTES] = {
|
||||
ATTRIB_VERTEX,
|
||||
ATTRIB_TEXTURE_POSITION,
|
||||
};
|
||||
const GLchar* attr_name[NUM_ATTRIBUTES] = {
|
||||
"position",
|
||||
"texture_coordinate",
|
||||
};
|
||||
|
||||
// Shader to blend in previous mask based on computed uncertainty probability.
|
||||
const std::string frag_src =
|
||||
absl::StrCat(std::string(mediapipe::kMediaPipeFragmentShaderPreamble),
|
||||
R"(
|
||||
DEFAULT_PRECISION(mediump, float)
|
||||
|
||||
#ifdef GL_ES
|
||||
#define fragColor gl_FragColor
|
||||
#else
|
||||
out vec4 fragColor;
|
||||
#endif // defined(GL_ES);
|
||||
|
||||
in vec2 sample_coordinate;
|
||||
uniform sampler2D current_mask;
|
||||
uniform sampler2D previous_mask;
|
||||
uniform float combine_with_previous_ratio;
|
||||
|
||||
void main() {
|
||||
vec4 current_pix = texture2D(current_mask, sample_coordinate);
|
||||
vec4 previous_pix = texture2D(previous_mask, sample_coordinate);
|
||||
float new_mask_value = current_pix.r;
|
||||
float prev_mask_value = previous_pix.r;
|
||||
|
||||
// Assume p := new_mask_value
|
||||
// H(p) := 1 + (p * log(p) + (1-p) * log(1-p)) / log(2)
|
||||
// uncertainty alpha(p) =
|
||||
// Clamp(1 - (1 - H(p)) * (1 - H(p)), 0, 1) [squaring the uncertainty]
|
||||
//
|
||||
// The following polynomial approximates uncertainty alpha as a function
|
||||
// of (p + 0.5):
|
||||
const float c1 = 5.68842;
|
||||
const float c2 = -0.748699;
|
||||
const float c3 = -57.8051;
|
||||
const float c4 = 291.309;
|
||||
const float c5 = -624.717;
|
||||
float t = new_mask_value - 0.5;
|
||||
float x = t * t;
|
||||
|
||||
float uncertainty =
|
||||
1.0 - min(1.0, x * (c1 + x * (c2 + x * (c3 + x * (c4 + x * c5)))));
|
||||
|
||||
new_mask_value +=
|
||||
(prev_mask_value - new_mask_value) * (uncertainty * combine_with_previous_ratio);
|
||||
|
||||
fragColor = vec4(new_mask_value, 0.0, 0.0, new_mask_value);
|
||||
}
|
||||
)");
|
||||
|
||||
// Create shader program and set parameters.
|
||||
mediapipe::GlhCreateProgram(mediapipe::kBasicVertexShader, frag_src.c_str(),
|
||||
NUM_ATTRIBUTES, (const GLchar**)&attr_name[0],
|
||||
attr_location, &program_);
|
||||
RET_CHECK(program_) << "Problem initializing the program.";
|
||||
glUseProgram(program_);
|
||||
glUniform1i(glGetUniformLocation(program_, "current_mask"), 1);
|
||||
glUniform1i(glGetUniformLocation(program_, "previous_mask"), 2);
|
||||
glUniform1f(glGetUniformLocation(program_, "combine_with_previous_ratio"),
|
||||
combine_with_previous_ratio_);
|
||||
|
||||
#endif // !MEDIAPIPE_DISABLE_GPU
|
||||
|
||||
return absl::OkStatus();
|
||||
}
|
||||
|
||||
} // namespace mediapipe
|
|
@ -0,0 +1,35 @@
|
|||
// Copyright 2021 The MediaPipe Authors.
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
// You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software
|
||||
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
|
||||
syntax = "proto2";
|
||||
|
||||
package mediapipe;
|
||||
|
||||
import "mediapipe/framework/calculator.proto";
|
||||
|
||||
message SegmentationSmoothingCalculatorOptions {
|
||||
extend CalculatorOptions {
|
||||
optional SegmentationSmoothingCalculatorOptions ext = 377425128;
|
||||
}
|
||||
|
||||
// How much to blend in previous mask, based on a probability estimate.
|
||||
// Range: [0-1]
|
||||
// 0 = Use only current frame (no blending).
|
||||
// 1 = Blend in the previous mask based on uncertainty estimate.
|
||||
// With ratio at 1, the uncertainty estimate is trusted completely.
|
||||
// When uncertainty is high, the previous mask is given higher weight.
|
||||
// Therefore, if both ratio and uncertainty are 1, only old mask is used.
|
||||
// A pixel is 'uncertain' if its value is close to the middle (0.5 or 127).
|
||||
optional float combine_with_previous_ratio = 1 [default = 0.0];
|
||||
}
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue
Block a user