Project import generated by Copybara.
GitOrigin-RevId: 08c2016a4df5aef571b464a4d4491f38c6b2af10
This commit is contained in:
parent
ae05ad04b3
commit
8b57bf879b
21
.github/ISSUE_TEMPLATE/00-build-installation-issue.md
vendored
Normal file
21
.github/ISSUE_TEMPLATE/00-build-installation-issue.md
vendored
Normal file
|
@ -0,0 +1,21 @@
|
|||
<em>Please make sure that this is a build/installation issue and also refer to the [troubleshooting](https://google.github.io/mediapipe/getting_started/troubleshooting.html) documentation before raising any issues.</em>
|
||||
|
||||
**System information** (Please provide as much relevant information as possible)
|
||||
- OS Platform and Distribution (e.g. Linux Ubuntu 16.04, Android 11, iOS 14.4):
|
||||
- Compiler version (e.g. gcc/g++ 8 /Apple clang version 12.0.0):
|
||||
- Programming Language and version ( e.g. C++ 14, Python 3.6, Java ):
|
||||
- Installed using virtualenv? pip? Conda? (if python):
|
||||
- [MediaPipe version](https://github.com/google/mediapipe/releases):
|
||||
- Bazel version:
|
||||
- XCode and Tulsi versions (if iOS):
|
||||
- Android SDK and NDK versions (if android):
|
||||
- Android [AAR](https://google.github.io/mediapipe/getting_started/android_archive_library.html) ( if android):
|
||||
- OpenCV version (if running on desktop):
|
||||
|
||||
**Describe the problem**:
|
||||
|
||||
|
||||
**[Provide the exact sequence of commands / steps that you executed before running into the problem](https://google.github.io/mediapipe/getting_started/getting_started.html):**
|
||||
|
||||
**Complete Logs:**
|
||||
Include Complete Log information or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached:
|
20
.github/ISSUE_TEMPLATE/10-solution-issue.md
vendored
Normal file
20
.github/ISSUE_TEMPLATE/10-solution-issue.md
vendored
Normal file
|
@ -0,0 +1,20 @@
|
|||
<em>Please make sure that this is a [solution](https://google.github.io/mediapipe/solutions/solutions.html) issue.<em>
|
||||
|
||||
**System information** (Please provide as much relevant information as possible)
|
||||
- Have I written custom code (as opposed to using a stock example script provided in Mediapipe):
|
||||
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04, Android 11, iOS 14.4):
|
||||
- [MediaPipe version](https://github.com/google/mediapipe/releases):
|
||||
- Bazel version:
|
||||
- Solution (e.g. FaceMesh, Pose, Holistic):
|
||||
- Programming Language and version ( e.g. C++, Python, Java):
|
||||
|
||||
**Describe the expected behavior:**
|
||||
|
||||
**Standalone code you may have used to try to get what you need :**
|
||||
|
||||
If there is a problem, provide a reproducible test case that is the bare minimum necessary to generate the problem. If possible, please share a link to Colab/repo link /any notebook:
|
||||
|
||||
**Other info / Complete Logs :**
|
||||
Include any logs or source code that would be helpful to
|
||||
diagnose the problem. If including tracebacks, please include the full
|
||||
traceback. Large logs and files should be attached:
|
45
.github/ISSUE_TEMPLATE/20-documentation-issue.md
vendored
Normal file
45
.github/ISSUE_TEMPLATE/20-documentation-issue.md
vendored
Normal file
|
@ -0,0 +1,45 @@
|
|||
Thank you for submitting a MediaPipe documentation issue.
|
||||
The MediaPipe docs are open source! To get involved, read the documentation Contributor Guide
|
||||
## URL(s) with the issue:
|
||||
|
||||
Please provide a link to the documentation entry, for example: https://github.com/google/mediapipe/blob/master/docs/solutions/face_mesh.md#models
|
||||
|
||||
## Description of issue (what needs changing):
|
||||
|
||||
Kinds of documentation problems:
|
||||
|
||||
### Clear description
|
||||
|
||||
For example, why should someone use this method? How is it useful?
|
||||
|
||||
### Correct links
|
||||
|
||||
Is the link to the source code correct?
|
||||
|
||||
### Parameters defined
|
||||
Are all parameters defined and formatted correctly?
|
||||
|
||||
### Returns defined
|
||||
|
||||
Are return values defined?
|
||||
|
||||
### Raises listed and defined
|
||||
|
||||
Are the errors defined? For example,
|
||||
|
||||
### Usage example
|
||||
|
||||
Is there a usage example?
|
||||
|
||||
See the API guide:
|
||||
on how to write testable usage examples.
|
||||
|
||||
### Request visuals, if applicable
|
||||
|
||||
Are there currently visuals? If not, will it clarify the content?
|
||||
|
||||
### Submit a pull request?
|
||||
|
||||
Are you planning to also submit a pull request to fix the issue? See the docs
|
||||
https://github.com/google/mediapipe/blob/master/CONTRIBUTING.md
|
||||
|
18
.github/bot_config.yml
vendored
Normal file
18
.github/bot_config.yml
vendored
Normal file
|
@ -0,0 +1,18 @@
|
|||
# Copyright 2021 The MediaPipe Authors.
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# ============================================================================
|
||||
|
||||
# A list of assignees
|
||||
assignees:
|
||||
- sgowroji
|
34
.github/stale.yml
vendored
Normal file
34
.github/stale.yml
vendored
Normal file
|
@ -0,0 +1,34 @@
|
|||
# Copyright 2021 The MediaPipe Authors.
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# ============================================================================
|
||||
#
|
||||
# This file was assembled from multiple pieces, whose use is documented
|
||||
# throughout. Please refer to the TensorFlow dockerfiles documentation
|
||||
# for more information.
|
||||
|
||||
# Number of days of inactivity before an Issue or Pull Request becomes stale
|
||||
daysUntilStale: 7
|
||||
# Number of days of inactivity before a stale Issue or Pull Request is closed
|
||||
daysUntilClose: 7
|
||||
# Only issues or pull requests with all of these labels are checked if stale. Defaults to `[]` (disabled)
|
||||
onlyLabels:
|
||||
- stat:awaiting response
|
||||
# Comment to post when marking as stale. Set to `false` to disable
|
||||
markComment: >
|
||||
This issue has been automatically marked as stale because it has not had
|
||||
recent activity. It will be closed if no further activity occurs. Thank you.
|
||||
# Comment to post when removing the stale label. Set to `false` to disable
|
||||
unmarkComment: false
|
||||
closeComment: >
|
||||
Closing as stale. Please reopen if you'd like to work on this further.
|
|
@ -40,11 +40,12 @@ Hair Segmentation
|
|||
[Hands](https://google.github.io/mediapipe/solutions/hands) | ✅ | ✅ | ✅ | ✅ | ✅ |
|
||||
[Pose](https://google.github.io/mediapipe/solutions/pose) | ✅ | ✅ | ✅ | ✅ | ✅ |
|
||||
[Holistic](https://google.github.io/mediapipe/solutions/holistic) | ✅ | ✅ | ✅ | ✅ | ✅ |
|
||||
[Selfie Segmentation](https://google.github.io/mediapipe/solutions/selfie_segmentation) | ✅ | ✅ | ✅ | ✅ | ✅ |
|
||||
[Hair Segmentation](https://google.github.io/mediapipe/solutions/hair_segmentation) | ✅ | | ✅ | | |
|
||||
[Object Detection](https://google.github.io/mediapipe/solutions/object_detection) | ✅ | ✅ | ✅ | | | ✅
|
||||
[Box Tracking](https://google.github.io/mediapipe/solutions/box_tracking) | ✅ | ✅ | ✅ | | |
|
||||
[Instant Motion Tracking](https://google.github.io/mediapipe/solutions/instant_motion_tracking) | ✅ | | | | |
|
||||
[Objectron](https://google.github.io/mediapipe/solutions/objectron) | ✅ | | ✅ | ✅ | |
|
||||
[Objectron](https://google.github.io/mediapipe/solutions/objectron) | ✅ | | ✅ | ✅ | |
|
||||
[KNIFT](https://google.github.io/mediapipe/solutions/knift) | ✅ | | | | |
|
||||
[AutoFlip](https://google.github.io/mediapipe/solutions/autoflip) | | | ✅ | | |
|
||||
[MediaSequence](https://google.github.io/mediapipe/solutions/media_sequence) | | | ✅ | | |
|
||||
|
|
10
WORKSPACE
10
WORKSPACE
|
@ -71,8 +71,8 @@ http_archive(
|
|||
# Google Benchmark library.
|
||||
http_archive(
|
||||
name = "com_google_benchmark",
|
||||
urls = ["https://github.com/google/benchmark/archive/master.zip"],
|
||||
strip_prefix = "benchmark-master",
|
||||
urls = ["https://github.com/google/benchmark/archive/main.zip"],
|
||||
strip_prefix = "benchmark-main",
|
||||
build_file = "@//third_party:benchmark.BUILD",
|
||||
)
|
||||
|
||||
|
@ -369,9 +369,9 @@ http_archive(
|
|||
)
|
||||
|
||||
# Tensorflow repo should always go after the other external dependencies.
|
||||
# 2021-04-30
|
||||
_TENSORFLOW_GIT_COMMIT = "5bd3c57ef184543d22e34e36cff9d9bea608e06d"
|
||||
_TENSORFLOW_SHA256= "9a45862834221aafacf6fb275f92b3876bc89443cbecc51be93f13839a6609f0"
|
||||
# 2021-05-27
|
||||
_TENSORFLOW_GIT_COMMIT = "d6bfcdb0926173dbb7aa02ceba5aae6250b8aaa6"
|
||||
_TENSORFLOW_SHA256 = "ec40e1462239d8783d02f76a43412c8f80bac71ea20e41e1b7729b990aad6923"
|
||||
http_archive(
|
||||
name = "org_tensorflow",
|
||||
urls = [
|
||||
|
|
|
@ -97,6 +97,7 @@ for app in ${apps}; do
|
|||
if [[ ${target_name} == "holistic_tracking" ||
|
||||
${target_name} == "iris_tracking" ||
|
||||
${target_name} == "pose_tracking" ||
|
||||
${target_name} == "selfie_segmentation" ||
|
||||
${target_name} == "upper_body_pose_tracking" ]]; then
|
||||
graph_suffix="cpu"
|
||||
else
|
||||
|
|
|
@ -248,12 +248,58 @@ absl::Status MyCalculator::Process() {
|
|||
}
|
||||
```
|
||||
|
||||
## Calculator options
|
||||
|
||||
Calculators accept processing parameters through (1) input stream packets (2)
|
||||
input side packets, and (3) calculator options. Calculator options, if
|
||||
specified, appear as literal values in the `node_options` field of the
|
||||
`CalculatorGraphConfiguration.Node` message.
|
||||
|
||||
```
|
||||
node {
|
||||
calculator: "TfLiteInferenceCalculator"
|
||||
input_stream: "TENSORS:main_model_input"
|
||||
output_stream: "TENSORS:main_model_output"
|
||||
node_options: {
|
||||
[type.googleapis.com/mediapipe.TfLiteInferenceCalculatorOptions] {
|
||||
model_path: "mediapipe/models/active_speaker_detection/audio_visual_model.tflite"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The `node_options` field accepts the proto3 syntax. Alternatively, calculator
|
||||
options can be specified in the `options` field using proto2 syntax.
|
||||
|
||||
```
|
||||
node: {
|
||||
calculator: "IntervalFilterCalculator"
|
||||
node_options: {
|
||||
[type.googleapis.com/mediapipe.IntervalFilterCalculatorOptions] {
|
||||
intervals {
|
||||
start_us: 20000
|
||||
end_us: 40000
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Not all calculators accept calcuator options. In order to accept options, a
|
||||
calculator will normally define a new protobuf message type to represent its
|
||||
options, such as `IntervalFilterCalculatorOptions`. The calculator will then
|
||||
read that protobuf message in its `CalculatorBase::Open` method, and possibly
|
||||
also in the `CalculatorBase::GetContract` function or its
|
||||
`CalculatorBase::Process` method. Normally, the new protobuf message type will
|
||||
be defined as a protobuf schema using a ".proto" file and a
|
||||
`mediapipe_proto_library()` build rule.
|
||||
|
||||
## Example calculator
|
||||
|
||||
This section discusses the implementation of `PacketClonerCalculator`, which
|
||||
does a relatively simple job, and is used in many calculator graphs.
|
||||
`PacketClonerCalculator` simply produces a copy of its most recent input
|
||||
packets on demand.
|
||||
`PacketClonerCalculator` simply produces a copy of its most recent input packets
|
||||
on demand.
|
||||
|
||||
`PacketClonerCalculator` is useful when the timestamps of arriving data packets
|
||||
are not aligned perfectly. Suppose we have a room with a microphone, light
|
||||
|
@ -279,8 +325,8 @@ input streams:
|
|||
imageframe of video data representing video collected from camera in the
|
||||
room with timestamp.
|
||||
|
||||
Below is the implementation of the `PacketClonerCalculator`. You can see
|
||||
the `GetContract()`, `Open()`, and `Process()` methods as well as the instance
|
||||
Below is the implementation of the `PacketClonerCalculator`. You can see the
|
||||
`GetContract()`, `Open()`, and `Process()` methods as well as the instance
|
||||
variable `current_` which holds the most recent input packets.
|
||||
|
||||
```c++
|
||||
|
@ -401,6 +447,6 @@ node {
|
|||
The diagram below shows how the `PacketClonerCalculator` defines its output
|
||||
packets (bottom) based on its series of input packets (top).
|
||||
|
||||
| ![Graph using PacketClonerCalculator](../images/packet_cloner_calculator.png) |
|
||||
| :---------------------------------------------------------------------------: |
|
||||
| *Each time it receives a packet on its TICK input stream, the PacketClonerCalculator outputs the most recent packet from each of its input streams. The sequence of output packets (bottom) is determined by the sequence of input packets (top) and their timestamps. The timestamps are shown along the right side of the diagram.* |
|
||||
![Graph using PacketClonerCalculator](../images/packet_cloner_calculator.png) |
|
||||
:--------------------------------------------------------------------------: |
|
||||
*Each time it receives a packet on its TICK input stream, the PacketClonerCalculator outputs the most recent packet from each of its input streams. The sequence of output packets (bottom) is determined by the sequence of input packets (top) and their timestamps. The timestamps are shown along the right side of the diagram.* |
|
||||
|
|
|
@ -111,11 +111,11 @@ component known as an InputStreamHandler.
|
|||
|
||||
See [Synchronization](synchronization.md) for more details.
|
||||
|
||||
### Realtime data streams
|
||||
### Real-time streams
|
||||
|
||||
MediaPipe calculator graphs are often used to process streams of video or audio
|
||||
frames for interactive applications. Normally, each Calculator runs as soon as
|
||||
all of its input packets for a given timestamp become available. Calculators
|
||||
used in realtime graphs need to define output timestamp bounds based on input
|
||||
used in real-time graphs need to define output timestamp bounds based on input
|
||||
timestamp bounds in order to allow downstream calculators to be scheduled
|
||||
promptly. See [Realtime data streams](realtime.md) for details.
|
||||
promptly. See [Real-time Streams](realtime_streams.md) for details.
|
||||
|
|
|
@ -1,29 +1,28 @@
|
|||
---
|
||||
layout: default
|
||||
title: Processing real-time data streams
|
||||
title: Real-time Streams
|
||||
parent: Framework Concepts
|
||||
nav_order: 6
|
||||
has_children: true
|
||||
has_toc: false
|
||||
---
|
||||
|
||||
# Processing real-time data streams
|
||||
# Real-time Streams
|
||||
{: .no_toc }
|
||||
|
||||
1. TOC
|
||||
{:toc}
|
||||
---
|
||||
|
||||
## Realtime timestamps
|
||||
## Real-time timestamps
|
||||
|
||||
MediaPipe calculator graphs are often used to process streams of video or audio
|
||||
frames for interactive applications. The MediaPipe framework requires only that
|
||||
successive packets be assigned monotonically increasing timestamps. By
|
||||
convention, realtime calculators and graphs use the recording time or the
|
||||
convention, real-time calculators and graphs use the recording time or the
|
||||
presentation time of each frame as its timestamp, with each timestamp indicating
|
||||
the microseconds since `Jan/1/1970:00:00:00`. This allows packets from various
|
||||
sources to be processed in a globally consistent sequence.
|
||||
|
||||
## Realtime scheduling
|
||||
## Real-time scheduling
|
||||
|
||||
Normally, each Calculator runs as soon as all of its input packets for a given
|
||||
timestamp become available. Normally, this happens when the calculator has
|
||||
|
@ -38,7 +37,7 @@ When a calculator does not produce any output packets for a given timestamp, it
|
|||
can instead output a "timestamp bound" indicating that no packet will be
|
||||
produced for that timestamp. This indication is necessary to allow downstream
|
||||
calculators to run at that timestamp, even though no packet has arrived for
|
||||
certain streams for that timestamp. This is especially important for realtime
|
||||
certain streams for that timestamp. This is especially important for real-time
|
||||
graphs in interactive applications, where it is crucial that each calculator
|
||||
begin processing as soon as possible.
|
||||
|
||||
|
@ -83,12 +82,12 @@ For example, `Timestamp(1).NextAllowedInStream() == Timestamp(2)`.
|
|||
|
||||
## Propagating timestamp bounds
|
||||
|
||||
Calculators that will be used in realtime graphs need to define output timestamp
|
||||
bounds based on input timestamp bounds in order to allow downstream calculators
|
||||
to be scheduled promptly. A common pattern is for calculators to output packets
|
||||
with the same timestamps as their input packets. In this case, simply outputting
|
||||
a packet on every call to `Calculator::Process` is sufficient to define output
|
||||
timestamp bounds.
|
||||
Calculators that will be used in real-time graphs need to define output
|
||||
timestamp bounds based on input timestamp bounds in order to allow downstream
|
||||
calculators to be scheduled promptly. A common pattern is for calculators to
|
||||
output packets with the same timestamps as their input packets. In this case,
|
||||
simply outputting a packet on every call to `Calculator::Process` is sufficient
|
||||
to define output timestamp bounds.
|
||||
|
||||
However, calculators are not required to follow this common pattern for output
|
||||
timestamps, they are only required to choose monotonically increasing output
|
|
@ -16,13 +16,14 @@ nav_order: 4
|
|||
|
||||
MediaPipe currently offers the following solutions:
|
||||
|
||||
Solution | NPM Package | Example
|
||||
----------------- | ----------------------------- | -------
|
||||
[Face Mesh][F-pg] | [@mediapipe/face_mesh][F-npm] | [mediapipe.dev/demo/face_mesh][F-demo]
|
||||
[Face Detection][Fd-pg] | [@mediapipe/face_detection][Fd-npm] | [mediapipe.dev/demo/face_detection][Fd-demo]
|
||||
[Hands][H-pg] | [@mediapipe/hands][H-npm] | [mediapipe.dev/demo/hands][H-demo]
|
||||
[Holistic][Ho-pg] | [@mediapipe/holistic][Ho-npm] | [mediapipe.dev/demo/holistic][Ho-demo]
|
||||
[Pose][P-pg] | [@mediapipe/pose][P-npm] | [mediapipe.dev/demo/pose][P-demo]
|
||||
Solution | NPM Package | Example
|
||||
--------------------------- | --------------------------------------- | -------
|
||||
[Face Mesh][F-pg] | [@mediapipe/face_mesh][F-npm] | [mediapipe.dev/demo/face_mesh][F-demo]
|
||||
[Face Detection][Fd-pg] | [@mediapipe/face_detection][Fd-npm] | [mediapipe.dev/demo/face_detection][Fd-demo]
|
||||
[Hands][H-pg] | [@mediapipe/hands][H-npm] | [mediapipe.dev/demo/hands][H-demo]
|
||||
[Holistic][Ho-pg] | [@mediapipe/holistic][Ho-npm] | [mediapipe.dev/demo/holistic][Ho-demo]
|
||||
[Pose][P-pg] | [@mediapipe/pose][P-npm] | [mediapipe.dev/demo/pose][P-demo]
|
||||
[Selfie Segmentation][S-pg] | [@mediapipe/selfie_segmentation][S-npm] | [mediapipe.dev/demo/selfie_segmentation][S-demo]
|
||||
|
||||
Click on a solution link above for more information, including API and code
|
||||
snippets.
|
||||
|
@ -67,11 +68,13 @@ affecting your work, restrict your request to a `<minor>` number. e.g.,
|
|||
[Fd-pg]: ../solutions/face_detection#javascript-solution-api
|
||||
[H-pg]: ../solutions/hands#javascript-solution-api
|
||||
[P-pg]: ../solutions/pose#javascript-solution-api
|
||||
[S-pg]: ../solutions/selfie_segmentation#javascript-solution-api
|
||||
[Ho-npm]: https://www.npmjs.com/package/@mediapipe/holistic
|
||||
[F-npm]: https://www.npmjs.com/package/@mediapipe/face_mesh
|
||||
[Fd-npm]: https://www.npmjs.com/package/@mediapipe/face_detection
|
||||
[H-npm]: https://www.npmjs.com/package/@mediapipe/hands
|
||||
[P-npm]: https://www.npmjs.com/package/@mediapipe/pose
|
||||
[S-npm]: https://www.npmjs.com/package/@mediapipe/selfie_segmentation
|
||||
[draw-npm]: https://www.npmjs.com/package/@mediapipe/drawing_utils
|
||||
[cam-npm]: https://www.npmjs.com/package/@mediapipe/camera_utils
|
||||
[ctrl-npm]: https://www.npmjs.com/package/@mediapipe/control_utils
|
||||
|
@ -80,15 +83,18 @@ affecting your work, restrict your request to a `<minor>` number. e.g.,
|
|||
[Fd-jsd]: https://www.jsdelivr.com/package/npm/@mediapipe/face_detection
|
||||
[H-jsd]: https://www.jsdelivr.com/package/npm/@mediapipe/hands
|
||||
[P-jsd]: https://www.jsdelivr.com/package/npm/@mediapipe/pose
|
||||
[P-jsd]: https://www.jsdelivr.com/package/npm/@mediapipe/selfie_segmentation
|
||||
[Ho-pen]: https://code.mediapipe.dev/codepen/holistic
|
||||
[F-pen]: https://code.mediapipe.dev/codepen/face_mesh
|
||||
[Fd-pen]: https://code.mediapipe.dev/codepen/face_detection
|
||||
[H-pen]: https://code.mediapipe.dev/codepen/hands
|
||||
[P-pen]: https://code.mediapipe.dev/codepen/pose
|
||||
[S-pen]: https://code.mediapipe.dev/codepen/selfie_segmentation
|
||||
[Ho-demo]: https://mediapipe.dev/demo/holistic
|
||||
[F-demo]: https://mediapipe.dev/demo/face_mesh
|
||||
[Fd-demo]: https://mediapipe.dev/demo/face_detection
|
||||
[H-demo]: https://mediapipe.dev/demo/hands
|
||||
[P-demo]: https://mediapipe.dev/demo/pose
|
||||
[S-demo]: https://mediapipe.dev/demo/selfie_segmentation
|
||||
[npm]: https://www.npmjs.com/package/@mediapipe
|
||||
[codepen]: https://code.mediapipe.dev/codepen
|
||||
|
|
|
@ -51,6 +51,7 @@ details in each solution via the links below:
|
|||
* [MediaPipe Holistic](../solutions/holistic#python-solution-api)
|
||||
* [MediaPipe Objectron](../solutions/objectron#python-solution-api)
|
||||
* [MediaPipe Pose](../solutions/pose#python-solution-api)
|
||||
* [MediaPipe Selfie Segmentation](../solutions/selfie_segmentation#python-solution-api)
|
||||
|
||||
## MediaPipe on Google Colab
|
||||
|
||||
|
@ -62,6 +63,7 @@ details in each solution via the links below:
|
|||
* [MediaPipe Pose Colab](https://mediapipe.page.link/pose_py_colab)
|
||||
* [MediaPipe Pose Classification Colab (Basic)](https://mediapipe.page.link/pose_classification_basic)
|
||||
* [MediaPipe Pose Classification Colab (Extended)](https://mediapipe.page.link/pose_classification_extended)
|
||||
* [MediaPipe Selfie Segmentation Colab](https://mediapipe.page.link/selfie_segmentation_py_colab)
|
||||
|
||||
## MediaPipe Python Framework
|
||||
|
||||
|
|
BIN
docs/images/selfie_segmentation_web.mp4
Normal file
BIN
docs/images/selfie_segmentation_web.mp4
Normal file
Binary file not shown.
|
@ -40,11 +40,12 @@ Hair Segmentation
|
|||
[Hands](https://google.github.io/mediapipe/solutions/hands) | ✅ | ✅ | ✅ | ✅ | ✅ |
|
||||
[Pose](https://google.github.io/mediapipe/solutions/pose) | ✅ | ✅ | ✅ | ✅ | ✅ |
|
||||
[Holistic](https://google.github.io/mediapipe/solutions/holistic) | ✅ | ✅ | ✅ | ✅ | ✅ |
|
||||
[Selfie Segmentation](https://google.github.io/mediapipe/solutions/selfie_segmentation) | ✅ | ✅ | ✅ | ✅ | ✅ |
|
||||
[Hair Segmentation](https://google.github.io/mediapipe/solutions/hair_segmentation) | ✅ | | ✅ | | |
|
||||
[Object Detection](https://google.github.io/mediapipe/solutions/object_detection) | ✅ | ✅ | ✅ | | | ✅
|
||||
[Box Tracking](https://google.github.io/mediapipe/solutions/box_tracking) | ✅ | ✅ | ✅ | | |
|
||||
[Instant Motion Tracking](https://google.github.io/mediapipe/solutions/instant_motion_tracking) | ✅ | | | | |
|
||||
[Objectron](https://google.github.io/mediapipe/solutions/objectron) | ✅ | | ✅ | ✅ | |
|
||||
[Objectron](https://google.github.io/mediapipe/solutions/objectron) | ✅ | | ✅ | ✅ | |
|
||||
[KNIFT](https://google.github.io/mediapipe/solutions/knift) | ✅ | | | | |
|
||||
[AutoFlip](https://google.github.io/mediapipe/solutions/autoflip) | | | ✅ | | |
|
||||
[MediaSequence](https://google.github.io/mediapipe/solutions/media_sequence) | | | ✅ | | |
|
||||
|
|
|
@ -2,7 +2,7 @@
|
|||
layout: default
|
||||
title: AutoFlip (Saliency-aware Video Cropping)
|
||||
parent: Solutions
|
||||
nav_order: 13
|
||||
nav_order: 14
|
||||
---
|
||||
|
||||
# AutoFlip: Saliency-aware Video Cropping
|
||||
|
|
|
@ -2,7 +2,7 @@
|
|||
layout: default
|
||||
title: Box Tracking
|
||||
parent: Solutions
|
||||
nav_order: 9
|
||||
nav_order: 10
|
||||
---
|
||||
|
||||
# MediaPipe Box Tracking
|
||||
|
|
|
@ -68,7 +68,7 @@ normalized to `[0.0, 1.0]` by the image width and height respectively.
|
|||
|
||||
Please first follow general [instructions](../getting_started/python.md) to
|
||||
install MediaPipe Python package, then learn more in the companion
|
||||
[Python Colab](#resources) and the following usage example.
|
||||
[Python Colab](#resources) and the usage example below.
|
||||
|
||||
Supported configuration options:
|
||||
|
||||
|
@ -81,9 +81,10 @@ mp_face_detection = mp.solutions.face_detection
|
|||
mp_drawing = mp.solutions.drawing_utils
|
||||
|
||||
# For static images:
|
||||
IMAGE_FILES = []
|
||||
with mp_face_detection.FaceDetection(
|
||||
min_detection_confidence=0.5) as face_detection:
|
||||
for idx, file in enumerate(file_list):
|
||||
for idx, file in enumerate(IMAGE_FILES):
|
||||
image = cv2.imread(file)
|
||||
# Convert the BGR image to RGB and process it with MediaPipe Face Detection.
|
||||
results = face_detection.process(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
|
||||
|
|
|
@ -265,7 +265,7 @@ magnitude of `z` uses roughly the same scale as `x`.
|
|||
|
||||
Please first follow general [instructions](../getting_started/python.md) to
|
||||
install MediaPipe Python package, then learn more in the companion
|
||||
[Python Colab](#resources) and the following usage example.
|
||||
[Python Colab](#resources) and the usage example below.
|
||||
|
||||
Supported configuration options:
|
||||
|
||||
|
@ -281,12 +281,13 @@ mp_drawing = mp.solutions.drawing_utils
|
|||
mp_face_mesh = mp.solutions.face_mesh
|
||||
|
||||
# For static images:
|
||||
IMAGE_FILES = []
|
||||
drawing_spec = mp_drawing.DrawingSpec(thickness=1, circle_radius=1)
|
||||
with mp_face_mesh.FaceMesh(
|
||||
static_image_mode=True,
|
||||
max_num_faces=1,
|
||||
min_detection_confidence=0.5) as face_mesh:
|
||||
for idx, file in enumerate(file_list):
|
||||
for idx, file in enumerate(IMAGE_FILES):
|
||||
image = cv2.imread(file)
|
||||
# Convert the BGR image to RGB before processing.
|
||||
results = face_mesh.process(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
|
||||
|
|
|
@ -2,7 +2,7 @@
|
|||
layout: default
|
||||
title: Hair Segmentation
|
||||
parent: Solutions
|
||||
nav_order: 7
|
||||
nav_order: 8
|
||||
---
|
||||
|
||||
# MediaPipe Hair Segmentation
|
||||
|
|
|
@ -206,7 +206,7 @@ is not the case, please swap the handedness output in the application.
|
|||
|
||||
Please first follow general [instructions](../getting_started/python.md) to
|
||||
install MediaPipe Python package, then learn more in the companion
|
||||
[Python Colab](#resources) and the following usage example.
|
||||
[Python Colab](#resources) and the usage example below.
|
||||
|
||||
Supported configuration options:
|
||||
|
||||
|
@ -222,11 +222,12 @@ mp_drawing = mp.solutions.drawing_utils
|
|||
mp_hands = mp.solutions.hands
|
||||
|
||||
# For static images:
|
||||
IMAGE_FILES = []
|
||||
with mp_hands.Hands(
|
||||
static_image_mode=True,
|
||||
max_num_hands=2,
|
||||
min_detection_confidence=0.5) as hands:
|
||||
for idx, file in enumerate(file_list):
|
||||
for idx, file in enumerate(IMAGE_FILES):
|
||||
# Read an image, flip it around y-axis for correct handedness output (see
|
||||
# above).
|
||||
image = cv2.flip(cv2.imread(file), 1)
|
||||
|
|
|
@ -201,7 +201,7 @@ A list of 21 hand landmarks on the right hand, in the same representation as
|
|||
|
||||
Please first follow general [instructions](../getting_started/python.md) to
|
||||
install MediaPipe Python package, then learn more in the companion
|
||||
[Python Colab](#resources) and the following usage example.
|
||||
[Python Colab](#resources) and the usage example below.
|
||||
|
||||
Supported configuration options:
|
||||
|
||||
|
@ -218,10 +218,11 @@ mp_drawing = mp.solutions.drawing_utils
|
|||
mp_holistic = mp.solutions.holistic
|
||||
|
||||
# For static images:
|
||||
IMAGE_FILES = []
|
||||
with mp_holistic.Holistic(
|
||||
static_image_mode=True,
|
||||
model_complexity=2) as holistic:
|
||||
for idx, file in enumerate(file_list):
|
||||
for idx, file in enumerate(IMAGE_FILES):
|
||||
image = cv2.imread(file)
|
||||
image_height, image_width, _ = image.shape
|
||||
# Convert the BGR image to RGB before processing.
|
||||
|
|
|
@ -2,7 +2,7 @@
|
|||
layout: default
|
||||
title: Instant Motion Tracking
|
||||
parent: Solutions
|
||||
nav_order: 10
|
||||
nav_order: 11
|
||||
---
|
||||
|
||||
# MediaPipe Instant Motion Tracking
|
||||
|
|
|
@ -2,7 +2,7 @@
|
|||
layout: default
|
||||
title: KNIFT (Template-based Feature Matching)
|
||||
parent: Solutions
|
||||
nav_order: 12
|
||||
nav_order: 13
|
||||
---
|
||||
|
||||
# MediaPipe KNIFT
|
||||
|
|
|
@ -2,7 +2,7 @@
|
|||
layout: default
|
||||
title: Dataset Preparation with MediaSequence
|
||||
parent: Solutions
|
||||
nav_order: 14
|
||||
nav_order: 15
|
||||
---
|
||||
|
||||
# Dataset Preparation with MediaSequence
|
||||
|
|
|
@ -16,10 +16,15 @@ nav_order: 30
|
|||
|
||||
* Face detection model for front-facing/selfie camera:
|
||||
[TFLite model](https://github.com/google/mediapipe/tree/master/mediapipe/modules/face_detection/face_detection_front.tflite),
|
||||
[TFLite model quantized for EdgeTPU/Coral](https://github.com/google/mediapipe/tree/master/mediapipe/examples/coral/models/face-detector-quantized_edgetpu.tflite)
|
||||
[TFLite model quantized for EdgeTPU/Coral](https://github.com/google/mediapipe/tree/master/mediapipe/examples/coral/models/face-detector-quantized_edgetpu.tflite),
|
||||
[Model card](https://mediapipe.page.link/blazeface-mc)
|
||||
* Face detection model for back-facing camera:
|
||||
[TFLite model ](https://github.com/google/mediapipe/tree/master/mediapipe/modules/face_detection/face_detection_back.tflite)
|
||||
* [Model card](https://mediapipe.page.link/blazeface-mc)
|
||||
[TFLite model](https://github.com/google/mediapipe/tree/master/mediapipe/modules/face_detection/face_detection_back.tflite),
|
||||
[Model card](https://mediapipe.page.link/blazeface-back-mc)
|
||||
* Face detection model for back-facing camera (sparse):
|
||||
[TFLite model](https://github.com/google/mediapipe/tree/master/mediapipe/modules/face_detection/face_detection_back_sparse.tflite),
|
||||
[Model card](https://mediapipe.page.link/blazeface-back-sparse-mc)
|
||||
|
||||
|
||||
### [Face Mesh](https://google.github.io/mediapipe/solutions/face_mesh)
|
||||
|
||||
|
@ -60,6 +65,12 @@ nav_order: 30
|
|||
* Hand recrop model:
|
||||
[TFLite model](https://github.com/google/mediapipe/tree/master/mediapipe/modules/holistic_landmark/hand_recrop.tflite)
|
||||
|
||||
### [Selfie Segmentation](https://google.github.io/mediapipe/solutions/selfie_segmentation)
|
||||
|
||||
* [TFLite model (general)](https://github.com/google/mediapipe/tree/master/mediapipe/modules/selfie_segmentation/selfie_segmentation.tflite)
|
||||
* [TFLite model (landscape)](https://github.com/google/mediapipe/tree/master/mediapipe/modules/selfie_segmentation/selfie_segmentation_landscape.tflite)
|
||||
* [Model card](https://mediapipe.page.link/selfiesegmentation-mc)
|
||||
|
||||
### [Hair Segmentation](https://google.github.io/mediapipe/solutions/hair_segmentation)
|
||||
|
||||
* [TFLite model](https://github.com/google/mediapipe/tree/master/mediapipe/models/hair_segmentation.tflite)
|
||||
|
|
|
@ -2,7 +2,7 @@
|
|||
layout: default
|
||||
title: Object Detection
|
||||
parent: Solutions
|
||||
nav_order: 8
|
||||
nav_order: 9
|
||||
---
|
||||
|
||||
# MediaPipe Object Detection
|
||||
|
|
|
@ -2,7 +2,7 @@
|
|||
layout: default
|
||||
title: Objectron (3D Object Detection)
|
||||
parent: Solutions
|
||||
nav_order: 11
|
||||
nav_order: 12
|
||||
---
|
||||
|
||||
# MediaPipe Objectron
|
||||
|
@ -277,7 +277,7 @@ following:
|
|||
|
||||
Please first follow general [instructions](../getting_started/python.md) to
|
||||
install MediaPipe Python package, then learn more in the companion
|
||||
[Python Colab](#resources) and the following usage example.
|
||||
[Python Colab](#resources) and the usage example below.
|
||||
|
||||
Supported configuration options:
|
||||
|
||||
|
@ -297,11 +297,12 @@ mp_drawing = mp.solutions.drawing_utils
|
|||
mp_objectron = mp.solutions.objectron
|
||||
|
||||
# For static images:
|
||||
IMAGE_FILES = []
|
||||
with mp_objectron.Objectron(static_image_mode=True,
|
||||
max_num_objects=5,
|
||||
min_detection_confidence=0.5,
|
||||
model_name='Shoe') as objectron:
|
||||
for idx, file in enumerate(file_list):
|
||||
for idx, file in enumerate(IMAGE_FILES):
|
||||
image = cv2.imread(file)
|
||||
# Convert the BGR image to RGB and process it with MediaPipe Objectron.
|
||||
results = objectron.process(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
|
||||
|
|
|
@ -187,7 +187,7 @@ Naming style may differ slightly across platforms/languages.
|
|||
|
||||
#### pose_landmarks
|
||||
|
||||
A list of pose landmarks. Each lanmark consists of the following:
|
||||
A list of pose landmarks. Each landmark consists of the following:
|
||||
|
||||
* `x` and `y`: Landmark coordinates normalized to `[0.0, 1.0]` by the image
|
||||
width and height respectively.
|
||||
|
@ -202,7 +202,7 @@ A list of pose landmarks. Each lanmark consists of the following:
|
|||
|
||||
Please first follow general [instructions](../getting_started/python.md) to
|
||||
install MediaPipe Python package, then learn more in the companion
|
||||
[Python Colab](#resources) and the following usage example.
|
||||
[Python Colab](#resources) and the usage example below.
|
||||
|
||||
Supported configuration options:
|
||||
|
||||
|
@ -219,11 +219,12 @@ mp_drawing = mp.solutions.drawing_utils
|
|||
mp_pose = mp.solutions.pose
|
||||
|
||||
# For static images:
|
||||
IMAGE_FILES = []
|
||||
with mp_pose.Pose(
|
||||
static_image_mode=True,
|
||||
model_complexity=2,
|
||||
min_detection_confidence=0.5) as pose:
|
||||
for idx, file in enumerate(file_list):
|
||||
for idx, file in enumerate(IMAGE_FILES):
|
||||
image = cv2.imread(file)
|
||||
image_height, image_width, _ = image.shape
|
||||
# Convert the BGR image to RGB before processing.
|
||||
|
|
286
docs/solutions/selfie_segmentation.md
Normal file
286
docs/solutions/selfie_segmentation.md
Normal file
|
@ -0,0 +1,286 @@
|
|||
---
|
||||
layout: default
|
||||
title: Selfie Segmentation
|
||||
parent: Solutions
|
||||
nav_order: 7
|
||||
---
|
||||
|
||||
# MediaPipe Selfie Segmentation
|
||||
{: .no_toc }
|
||||
|
||||
<details close markdown="block">
|
||||
<summary>
|
||||
Table of contents
|
||||
</summary>
|
||||
{: .text-delta }
|
||||
1. TOC
|
||||
{:toc}
|
||||
</details>
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
*Fig 1. Example of MediaPipe Selfie Segmentation.* |
|
||||
:------------------------------------------------: |
|
||||
<video autoplay muted loop preload style="height: auto; width: 480px"><source src="../images/selfie_segmentation_web.mp4" type="video/mp4"></video> |
|
||||
|
||||
MediaPipe Selfie Segmentation segments the prominent humans in the scene. It can
|
||||
run in real-time on both smartphones and laptops. The intended use cases include
|
||||
selfie effects and video conferencing, where the person is close (< 2m) to the
|
||||
camera.
|
||||
|
||||
## Models
|
||||
|
||||
In this solution, we provide two models: general and landscape. Both models are
|
||||
based on
|
||||
[MobileNetV3](https://ai.googleblog.com/2019/11/introducing-next-generation-on-device.html),
|
||||
with modifications to make them more efficient. The general model operates on a
|
||||
256x256x3 (HWC) tensor, and outputs a 256x256x1 tensor representing the
|
||||
segmentation mask. The landscape model is similar to the general model, but
|
||||
operates on a 144x256x3 (HWC) tensor. It has fewer FLOPs than the general model,
|
||||
and therefore, runs faster. Note that MediaPipe Selfie Segmentation
|
||||
automatically resizes the input image to the desired tensor dimension before
|
||||
feeding it into the ML models.
|
||||
|
||||
The general model is also powering [ML Kit](https://developers.google.com/ml-kit/vision/selfie-segmentation),
|
||||
and a variant of the landscape model is powering [Google Meet](https://ai.googleblog.com/2020/10/background-features-in-google-meet.html).
|
||||
Please find more detail about the models in the [model card](./models.md#selfie_segmentation).
|
||||
|
||||
## ML Pipeline
|
||||
|
||||
The pipeline is implemented as a MediaPipe
|
||||
[graph](https://github.com/google/mediapipe/tree/master/mediapipe/graphs/selfie_segmentation/selfie_segmentation_gpu.pbtxt)
|
||||
that uses a
|
||||
[selfie segmentation subgraph](https://github.com/google/mediapipe/tree/master/mediapipe/modules/selfie_segmentation/selfie_segmentation_gpu.pbtxt)
|
||||
from the
|
||||
[selfie segmentation module](https://github.com/google/mediapipe/tree/master/mediapipe/modules/selfie_segmentation).
|
||||
|
||||
Note: To visualize a graph, copy the graph and paste it into
|
||||
[MediaPipe Visualizer](https://viz.mediapipe.dev/). For more information on how
|
||||
to visualize its associated subgraphs, please see
|
||||
[visualizer documentation](../tools/visualizer.md).
|
||||
|
||||
## Solution APIs
|
||||
|
||||
### Cross-platform Configuration Options
|
||||
|
||||
Naming style and availability may differ slightly across platforms/languages.
|
||||
|
||||
#### model_selection
|
||||
|
||||
An integer index `0` or `1`. Use `0` to select the general model, and `1` to
|
||||
select the landscape model (see details in [Models](#models)). Default to `0` if
|
||||
not specified.
|
||||
|
||||
### Output
|
||||
|
||||
Naming style may differ slightly across platforms/languages.
|
||||
|
||||
#### segmentation_mask
|
||||
|
||||
The output segmentation mask, which has the same dimension as the input image.
|
||||
|
||||
### Python Solution API
|
||||
|
||||
Please first follow general [instructions](../getting_started/python.md) to
|
||||
install MediaPipe Python package, then learn more in the companion
|
||||
[Python Colab](#resources) and the usage example below.
|
||||
|
||||
Supported configuration options:
|
||||
|
||||
* [model_selection](#model_selection)
|
||||
|
||||
```python
|
||||
import cv2
|
||||
import mediapipe as mp
|
||||
mp_drawing = mp.solutions.drawing_utils
|
||||
mp_selfie_segmentation = mp.solutions.selfie_segmentation
|
||||
|
||||
# For static images:
|
||||
IMAGE_FILES = []
|
||||
BG_COLOR = (192, 192, 192) # gray
|
||||
MASK_COLOR = (255, 255, 255) # white
|
||||
with mp_selfie_segmentation.SelfieSegmentation(
|
||||
model_selection=0) as selfie_segmentation:
|
||||
for idx, file in enumerate(IMAGE_FILES):
|
||||
image = cv2.imread(file)
|
||||
image_height, image_width, _ = image.shape
|
||||
# Convert the BGR image to RGB before processing.
|
||||
results = selfie_segmentation.process(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
|
||||
|
||||
# Draw selfie segmentation on the background image.
|
||||
# To improve segmentation around boundaries, consider applying a joint
|
||||
# bilateral filter to "results.segmentation_mask" with "image".
|
||||
condition = np.stack((results.segmentation_mask,) * 3, axis=-1) > 0.1
|
||||
# Generate solid color images for showing the output selfie segmentation mask.
|
||||
fg_image = np.zeros(image.shape, dtype=np.uint8)
|
||||
fg_image[:] = MASK_COLOR
|
||||
bg_image = np.zeros(image.shape, dtype=np.uint8)
|
||||
bg_image[:] = BG_COLOR
|
||||
output_image = np.where(condition, fg_image, bg_image)
|
||||
cv2.imwrite('/tmp/selfie_segmentation_output' + str(idx) + '.png', output_image)
|
||||
|
||||
# For webcam input:
|
||||
BG_COLOR = (192, 192, 192) # gray
|
||||
cap = cv2.VideoCapture(0)
|
||||
with mp_selfie_segmentation.SelfieSegmentation(
|
||||
model_selection=1) as selfie_segmentation:
|
||||
bg_image = None
|
||||
while cap.isOpened():
|
||||
success, image = cap.read()
|
||||
if not success:
|
||||
print("Ignoring empty camera frame.")
|
||||
# If loading a video, use 'break' instead of 'continue'.
|
||||
continue
|
||||
|
||||
# Flip the image horizontally for a later selfie-view display, and convert
|
||||
# the BGR image to RGB.
|
||||
image = cv2.cvtColor(cv2.flip(image, 1), cv2.COLOR_BGR2RGB)
|
||||
# To improve performance, optionally mark the image as not writeable to
|
||||
# pass by reference.
|
||||
image.flags.writeable = False
|
||||
results = selfie_segmentation.process(image)
|
||||
|
||||
image.flags.writeable = True
|
||||
image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)
|
||||
|
||||
# Draw selfie segmentation on the background image.
|
||||
# To improve segmentation around boundaries, consider applying a joint
|
||||
# bilateral filter to "results.segmentation_mask" with "image".
|
||||
condition = np.stack(
|
||||
(results.segmentation_mask,) * 3, axis=-1) > 0.1
|
||||
# The background can be customized.
|
||||
# a) Load an image (with the same width and height of the input image) to
|
||||
# be the background, e.g., bg_image = cv2.imread('/path/to/image/file')
|
||||
# b) Blur the input image by applying image filtering, e.g.,
|
||||
# bg_image = cv2.GaussianBlur(image,(55,55),0)
|
||||
if bg_image is None:
|
||||
bg_image = np.zeros(image.shape, dtype=np.uint8)
|
||||
bg_image[:] = BG_COLOR
|
||||
output_image = np.where(condition, image, bg_image)
|
||||
|
||||
cv2.imshow('MediaPipe Selfie Segmentation', output_image)
|
||||
if cv2.waitKey(5) & 0xFF == 27:
|
||||
break
|
||||
cap.release()
|
||||
```
|
||||
|
||||
### JavaScript Solution API
|
||||
|
||||
Please first see general [introduction](../getting_started/javascript.md) on
|
||||
MediaPipe in JavaScript, then learn more in the companion [web demo](#resources)
|
||||
and the following usage example.
|
||||
|
||||
Supported configuration options:
|
||||
|
||||
* [modelSelection](#model_selection)
|
||||
|
||||
```html
|
||||
<!DOCTYPE html>
|
||||
<html>
|
||||
<head>
|
||||
<meta charset="utf-8">
|
||||
<script src="https://cdn.jsdelivr.net/npm/@mediapipe/camera_utils/camera_utils.js" crossorigin="anonymous"></script>
|
||||
<script src="https://cdn.jsdelivr.net/npm/@mediapipe/control_utils/control_utils.js" crossorigin="anonymous"></script>
|
||||
<script src="https://cdn.jsdelivr.net/npm/@mediapipe/drawing_utils/drawing_utils.js" crossorigin="anonymous"></script>
|
||||
<script src="https://cdn.jsdelivr.net/npm/@mediapipe/selfie_segmentation/selfie_segmentation.js" crossorigin="anonymous"></script>
|
||||
</head>
|
||||
|
||||
<body>
|
||||
<div class="container">
|
||||
<video class="input_video"></video>
|
||||
<canvas class="output_canvas" width="1280px" height="720px"></canvas>
|
||||
</div>
|
||||
</body>
|
||||
</html>
|
||||
```
|
||||
|
||||
```javascript
|
||||
<script type="module">
|
||||
const videoElement = document.getElementsByClassName('input_video')[0];
|
||||
const canvasElement = document.getElementsByClassName('output_canvas')[0];
|
||||
const canvasCtx = canvasElement.getContext('2d');
|
||||
|
||||
function onResults(results) {
|
||||
canvasCtx.save();
|
||||
canvasCtx.clearRect(0, 0, canvasElement.width, canvasElement.height);
|
||||
canvasCtx.drawImage(results.segmentationMask, 0, 0,
|
||||
canvasElement.width, canvasElement.height);
|
||||
|
||||
// Only overwrite existing pixels.
|
||||
canvasCtx.globalCompositeOperation = 'source-in';
|
||||
canvasCtx.fillStyle = '#00FF00';
|
||||
canvasCtx.fillRect(0, 0, canvasElement.width, canvasElement.height);
|
||||
|
||||
// Only overwrite missing pixels.
|
||||
canvasCtx.globalCompositeOperation = 'destination-atop';
|
||||
canvasCtx.drawImage(
|
||||
results.image, 0, 0, canvasElement.width, canvasElement.height);
|
||||
|
||||
canvasCtx.restore();
|
||||
}
|
||||
|
||||
const selfieSegmentation = new SelfieSegmentation({locateFile: (file) => {
|
||||
return `https://cdn.jsdelivr.net/npm/@mediapipe/selfie_segmentation/${file}`;
|
||||
}});
|
||||
selfieSegmentation.setOptions({
|
||||
modelSelection: 1,
|
||||
});
|
||||
selfieSegmentation.onResults(onResults);
|
||||
|
||||
const camera = new Camera(videoElement, {
|
||||
onFrame: async () => {
|
||||
await selfieSegmentation.send({image: videoElement});
|
||||
},
|
||||
width: 1280,
|
||||
height: 720
|
||||
});
|
||||
camera.start();
|
||||
</script>
|
||||
```
|
||||
|
||||
## Example Apps
|
||||
|
||||
Please first see general instructions for
|
||||
[Android](../getting_started/android.md), [iOS](../getting_started/ios.md), and
|
||||
[desktop](../getting_started/cpp.md) on how to build MediaPipe examples.
|
||||
|
||||
Note: To visualize a graph, copy the graph and paste it into
|
||||
[MediaPipe Visualizer](https://viz.mediapipe.dev/). For more information on how
|
||||
to visualize its associated subgraphs, please see
|
||||
[visualizer documentation](../tools/visualizer.md).
|
||||
|
||||
### Mobile
|
||||
|
||||
* Graph:
|
||||
[`mediapipe/graphs/selfie_segmentation/selfie_segmentation_gpu.pbtxt`](https://github.com/google/mediapipe/tree/master/mediapipe/graphs/selfie_segmentation/selfie_segmentation_gpu.pbtxt)
|
||||
* Android target:
|
||||
[(or download prebuilt ARM64 APK)](https://drive.google.com/file/d/1DoeyGzMmWUsjfVgZfGGecrn7GKzYcEAo/view?usp=sharing)
|
||||
[`mediapipe/examples/android/src/java/com/google/mediapipe/apps/selfiesegmentationgpu:selfiesegmentationgpu`](https://github.com/google/mediapipe/tree/master/mediapipe/examples/android/src/java/com/google/mediapipe/apps/selfiesegmentationgpu/BUILD)
|
||||
* iOS target:
|
||||
[`mediapipe/examples/ios/selfiesegmentationgpu:SelfieSegmentationGpuApp`](http:/mediapipe/examples/ios/selfiesegmentationgpu/BUILD)
|
||||
|
||||
### Desktop
|
||||
|
||||
Please first see general instructions for [desktop](../getting_started/cpp.md)
|
||||
on how to build MediaPipe examples.
|
||||
|
||||
* Running on CPU
|
||||
* Graph:
|
||||
[`mediapipe/graphs/selfie_segmentation/selfie_segmentation_cpu.pbtxt`](https://github.com/google/mediapipe/tree/master/mediapipe/graphs/selfie_segmentation/selfie_segmentation_cpu.pbtxt)
|
||||
* Target:
|
||||
[`mediapipe/examples/desktop/selfie_segmentation:selfie_segmentation_cpu`](https://github.com/google/mediapipe/tree/master/mediapipe/examples/desktop/selfie_segmentation/BUILD)
|
||||
* Running on GPU
|
||||
* Graph:
|
||||
[`mediapipe/graphs/selfie_segmentation/selfie_segmentation_gpu.pbtxt`](https://github.com/google/mediapipe/tree/master/mediapipe/graphs/selfie_segmentation/selfie_segmentation_gpu.pbtxt)
|
||||
* Target:
|
||||
[`mediapipe/examples/desktop/selfie_segmentation:selfie_segmentation_gpu`](https://github.com/google/mediapipe/tree/master/mediapipe/examples/desktop/selfie_segmentation/BUILD)
|
||||
|
||||
## Resources
|
||||
|
||||
* Google AI Blog:
|
||||
[Background Features in Google Meet, Powered by Web ML](https://ai.googleblog.com/2020/10/background-features-in-google-meet.html)
|
||||
* [ML Kit Selfie Segmentation API](https://developers.google.com/ml-kit/vision/selfie-segmentation)
|
||||
* [Models and model cards](./models.md#selfie_segmentation)
|
||||
* [Web demo](https://code.mediapipe.dev/codepen/selfie_segmentation)
|
||||
* [Python Colab](https://mediapipe.page.link/selfie_segmentation_py_colab)
|
|
@ -24,11 +24,12 @@ has_toc: false
|
|||
[Hands](https://google.github.io/mediapipe/solutions/hands) | ✅ | ✅ | ✅ | ✅ | ✅ |
|
||||
[Pose](https://google.github.io/mediapipe/solutions/pose) | ✅ | ✅ | ✅ | ✅ | ✅ |
|
||||
[Holistic](https://google.github.io/mediapipe/solutions/holistic) | ✅ | ✅ | ✅ | ✅ | ✅ |
|
||||
[Selfie Segmentation](https://google.github.io/mediapipe/solutions/selfie_segmentation) | ✅ | ✅ | ✅ | ✅ | ✅ |
|
||||
[Hair Segmentation](https://google.github.io/mediapipe/solutions/hair_segmentation) | ✅ | | ✅ | | |
|
||||
[Object Detection](https://google.github.io/mediapipe/solutions/object_detection) | ✅ | ✅ | ✅ | | | ✅
|
||||
[Box Tracking](https://google.github.io/mediapipe/solutions/box_tracking) | ✅ | ✅ | ✅ | | |
|
||||
[Instant Motion Tracking](https://google.github.io/mediapipe/solutions/instant_motion_tracking) | ✅ | | | | |
|
||||
[Objectron](https://google.github.io/mediapipe/solutions/objectron) | ✅ | | ✅ | ✅ | |
|
||||
[Objectron](https://google.github.io/mediapipe/solutions/objectron) | ✅ | | ✅ | ✅ | |
|
||||
[KNIFT](https://google.github.io/mediapipe/solutions/knift) | ✅ | | | | |
|
||||
[AutoFlip](https://google.github.io/mediapipe/solutions/autoflip) | | | ✅ | | |
|
||||
[MediaSequence](https://google.github.io/mediapipe/solutions/media_sequence) | | | ✅ | | |
|
||||
|
|
|
@ -2,7 +2,7 @@
|
|||
layout: default
|
||||
title: YouTube-8M Feature Extraction and Model Inference
|
||||
parent: Solutions
|
||||
nav_order: 15
|
||||
nav_order: 16
|
||||
---
|
||||
|
||||
# YouTube-8M Feature Extraction and Model Inference
|
||||
|
|
|
@ -16,6 +16,7 @@
|
|||
"mediapipe/examples/ios/objectdetectiongpu/BUILD",
|
||||
"mediapipe/examples/ios/objectdetectiontrackinggpu/BUILD",
|
||||
"mediapipe/examples/ios/posetrackinggpu/BUILD",
|
||||
"mediapipe/examples/ios/selfiesegmentationgpu/BUILD",
|
||||
"mediapipe/framework/BUILD",
|
||||
"mediapipe/gpu/BUILD",
|
||||
"mediapipe/objc/BUILD",
|
||||
|
@ -35,6 +36,7 @@
|
|||
"//mediapipe/examples/ios/objectdetectiongpu:ObjectDetectionGpuApp",
|
||||
"//mediapipe/examples/ios/objectdetectiontrackinggpu:ObjectDetectionTrackingGpuApp",
|
||||
"//mediapipe/examples/ios/posetrackinggpu:PoseTrackingGpuApp",
|
||||
"//mediapipe/examples/ios/selfiesegmentationgpu:SelfieSegmentationGpuApp",
|
||||
"//mediapipe/objc:mediapipe_framework_ios"
|
||||
],
|
||||
"optionSet" : {
|
||||
|
@ -103,6 +105,7 @@
|
|||
"mediapipe/examples/ios/objectdetectioncpu",
|
||||
"mediapipe/examples/ios/objectdetectiongpu",
|
||||
"mediapipe/examples/ios/posetrackinggpu",
|
||||
"mediapipe/examples/ios/selfiesegmentationgpu",
|
||||
"mediapipe/framework",
|
||||
"mediapipe/framework/deps",
|
||||
"mediapipe/framework/formats",
|
||||
|
@ -120,6 +123,7 @@
|
|||
"mediapipe/graphs/hand_tracking",
|
||||
"mediapipe/graphs/object_detection",
|
||||
"mediapipe/graphs/pose_tracking",
|
||||
"mediapipe/graphs/selfie_segmentation",
|
||||
"mediapipe/models",
|
||||
"mediapipe/modules",
|
||||
"mediapipe/objc",
|
||||
|
|
|
@ -22,6 +22,7 @@
|
|||
"mediapipe/examples/ios/objectdetectiongpu",
|
||||
"mediapipe/examples/ios/objectdetectiontrackinggpu",
|
||||
"mediapipe/examples/ios/posetrackinggpu",
|
||||
"mediapipe/examples/ios/selfiesegmentationgpu",
|
||||
"mediapipe/objc"
|
||||
],
|
||||
"projectName" : "Mediapipe",
|
||||
|
|
|
@ -37,6 +37,22 @@ constexpr char kImageFrameTag[] = "IMAGE";
|
|||
constexpr char kMaskCpuTag[] = "MASK";
|
||||
constexpr char kGpuBufferTag[] = "IMAGE_GPU";
|
||||
constexpr char kMaskGpuTag[] = "MASK_GPU";
|
||||
|
||||
inline cv::Vec3b Blend(const cv::Vec3b& color1, const cv::Vec3b& color2,
|
||||
float weight, int invert_mask,
|
||||
int adjust_with_luminance) {
|
||||
weight = (1 - invert_mask) * weight + invert_mask * (1.0f - weight);
|
||||
|
||||
float luminance =
|
||||
(1 - adjust_with_luminance) * 1.0f +
|
||||
adjust_with_luminance *
|
||||
(color1[0] * 0.299 + color1[1] * 0.587 + color1[2] * 0.114) / 255;
|
||||
|
||||
float mix_value = weight * luminance;
|
||||
|
||||
return color1 * (1.0 - mix_value) + color2 * mix_value;
|
||||
}
|
||||
|
||||
} // namespace
|
||||
|
||||
namespace mediapipe {
|
||||
|
@ -44,15 +60,14 @@ namespace mediapipe {
|
|||
// A calculator to recolor a masked area of an image to a specified color.
|
||||
//
|
||||
// A mask image is used to specify where to overlay a user defined color.
|
||||
// The luminance of the input image is used to adjust the blending weight,
|
||||
// to help preserve image textures.
|
||||
//
|
||||
// Inputs:
|
||||
// One of the following IMAGE tags:
|
||||
// IMAGE: An ImageFrame input image, RGB or RGBA.
|
||||
// IMAGE: An ImageFrame input image in ImageFormat::SRGB.
|
||||
// IMAGE_GPU: A GpuBuffer input image, RGBA.
|
||||
// One of the following MASK tags:
|
||||
// MASK: An ImageFrame input mask, Gray, RGB or RGBA.
|
||||
// MASK: An ImageFrame input mask in ImageFormat::GRAY8, SRGB, SRGBA, or
|
||||
// VEC32F1
|
||||
// MASK_GPU: A GpuBuffer input mask, RGBA.
|
||||
// Output:
|
||||
// One of the following IMAGE tags:
|
||||
|
@ -98,10 +113,12 @@ class RecolorCalculator : public CalculatorBase {
|
|||
void GlRender();
|
||||
|
||||
bool initialized_ = false;
|
||||
std::vector<float> color_;
|
||||
std::vector<uint8> color_;
|
||||
mediapipe::RecolorCalculatorOptions::MaskChannel mask_channel_;
|
||||
|
||||
bool use_gpu_ = false;
|
||||
bool invert_mask_ = false;
|
||||
bool adjust_with_luminance_ = false;
|
||||
#if !MEDIAPIPE_DISABLE_GPU
|
||||
mediapipe::GlCalculatorHelper gpu_helper_;
|
||||
GLuint program_ = 0;
|
||||
|
@ -233,11 +250,15 @@ absl::Status RecolorCalculator::RenderCpu(CalculatorContext* cc) {
|
|||
}
|
||||
cv::Mat mask_full;
|
||||
cv::resize(mask_mat, mask_full, input_mat.size());
|
||||
const cv::Vec3b recolor = {color_[0], color_[1], color_[2]};
|
||||
|
||||
auto output_img = absl::make_unique<ImageFrame>(
|
||||
input_img.Format(), input_mat.cols, input_mat.rows);
|
||||
cv::Mat output_mat = mediapipe::formats::MatView(output_img.get());
|
||||
|
||||
const int invert_mask = invert_mask_ ? 1 : 0;
|
||||
const int adjust_with_luminance = adjust_with_luminance_ ? 1 : 0;
|
||||
|
||||
// From GPU shader:
|
||||
/*
|
||||
vec4 weight = texture2D(mask, sample_coordinate);
|
||||
|
@ -249,18 +270,23 @@ absl::Status RecolorCalculator::RenderCpu(CalculatorContext* cc) {
|
|||
|
||||
fragColor = mix(color1, color2, mix_value);
|
||||
*/
|
||||
for (int i = 0; i < output_mat.rows; ++i) {
|
||||
for (int j = 0; j < output_mat.cols; ++j) {
|
||||
float weight = mask_full.at<uchar>(i, j) * (1.0 / 255.0);
|
||||
cv::Vec3f color1 = input_mat.at<cv::Vec3b>(i, j);
|
||||
cv::Vec3f color2 = {color_[0], color_[1], color_[2]};
|
||||
|
||||
float luminance =
|
||||
(color1[0] * 0.299 + color1[1] * 0.587 + color1[2] * 0.114) / 255;
|
||||
float mix_value = weight * luminance;
|
||||
|
||||
cv::Vec3b mix_color = color1 * (1.0 - mix_value) + color2 * mix_value;
|
||||
output_mat.at<cv::Vec3b>(i, j) = mix_color;
|
||||
if (mask_img.Format() == ImageFormat::VEC32F1) {
|
||||
for (int i = 0; i < output_mat.rows; ++i) {
|
||||
for (int j = 0; j < output_mat.cols; ++j) {
|
||||
const float weight = mask_full.at<float>(i, j);
|
||||
output_mat.at<cv::Vec3b>(i, j) =
|
||||
Blend(input_mat.at<cv::Vec3b>(i, j), recolor, weight, invert_mask,
|
||||
adjust_with_luminance);
|
||||
}
|
||||
}
|
||||
} else {
|
||||
for (int i = 0; i < output_mat.rows; ++i) {
|
||||
for (int j = 0; j < output_mat.cols; ++j) {
|
||||
const float weight = mask_full.at<uchar>(i, j) * (1.0 / 255.0);
|
||||
output_mat.at<cv::Vec3b>(i, j) =
|
||||
Blend(input_mat.at<cv::Vec3b>(i, j), recolor, weight, invert_mask,
|
||||
adjust_with_luminance);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
|
@ -385,6 +411,9 @@ absl::Status RecolorCalculator::LoadOptions(CalculatorContext* cc) {
|
|||
color_.push_back(options.color().g());
|
||||
color_.push_back(options.color().b());
|
||||
|
||||
invert_mask_ = options.invert_mask();
|
||||
adjust_with_luminance_ = options.adjust_with_luminance();
|
||||
|
||||
return absl::OkStatus();
|
||||
}
|
||||
|
||||
|
@ -435,13 +464,20 @@ absl::Status RecolorCalculator::InitGpu(CalculatorContext* cc) {
|
|||
uniform sampler2D frame;
|
||||
uniform sampler2D mask;
|
||||
uniform vec3 recolor;
|
||||
uniform float invert_mask;
|
||||
uniform float adjust_with_luminance;
|
||||
|
||||
void main() {
|
||||
vec4 weight = texture2D(mask, sample_coordinate);
|
||||
vec4 color1 = texture2D(frame, sample_coordinate);
|
||||
vec4 color2 = vec4(recolor, 1.0);
|
||||
|
||||
float luminance = dot(color1.rgb, vec3(0.299, 0.587, 0.114));
|
||||
weight = mix(weight, 1.0 - weight, invert_mask);
|
||||
|
||||
float luminance = mix(1.0,
|
||||
dot(color1.rgb, vec3(0.299, 0.587, 0.114)),
|
||||
adjust_with_luminance);
|
||||
|
||||
float mix_value = weight.MASK_COMPONENT * luminance;
|
||||
|
||||
fragColor = mix(color1, color2, mix_value);
|
||||
|
@ -458,6 +494,10 @@ absl::Status RecolorCalculator::InitGpu(CalculatorContext* cc) {
|
|||
glUniform1i(glGetUniformLocation(program_, "mask"), 2);
|
||||
glUniform3f(glGetUniformLocation(program_, "recolor"), color_[0] / 255.0,
|
||||
color_[1] / 255.0, color_[2] / 255.0);
|
||||
glUniform1f(glGetUniformLocation(program_, "invert_mask"),
|
||||
invert_mask_ ? 1.0f : 0.0f);
|
||||
glUniform1f(glGetUniformLocation(program_, "adjust_with_luminance"),
|
||||
adjust_with_luminance_ ? 1.0f : 0.0f);
|
||||
#endif // !MEDIAPIPE_DISABLE_GPU
|
||||
|
||||
return absl::OkStatus();
|
||||
|
|
|
@ -36,4 +36,11 @@ message RecolorCalculatorOptions {
|
|||
// Color to blend into input image where mask is > 0.
|
||||
// The blending is based on the input image luminosity.
|
||||
optional Color color = 2;
|
||||
|
||||
// Swap the meaning of mask values for foreground/background.
|
||||
optional bool invert_mask = 3 [default = false];
|
||||
|
||||
// Whether to use the luminance of the input image to further adjust the
|
||||
// blending weight, to help preserve image textures.
|
||||
optional bool adjust_with_luminance = 4 [default = true];
|
||||
}
|
||||
|
|
|
@ -753,3 +753,76 @@ cc_test(
|
|||
"//mediapipe/framework/port:gtest_main",
|
||||
],
|
||||
)
|
||||
|
||||
# Copied from /mediapipe/calculators/tflite/BUILD
|
||||
selects.config_setting_group(
|
||||
name = "gpu_inference_disabled",
|
||||
match_any = [
|
||||
"//mediapipe/gpu:disable_gpu",
|
||||
],
|
||||
)
|
||||
|
||||
mediapipe_proto_library(
|
||||
name = "tensors_to_segmentation_calculator_proto",
|
||||
srcs = ["tensors_to_segmentation_calculator.proto"],
|
||||
visibility = ["//visibility:public"],
|
||||
deps = [
|
||||
"//mediapipe/framework:calculator_options_proto",
|
||||
"//mediapipe/framework:calculator_proto",
|
||||
"//mediapipe/gpu:gpu_origin_proto",
|
||||
],
|
||||
)
|
||||
|
||||
cc_library(
|
||||
name = "tensors_to_segmentation_calculator",
|
||||
srcs = ["tensors_to_segmentation_calculator.cc"],
|
||||
copts = select({
|
||||
"//mediapipe:apple": [
|
||||
"-x objective-c++",
|
||||
"-fobjc-arc", # enable reference-counting
|
||||
],
|
||||
"//conditions:default": [],
|
||||
}),
|
||||
visibility = ["//visibility:public"],
|
||||
deps = [
|
||||
":tensors_to_segmentation_calculator_cc_proto",
|
||||
"@com_google_absl//absl/strings:str_format",
|
||||
"@com_google_absl//absl/strings",
|
||||
"@com_google_absl//absl/types:span",
|
||||
"//mediapipe/framework/formats:image",
|
||||
"//mediapipe/framework/formats:image_frame",
|
||||
"//mediapipe/framework/formats:image_opencv",
|
||||
"//mediapipe/framework/formats:tensor",
|
||||
"//mediapipe/framework/port:opencv_imgproc",
|
||||
"//mediapipe/framework/port:ret_check",
|
||||
"//mediapipe/framework:calculator_context",
|
||||
"//mediapipe/framework:calculator_framework",
|
||||
"//mediapipe/framework:port",
|
||||
"//mediapipe/util:resource_util",
|
||||
"@org_tensorflow//tensorflow/lite:framework",
|
||||
"//mediapipe/gpu:gpu_origin_cc_proto",
|
||||
"//mediapipe/framework/port:statusor",
|
||||
] + selects.with_or({
|
||||
"//mediapipe/gpu:disable_gpu": [],
|
||||
"//conditions:default": [
|
||||
"//mediapipe/gpu:gl_calculator_helper",
|
||||
"//mediapipe/gpu:gl_simple_shaders",
|
||||
"//mediapipe/gpu:gpu_buffer",
|
||||
"//mediapipe/gpu:shader_util",
|
||||
],
|
||||
}) + selects.with_or({
|
||||
":gpu_inference_disabled": [],
|
||||
"//mediapipe:ios": [
|
||||
"//mediapipe/gpu:MPPMetalUtil",
|
||||
"//mediapipe/gpu:MPPMetalHelper",
|
||||
],
|
||||
"//conditions:default": [
|
||||
"@org_tensorflow//tensorflow/lite/delegates/gpu:gl_delegate",
|
||||
"@org_tensorflow//tensorflow/lite/delegates/gpu/gl:gl_program",
|
||||
"@org_tensorflow//tensorflow/lite/delegates/gpu/gl:gl_shader",
|
||||
"@org_tensorflow//tensorflow/lite/delegates/gpu/gl:gl_texture",
|
||||
"@org_tensorflow//tensorflow/lite/delegates/gpu/gl/converters:util",
|
||||
],
|
||||
}),
|
||||
alwayslink = 1,
|
||||
)
|
||||
|
|
|
@ -105,6 +105,15 @@ void ConvertAnchorsToRawValues(const std::vector<Anchor>& anchors,
|
|||
// for anchors (e.g. for SSD models) depend on the outputs of the
|
||||
// detection model. The size of anchor tensor must be (num_boxes *
|
||||
// 4).
|
||||
//
|
||||
// Input side packet:
|
||||
// ANCHORS (optional) - The anchors used for decoding the bounding boxes, as a
|
||||
// vector of `Anchor` protos. Not required if post-processing is built-in
|
||||
// the model.
|
||||
// IGNORE_CLASSES (optional) - The list of class ids that should be ignored, as
|
||||
// a vector of integers. It overrides the corresponding field in the
|
||||
// calculator options.
|
||||
//
|
||||
// Output:
|
||||
// DETECTIONS - Result MediaPipe detections.
|
||||
//
|
||||
|
@ -132,8 +141,11 @@ class TensorsToDetectionsCalculator : public Node {
|
|||
static constexpr Input<std::vector<Tensor>> kInTensors{"TENSORS"};
|
||||
static constexpr SideInput<std::vector<Anchor>>::Optional kInAnchors{
|
||||
"ANCHORS"};
|
||||
static constexpr SideInput<std::vector<int>>::Optional kSideInIgnoreClasses{
|
||||
"IGNORE_CLASSES"};
|
||||
static constexpr Output<std::vector<Detection>> kOutDetections{"DETECTIONS"};
|
||||
MEDIAPIPE_NODE_CONTRACT(kInTensors, kInAnchors, kOutDetections);
|
||||
MEDIAPIPE_NODE_CONTRACT(kInTensors, kInAnchors, kSideInIgnoreClasses,
|
||||
kOutDetections);
|
||||
static absl::Status UpdateContract(CalculatorContract* cc);
|
||||
|
||||
absl::Status Open(CalculatorContext* cc) override;
|
||||
|
@ -566,8 +578,15 @@ absl::Status TensorsToDetectionsCalculator::LoadOptions(CalculatorContext* cc) {
|
|||
kNumCoordsPerBox,
|
||||
num_coords_);
|
||||
|
||||
for (int i = 0; i < options_.ignore_classes_size(); ++i) {
|
||||
ignore_classes_.insert(options_.ignore_classes(i));
|
||||
if (kSideInIgnoreClasses(cc).IsConnected()) {
|
||||
RET_CHECK(!kSideInIgnoreClasses(cc).IsEmpty());
|
||||
for (int ignore_class : *kSideInIgnoreClasses(cc)) {
|
||||
ignore_classes_.insert(ignore_class);
|
||||
}
|
||||
} else {
|
||||
for (int i = 0; i < options_.ignore_classes_size(); ++i) {
|
||||
ignore_classes_.insert(options_.ignore_classes(i));
|
||||
}
|
||||
}
|
||||
|
||||
return absl::OkStatus();
|
||||
|
|
|
@ -56,7 +56,7 @@ message TensorsToDetectionsCalculatorOptions {
|
|||
// [x_center, y_center, w, h].
|
||||
optional bool reverse_output_order = 14 [default = false];
|
||||
// The ids of classes that should be ignored during decoding the score for
|
||||
// each predicted box.
|
||||
// each predicted box. Can be overridden with IGNORE_CLASSES side packet.
|
||||
repeated int32 ignore_classes = 8;
|
||||
|
||||
optional bool sigmoid_score = 15 [default = false];
|
||||
|
|
|
@ -0,0 +1,885 @@
|
|||
// Copyright 2021 The MediaPipe Authors.
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
// You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software
|
||||
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
|
||||
#include <vector>
|
||||
|
||||
#include "absl/strings/str_format.h"
|
||||
#include "absl/types/span.h"
|
||||
#include "mediapipe/calculators/tensor/tensors_to_segmentation_calculator.pb.h"
|
||||
#include "mediapipe/framework/calculator_context.h"
|
||||
#include "mediapipe/framework/calculator_framework.h"
|
||||
#include "mediapipe/framework/formats/image.h"
|
||||
#include "mediapipe/framework/formats/image_opencv.h"
|
||||
#include "mediapipe/framework/formats/tensor.h"
|
||||
#include "mediapipe/framework/port.h"
|
||||
#include "mediapipe/framework/port/opencv_imgproc_inc.h"
|
||||
#include "mediapipe/framework/port/ret_check.h"
|
||||
#include "mediapipe/framework/port/statusor.h"
|
||||
#include "mediapipe/gpu/gpu_origin.pb.h"
|
||||
#include "mediapipe/util/resource_util.h"
|
||||
#include "tensorflow/lite/interpreter.h"
|
||||
|
||||
#if !MEDIAPIPE_DISABLE_GPU
|
||||
#include "mediapipe/gpu/gl_calculator_helper.h"
|
||||
#include "mediapipe/gpu/gl_simple_shaders.h"
|
||||
#include "mediapipe/gpu/gpu_buffer.h"
|
||||
#include "mediapipe/gpu/shader_util.h"
|
||||
#endif // !MEDIAPIPE_DISABLE_GPU
|
||||
|
||||
#if MEDIAPIPE_OPENGL_ES_VERSION >= MEDIAPIPE_OPENGL_ES_31
|
||||
#include "tensorflow/lite/delegates/gpu/gl/converters/util.h"
|
||||
#include "tensorflow/lite/delegates/gpu/gl/gl_program.h"
|
||||
#include "tensorflow/lite/delegates/gpu/gl/gl_shader.h"
|
||||
#include "tensorflow/lite/delegates/gpu/gl/gl_texture.h"
|
||||
#include "tensorflow/lite/delegates/gpu/gl_delegate.h"
|
||||
#endif // MEDIAPIPE_OPENGL_ES_VERSION >= MEDIAPIPE_OPENGL_ES_31
|
||||
|
||||
#if MEDIAPIPE_METAL_ENABLED
|
||||
#import <CoreVideo/CoreVideo.h>
|
||||
#import <Metal/Metal.h>
|
||||
#import <MetalKit/MetalKit.h>
|
||||
|
||||
#import "mediapipe/gpu/MPPMetalHelper.h"
|
||||
#include "mediapipe/gpu/MPPMetalUtil.h"
|
||||
#endif // MEDIAPIPE_METAL_ENABLED
|
||||
|
||||
namespace {
|
||||
constexpr int kWorkgroupSize = 8; // Block size for GPU shader.
|
||||
enum { ATTRIB_VERTEX, ATTRIB_TEXTURE_POSITION, NUM_ATTRIBUTES };
|
||||
|
||||
// Commonly used to compute the number of blocks to launch in a kernel.
|
||||
int NumGroups(const int size, const int group_size) { // NOLINT
|
||||
return (size + group_size - 1) / group_size;
|
||||
}
|
||||
|
||||
bool CanUseGpu() {
|
||||
#if !MEDIAPIPE_DISABLE_GPU || MEDIAPIPE_METAL_ENABLED
|
||||
// TODO: Configure GPU usage policy in individual calculators.
|
||||
constexpr bool kAllowGpuProcessing = true;
|
||||
return kAllowGpuProcessing;
|
||||
#else
|
||||
return false;
|
||||
#endif // !MEDIAPIPE_DISABLE_GPU || MEDIAPIPE_METAL_ENABLED
|
||||
}
|
||||
|
||||
constexpr char kTensorsTag[] = "TENSORS";
|
||||
constexpr char kOutputSizeTag[] = "OUTPUT_SIZE";
|
||||
constexpr char kMaskTag[] = "MASK";
|
||||
|
||||
absl::StatusOr<std::tuple<int, int, int>> GetHwcFromDims(
|
||||
const std::vector<int>& dims) {
|
||||
if (dims.size() == 3) {
|
||||
return std::make_tuple(dims[0], dims[1], dims[2]);
|
||||
} else if (dims.size() == 4) {
|
||||
// BHWC format check B == 1
|
||||
RET_CHECK_EQ(1, dims[0]) << "Expected batch to be 1 for BHWC heatmap";
|
||||
return std::make_tuple(dims[1], dims[2], dims[3]);
|
||||
} else {
|
||||
RET_CHECK(false) << "Invalid shape for segmentation tensor " << dims.size();
|
||||
}
|
||||
}
|
||||
} // namespace
|
||||
|
||||
namespace mediapipe {
|
||||
|
||||
#if MEDIAPIPE_OPENGL_ES_VERSION >= MEDIAPIPE_OPENGL_ES_31
|
||||
using ::tflite::gpu::gl::GlProgram;
|
||||
using ::tflite::gpu::gl::GlShader;
|
||||
#endif // MEDIAPIPE_OPENGL_ES_VERSION >= MEDIAPIPE_OPENGL_ES_31
|
||||
|
||||
// Converts Tensors from a tflite segmentation model to an image mask.
|
||||
//
|
||||
// Performs optional upscale to OUTPUT_SIZE dimensions if provided,
|
||||
// otherwise the mask is the same size as input tensor.
|
||||
//
|
||||
// If at least one input tensor is already on GPU, processing happens on GPU and
|
||||
// the output mask is also stored on GPU. Otherwise, processing and the output
|
||||
// mask are both on CPU.
|
||||
//
|
||||
// On GPU, the mask is an RGBA image, in both the R & A channels, scaled 0-1.
|
||||
// On CPU, the mask is a ImageFormat::VEC32F1 image, with values scaled 0-1.
|
||||
//
|
||||
//
|
||||
// Inputs:
|
||||
// One of the following TENSORS tags:
|
||||
// TENSORS: Vector of Tensor,
|
||||
// The tensor dimensions are specified in this calculator's options.
|
||||
// OUTPUT_SIZE(optional): std::pair<int, int>,
|
||||
// If provided, the size to upscale mask to.
|
||||
//
|
||||
// Output:
|
||||
// MASK: An Image output mask, RGBA(GPU) / VEC32F1(CPU).
|
||||
//
|
||||
// Options:
|
||||
// See tensors_to_segmentation_calculator.proto
|
||||
//
|
||||
// Usage example:
|
||||
// node {
|
||||
// calculator: "TensorsToSegmentationCalculator"
|
||||
// input_stream: "TENSORS:tensors"
|
||||
// input_stream: "OUTPUT_SIZE:size"
|
||||
// output_stream: "MASK:hair_mask"
|
||||
// node_options: {
|
||||
// [mediapipe.TensorsToSegmentationCalculatorOptions] {
|
||||
// output_layer_index: 1
|
||||
// # gpu_origin: CONVENTIONAL # or TOP_LEFT
|
||||
// }
|
||||
// }
|
||||
// }
|
||||
//
|
||||
// Currently only OpenGLES 3.1 and CPU backends supported.
|
||||
// TODO Refactor and add support for other backends/platforms.
|
||||
//
|
||||
class TensorsToSegmentationCalculator : public CalculatorBase {
|
||||
public:
|
||||
static absl::Status GetContract(CalculatorContract* cc);
|
||||
|
||||
absl::Status Open(CalculatorContext* cc) override;
|
||||
absl::Status Process(CalculatorContext* cc) override;
|
||||
absl::Status Close(CalculatorContext* cc) override;
|
||||
|
||||
private:
|
||||
absl::Status LoadOptions(CalculatorContext* cc);
|
||||
absl::Status InitGpu(CalculatorContext* cc);
|
||||
absl::Status ProcessGpu(CalculatorContext* cc);
|
||||
absl::Status ProcessCpu(CalculatorContext* cc);
|
||||
void GlRender();
|
||||
|
||||
bool DoesGpuTextureStartAtBottom() {
|
||||
return options_.gpu_origin() != mediapipe::GpuOrigin_Mode_TOP_LEFT;
|
||||
}
|
||||
|
||||
template <class T>
|
||||
absl::Status ApplyActivation(cv::Mat& tensor_mat, cv::Mat* small_mask_mat);
|
||||
|
||||
::mediapipe::TensorsToSegmentationCalculatorOptions options_;
|
||||
|
||||
#if !MEDIAPIPE_DISABLE_GPU
|
||||
mediapipe::GlCalculatorHelper gpu_helper_;
|
||||
GLuint upsample_program_;
|
||||
#if MEDIAPIPE_OPENGL_ES_VERSION >= MEDIAPIPE_OPENGL_ES_31
|
||||
std::unique_ptr<GlProgram> mask_program_31_;
|
||||
#else
|
||||
GLuint mask_program_20_;
|
||||
#endif // MEDIAPIPE_OPENGL_ES_VERSION >= MEDIAPIPE_OPENGL_ES_31
|
||||
#if MEDIAPIPE_METAL_ENABLED
|
||||
MPPMetalHelper* metal_helper_ = nullptr;
|
||||
id<MTLComputePipelineState> mask_program_;
|
||||
#endif // MEDIAPIPE_METAL_ENABLED
|
||||
#endif // !MEDIAPIPE_DISABLE_GPU
|
||||
};
|
||||
REGISTER_CALCULATOR(TensorsToSegmentationCalculator);
|
||||
|
||||
// static
|
||||
absl::Status TensorsToSegmentationCalculator::GetContract(
|
||||
CalculatorContract* cc) {
|
||||
RET_CHECK(!cc->Inputs().GetTags().empty());
|
||||
RET_CHECK(!cc->Outputs().GetTags().empty());
|
||||
|
||||
// Inputs.
|
||||
cc->Inputs().Tag(kTensorsTag).Set<std::vector<Tensor>>();
|
||||
if (cc->Inputs().HasTag(kOutputSizeTag)) {
|
||||
cc->Inputs().Tag(kOutputSizeTag).Set<std::pair<int, int>>();
|
||||
}
|
||||
|
||||
// Outputs.
|
||||
cc->Outputs().Tag(kMaskTag).Set<Image>();
|
||||
|
||||
if (CanUseGpu()) {
|
||||
#if !MEDIAPIPE_DISABLE_GPU
|
||||
MP_RETURN_IF_ERROR(mediapipe::GlCalculatorHelper::UpdateContract(cc));
|
||||
#if MEDIAPIPE_METAL_ENABLED
|
||||
MP_RETURN_IF_ERROR([MPPMetalHelper updateContract:cc]);
|
||||
#endif // MEDIAPIPE_METAL_ENABLED
|
||||
#endif // !MEDIAPIPE_DISABLE_GPU
|
||||
}
|
||||
|
||||
return absl::OkStatus();
|
||||
}
|
||||
|
||||
absl::Status TensorsToSegmentationCalculator::Open(CalculatorContext* cc) {
|
||||
cc->SetOffset(TimestampDiff(0));
|
||||
bool use_gpu = false;
|
||||
|
||||
if (CanUseGpu()) {
|
||||
#if !MEDIAPIPE_DISABLE_GPU
|
||||
use_gpu = true;
|
||||
MP_RETURN_IF_ERROR(gpu_helper_.Open(cc));
|
||||
#if MEDIAPIPE_METAL_ENABLED
|
||||
metal_helper_ = [[MPPMetalHelper alloc] initWithCalculatorContext:cc];
|
||||
RET_CHECK(metal_helper_);
|
||||
#endif // MEDIAPIPE_METAL_ENABLED
|
||||
#endif // !MEDIAPIPE_DISABLE_GPU
|
||||
}
|
||||
|
||||
MP_RETURN_IF_ERROR(LoadOptions(cc));
|
||||
|
||||
if (use_gpu) {
|
||||
#if !MEDIAPIPE_DISABLE_GPU
|
||||
MP_RETURN_IF_ERROR(InitGpu(cc));
|
||||
#else
|
||||
RET_CHECK_FAIL() << "GPU processing disabled.";
|
||||
#endif // !MEDIAPIPE_DISABLE_GPU
|
||||
}
|
||||
|
||||
return absl::OkStatus();
|
||||
}
|
||||
|
||||
absl::Status TensorsToSegmentationCalculator::Process(CalculatorContext* cc) {
|
||||
if (cc->Inputs().Tag(kTensorsTag).IsEmpty()) {
|
||||
return absl::OkStatus();
|
||||
}
|
||||
|
||||
const auto& input_tensors =
|
||||
cc->Inputs().Tag(kTensorsTag).Get<std::vector<Tensor>>();
|
||||
|
||||
bool use_gpu = false;
|
||||
if (CanUseGpu()) {
|
||||
// Use GPU processing only if at least one input tensor is already on GPU.
|
||||
for (const auto& tensor : input_tensors) {
|
||||
if (tensor.ready_on_gpu()) {
|
||||
use_gpu = true;
|
||||
break;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Validate tensor channels and activation type.
|
||||
{
|
||||
RET_CHECK(!input_tensors.empty());
|
||||
ASSIGN_OR_RETURN(auto hwc, GetHwcFromDims(input_tensors[0].shape().dims));
|
||||
int tensor_channels = std::get<2>(hwc);
|
||||
typedef mediapipe::TensorsToSegmentationCalculatorOptions Options;
|
||||
switch (options_.activation()) {
|
||||
case Options::NONE:
|
||||
RET_CHECK_EQ(tensor_channels, 1);
|
||||
break;
|
||||
case Options::SIGMOID:
|
||||
RET_CHECK_EQ(tensor_channels, 1);
|
||||
break;
|
||||
case Options::SOFTMAX:
|
||||
RET_CHECK_EQ(tensor_channels, 2);
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
if (use_gpu) {
|
||||
#if !MEDIAPIPE_DISABLE_GPU
|
||||
MP_RETURN_IF_ERROR(gpu_helper_.RunInGlContext([this, cc]() -> absl::Status {
|
||||
MP_RETURN_IF_ERROR(ProcessGpu(cc));
|
||||
return absl::OkStatus();
|
||||
}));
|
||||
#else
|
||||
RET_CHECK_FAIL() << "GPU processing disabled.";
|
||||
#endif // !MEDIAPIPE_DISABLE_GPU
|
||||
} else {
|
||||
MP_RETURN_IF_ERROR(ProcessCpu(cc));
|
||||
}
|
||||
|
||||
return absl::OkStatus();
|
||||
}
|
||||
|
||||
absl::Status TensorsToSegmentationCalculator::Close(CalculatorContext* cc) {
|
||||
#if !MEDIAPIPE_DISABLE_GPU
|
||||
gpu_helper_.RunInGlContext([this] {
|
||||
if (upsample_program_) glDeleteProgram(upsample_program_);
|
||||
upsample_program_ = 0;
|
||||
#if MEDIAPIPE_OPENGL_ES_VERSION >= MEDIAPIPE_OPENGL_ES_31
|
||||
mask_program_31_.reset();
|
||||
#else
|
||||
if (mask_program_20_) glDeleteProgram(mask_program_20_);
|
||||
mask_program_20_ = 0;
|
||||
#endif // MEDIAPIPE_OPENGL_ES_VERSION >= MEDIAPIPE_OPENGL_ES_31
|
||||
#if MEDIAPIPE_METAL_ENABLED
|
||||
mask_program_ = nil;
|
||||
#endif // MEDIAPIPE_METAL_ENABLED
|
||||
});
|
||||
#endif // !MEDIAPIPE_DISABLE_GPU
|
||||
|
||||
return absl::OkStatus();
|
||||
}
|
||||
|
||||
absl::Status TensorsToSegmentationCalculator::ProcessCpu(
|
||||
CalculatorContext* cc) {
|
||||
// Get input streams, and dimensions.
|
||||
const auto& input_tensors =
|
||||
cc->Inputs().Tag(kTensorsTag).Get<std::vector<Tensor>>();
|
||||
ASSIGN_OR_RETURN(auto hwc, GetHwcFromDims(input_tensors[0].shape().dims));
|
||||
auto [tensor_height, tensor_width, tensor_channels] = hwc;
|
||||
int output_width = tensor_width, output_height = tensor_height;
|
||||
if (cc->Inputs().HasTag(kOutputSizeTag)) {
|
||||
const auto& size =
|
||||
cc->Inputs().Tag(kOutputSizeTag).Get<std::pair<int, int>>();
|
||||
output_width = size.first;
|
||||
output_height = size.second;
|
||||
}
|
||||
|
||||
// Create initial working mask.
|
||||
cv::Mat small_mask_mat(cv::Size(tensor_width, tensor_height), CV_32FC1);
|
||||
|
||||
// Wrap input tensor.
|
||||
auto raw_input_tensor = &input_tensors[0];
|
||||
auto raw_input_view = raw_input_tensor->GetCpuReadView();
|
||||
const float* raw_input_data = raw_input_view.buffer<float>();
|
||||
cv::Mat tensor_mat(cv::Size(tensor_width, tensor_height),
|
||||
CV_MAKETYPE(CV_32F, tensor_channels),
|
||||
const_cast<float*>(raw_input_data));
|
||||
|
||||
// Process mask tensor and apply activation function.
|
||||
if (tensor_channels == 2) {
|
||||
MP_RETURN_IF_ERROR(ApplyActivation<cv::Vec2f>(tensor_mat, &small_mask_mat));
|
||||
} else if (tensor_channels == 1) {
|
||||
RET_CHECK(mediapipe::TensorsToSegmentationCalculatorOptions::SOFTMAX !=
|
||||
options_.activation()); // Requires 2 channels.
|
||||
if (mediapipe::TensorsToSegmentationCalculatorOptions::NONE ==
|
||||
options_.activation()) // Pass-through optimization.
|
||||
tensor_mat.copyTo(small_mask_mat);
|
||||
else
|
||||
MP_RETURN_IF_ERROR(ApplyActivation<float>(tensor_mat, &small_mask_mat));
|
||||
} else {
|
||||
RET_CHECK_FAIL() << "Unsupported number of tensor channels "
|
||||
<< tensor_channels;
|
||||
}
|
||||
|
||||
// Send out image as CPU packet.
|
||||
std::shared_ptr<ImageFrame> mask_frame = std::make_shared<ImageFrame>(
|
||||
ImageFormat::VEC32F1, output_width, output_height);
|
||||
std::unique_ptr<Image> output_mask = absl::make_unique<Image>(mask_frame);
|
||||
cv::Mat output_mat = formats::MatView(output_mask.get());
|
||||
// Upsample small mask into output.
|
||||
cv::resize(small_mask_mat, output_mat, cv::Size(output_width, output_height));
|
||||
cc->Outputs().Tag(kMaskTag).Add(output_mask.release(), cc->InputTimestamp());
|
||||
|
||||
return absl::OkStatus();
|
||||
}
|
||||
|
||||
template <class T>
|
||||
absl::Status TensorsToSegmentationCalculator::ApplyActivation(
|
||||
cv::Mat& tensor_mat, cv::Mat* small_mask_mat) {
|
||||
// Configure activation function.
|
||||
const int output_layer_index = options_.output_layer_index();
|
||||
typedef mediapipe::TensorsToSegmentationCalculatorOptions Options;
|
||||
const auto activation_fn = [&](const cv::Vec2f& mask_value) {
|
||||
float new_mask_value = 0;
|
||||
// TODO consider moving switch out of the loop,
|
||||
// and also avoid float/Vec2f casting.
|
||||
switch (options_.activation()) {
|
||||
case Options::NONE: {
|
||||
new_mask_value = mask_value[0];
|
||||
break;
|
||||
}
|
||||
case Options::SIGMOID: {
|
||||
const float pixel0 = mask_value[0];
|
||||
new_mask_value = 1.0 / (std::exp(-pixel0) + 1.0);
|
||||
break;
|
||||
}
|
||||
case Options::SOFTMAX: {
|
||||
const float pixel0 = mask_value[0];
|
||||
const float pixel1 = mask_value[1];
|
||||
const float max_pixel = std::max(pixel0, pixel1);
|
||||
const float min_pixel = std::min(pixel0, pixel1);
|
||||
const float softmax_denom =
|
||||
/*exp(max_pixel - max_pixel)=*/1.0f +
|
||||
std::exp(min_pixel - max_pixel);
|
||||
new_mask_value = std::exp(mask_value[output_layer_index] - max_pixel) /
|
||||
softmax_denom;
|
||||
break;
|
||||
}
|
||||
}
|
||||
return new_mask_value;
|
||||
};
|
||||
|
||||
// Process mask tensor.
|
||||
for (int i = 0; i < tensor_mat.rows; ++i) {
|
||||
for (int j = 0; j < tensor_mat.cols; ++j) {
|
||||
const T& input_pix = tensor_mat.at<T>(i, j);
|
||||
const float mask_value = activation_fn(input_pix);
|
||||
small_mask_mat->at<float>(i, j) = mask_value;
|
||||
}
|
||||
}
|
||||
|
||||
return absl::OkStatus();
|
||||
}
|
||||
|
||||
// Steps:
|
||||
// 1. receive tensor
|
||||
// 2. process segmentation tensor into small mask
|
||||
// 3. upsample small mask into output mask to be same size as input image
|
||||
absl::Status TensorsToSegmentationCalculator::ProcessGpu(
|
||||
CalculatorContext* cc) {
|
||||
#if !MEDIAPIPE_DISABLE_GPU
|
||||
// Get input streams, and dimensions.
|
||||
const auto& input_tensors =
|
||||
cc->Inputs().Tag(kTensorsTag).Get<std::vector<Tensor>>();
|
||||
ASSIGN_OR_RETURN(auto hwc, GetHwcFromDims(input_tensors[0].shape().dims));
|
||||
auto [tensor_height, tensor_width, tensor_channels] = hwc;
|
||||
int output_width = tensor_width, output_height = tensor_height;
|
||||
if (cc->Inputs().HasTag(kOutputSizeTag)) {
|
||||
const auto& size =
|
||||
cc->Inputs().Tag(kOutputSizeTag).Get<std::pair<int, int>>();
|
||||
output_width = size.first;
|
||||
output_height = size.second;
|
||||
}
|
||||
|
||||
// Create initial working mask texture.
|
||||
#if MEDIAPIPE_OPENGL_ES_VERSION >= MEDIAPIPE_OPENGL_ES_31
|
||||
tflite::gpu::gl::GlTexture small_mask_texture;
|
||||
#else
|
||||
mediapipe::GlTexture small_mask_texture;
|
||||
#endif // MEDIAPIPE_OPENGL_ES_VERSION >= MEDIAPIPE_OPENGL_ES_31
|
||||
|
||||
// Run shader, process mask tensor.
|
||||
#if MEDIAPIPE_OPENGL_ES_VERSION >= MEDIAPIPE_OPENGL_ES_31
|
||||
{
|
||||
MP_RETURN_IF_ERROR(CreateReadWriteRgbaImageTexture(
|
||||
tflite::gpu::DataType::UINT8, // GL_RGBA8
|
||||
{tensor_width, tensor_height}, &small_mask_texture));
|
||||
|
||||
const int output_index = 0;
|
||||
glBindImageTexture(output_index, small_mask_texture.id(), 0, GL_FALSE, 0,
|
||||
GL_WRITE_ONLY, GL_RGBA8);
|
||||
|
||||
auto read_view = input_tensors[0].GetOpenGlBufferReadView();
|
||||
glBindBufferBase(GL_SHADER_STORAGE_BUFFER, 2, read_view.name());
|
||||
|
||||
const tflite::gpu::uint3 workgroups = {
|
||||
NumGroups(tensor_width, kWorkgroupSize),
|
||||
NumGroups(tensor_height, kWorkgroupSize), 1};
|
||||
|
||||
glUseProgram(mask_program_31_->id());
|
||||
glUniform2i(glGetUniformLocation(mask_program_31_->id(), "out_size"),
|
||||
tensor_width, tensor_height);
|
||||
|
||||
MP_RETURN_IF_ERROR(mask_program_31_->Dispatch(workgroups));
|
||||
}
|
||||
#elif MEDIAPIPE_METAL_ENABLED
|
||||
{
|
||||
id<MTLCommandBuffer> command_buffer = [metal_helper_ commandBuffer];
|
||||
command_buffer.label = @"SegmentationKernel";
|
||||
id<MTLComputeCommandEncoder> command_encoder =
|
||||
[command_buffer computeCommandEncoder];
|
||||
[command_encoder setComputePipelineState:mask_program_];
|
||||
|
||||
auto read_view = input_tensors[0].GetMtlBufferReadView(command_buffer);
|
||||
[command_encoder setBuffer:read_view.buffer() offset:0 atIndex:0];
|
||||
|
||||
mediapipe::GpuBuffer small_mask_buffer = [metal_helper_
|
||||
mediapipeGpuBufferWithWidth:tensor_width
|
||||
height:tensor_height
|
||||
format:mediapipe::GpuBufferFormat::kBGRA32];
|
||||
id<MTLTexture> small_mask_texture_metal =
|
||||
[metal_helper_ metalTextureWithGpuBuffer:small_mask_buffer];
|
||||
[command_encoder setTexture:small_mask_texture_metal atIndex:1];
|
||||
|
||||
unsigned int out_size[] = {static_cast<unsigned int>(tensor_width),
|
||||
static_cast<unsigned int>(tensor_height)};
|
||||
[command_encoder setBytes:&out_size length:sizeof(out_size) atIndex:2];
|
||||
|
||||
MTLSize threads_per_group = MTLSizeMake(kWorkgroupSize, kWorkgroupSize, 1);
|
||||
MTLSize threadgroups =
|
||||
MTLSizeMake(NumGroups(tensor_width, kWorkgroupSize),
|
||||
NumGroups(tensor_height, kWorkgroupSize), 1);
|
||||
[command_encoder dispatchThreadgroups:threadgroups
|
||||
threadsPerThreadgroup:threads_per_group];
|
||||
[command_encoder endEncoding];
|
||||
[command_buffer commit];
|
||||
|
||||
small_mask_texture = gpu_helper_.CreateSourceTexture(small_mask_buffer);
|
||||
}
|
||||
#else
|
||||
{
|
||||
small_mask_texture = gpu_helper_.CreateDestinationTexture(
|
||||
tensor_width, tensor_height,
|
||||
mediapipe::GpuBufferFormat::kBGRA32); // actually GL_RGBA8
|
||||
|
||||
// Go through CPU if not already texture 2D (no direct conversion yet).
|
||||
// Tensor::GetOpenGlTexture2dReadView() doesn't automatically convert types.
|
||||
if (!input_tensors[0].ready_as_opengl_texture_2d()) {
|
||||
(void)input_tensors[0].GetCpuReadView();
|
||||
}
|
||||
|
||||
auto read_view = input_tensors[0].GetOpenGlTexture2dReadView();
|
||||
|
||||
gpu_helper_.BindFramebuffer(small_mask_texture);
|
||||
glActiveTexture(GL_TEXTURE1);
|
||||
glBindTexture(GL_TEXTURE_2D, read_view.name());
|
||||
glUseProgram(mask_program_20_);
|
||||
GlRender();
|
||||
glBindTexture(GL_TEXTURE_2D, 0);
|
||||
glFlush();
|
||||
}
|
||||
#endif // MEDIAPIPE_OPENGL_ES_VERSION >= MEDIAPIPE_OPENGL_ES_31
|
||||
|
||||
// Upsample small mask into output.
|
||||
mediapipe::GlTexture output_texture = gpu_helper_.CreateDestinationTexture(
|
||||
output_width, output_height,
|
||||
mediapipe::GpuBufferFormat::kBGRA32); // actually GL_RGBA8
|
||||
|
||||
// Run shader, upsample result.
|
||||
{
|
||||
gpu_helper_.BindFramebuffer(output_texture);
|
||||
glActiveTexture(GL_TEXTURE1);
|
||||
#if MEDIAPIPE_OPENGL_ES_VERSION >= MEDIAPIPE_OPENGL_ES_31
|
||||
glBindTexture(GL_TEXTURE_2D, small_mask_texture.id());
|
||||
#else
|
||||
glBindTexture(GL_TEXTURE_2D, small_mask_texture.name());
|
||||
#endif // MEDIAPIPE_OPENGL_ES_VERSION >= MEDIAPIPE_OPENGL_ES_31
|
||||
glUseProgram(upsample_program_);
|
||||
GlRender();
|
||||
glBindTexture(GL_TEXTURE_2D, 0);
|
||||
glFlush();
|
||||
}
|
||||
|
||||
// Send out image as GPU packet.
|
||||
auto output_image = output_texture.GetFrame<Image>();
|
||||
cc->Outputs().Tag(kMaskTag).Add(output_image.release(), cc->InputTimestamp());
|
||||
|
||||
// Cleanup
|
||||
output_texture.Release();
|
||||
#endif // !MEDIAPIPE_DISABLE_GPU
|
||||
|
||||
return absl::OkStatus();
|
||||
}
|
||||
|
||||
void TensorsToSegmentationCalculator::GlRender() {
|
||||
#if !MEDIAPIPE_DISABLE_GPU
|
||||
static const GLfloat square_vertices[] = {
|
||||
-1.0f, -1.0f, // bottom left
|
||||
1.0f, -1.0f, // bottom right
|
||||
-1.0f, 1.0f, // top left
|
||||
1.0f, 1.0f, // top right
|
||||
};
|
||||
static const GLfloat texture_vertices[] = {
|
||||
0.0f, 0.0f, // bottom left
|
||||
1.0f, 0.0f, // bottom right
|
||||
0.0f, 1.0f, // top left
|
||||
1.0f, 1.0f, // top right
|
||||
};
|
||||
|
||||
// vertex storage
|
||||
GLuint vbo[2];
|
||||
glGenBuffers(2, vbo);
|
||||
GLuint vao;
|
||||
glGenVertexArrays(1, &vao);
|
||||
glBindVertexArray(vao);
|
||||
|
||||
// vbo 0
|
||||
glBindBuffer(GL_ARRAY_BUFFER, vbo[0]);
|
||||
glBufferData(GL_ARRAY_BUFFER, 4 * 2 * sizeof(GLfloat), square_vertices,
|
||||
GL_STATIC_DRAW);
|
||||
glEnableVertexAttribArray(ATTRIB_VERTEX);
|
||||
glVertexAttribPointer(ATTRIB_VERTEX, 2, GL_FLOAT, 0, 0, nullptr);
|
||||
|
||||
// vbo 1
|
||||
glBindBuffer(GL_ARRAY_BUFFER, vbo[1]);
|
||||
glBufferData(GL_ARRAY_BUFFER, 4 * 2 * sizeof(GLfloat), texture_vertices,
|
||||
GL_STATIC_DRAW);
|
||||
glEnableVertexAttribArray(ATTRIB_TEXTURE_POSITION);
|
||||
glVertexAttribPointer(ATTRIB_TEXTURE_POSITION, 2, GL_FLOAT, 0, 0, nullptr);
|
||||
|
||||
// draw
|
||||
glDrawArrays(GL_TRIANGLE_STRIP, 0, 4);
|
||||
|
||||
// cleanup
|
||||
glDisableVertexAttribArray(ATTRIB_VERTEX);
|
||||
glDisableVertexAttribArray(ATTRIB_TEXTURE_POSITION);
|
||||
glBindBuffer(GL_ARRAY_BUFFER, 0);
|
||||
glBindVertexArray(0);
|
||||
glDeleteVertexArrays(1, &vao);
|
||||
glDeleteBuffers(2, vbo);
|
||||
#endif // !MEDIAPIPE_DISABLE_GPU
|
||||
}
|
||||
|
||||
absl::Status TensorsToSegmentationCalculator::LoadOptions(
|
||||
CalculatorContext* cc) {
|
||||
// Get calculator options specified in the graph.
|
||||
options_ = cc->Options<::mediapipe::TensorsToSegmentationCalculatorOptions>();
|
||||
|
||||
return absl::OkStatus();
|
||||
}
|
||||
|
||||
absl::Status TensorsToSegmentationCalculator::InitGpu(CalculatorContext* cc) {
|
||||
#if !MEDIAPIPE_DISABLE_GPU
|
||||
MP_RETURN_IF_ERROR(gpu_helper_.RunInGlContext([this]() -> absl::Status {
|
||||
// A shader to process a segmentation tensor into an output mask.
|
||||
// Currently uses 4 channels for output, and sets R+A channels as mask value.
|
||||
#if MEDIAPIPE_OPENGL_ES_VERSION >= MEDIAPIPE_OPENGL_ES_31
|
||||
// GLES 3.1
|
||||
const tflite::gpu::uint3 workgroup_size = {kWorkgroupSize, kWorkgroupSize,
|
||||
1};
|
||||
const std::string shader_header =
|
||||
absl::StrCat(tflite::gpu::gl::GetShaderHeader(workgroup_size), R"(
|
||||
precision highp float;
|
||||
|
||||
layout(rgba8, binding = 0) writeonly uniform highp image2D output_texture;
|
||||
|
||||
uniform ivec2 out_size;
|
||||
)");
|
||||
/* Shader defines will be inserted here. */
|
||||
|
||||
const std::string shader_src_main = R"(
|
||||
layout(std430, binding = 2) readonly buffer B0 {
|
||||
#ifdef TWO_CHANNEL_INPUT
|
||||
vec2 elements[];
|
||||
#else
|
||||
float elements[];
|
||||
#endif // TWO_CHANNEL_INPUT
|
||||
} input_data; // data tensor
|
||||
|
||||
void main() {
|
||||
int out_width = out_size.x;
|
||||
int out_height = out_size.y;
|
||||
|
||||
ivec2 gid = ivec2(gl_GlobalInvocationID.xy);
|
||||
if (gid.x >= out_width || gid.y >= out_height) { return; }
|
||||
int linear_index = gid.y * out_width + gid.x;
|
||||
|
||||
#ifdef TWO_CHANNEL_INPUT
|
||||
vec2 input_value = input_data.elements[linear_index];
|
||||
#else
|
||||
vec2 input_value = vec2(input_data.elements[linear_index], 0.0);
|
||||
#endif // TWO_CHANNEL_INPUT
|
||||
|
||||
// Run activation function.
|
||||
// One and only one of FN_SOFTMAX,FN_SIGMOID,FN_NONE will be defined.
|
||||
#ifdef FN_SOFTMAX
|
||||
// Only two channel input tensor is supported.
|
||||
vec2 input_px = input_value.rg;
|
||||
float shift = max(input_px.r, input_px.g);
|
||||
float softmax_denom = exp(input_px.r - shift) + exp(input_px.g - shift);
|
||||
float new_mask_value =
|
||||
exp(input_px[OUTPUT_LAYER_INDEX] - shift) / softmax_denom;
|
||||
#endif // FN_SOFTMAX
|
||||
|
||||
#ifdef FN_SIGMOID
|
||||
float new_mask_value = 1.0 / (exp(-input_value.r) + 1.0);
|
||||
#endif // FN_SIGMOID
|
||||
|
||||
#ifdef FN_NONE
|
||||
float new_mask_value = input_value.r;
|
||||
#endif // FN_NONE
|
||||
|
||||
#ifdef FLIP_Y_COORD
|
||||
int y_coord = out_height - gid.y - 1;
|
||||
#else
|
||||
int y_coord = gid.y;
|
||||
#endif // defined(FLIP_Y_COORD)
|
||||
ivec2 output_coordinate = ivec2(gid.x, y_coord);
|
||||
|
||||
vec4 out_value = vec4(new_mask_value, 0.0, 0.0, new_mask_value);
|
||||
imageStore(output_texture, output_coordinate, out_value);
|
||||
})";
|
||||
|
||||
#elif MEDIAPIPE_METAL_ENABLED
|
||||
// METAL
|
||||
const std::string shader_header = R"(
|
||||
#include <metal_stdlib>
|
||||
using namespace metal;
|
||||
)";
|
||||
/* Shader defines will be inserted here. */
|
||||
|
||||
const std::string shader_src_main = R"(
|
||||
kernel void segmentationKernel(
|
||||
#ifdef TWO_CHANNEL_INPUT
|
||||
device float2* elements [[ buffer(0) ]],
|
||||
#else
|
||||
device float* elements [[ buffer(0) ]],
|
||||
#endif // TWO_CHANNEL_INPUT
|
||||
texture2d<float, access::write> output_texture [[ texture(1) ]],
|
||||
constant uint* out_size [[ buffer(2) ]],
|
||||
uint2 gid [[ thread_position_in_grid ]])
|
||||
{
|
||||
uint out_width = out_size[0];
|
||||
uint out_height = out_size[1];
|
||||
|
||||
if (gid.x >= out_width || gid.y >= out_height) { return; }
|
||||
uint linear_index = gid.y * out_width + gid.x;
|
||||
|
||||
#ifdef TWO_CHANNEL_INPUT
|
||||
float2 input_value = elements[linear_index];
|
||||
#else
|
||||
float2 input_value = float2(elements[linear_index], 0.0);
|
||||
#endif // TWO_CHANNEL_INPUT
|
||||
|
||||
// Run activation function.
|
||||
// One and only one of FN_SOFTMAX,FN_SIGMOID,FN_NONE will be defined.
|
||||
#ifdef FN_SOFTMAX
|
||||
// Only two channel input tensor is supported.
|
||||
float2 input_px = input_value.xy;
|
||||
float shift = max(input_px.x, input_px.y);
|
||||
float softmax_denom = exp(input_px.r - shift) + exp(input_px.g - shift);
|
||||
float new_mask_value =
|
||||
exp(input_px[OUTPUT_LAYER_INDEX] - shift) / softmax_denom;
|
||||
#endif // FN_SOFTMAX
|
||||
|
||||
#ifdef FN_SIGMOID
|
||||
float new_mask_value = 1.0 / (exp(-input_value.x) + 1.0);
|
||||
#endif // FN_SIGMOID
|
||||
|
||||
#ifdef FN_NONE
|
||||
float new_mask_value = input_value.x;
|
||||
#endif // FN_NONE
|
||||
|
||||
#ifdef FLIP_Y_COORD
|
||||
int y_coord = out_height - gid.y - 1;
|
||||
#else
|
||||
int y_coord = gid.y;
|
||||
#endif // defined(FLIP_Y_COORD)
|
||||
uint2 output_coordinate = uint2(gid.x, y_coord);
|
||||
|
||||
float4 out_value = float4(new_mask_value, 0.0, 0.0, new_mask_value);
|
||||
output_texture.write(out_value, output_coordinate);
|
||||
}
|
||||
)";
|
||||
|
||||
#else
|
||||
// GLES 2.0
|
||||
const std::string shader_header = absl::StrCat(
|
||||
std::string(mediapipe::kMediaPipeFragmentShaderPreamble), R"(
|
||||
DEFAULT_PRECISION(mediump, float)
|
||||
)");
|
||||
/* Shader defines will be inserted here. */
|
||||
|
||||
const std::string shader_src_main = R"(
|
||||
in vec2 sample_coordinate;
|
||||
|
||||
uniform sampler2D input_texture;
|
||||
|
||||
#ifdef GL_ES
|
||||
#define fragColor gl_FragColor
|
||||
#else
|
||||
out vec4 fragColor;
|
||||
#endif // defined(GL_ES);
|
||||
|
||||
void main() {
|
||||
|
||||
vec4 input_value = texture2D(input_texture, sample_coordinate);
|
||||
vec2 gid = sample_coordinate;
|
||||
|
||||
// Run activation function.
|
||||
// One and only one of FN_SOFTMAX,FN_SIGMOID,FN_NONE will be defined.
|
||||
|
||||
#ifdef FN_SOFTMAX
|
||||
// Only two channel input tensor is supported.
|
||||
vec2 input_px = input_value.rg;
|
||||
float shift = max(input_px.r, input_px.g);
|
||||
float softmax_denom = exp(input_px.r - shift) + exp(input_px.g - shift);
|
||||
float new_mask_value =
|
||||
exp(mix(input_px.r, input_px.g, float(OUTPUT_LAYER_INDEX)) - shift) / softmax_denom;
|
||||
#endif // FN_SOFTMAX
|
||||
|
||||
#ifdef FN_SIGMOID
|
||||
float new_mask_value = 1.0 / (exp(-input_value.r) + 1.0);
|
||||
#endif // FN_SIGMOID
|
||||
|
||||
#ifdef FN_NONE
|
||||
float new_mask_value = input_value.r;
|
||||
#endif // FN_NONE
|
||||
|
||||
#ifdef FLIP_Y_COORD
|
||||
float y_coord = 1.0 - gid.y;
|
||||
#else
|
||||
float y_coord = gid.y;
|
||||
#endif // defined(FLIP_Y_COORD)
|
||||
vec2 output_coordinate = vec2(gid.x, y_coord);
|
||||
|
||||
vec4 out_value = vec4(new_mask_value, 0.0, 0.0, new_mask_value);
|
||||
fragColor = out_value;
|
||||
})";
|
||||
#endif // MEDIAPIPE_OPENGL_ES_VERSION >= MEDIAPIPE_OPENGL_ES_31
|
||||
|
||||
// Shader defines.
|
||||
typedef mediapipe::TensorsToSegmentationCalculatorOptions Options;
|
||||
const std::string output_layer_index =
|
||||
"\n#define OUTPUT_LAYER_INDEX int(" +
|
||||
std::to_string(options_.output_layer_index()) + ")";
|
||||
const std::string flip_y_coord =
|
||||
DoesGpuTextureStartAtBottom() ? "\n#define FLIP_Y_COORD" : "";
|
||||
const std::string fn_none =
|
||||
options_.activation() == Options::NONE ? "\n#define FN_NONE" : "";
|
||||
const std::string fn_sigmoid =
|
||||
options_.activation() == Options::SIGMOID ? "\n#define FN_SIGMOID" : "";
|
||||
const std::string fn_softmax =
|
||||
options_.activation() == Options::SOFTMAX ? "\n#define FN_SOFTMAX" : "";
|
||||
const std::string two_channel = options_.activation() == Options::SOFTMAX
|
||||
? "\n#define TWO_CHANNEL_INPUT"
|
||||
: "";
|
||||
const std::string shader_defines =
|
||||
absl::StrCat(output_layer_index, flip_y_coord, fn_softmax, fn_sigmoid,
|
||||
fn_none, two_channel);
|
||||
|
||||
// Build full shader.
|
||||
const std::string shader_src_no_previous =
|
||||
absl::StrCat(shader_header, shader_defines, shader_src_main);
|
||||
|
||||
// Vertex shader attributes.
|
||||
const GLint attr_location[NUM_ATTRIBUTES] = {
|
||||
ATTRIB_VERTEX,
|
||||
ATTRIB_TEXTURE_POSITION,
|
||||
};
|
||||
const GLchar* attr_name[NUM_ATTRIBUTES] = {
|
||||
"position",
|
||||
"texture_coordinate",
|
||||
};
|
||||
|
||||
// Main shader program & parameters
|
||||
#if MEDIAPIPE_OPENGL_ES_VERSION >= MEDIAPIPE_OPENGL_ES_31
|
||||
GlShader shader_without_previous;
|
||||
MP_RETURN_IF_ERROR(GlShader::CompileShader(
|
||||
GL_COMPUTE_SHADER, shader_src_no_previous, &shader_without_previous));
|
||||
mask_program_31_ = absl::make_unique<GlProgram>();
|
||||
MP_RETURN_IF_ERROR(GlProgram::CreateWithShader(shader_without_previous,
|
||||
mask_program_31_.get()));
|
||||
#elif MEDIAPIPE_METAL_ENABLED
|
||||
id<MTLDevice> device = metal_helper_.mtlDevice;
|
||||
NSString* library_source =
|
||||
[NSString stringWithUTF8String:shader_src_no_previous.c_str()];
|
||||
NSError* error = nil;
|
||||
id<MTLLibrary> library = [device newLibraryWithSource:library_source
|
||||
options:nullptr
|
||||
error:&error];
|
||||
RET_CHECK(library != nil) << "Couldn't create shader library "
|
||||
<< [[error localizedDescription] UTF8String];
|
||||
id<MTLFunction> kernel_func = nil;
|
||||
kernel_func = [library newFunctionWithName:@"segmentationKernel"];
|
||||
RET_CHECK(kernel_func != nil) << "Couldn't create kernel function.";
|
||||
mask_program_ =
|
||||
[device newComputePipelineStateWithFunction:kernel_func error:&error];
|
||||
RET_CHECK(mask_program_ != nil) << "Couldn't create pipeline state " <<
|
||||
[[error localizedDescription] UTF8String];
|
||||
#else
|
||||
mediapipe::GlhCreateProgram(
|
||||
mediapipe::kBasicVertexShader, shader_src_no_previous.c_str(),
|
||||
NUM_ATTRIBUTES, &attr_name[0], attr_location, &mask_program_20_);
|
||||
RET_CHECK(mask_program_20_) << "Problem initializing the program.";
|
||||
glUseProgram(mask_program_20_);
|
||||
glUniform1i(glGetUniformLocation(mask_program_20_, "input_texture"), 1);
|
||||
#endif // MEDIAPIPE_OPENGL_ES_VERSION >= MEDIAPIPE_OPENGL_ES_31
|
||||
|
||||
// Simple pass-through program, used for hardware upsampling.
|
||||
mediapipe::GlhCreateProgram(
|
||||
mediapipe::kBasicVertexShader, mediapipe::kBasicTexturedFragmentShader,
|
||||
NUM_ATTRIBUTES, &attr_name[0], attr_location, &upsample_program_);
|
||||
RET_CHECK(upsample_program_) << "Problem initializing the program.";
|
||||
glUseProgram(upsample_program_);
|
||||
glUniform1i(glGetUniformLocation(upsample_program_, "video_frame"), 1);
|
||||
|
||||
return absl::OkStatus();
|
||||
}));
|
||||
#endif // !MEDIAPIPE_DISABLE_GPU
|
||||
|
||||
return absl::OkStatus();
|
||||
}
|
||||
|
||||
} // namespace mediapipe
|
|
@ -0,0 +1,46 @@
|
|||
// Copyright 2021 The MediaPipe Authors.
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
// You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software
|
||||
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
|
||||
syntax = "proto2";
|
||||
|
||||
package mediapipe;
|
||||
|
||||
import "mediapipe/framework/calculator.proto";
|
||||
import "mediapipe/gpu/gpu_origin.proto";
|
||||
|
||||
message TensorsToSegmentationCalculatorOptions {
|
||||
extend mediapipe.CalculatorOptions {
|
||||
optional TensorsToSegmentationCalculatorOptions ext = 374311106;
|
||||
}
|
||||
|
||||
// For CONVENTIONAL mode in OpenGL, textures start at bottom and needs
|
||||
// to be flipped vertically as tensors are expected to start at top.
|
||||
// (DEFAULT or unset is interpreted as CONVENTIONAL.)
|
||||
optional GpuOrigin.Mode gpu_origin = 1;
|
||||
|
||||
// Supported activation functions for filtering.
|
||||
enum Activation {
|
||||
NONE = 0; // Assumes 1-channel input tensor.
|
||||
SIGMOID = 1; // Assumes 1-channel input tensor.
|
||||
SOFTMAX = 2; // Assumes 2-channel input tensor.
|
||||
}
|
||||
// Activation function to apply to input tensor.
|
||||
// Softmax requires a 2-channel tensor, see output_layer_index below.
|
||||
optional Activation activation = 2 [default = NONE];
|
||||
|
||||
// Channel to use for processing tensor.
|
||||
// Only applies when using activation=SOFTMAX.
|
||||
// Works on two channel input tensor only.
|
||||
optional int32 output_layer_index = 3 [default = 1];
|
||||
}
|
|
@ -859,6 +859,7 @@ cc_library(
|
|||
"//mediapipe/framework:calculator_framework",
|
||||
"//mediapipe/framework:timestamp",
|
||||
"//mediapipe/framework/formats:landmark_cc_proto",
|
||||
"//mediapipe/framework/formats:rect_cc_proto",
|
||||
"//mediapipe/framework/port:ret_check",
|
||||
"//mediapipe/util/filtering:one_euro_filter",
|
||||
"//mediapipe/util/filtering:relative_velocity_filter",
|
||||
|
|
|
@ -323,7 +323,7 @@ absl::Status DetectionsToRectsCalculator::ComputeRotation(
|
|||
DetectionSpec DetectionsToRectsCalculator::GetDetectionSpec(
|
||||
const CalculatorContext* cc) {
|
||||
absl::optional<std::pair<int, int>> image_size;
|
||||
if (cc->Inputs().HasTag(kImageSizeTag)) {
|
||||
if (HasTagValue(cc->Inputs(), kImageSizeTag)) {
|
||||
image_size = cc->Inputs().Tag(kImageSizeTag).Get<std::pair<int, int>>();
|
||||
}
|
||||
|
||||
|
|
|
@ -157,6 +157,12 @@ TEST(DetectionsToRectsCalculatorTest, DetectionKeyPointsToRect) {
|
|||
/*image_size=*/{640, 480});
|
||||
MP_ASSERT_OK(status_or_value);
|
||||
EXPECT_THAT(status_or_value.value(), RectEq(480, 360, 320, 240));
|
||||
|
||||
status_or_value = RunDetectionKeyPointsToRectCalculation(
|
||||
/*detection=*/DetectionWithKeyPoints({{0.25f, 0.25f}, {0.75f, 0.75f}}),
|
||||
/*image_size=*/{0, 0});
|
||||
MP_ASSERT_OK(status_or_value);
|
||||
EXPECT_THAT(status_or_value.value(), RectEq(0, 0, 0, 0));
|
||||
}
|
||||
|
||||
TEST(DetectionsToRectsCalculatorTest, DetectionToNormalizedRect) {
|
||||
|
|
|
@ -18,6 +18,7 @@
|
|||
#include "mediapipe/calculators/util/landmarks_smoothing_calculator.pb.h"
|
||||
#include "mediapipe/framework/calculator_framework.h"
|
||||
#include "mediapipe/framework/formats/landmark.pb.h"
|
||||
#include "mediapipe/framework/formats/rect.pb.h"
|
||||
#include "mediapipe/framework/port/ret_check.h"
|
||||
#include "mediapipe/framework/timestamp.h"
|
||||
#include "mediapipe/util/filtering/one_euro_filter.h"
|
||||
|
@ -30,6 +31,7 @@ namespace {
|
|||
constexpr char kNormalizedLandmarksTag[] = "NORM_LANDMARKS";
|
||||
constexpr char kLandmarksTag[] = "LANDMARKS";
|
||||
constexpr char kImageSizeTag[] = "IMAGE_SIZE";
|
||||
constexpr char kObjectScaleRoiTag[] = "OBJECT_SCALE_ROI";
|
||||
constexpr char kNormalizedFilteredLandmarksTag[] = "NORM_FILTERED_LANDMARKS";
|
||||
constexpr char kFilteredLandmarksTag[] = "FILTERED_LANDMARKS";
|
||||
|
||||
|
@ -94,6 +96,18 @@ float GetObjectScale(const LandmarkList& landmarks) {
|
|||
return (object_width + object_height) / 2.0f;
|
||||
}
|
||||
|
||||
float GetObjectScale(const NormalizedRect& roi, const int image_width,
|
||||
const int image_height) {
|
||||
const float object_width = roi.width() * image_width;
|
||||
const float object_height = roi.height() * image_height;
|
||||
|
||||
return (object_width + object_height) / 2.0f;
|
||||
}
|
||||
|
||||
float GetObjectScale(const Rect& roi) {
|
||||
return (roi.width() + roi.height()) / 2.0f;
|
||||
}
|
||||
|
||||
// Abstract class for various landmarks filters.
|
||||
class LandmarksFilter {
|
||||
public:
|
||||
|
@ -103,6 +117,7 @@ class LandmarksFilter {
|
|||
|
||||
virtual absl::Status Apply(const LandmarkList& in_landmarks,
|
||||
const absl::Duration& timestamp,
|
||||
const absl::optional<float> object_scale_opt,
|
||||
LandmarkList* out_landmarks) = 0;
|
||||
};
|
||||
|
||||
|
@ -111,6 +126,7 @@ class NoFilter : public LandmarksFilter {
|
|||
public:
|
||||
absl::Status Apply(const LandmarkList& in_landmarks,
|
||||
const absl::Duration& timestamp,
|
||||
const absl::optional<float> object_scale_opt,
|
||||
LandmarkList* out_landmarks) override {
|
||||
*out_landmarks = in_landmarks;
|
||||
return absl::OkStatus();
|
||||
|
@ -136,13 +152,15 @@ class VelocityFilter : public LandmarksFilter {
|
|||
|
||||
absl::Status Apply(const LandmarkList& in_landmarks,
|
||||
const absl::Duration& timestamp,
|
||||
const absl::optional<float> object_scale_opt,
|
||||
LandmarkList* out_landmarks) override {
|
||||
// Get value scale as inverse value of the object scale.
|
||||
// If value is too small smoothing will be disabled and landmarks will be
|
||||
// returned as is.
|
||||
float value_scale = 1.0f;
|
||||
if (!disable_value_scaling_) {
|
||||
const float object_scale = GetObjectScale(in_landmarks);
|
||||
const float object_scale =
|
||||
object_scale_opt ? *object_scale_opt : GetObjectScale(in_landmarks);
|
||||
if (object_scale < min_allowed_object_scale_) {
|
||||
*out_landmarks = in_landmarks;
|
||||
return absl::OkStatus();
|
||||
|
@ -205,12 +223,14 @@ class VelocityFilter : public LandmarksFilter {
|
|||
class OneEuroFilterImpl : public LandmarksFilter {
|
||||
public:
|
||||
OneEuroFilterImpl(double frequency, double min_cutoff, double beta,
|
||||
double derivate_cutoff, float min_allowed_object_scale)
|
||||
double derivate_cutoff, float min_allowed_object_scale,
|
||||
bool disable_value_scaling)
|
||||
: frequency_(frequency),
|
||||
min_cutoff_(min_cutoff),
|
||||
beta_(beta),
|
||||
derivate_cutoff_(derivate_cutoff),
|
||||
min_allowed_object_scale_(min_allowed_object_scale) {}
|
||||
min_allowed_object_scale_(min_allowed_object_scale),
|
||||
disable_value_scaling_(disable_value_scaling) {}
|
||||
|
||||
absl::Status Reset() override {
|
||||
x_filters_.clear();
|
||||
|
@ -221,16 +241,24 @@ class OneEuroFilterImpl : public LandmarksFilter {
|
|||
|
||||
absl::Status Apply(const LandmarkList& in_landmarks,
|
||||
const absl::Duration& timestamp,
|
||||
const absl::optional<float> object_scale_opt,
|
||||
LandmarkList* out_landmarks) override {
|
||||
// Initialize filters once.
|
||||
MP_RETURN_IF_ERROR(InitializeFiltersIfEmpty(in_landmarks.landmark_size()));
|
||||
|
||||
const float object_scale = GetObjectScale(in_landmarks);
|
||||
if (object_scale < min_allowed_object_scale_) {
|
||||
*out_landmarks = in_landmarks;
|
||||
return absl::OkStatus();
|
||||
// Get value scale as inverse value of the object scale.
|
||||
// If value is too small smoothing will be disabled and landmarks will be
|
||||
// returned as is.
|
||||
float value_scale = 1.0f;
|
||||
if (!disable_value_scaling_) {
|
||||
const float object_scale =
|
||||
object_scale_opt ? *object_scale_opt : GetObjectScale(in_landmarks);
|
||||
if (object_scale < min_allowed_object_scale_) {
|
||||
*out_landmarks = in_landmarks;
|
||||
return absl::OkStatus();
|
||||
}
|
||||
value_scale = 1.0f / object_scale;
|
||||
}
|
||||
const float value_scale = 1.0f / object_scale;
|
||||
|
||||
// Filter landmarks. Every axis of every landmark is filtered separately.
|
||||
for (int i = 0; i < in_landmarks.landmark_size(); ++i) {
|
||||
|
@ -277,6 +305,7 @@ class OneEuroFilterImpl : public LandmarksFilter {
|
|||
double beta_;
|
||||
double derivate_cutoff_;
|
||||
double min_allowed_object_scale_;
|
||||
bool disable_value_scaling_;
|
||||
|
||||
std::vector<OneEuroFilter> x_filters_;
|
||||
std::vector<OneEuroFilter> y_filters_;
|
||||
|
@ -292,6 +321,10 @@ class OneEuroFilterImpl : public LandmarksFilter {
|
|||
// IMAGE_SIZE: A std::pair<int, int> represention of image width and height.
|
||||
// Required to perform all computations in absolute coordinates to avoid any
|
||||
// influence of normalized values.
|
||||
// OBJECT_SCALE_ROI (optional): A NormRect or Rect (depending on the format of
|
||||
// input landmarks) used to determine the object scale for some of the
|
||||
// filters. If not provided - object scale will be calculated from
|
||||
// landmarks.
|
||||
//
|
||||
// Outputs:
|
||||
// NORM_FILTERED_LANDMARKS: A NormalizedLandmarkList of smoothed landmarks.
|
||||
|
@ -301,6 +334,7 @@ class OneEuroFilterImpl : public LandmarksFilter {
|
|||
// calculator: "LandmarksSmoothingCalculator"
|
||||
// input_stream: "NORM_LANDMARKS:pose_landmarks"
|
||||
// input_stream: "IMAGE_SIZE:image_size"
|
||||
// input_stream: "OBJECT_SCALE_ROI:roi"
|
||||
// output_stream: "NORM_FILTERED_LANDMARKS:pose_landmarks_filtered"
|
||||
// options: {
|
||||
// [mediapipe.LandmarksSmoothingCalculatorOptions.ext] {
|
||||
|
@ -330,9 +364,17 @@ absl::Status LandmarksSmoothingCalculator::GetContract(CalculatorContract* cc) {
|
|||
cc->Outputs()
|
||||
.Tag(kNormalizedFilteredLandmarksTag)
|
||||
.Set<NormalizedLandmarkList>();
|
||||
|
||||
if (cc->Inputs().HasTag(kObjectScaleRoiTag)) {
|
||||
cc->Inputs().Tag(kObjectScaleRoiTag).Set<NormalizedRect>();
|
||||
}
|
||||
} else {
|
||||
cc->Inputs().Tag(kLandmarksTag).Set<LandmarkList>();
|
||||
cc->Outputs().Tag(kFilteredLandmarksTag).Set<LandmarkList>();
|
||||
|
||||
if (cc->Inputs().HasTag(kObjectScaleRoiTag)) {
|
||||
cc->Inputs().Tag(kObjectScaleRoiTag).Set<Rect>();
|
||||
}
|
||||
}
|
||||
|
||||
return absl::OkStatus();
|
||||
|
@ -357,7 +399,8 @@ absl::Status LandmarksSmoothingCalculator::Open(CalculatorContext* cc) {
|
|||
options.one_euro_filter().min_cutoff(),
|
||||
options.one_euro_filter().beta(),
|
||||
options.one_euro_filter().derivate_cutoff(),
|
||||
options.one_euro_filter().min_allowed_object_scale());
|
||||
options.one_euro_filter().min_allowed_object_scale(),
|
||||
options.one_euro_filter().disable_value_scaling());
|
||||
} else {
|
||||
RET_CHECK_FAIL()
|
||||
<< "Landmarks filter is either not specified or not supported";
|
||||
|
@ -389,13 +432,20 @@ absl::Status LandmarksSmoothingCalculator::Process(CalculatorContext* cc) {
|
|||
std::tie(image_width, image_height) =
|
||||
cc->Inputs().Tag(kImageSizeTag).Get<std::pair<int, int>>();
|
||||
|
||||
absl::optional<float> object_scale;
|
||||
if (cc->Inputs().HasTag(kObjectScaleRoiTag) &&
|
||||
!cc->Inputs().Tag(kObjectScaleRoiTag).IsEmpty()) {
|
||||
auto& roi = cc->Inputs().Tag(kObjectScaleRoiTag).Get<NormalizedRect>();
|
||||
object_scale = GetObjectScale(roi, image_width, image_height);
|
||||
}
|
||||
|
||||
auto in_landmarks = absl::make_unique<LandmarkList>();
|
||||
NormalizedLandmarksToLandmarks(in_norm_landmarks, image_width, image_height,
|
||||
in_landmarks.get());
|
||||
|
||||
auto out_landmarks = absl::make_unique<LandmarkList>();
|
||||
MP_RETURN_IF_ERROR(landmarks_filter_->Apply(*in_landmarks, timestamp,
|
||||
out_landmarks.get()));
|
||||
MP_RETURN_IF_ERROR(landmarks_filter_->Apply(
|
||||
*in_landmarks, timestamp, object_scale, out_landmarks.get()));
|
||||
|
||||
auto out_norm_landmarks = absl::make_unique<NormalizedLandmarkList>();
|
||||
LandmarksToNormalizedLandmarks(*out_landmarks, image_width, image_height,
|
||||
|
@ -408,9 +458,16 @@ absl::Status LandmarksSmoothingCalculator::Process(CalculatorContext* cc) {
|
|||
const auto& in_landmarks =
|
||||
cc->Inputs().Tag(kLandmarksTag).Get<LandmarkList>();
|
||||
|
||||
absl::optional<float> object_scale;
|
||||
if (cc->Inputs().HasTag(kObjectScaleRoiTag) &&
|
||||
!cc->Inputs().Tag(kObjectScaleRoiTag).IsEmpty()) {
|
||||
auto& roi = cc->Inputs().Tag(kObjectScaleRoiTag).Get<Rect>();
|
||||
object_scale = GetObjectScale(roi);
|
||||
}
|
||||
|
||||
auto out_landmarks = absl::make_unique<LandmarkList>();
|
||||
MP_RETURN_IF_ERROR(
|
||||
landmarks_filter_->Apply(in_landmarks, timestamp, out_landmarks.get()));
|
||||
MP_RETURN_IF_ERROR(landmarks_filter_->Apply(
|
||||
in_landmarks, timestamp, object_scale, out_landmarks.get()));
|
||||
|
||||
cc->Outputs()
|
||||
.Tag(kFilteredLandmarksTag)
|
||||
|
|
|
@ -41,9 +41,9 @@ message LandmarksSmoothingCalculatorOptions {
|
|||
optional float min_allowed_object_scale = 3 [default = 1e-6];
|
||||
|
||||
// Disable value scaling based on object size and use `1.0` instead.
|
||||
// Value scale is calculated as inverse value of object size. Object size is
|
||||
// calculated as maximum side of rectangular bounding box of the object in
|
||||
// XY plane.
|
||||
// If not disabled, value scale is calculated as inverse value of object
|
||||
// size. Object size is calculated as maximum side of rectangular bounding
|
||||
// box of the object in XY plane.
|
||||
optional bool disable_value_scaling = 4 [default = false];
|
||||
}
|
||||
|
||||
|
@ -72,6 +72,12 @@ message LandmarksSmoothingCalculatorOptions {
|
|||
// If calculated object scale is less than given value smoothing will be
|
||||
// disabled and landmarks will be returned as is.
|
||||
optional float min_allowed_object_scale = 5 [default = 1e-6];
|
||||
|
||||
// Disable value scaling based on object size and use `1.0` instead.
|
||||
// If not disabled, value scale is calculated as inverse value of object
|
||||
// size. Object size is calculated as maximum side of rectangular bounding
|
||||
// box of the object in XY plane.
|
||||
optional bool disable_value_scaling = 6 [default = false];
|
||||
}
|
||||
|
||||
oneof filter_options {
|
||||
|
|
|
@ -40,7 +40,7 @@ constexpr char kRectTag[] = "NORM_RECT";
|
|||
// Input:
|
||||
// LANDMARKS: A LandmarkList representing world landmarks in the rectangle.
|
||||
// NORM_RECT: An NormalizedRect representing a normalized rectangle in image
|
||||
// coordinates.
|
||||
// coordinates. (Optional)
|
||||
//
|
||||
// Output:
|
||||
// LANDMARKS: A LandmarkList representing world landmarks projected (rotated
|
||||
|
@ -59,7 +59,9 @@ class WorldLandmarkProjectionCalculator : public CalculatorBase {
|
|||
public:
|
||||
static absl::Status GetContract(CalculatorContract* cc) {
|
||||
cc->Inputs().Tag(kLandmarksTag).Set<LandmarkList>();
|
||||
cc->Inputs().Tag(kRectTag).Set<NormalizedRect>();
|
||||
if (cc->Inputs().HasTag(kRectTag)) {
|
||||
cc->Inputs().Tag(kRectTag).Set<NormalizedRect>();
|
||||
}
|
||||
cc->Outputs().Tag(kLandmarksTag).Set<LandmarkList>();
|
||||
|
||||
return absl::OkStatus();
|
||||
|
@ -74,13 +76,24 @@ class WorldLandmarkProjectionCalculator : public CalculatorBase {
|
|||
absl::Status Process(CalculatorContext* cc) override {
|
||||
// Check that landmarks and rect are not empty.
|
||||
if (cc->Inputs().Tag(kLandmarksTag).IsEmpty() ||
|
||||
cc->Inputs().Tag(kRectTag).IsEmpty()) {
|
||||
(cc->Inputs().HasTag(kRectTag) &&
|
||||
cc->Inputs().Tag(kRectTag).IsEmpty())) {
|
||||
return absl::OkStatus();
|
||||
}
|
||||
|
||||
const auto& in_landmarks =
|
||||
cc->Inputs().Tag(kLandmarksTag).Get<LandmarkList>();
|
||||
const auto& in_rect = cc->Inputs().Tag(kRectTag).Get<NormalizedRect>();
|
||||
std::function<void(const Landmark&, Landmark*)> rotate_fn;
|
||||
if (cc->Inputs().HasTag(kRectTag)) {
|
||||
const auto& in_rect = cc->Inputs().Tag(kRectTag).Get<NormalizedRect>();
|
||||
const float cosa = std::cos(in_rect.rotation());
|
||||
const float sina = std::sin(in_rect.rotation());
|
||||
rotate_fn = [cosa, sina](const Landmark& in_landmark,
|
||||
Landmark* out_landmark) {
|
||||
out_landmark->set_x(cosa * in_landmark.x() - sina * in_landmark.y());
|
||||
out_landmark->set_y(sina * in_landmark.x() + cosa * in_landmark.y());
|
||||
};
|
||||
}
|
||||
|
||||
auto out_landmarks = absl::make_unique<LandmarkList>();
|
||||
for (int i = 0; i < in_landmarks.landmark_size(); ++i) {
|
||||
|
@ -89,11 +102,9 @@ class WorldLandmarkProjectionCalculator : public CalculatorBase {
|
|||
Landmark* out_landmark = out_landmarks->add_landmark();
|
||||
*out_landmark = in_landmark;
|
||||
|
||||
const float angle = in_rect.rotation();
|
||||
out_landmark->set_x(std::cos(angle) * in_landmark.x() -
|
||||
std::sin(angle) * in_landmark.y());
|
||||
out_landmark->set_y(std::sin(angle) * in_landmark.x() +
|
||||
std::cos(angle) * in_landmark.y());
|
||||
if (rotate_fn) {
|
||||
rotate_fn(in_landmark, out_landmark);
|
||||
}
|
||||
}
|
||||
|
||||
cc->Outputs()
|
||||
|
|
|
@ -0,0 +1,60 @@
|
|||
# Copyright 2021 The MediaPipe Authors.
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
|
||||
licenses(["notice"])
|
||||
|
||||
package(default_visibility = ["//visibility:private"])
|
||||
|
||||
cc_binary(
|
||||
name = "libmediapipe_jni.so",
|
||||
linkshared = 1,
|
||||
linkstatic = 1,
|
||||
deps = [
|
||||
"//mediapipe/graphs/selfie_segmentation:selfie_segmentation_gpu_deps",
|
||||
"//mediapipe/java/com/google/mediapipe/framework/jni:mediapipe_framework_jni",
|
||||
],
|
||||
)
|
||||
|
||||
cc_library(
|
||||
name = "mediapipe_jni_lib",
|
||||
srcs = [":libmediapipe_jni.so"],
|
||||
alwayslink = 1,
|
||||
)
|
||||
|
||||
android_binary(
|
||||
name = "selfiesegmentationgpu",
|
||||
srcs = glob(["*.java"]),
|
||||
assets = [
|
||||
"//mediapipe/graphs/selfie_segmentation:selfie_segmentation_gpu.binarypb",
|
||||
"//mediapipe/modules/selfie_segmentation:selfie_segmentation.tflite",
|
||||
],
|
||||
assets_dir = "",
|
||||
manifest = "//mediapipe/examples/android/src/java/com/google/mediapipe/apps/basic:AndroidManifest.xml",
|
||||
manifest_values = {
|
||||
"applicationId": "com.google.mediapipe.apps.selfiesegmentationgpu",
|
||||
"appName": "Selfie Segmentation",
|
||||
"mainActivity": "com.google.mediapipe.apps.basic.MainActivity",
|
||||
"cameraFacingFront": "True",
|
||||
"binaryGraphName": "selfie_segmentation_gpu.binarypb",
|
||||
"inputVideoStreamName": "input_video",
|
||||
"outputVideoStreamName": "output_video",
|
||||
"flipFramesVertically": "True",
|
||||
"converterNumBuffers": "2",
|
||||
},
|
||||
multidex = "native",
|
||||
deps = [
|
||||
":mediapipe_jni_lib",
|
||||
"//mediapipe/examples/android/src/java/com/google/mediapipe/apps/basic:basic_lib",
|
||||
],
|
||||
)
|
|
@ -49,7 +49,23 @@ message FaceBoxAdjusterCalculatorOptions {
|
|||
optional float ipd_face_box_height_ratio = 7 [default = 0.3131];
|
||||
|
||||
// The max look up angle before considering the eye distance unstable.
|
||||
optional float max_head_tilt_angle_deg = 8 [default = 12.0];
|
||||
optional float max_head_tilt_angle_deg = 8 [default = 5.0];
|
||||
// The min look up angle (i.e. looking down) before considering the eye
|
||||
// distance unstable.
|
||||
optional float min_head_tilt_angle_deg = 10 [default = -18.0];
|
||||
// The max look right angle before considering the eye distance unstable.
|
||||
optional float max_head_pan_angle_deg = 11 [default = 25.0];
|
||||
// The min look right angle (i.e. looking left) before considering the eye
|
||||
// distance unstable.
|
||||
optional float min_head_pan_angle_deg = 12 [default = -25.0];
|
||||
|
||||
// Update rate for motion history, valid values [0.0, 1.0].
|
||||
optional float motion_history_alpha = 13 [default = 0.5];
|
||||
|
||||
// Max value of head motion (max of current or history) to be considered still
|
||||
// stable.
|
||||
optional float head_motion_threshold = 14 [default = 10.0];
|
||||
|
||||
// The max amount of time to use an old eye distance when the face look angle
|
||||
// is unstable.
|
||||
optional int32 max_facesize_history_us = 9 [default = 8000000];
|
||||
|
|
|
@ -14,8 +14,8 @@ node: {
|
|||
output_stream: "LETTERBOX_PADDING:letterbox_padding"
|
||||
options: {
|
||||
[mediapipe.ImageTransformationCalculatorOptions.ext] {
|
||||
output_width: 256
|
||||
output_height: 256
|
||||
output_width: 192
|
||||
output_height: 192
|
||||
scale_mode: FIT
|
||||
}
|
||||
}
|
||||
|
@ -50,19 +50,17 @@ node {
|
|||
output_side_packet: "anchors"
|
||||
options: {
|
||||
[mediapipe.SsdAnchorsCalculatorOptions.ext] {
|
||||
num_layers: 4
|
||||
min_scale: 0.15625
|
||||
num_layers: 1
|
||||
min_scale: 0.1484375
|
||||
max_scale: 0.75
|
||||
input_size_height: 256
|
||||
input_size_width: 256
|
||||
input_size_height: 192
|
||||
input_size_width: 192
|
||||
anchor_offset_x: 0.5
|
||||
anchor_offset_y: 0.5
|
||||
strides: 16
|
||||
strides: 32
|
||||
strides: 32
|
||||
strides: 32
|
||||
strides: 4
|
||||
aspect_ratios: 1.0
|
||||
fixed_anchor_size: true
|
||||
interpolated_scale_aspect_ratio: 0.0
|
||||
}
|
||||
}
|
||||
}
|
||||
|
@ -78,7 +76,7 @@ node {
|
|||
options: {
|
||||
[mediapipe.TfLiteTensorsToDetectionsCalculatorOptions.ext] {
|
||||
num_classes: 1
|
||||
num_boxes: 896
|
||||
num_boxes: 2304
|
||||
num_coords: 16
|
||||
box_coord_offset: 0
|
||||
keypoint_coord_offset: 4
|
||||
|
@ -87,11 +85,11 @@ node {
|
|||
sigmoid_score: true
|
||||
score_clipping_thresh: 100.0
|
||||
reverse_output_order: true
|
||||
x_scale: 256.0
|
||||
y_scale: 256.0
|
||||
h_scale: 256.0
|
||||
w_scale: 256.0
|
||||
min_score_thresh: 0.65
|
||||
x_scale: 192.0
|
||||
y_scale: 192.0
|
||||
h_scale: 192.0
|
||||
w_scale: 192.0
|
||||
min_score_thresh: 0.6
|
||||
}
|
||||
}
|
||||
}
|
||||
|
|
34
mediapipe/examples/desktop/selfie_segmentation/BUILD
Normal file
34
mediapipe/examples/desktop/selfie_segmentation/BUILD
Normal file
|
@ -0,0 +1,34 @@
|
|||
# Copyright 2021 The MediaPipe Authors.
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
|
||||
licenses(["notice"])
|
||||
|
||||
package(default_visibility = ["//mediapipe/examples:__subpackages__"])
|
||||
|
||||
cc_binary(
|
||||
name = "selfie_segmentation_cpu",
|
||||
deps = [
|
||||
"//mediapipe/examples/desktop:demo_run_graph_main",
|
||||
"//mediapipe/graphs/selfie_segmentation:selfie_segmentation_cpu_deps",
|
||||
],
|
||||
)
|
||||
|
||||
# Linux only
|
||||
cc_binary(
|
||||
name = "selfie_segmentation_gpu",
|
||||
deps = [
|
||||
"//mediapipe/examples/desktop:demo_run_graph_main_gpu",
|
||||
"//mediapipe/graphs/selfie_segmentation:selfie_segmentation_gpu_deps",
|
||||
],
|
||||
)
|
69
mediapipe/examples/ios/selfiesegmentationgpu/BUILD
Normal file
69
mediapipe/examples/ios/selfiesegmentationgpu/BUILD
Normal file
|
@ -0,0 +1,69 @@
|
|||
# Copyright 2021 The MediaPipe Authors.
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
|
||||
load(
|
||||
"@build_bazel_rules_apple//apple:ios.bzl",
|
||||
"ios_application",
|
||||
)
|
||||
load(
|
||||
"//mediapipe/examples/ios:bundle_id.bzl",
|
||||
"BUNDLE_ID_PREFIX",
|
||||
"example_provisioning",
|
||||
)
|
||||
|
||||
licenses(["notice"])
|
||||
|
||||
MIN_IOS_VERSION = "10.0"
|
||||
|
||||
alias(
|
||||
name = "selfiesegmentationgpu",
|
||||
actual = "SelfieSegmentationGpuApp",
|
||||
)
|
||||
|
||||
ios_application(
|
||||
name = "SelfieSegmentationGpuApp",
|
||||
app_icons = ["//mediapipe/examples/ios/common:AppIcon"],
|
||||
bundle_id = BUNDLE_ID_PREFIX + ".SelfieSegmentationGpu",
|
||||
families = [
|
||||
"iphone",
|
||||
"ipad",
|
||||
],
|
||||
infoplists = [
|
||||
"//mediapipe/examples/ios/common:Info.plist",
|
||||
"Info.plist",
|
||||
],
|
||||
minimum_os_version = MIN_IOS_VERSION,
|
||||
provisioning_profile = example_provisioning(),
|
||||
deps = [
|
||||
":SelfieSegmentationGpuAppLibrary",
|
||||
"@ios_opencv//:OpencvFramework",
|
||||
],
|
||||
)
|
||||
|
||||
objc_library(
|
||||
name = "SelfieSegmentationGpuAppLibrary",
|
||||
data = [
|
||||
"//mediapipe/graphs/selfie_segmentation:selfie_segmentation_gpu.binarypb",
|
||||
"//mediapipe/modules/selfie_segmentation:selfie_segmentation.tflite",
|
||||
],
|
||||
deps = [
|
||||
"//mediapipe/examples/ios/common:CommonMediaPipeAppLibrary",
|
||||
] + select({
|
||||
"//mediapipe:ios_i386": [],
|
||||
"//mediapipe:ios_x86_64": [],
|
||||
"//conditions:default": [
|
||||
"//mediapipe/graphs/selfie_segmentation:selfie_segmentation_gpu_deps",
|
||||
],
|
||||
}),
|
||||
)
|
14
mediapipe/examples/ios/selfiesegmentationgpu/Info.plist
Normal file
14
mediapipe/examples/ios/selfiesegmentationgpu/Info.plist
Normal file
|
@ -0,0 +1,14 @@
|
|||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
|
||||
<plist version="1.0">
|
||||
<dict>
|
||||
<key>CameraPosition</key>
|
||||
<string>front</string>
|
||||
<key>GraphOutputStream</key>
|
||||
<string>output_video</string>
|
||||
<key>GraphInputStream</key>
|
||||
<string>input_video</string>
|
||||
<key>GraphName</key>
|
||||
<string>selfie_segmentation_gpu</string>
|
||||
</dict>
|
||||
</plist>
|
|
@ -225,7 +225,7 @@ cc_library(
|
|||
"//mediapipe/framework:stream_handler_cc_proto",
|
||||
"//mediapipe/framework/port:any_proto",
|
||||
"//mediapipe/framework/port:status",
|
||||
"//mediapipe/framework/tool:options_util",
|
||||
"//mediapipe/framework/tool:options_map",
|
||||
"//mediapipe/framework/tool:tag_map",
|
||||
"@com_google_absl//absl/memory",
|
||||
],
|
||||
|
@ -473,7 +473,7 @@ cc_library(
|
|||
"//mediapipe/framework:calculator_cc_proto",
|
||||
"//mediapipe/framework/port:any_proto",
|
||||
"//mediapipe/framework/port:logging",
|
||||
"//mediapipe/framework/tool:options_util",
|
||||
"//mediapipe/framework/tool:options_map",
|
||||
"@com_google_absl//absl/base:core_headers",
|
||||
"@com_google_absl//absl/strings",
|
||||
],
|
||||
|
|
|
@ -1,4 +1,4 @@
|
|||
# Experimental new APIs
|
||||
# New MediaPipe APIs
|
||||
|
||||
This directory defines new APIs for MediaPipe:
|
||||
|
||||
|
@ -6,13 +6,12 @@ This directory defines new APIs for MediaPipe:
|
|||
- Builder API, for assembling CalculatorGraphConfigs with C++, as an alternative
|
||||
to using the proto API directly.
|
||||
|
||||
The code is working, and the new APIs interoperate fully with the existing
|
||||
framework code. They are considered a work in progress, but are being released
|
||||
now so we can begin adopting them in our calculators.
|
||||
The new APIs interoperate fully with the existing framework code, and we are
|
||||
adopting them in our calculators. We are still making improvements, and the
|
||||
placement of this code under the `mediapipe::api2` namespace is not final.
|
||||
|
||||
Developers are welcome to try out these APIs as early adopters, but should
|
||||
expect breaking changes. The placement of this code under the `mediapipe::api2`
|
||||
namespace is not final.
|
||||
Developers are welcome to try out these APIs as early adopters, but there may be
|
||||
breaking changes.
|
||||
|
||||
## Node API
|
||||
|
||||
|
|
|
@ -29,7 +29,7 @@
|
|||
#include "mediapipe/framework/port.h"
|
||||
#include "mediapipe/framework/port/any_proto.h"
|
||||
#include "mediapipe/framework/status_handler.pb.h"
|
||||
#include "mediapipe/framework/tool/options_util.h"
|
||||
#include "mediapipe/framework/tool/options_map.h"
|
||||
|
||||
namespace mediapipe {
|
||||
|
||||
|
|
|
@ -32,7 +32,7 @@
|
|||
#include "mediapipe/framework/packet_set.h"
|
||||
#include "mediapipe/framework/port.h"
|
||||
#include "mediapipe/framework/port/any_proto.h"
|
||||
#include "mediapipe/framework/tool/options_util.h"
|
||||
#include "mediapipe/framework/tool/options_map.h"
|
||||
|
||||
namespace mediapipe {
|
||||
|
||||
|
|
|
@ -154,12 +154,25 @@ cc_test(
|
|||
],
|
||||
)
|
||||
|
||||
cc_library(
|
||||
name = "options_map",
|
||||
hdrs = ["options_map.h"],
|
||||
visibility = ["//mediapipe/framework:mediapipe_internal"],
|
||||
deps = [
|
||||
"//mediapipe/framework:calculator_cc_proto",
|
||||
"//mediapipe/framework/port:any_proto",
|
||||
"//mediapipe/framework/port:status",
|
||||
"//mediapipe/framework/tool:type_util",
|
||||
],
|
||||
)
|
||||
|
||||
cc_library(
|
||||
name = "options_util",
|
||||
srcs = ["options_util.cc"],
|
||||
hdrs = ["options_util.h"],
|
||||
visibility = ["//mediapipe/framework:mediapipe_internal"],
|
||||
deps = [
|
||||
":options_map",
|
||||
":proto_util_lite",
|
||||
"//mediapipe/framework:calculator_cc_proto",
|
||||
"//mediapipe/framework:collection",
|
||||
|
@ -199,17 +212,6 @@ mediapipe_cc_test(
|
|||
],
|
||||
)
|
||||
|
||||
cc_library(
|
||||
name = "packet_util",
|
||||
hdrs = ["packet_util.h"],
|
||||
visibility = ["//visibility:public"],
|
||||
deps = [
|
||||
"//mediapipe/framework:packet",
|
||||
"//mediapipe/framework/port:statusor",
|
||||
"@org_tensorflow//tensorflow/core:protos_all_cc",
|
||||
],
|
||||
)
|
||||
|
||||
cc_library(
|
||||
name = "proto_util_lite",
|
||||
srcs = ["proto_util_lite.cc"],
|
||||
|
@ -681,6 +683,7 @@ cc_library(
|
|||
"//mediapipe/framework/port:logging",
|
||||
"//mediapipe/framework/port:ret_check",
|
||||
"//mediapipe/framework/port:status",
|
||||
"//mediapipe/framework/stream_handler:immediate_input_stream_handler",
|
||||
"//mediapipe/framework/tool:switch_container_cc_proto",
|
||||
"@com_google_absl//absl/strings",
|
||||
],
|
||||
|
@ -706,6 +709,7 @@ cc_library(
|
|||
"//mediapipe/framework/port:logging",
|
||||
"//mediapipe/framework/port:ret_check",
|
||||
"//mediapipe/framework/port:status",
|
||||
"//mediapipe/framework/stream_handler:immediate_input_stream_handler",
|
||||
"//mediapipe/framework/tool:switch_container_cc_proto",
|
||||
"@com_google_absl//absl/strings",
|
||||
],
|
||||
|
|
107
mediapipe/framework/tool/options_map.h
Normal file
107
mediapipe/framework/tool/options_map.h
Normal file
|
@ -0,0 +1,107 @@
|
|||
#ifndef MEDIAPIPE_FRAMEWORK_TOOL_OPTIONS_MAP_H_
|
||||
#define MEDIAPIPE_FRAMEWORK_TOOL_OPTIONS_MAP_H_
|
||||
|
||||
#include <map>
|
||||
#include <memory>
|
||||
#include <type_traits>
|
||||
|
||||
#include "mediapipe/framework/calculator.pb.h"
|
||||
#include "mediapipe/framework/port/any_proto.h"
|
||||
#include "mediapipe/framework/port/status.h"
|
||||
#include "mediapipe/framework/tool/type_util.h"
|
||||
|
||||
namespace mediapipe {
|
||||
|
||||
namespace tool {
|
||||
|
||||
// A compile-time detector for the constant |T::ext|.
|
||||
template <typename T>
|
||||
struct IsExtension {
|
||||
private:
|
||||
template <typename U>
|
||||
static char test(decltype(&U::ext));
|
||||
|
||||
template <typename>
|
||||
static int test(...);
|
||||
|
||||
public:
|
||||
static constexpr bool value = (sizeof(test<T>(0)) == sizeof(char));
|
||||
};
|
||||
|
||||
template <class T,
|
||||
typename std::enable_if<IsExtension<T>::value, int>::type = 0>
|
||||
void GetExtension(const CalculatorOptions& options, T* result) {
|
||||
if (options.HasExtension(T::ext)) {
|
||||
*result = options.GetExtension(T::ext);
|
||||
}
|
||||
}
|
||||
|
||||
template <class T,
|
||||
typename std::enable_if<!IsExtension<T>::value, int>::type = 0>
|
||||
void GetExtension(const CalculatorOptions& options, T* result) {}
|
||||
|
||||
template <class T>
|
||||
void GetNodeOptions(const CalculatorGraphConfig::Node& node_config, T* result) {
|
||||
#if defined(MEDIAPIPE_PROTO_LITE) && defined(MEDIAPIPE_PROTO_THIRD_PARTY)
|
||||
// protobuf::Any is unavailable with third_party/protobuf:protobuf-lite.
|
||||
#else
|
||||
for (const mediapipe::protobuf::Any& options : node_config.node_options()) {
|
||||
if (options.Is<T>()) {
|
||||
options.UnpackTo(result);
|
||||
}
|
||||
}
|
||||
#endif
|
||||
}
|
||||
|
||||
// A map from object type to object.
|
||||
class TypeMap {
|
||||
public:
|
||||
template <class T>
|
||||
bool Has() const {
|
||||
return content_.count(TypeId<T>()) > 0;
|
||||
}
|
||||
template <class T>
|
||||
T* Get() const {
|
||||
if (!Has<T>()) {
|
||||
content_[TypeId<T>()] = std::make_shared<T>();
|
||||
}
|
||||
return static_cast<T*>(content_[TypeId<T>()].get());
|
||||
}
|
||||
|
||||
private:
|
||||
mutable std::map<TypeIndex, std::shared_ptr<void>> content_;
|
||||
};
|
||||
|
||||
// Extracts the options message of a specified type from a
|
||||
// CalculatorGraphConfig::Node.
|
||||
class OptionsMap {
|
||||
public:
|
||||
OptionsMap& Initialize(const CalculatorGraphConfig::Node& node_config) {
|
||||
node_config_ = &node_config;
|
||||
return *this;
|
||||
}
|
||||
|
||||
// Returns the options data for a CalculatorGraphConfig::Node, from
|
||||
// either "options" or "node_options" using either GetExtension or UnpackTo.
|
||||
template <class T>
|
||||
const T& Get() const {
|
||||
if (options_.Has<T>()) {
|
||||
return *options_.Get<T>();
|
||||
}
|
||||
T* result = options_.Get<T>();
|
||||
if (node_config_->has_options()) {
|
||||
GetExtension(node_config_->options(), result);
|
||||
} else {
|
||||
GetNodeOptions(*node_config_, result);
|
||||
}
|
||||
return *result;
|
||||
}
|
||||
|
||||
const CalculatorGraphConfig::Node* node_config_;
|
||||
TypeMap options_;
|
||||
};
|
||||
|
||||
} // namespace tool
|
||||
} // namespace mediapipe
|
||||
|
||||
#endif // MEDIAPIPE_FRAMEWORK_TOOL_OPTIONS_MAP_H_
|
|
@ -20,6 +20,7 @@
|
|||
#include "mediapipe/framework/packet.h"
|
||||
#include "mediapipe/framework/packet_set.h"
|
||||
#include "mediapipe/framework/port/any_proto.h"
|
||||
#include "mediapipe/framework/tool/options_map.h"
|
||||
#include "mediapipe/framework/tool/type_util.h"
|
||||
|
||||
namespace mediapipe {
|
||||
|
@ -34,64 +35,6 @@ inline T MergeOptions(const T& base, const T& options) {
|
|||
return result;
|
||||
}
|
||||
|
||||
// A compile-time detector for the constant |T::ext|.
|
||||
template <typename T>
|
||||
struct IsExtension {
|
||||
private:
|
||||
template <typename U>
|
||||
static char test(decltype(&U::ext));
|
||||
|
||||
template <typename>
|
||||
static int test(...);
|
||||
|
||||
public:
|
||||
static constexpr bool value = (sizeof(test<T>(0)) == sizeof(char));
|
||||
};
|
||||
|
||||
// A map from object type to object.
|
||||
class TypeMap {
|
||||
public:
|
||||
template <class T>
|
||||
bool Has() const {
|
||||
return content_.count(TypeId<T>()) > 0;
|
||||
}
|
||||
template <class T>
|
||||
T* Get() const {
|
||||
if (!Has<T>()) {
|
||||
content_[TypeId<T>()] = std::make_shared<T>();
|
||||
}
|
||||
return static_cast<T*>(content_[TypeId<T>()].get());
|
||||
}
|
||||
|
||||
private:
|
||||
mutable std::map<TypeIndex, std::shared_ptr<void>> content_;
|
||||
};
|
||||
|
||||
template <class T,
|
||||
typename std::enable_if<IsExtension<T>::value, int>::type = 0>
|
||||
void GetExtension(const CalculatorOptions& options, T* result) {
|
||||
if (options.HasExtension(T::ext)) {
|
||||
*result = options.GetExtension(T::ext);
|
||||
}
|
||||
}
|
||||
|
||||
template <class T,
|
||||
typename std::enable_if<!IsExtension<T>::value, int>::type = 0>
|
||||
void GetExtension(const CalculatorOptions& options, T* result) {}
|
||||
|
||||
template <class T>
|
||||
void GetNodeOptions(const CalculatorGraphConfig::Node& node_config, T* result) {
|
||||
#if defined(MEDIAPIPE_PROTO_LITE) && defined(MEDIAPIPE_PROTO_THIRD_PARTY)
|
||||
// protobuf::Any is unavailable with third_party/protobuf:protobuf-lite.
|
||||
#else
|
||||
for (const mediapipe::protobuf::Any& options : node_config.node_options()) {
|
||||
if (options.Is<T>()) {
|
||||
options.UnpackTo(result);
|
||||
}
|
||||
}
|
||||
#endif
|
||||
}
|
||||
|
||||
// Combine a base options message with an optional side packet. The specified
|
||||
// packet can hold either the specified options type T or CalculatorOptions.
|
||||
// Fields are either replaced or merged depending on field merge_fields.
|
||||
|
@ -132,35 +75,6 @@ inline T RetrieveOptions(const T& base, const InputStreamShardSet& stream_set,
|
|||
return base;
|
||||
}
|
||||
|
||||
// Extracts the options message of a specified type from a
|
||||
// CalculatorGraphConfig::Node.
|
||||
class OptionsMap {
|
||||
public:
|
||||
OptionsMap& Initialize(const CalculatorGraphConfig::Node& node_config) {
|
||||
node_config_ = &node_config;
|
||||
return *this;
|
||||
}
|
||||
|
||||
// Returns the options data for a CalculatorGraphConfig::Node, from
|
||||
// either "options" or "node_options" using either GetExtension or UnpackTo.
|
||||
template <class T>
|
||||
const T& Get() const {
|
||||
if (options_.Has<T>()) {
|
||||
return *options_.Get<T>();
|
||||
}
|
||||
T* result = options_.Get<T>();
|
||||
if (node_config_->has_options()) {
|
||||
GetExtension(node_config_->options(), result);
|
||||
} else {
|
||||
GetNodeOptions(*node_config_, result);
|
||||
}
|
||||
return *result;
|
||||
}
|
||||
|
||||
const CalculatorGraphConfig::Node* node_config_;
|
||||
TypeMap options_;
|
||||
};
|
||||
|
||||
// Finds the descriptor for a protobuf.
|
||||
const proto_ns::Descriptor* GetProtobufDescriptor(const std::string& type_name);
|
||||
|
||||
|
|
|
@ -1,57 +0,0 @@
|
|||
// Copyright 2019 The MediaPipe Authors.
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
// You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software
|
||||
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
|
||||
#ifndef MEDIAPIPE_FRAMEWORK_TOOL_PACKET_UTIL_H_
|
||||
#define MEDIAPIPE_FRAMEWORK_TOOL_PACKET_UTIL_H_
|
||||
|
||||
#include "mediapipe/framework/packet.h"
|
||||
#include "tensorflow/core/example/example.pb.h"
|
||||
|
||||
namespace mediapipe {
|
||||
namespace tool {
|
||||
// The CLIF-friendly util functions to create and access a typed MediaPipe
|
||||
// Packet from MediaPipe Python interface.
|
||||
|
||||
// Functions for SequenceExample Packets.
|
||||
|
||||
// Make a SequenceExample packet from a serialized SequenceExample.
|
||||
// The SequenceExample in the Packet is owned by the C++ packet.
|
||||
Packet CreateSequenceExamplePacketFromString(std::string* serialized_content) {
|
||||
tensorflow::SequenceExample sequence_example;
|
||||
sequence_example.ParseFromString(*serialized_content);
|
||||
return MakePacket<tensorflow::SequenceExample>(sequence_example);
|
||||
}
|
||||
|
||||
// Get a serialized SequenceExample std::string from a Packet.
|
||||
// The ownership of the returned std::string will be transferred to the Python
|
||||
// object.
|
||||
std::unique_ptr<std::string> GetSerializedSequenceExample(Packet* packet) {
|
||||
return absl::make_unique<std::string>(
|
||||
packet->Get<tensorflow::SequenceExample>().SerializeAsString());
|
||||
}
|
||||
|
||||
// Make a String packet
|
||||
Packet CreateStringPacket(std::string* input_string) {
|
||||
return MakePacket<std::string>(*input_string);
|
||||
}
|
||||
|
||||
// Get the std::string from a Packet<std::string>
|
||||
std::unique_ptr<std::string> GetString(Packet* packet) {
|
||||
return absl::make_unique<std::string>(packet->Get<std::string>());
|
||||
}
|
||||
|
||||
} // namespace tool
|
||||
} // namespace mediapipe
|
||||
|
||||
#endif // MEDIAPIPE_FRAMEWORK_TOOL_PACKET_UTIL_H_
|
|
@ -16,8 +16,6 @@
|
|||
#define MEDIAPIPE_FRAMEWORK_TOOL_TYPE_UTIL_H_
|
||||
|
||||
#include <cstddef>
|
||||
#include <string>
|
||||
#include <typeindex>
|
||||
#include <typeinfo>
|
||||
|
||||
#include "mediapipe/framework/port.h"
|
||||
|
|
|
@ -142,7 +142,7 @@ def _metal_library_impl(ctx):
|
|||
if ctx.files.hdrs:
|
||||
additional_params["header"] = depset([f for f in ctx.files.hdrs])
|
||||
objc_provider = apple_common.new_objc_provider(
|
||||
providers = [x.objc for x in ctx.attr.deps if hasattr(x, "objc")],
|
||||
providers = [x[apple_common.Objc] for x in ctx.attr.deps if apple_common.Objc in x],
|
||||
**additional_params
|
||||
)
|
||||
|
||||
|
@ -169,7 +169,7 @@ def _metal_library_impl(ctx):
|
|||
METAL_LIBRARY_ATTRS = dicts.add(apple_support.action_required_attrs(), {
|
||||
"srcs": attr.label_list(allow_files = [".metal"], allow_empty = False),
|
||||
"hdrs": attr.label_list(allow_files = [".h"]),
|
||||
"deps": attr.label_list(providers = [["objc", CcInfo]]),
|
||||
"deps": attr.label_list(providers = [["objc", CcInfo], [apple_common.Objc, CcInfo]]),
|
||||
"copts": attr.string_list(),
|
||||
"minimum_os_version": attr.string(),
|
||||
})
|
||||
|
|
|
@ -40,8 +40,8 @@ node: {
|
|||
output_stream: "LETTERBOX_PADDING:letterbox_padding"
|
||||
node_options: {
|
||||
[type.googleapis.com/mediapipe.ImageTransformationCalculatorOptions] {
|
||||
output_width: 256
|
||||
output_height: 256
|
||||
output_width: 192
|
||||
output_height: 192
|
||||
scale_mode: FIT
|
||||
}
|
||||
}
|
||||
|
@ -76,19 +76,17 @@ node {
|
|||
output_side_packet: "anchors"
|
||||
node_options: {
|
||||
[type.googleapis.com/mediapipe.SsdAnchorsCalculatorOptions] {
|
||||
num_layers: 4
|
||||
min_scale: 0.15625
|
||||
num_layers: 1
|
||||
min_scale: 0.1484375
|
||||
max_scale: 0.75
|
||||
input_size_height: 256
|
||||
input_size_width: 256
|
||||
input_size_height: 192
|
||||
input_size_width: 192
|
||||
anchor_offset_x: 0.5
|
||||
anchor_offset_y: 0.5
|
||||
strides: 16
|
||||
strides: 32
|
||||
strides: 32
|
||||
strides: 32
|
||||
strides: 4
|
||||
aspect_ratios: 1.0
|
||||
fixed_anchor_size: true
|
||||
interpolated_scale_aspect_ratio: 0.0
|
||||
}
|
||||
}
|
||||
}
|
||||
|
@ -104,7 +102,7 @@ node {
|
|||
node_options: {
|
||||
[type.googleapis.com/mediapipe.TfLiteTensorsToDetectionsCalculatorOptions] {
|
||||
num_classes: 1
|
||||
num_boxes: 896
|
||||
num_boxes: 2304
|
||||
num_coords: 16
|
||||
box_coord_offset: 0
|
||||
keypoint_coord_offset: 4
|
||||
|
@ -113,11 +111,11 @@ node {
|
|||
sigmoid_score: true
|
||||
score_clipping_thresh: 100.0
|
||||
reverse_output_order: true
|
||||
x_scale: 256.0
|
||||
y_scale: 256.0
|
||||
h_scale: 256.0
|
||||
w_scale: 256.0
|
||||
min_score_thresh: 0.65
|
||||
x_scale: 192.0
|
||||
y_scale: 192.0
|
||||
h_scale: 192.0
|
||||
w_scale: 192.0
|
||||
min_score_thresh: 0.6
|
||||
}
|
||||
}
|
||||
}
|
||||
|
|
|
@ -41,8 +41,8 @@ node: {
|
|||
output_stream: "LETTERBOX_PADDING:letterbox_padding"
|
||||
node_options: {
|
||||
[type.googleapis.com/mediapipe.ImageTransformationCalculatorOptions] {
|
||||
output_width: 256
|
||||
output_height: 256
|
||||
output_width: 192
|
||||
output_height: 192
|
||||
scale_mode: FIT
|
||||
}
|
||||
}
|
||||
|
@ -77,19 +77,17 @@ node {
|
|||
output_side_packet: "anchors"
|
||||
node_options: {
|
||||
[type.googleapis.com/mediapipe.SsdAnchorsCalculatorOptions] {
|
||||
num_layers: 4
|
||||
min_scale: 0.15625
|
||||
num_layers: 1
|
||||
min_scale: 0.1484375
|
||||
max_scale: 0.75
|
||||
input_size_height: 256
|
||||
input_size_width: 256
|
||||
input_size_height: 192
|
||||
input_size_width: 192
|
||||
anchor_offset_x: 0.5
|
||||
anchor_offset_y: 0.5
|
||||
strides: 16
|
||||
strides: 32
|
||||
strides: 32
|
||||
strides: 32
|
||||
strides: 4
|
||||
aspect_ratios: 1.0
|
||||
fixed_anchor_size: true
|
||||
interpolated_scale_aspect_ratio: 0.0
|
||||
}
|
||||
}
|
||||
}
|
||||
|
@ -105,7 +103,7 @@ node {
|
|||
node_options: {
|
||||
[type.googleapis.com/mediapipe.TfLiteTensorsToDetectionsCalculatorOptions] {
|
||||
num_classes: 1
|
||||
num_boxes: 896
|
||||
num_boxes: 2304
|
||||
num_coords: 16
|
||||
box_coord_offset: 0
|
||||
keypoint_coord_offset: 4
|
||||
|
@ -114,11 +112,11 @@ node {
|
|||
sigmoid_score: true
|
||||
score_clipping_thresh: 100.0
|
||||
reverse_output_order: true
|
||||
x_scale: 256.0
|
||||
y_scale: 256.0
|
||||
h_scale: 256.0
|
||||
w_scale: 256.0
|
||||
min_score_thresh: 0.65
|
||||
x_scale: 192.0
|
||||
y_scale: 192.0
|
||||
h_scale: 192.0
|
||||
w_scale: 192.0
|
||||
min_score_thresh: 0.6
|
||||
}
|
||||
}
|
||||
}
|
||||
|
|
|
@ -15,9 +15,9 @@
|
|||
#include <cmath>
|
||||
#include <memory>
|
||||
|
||||
#include "Eigen/Core"
|
||||
#include "Eigen/Dense"
|
||||
#include "Eigen/src/Core/util/Constants.h"
|
||||
#include "Eigen/src/Geometry/Quaternion.h"
|
||||
#include "Eigen/Geometry"
|
||||
#include "absl/memory/memory.h"
|
||||
#include "absl/strings/str_cat.h"
|
||||
#include "absl/strings/str_join.h"
|
||||
|
|
|
@ -14,9 +14,9 @@
|
|||
|
||||
#include <memory>
|
||||
|
||||
#include "Eigen/Core"
|
||||
#include "Eigen/Dense"
|
||||
#include "Eigen/src/Core/util/Constants.h"
|
||||
#include "Eigen/src/Geometry/Quaternion.h"
|
||||
#include "Eigen/Geometry"
|
||||
#include "absl/memory/memory.h"
|
||||
#include "absl/strings/str_cat.h"
|
||||
#include "absl/strings/str_join.h"
|
||||
|
|
54
mediapipe/graphs/selfie_segmentation/BUILD
Normal file
54
mediapipe/graphs/selfie_segmentation/BUILD
Normal file
|
@ -0,0 +1,54 @@
|
|||
# Copyright 2021 The MediaPipe Authors.
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
|
||||
load(
|
||||
"//mediapipe/framework/tool:mediapipe_graph.bzl",
|
||||
"mediapipe_binary_graph",
|
||||
)
|
||||
|
||||
licenses(["notice"])
|
||||
|
||||
package(default_visibility = ["//visibility:public"])
|
||||
|
||||
cc_library(
|
||||
name = "selfie_segmentation_gpu_deps",
|
||||
deps = [
|
||||
"//mediapipe/calculators/core:flow_limiter_calculator",
|
||||
"//mediapipe/calculators/image:recolor_calculator",
|
||||
"//mediapipe/modules/selfie_segmentation:selfie_segmentation_gpu",
|
||||
],
|
||||
)
|
||||
|
||||
mediapipe_binary_graph(
|
||||
name = "selfie_segmentation_gpu_binary_graph",
|
||||
graph = "selfie_segmentation_gpu.pbtxt",
|
||||
output_name = "selfie_segmentation_gpu.binarypb",
|
||||
deps = [":selfie_segmentation_gpu_deps"],
|
||||
)
|
||||
|
||||
cc_library(
|
||||
name = "selfie_segmentation_cpu_deps",
|
||||
deps = [
|
||||
"//mediapipe/calculators/core:flow_limiter_calculator",
|
||||
"//mediapipe/calculators/image:recolor_calculator",
|
||||
"//mediapipe/modules/selfie_segmentation:selfie_segmentation_cpu",
|
||||
],
|
||||
)
|
||||
|
||||
mediapipe_binary_graph(
|
||||
name = "selfie_segmentation_cpu_binary_graph",
|
||||
graph = "selfie_segmentation_cpu.pbtxt",
|
||||
output_name = "selfie_segmentation_cpu.binarypb",
|
||||
deps = [":selfie_segmentation_cpu_deps"],
|
||||
)
|
|
@ -0,0 +1,52 @@
|
|||
# MediaPipe graph that performs selfie segmentation with TensorFlow Lite on CPU.
|
||||
|
||||
# CPU buffer. (ImageFrame)
|
||||
input_stream: "input_video"
|
||||
|
||||
# Output image with rendered results. (ImageFrame)
|
||||
output_stream: "output_video"
|
||||
|
||||
# Throttles the images flowing downstream for flow control. It passes through
|
||||
# the very first incoming image unaltered, and waits for downstream nodes
|
||||
# (calculators and subgraphs) in the graph to finish their tasks before it
|
||||
# passes through another image. All images that come in while waiting are
|
||||
# dropped, limiting the number of in-flight images in most part of the graph to
|
||||
# 1. This prevents the downstream nodes from queuing up incoming images and data
|
||||
# excessively, which leads to increased latency and memory usage, unwanted in
|
||||
# real-time mobile applications. It also eliminates unnecessarily computation,
|
||||
# e.g., the output produced by a node may get dropped downstream if the
|
||||
# subsequent nodes are still busy processing previous inputs.
|
||||
node {
|
||||
calculator: "FlowLimiterCalculator"
|
||||
input_stream: "input_video"
|
||||
input_stream: "FINISHED:output_video"
|
||||
input_stream_info: {
|
||||
tag_index: "FINISHED"
|
||||
back_edge: true
|
||||
}
|
||||
output_stream: "throttled_input_video"
|
||||
}
|
||||
|
||||
# Subgraph that performs selfie segmentation.
|
||||
node {
|
||||
calculator: "SelfieSegmentationCpu"
|
||||
input_stream: "IMAGE:throttled_input_video"
|
||||
output_stream: "SEGMENTATION_MASK:segmentation_mask"
|
||||
}
|
||||
|
||||
|
||||
# Colors the selfie segmentation with the color specified in the option.
|
||||
node {
|
||||
calculator: "RecolorCalculator"
|
||||
input_stream: "IMAGE:throttled_input_video"
|
||||
input_stream: "MASK:segmentation_mask"
|
||||
output_stream: "IMAGE:output_video"
|
||||
node_options: {
|
||||
[type.googleapis.com/mediapipe.RecolorCalculatorOptions] {
|
||||
color { r: 0 g: 0 b: 255 }
|
||||
mask_channel: RED
|
||||
invert_mask: true
|
||||
adjust_with_luminance: false
|
||||
}
|
||||
}
|
||||
}
|
|
@ -0,0 +1,52 @@
|
|||
# MediaPipe graph that performs selfie segmentation with TensorFlow Lite on GPU.
|
||||
|
||||
# GPU buffer. (GpuBuffer)
|
||||
input_stream: "input_video"
|
||||
|
||||
# Output image with rendered results. (GpuBuffer)
|
||||
output_stream: "output_video"
|
||||
|
||||
# Throttles the images flowing downstream for flow control. It passes through
|
||||
# the very first incoming image unaltered, and waits for downstream nodes
|
||||
# (calculators and subgraphs) in the graph to finish their tasks before it
|
||||
# passes through another image. All images that come in while waiting are
|
||||
# dropped, limiting the number of in-flight images in most part of the graph to
|
||||
# 1. This prevents the downstream nodes from queuing up incoming images and data
|
||||
# excessively, which leads to increased latency and memory usage, unwanted in
|
||||
# real-time mobile applications. It also eliminates unnecessarily computation,
|
||||
# e.g., the output produced by a node may get dropped downstream if the
|
||||
# subsequent nodes are still busy processing previous inputs.
|
||||
node {
|
||||
calculator: "FlowLimiterCalculator"
|
||||
input_stream: "input_video"
|
||||
input_stream: "FINISHED:output_video"
|
||||
input_stream_info: {
|
||||
tag_index: "FINISHED"
|
||||
back_edge: true
|
||||
}
|
||||
output_stream: "throttled_input_video"
|
||||
}
|
||||
|
||||
# Subgraph that performs selfie segmentation.
|
||||
node {
|
||||
calculator: "SelfieSegmentationGpu"
|
||||
input_stream: "IMAGE:throttled_input_video"
|
||||
output_stream: "SEGMENTATION_MASK:segmentation_mask"
|
||||
}
|
||||
|
||||
|
||||
# Colors the selfie segmentation with the color specified in the option.
|
||||
node {
|
||||
calculator: "RecolorCalculator"
|
||||
input_stream: "IMAGE_GPU:throttled_input_video"
|
||||
input_stream: "MASK_GPU:segmentation_mask"
|
||||
output_stream: "IMAGE_GPU:output_video"
|
||||
node_options: {
|
||||
[type.googleapis.com/mediapipe.RecolorCalculatorOptions] {
|
||||
color { r: 0 g: 0 b: 255 }
|
||||
mask_channel: RED
|
||||
invert_mask: true
|
||||
adjust_with_luminance: false
|
||||
}
|
||||
}
|
||||
}
|
|
@ -0,0 +1,223 @@
|
|||
// Copyright 2019-2021 The MediaPipe Authors.
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
// You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software
|
||||
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
|
||||
package com.google.mediapipe.components;
|
||||
|
||||
import android.graphics.SurfaceTexture;
|
||||
import android.opengl.GLES11Ext;
|
||||
import android.opengl.GLES20;
|
||||
import android.opengl.GLSurfaceView;
|
||||
import android.opengl.Matrix;
|
||||
import android.util.Log;
|
||||
import com.google.mediapipe.framework.TextureFrame;
|
||||
import com.google.mediapipe.glutil.CommonShaders;
|
||||
import com.google.mediapipe.glutil.ShaderUtil;
|
||||
import java.nio.FloatBuffer;
|
||||
import java.util.HashMap;
|
||||
import java.util.Map;
|
||||
import java.util.concurrent.atomic.AtomicReference;
|
||||
import javax.microedition.khronos.egl.EGLConfig;
|
||||
import javax.microedition.khronos.opengles.GL10;
|
||||
|
||||
/**
|
||||
* Renderer for a {@link GLSurfaceView}. It displays a texture. The texture is scaled and cropped as
|
||||
* necessary to fill the view, while maintaining its aspect ratio.
|
||||
*
|
||||
* <p>It can render both textures bindable to the normal {@link GLES20#GL_TEXTURE_2D} target as well
|
||||
* as textures bindable to {@link GLES11Ext#GL_TEXTURE_EXTERNAL_OES}, which is used for Android
|
||||
* surfaces. Call {@link #setTextureTarget(int)} to choose the correct target.
|
||||
*
|
||||
* <p>It can display a {@link SurfaceTexture} (call {@link #setSurfaceTexture(SurfaceTexture)}) or a
|
||||
* {@link TextureFrame} (call {@link #setNextFrame(TextureFrame)}).
|
||||
*/
|
||||
public class GlSurfaceViewRenderer implements GLSurfaceView.Renderer {
|
||||
private static final String TAG = "DemoRenderer";
|
||||
private static final int ATTRIB_POSITION = 1;
|
||||
private static final int ATTRIB_TEXTURE_COORDINATE = 2;
|
||||
|
||||
private int surfaceWidth;
|
||||
private int surfaceHeight;
|
||||
private int frameWidth = 0;
|
||||
private int frameHeight = 0;
|
||||
private int program = 0;
|
||||
private int frameUniform;
|
||||
private int textureTarget = GLES11Ext.GL_TEXTURE_EXTERNAL_OES;
|
||||
private int textureTransformUniform;
|
||||
// Controls the alignment between frame size and surface size, 0.5f default is centered.
|
||||
private float alignmentHorizontal = 0.5f;
|
||||
private float alignmentVertical = 0.5f;
|
||||
private float[] textureTransformMatrix = new float[16];
|
||||
private SurfaceTexture surfaceTexture = null;
|
||||
private final AtomicReference<TextureFrame> nextFrame = new AtomicReference<>();
|
||||
|
||||
@Override
|
||||
public void onSurfaceCreated(GL10 gl, EGLConfig config) {
|
||||
if (surfaceTexture == null) {
|
||||
Matrix.setIdentityM(textureTransformMatrix, 0 /* offset */);
|
||||
}
|
||||
Map<String, Integer> attributeLocations = new HashMap<>();
|
||||
attributeLocations.put("position", ATTRIB_POSITION);
|
||||
attributeLocations.put("texture_coordinate", ATTRIB_TEXTURE_COORDINATE);
|
||||
Log.d(TAG, "external texture: " + isExternalTexture());
|
||||
program =
|
||||
ShaderUtil.createProgram(
|
||||
CommonShaders.VERTEX_SHADER,
|
||||
isExternalTexture()
|
||||
? CommonShaders.FRAGMENT_SHADER_EXTERNAL
|
||||
: CommonShaders.FRAGMENT_SHADER,
|
||||
attributeLocations);
|
||||
frameUniform = GLES20.glGetUniformLocation(program, "video_frame");
|
||||
textureTransformUniform = GLES20.glGetUniformLocation(program, "texture_transform");
|
||||
ShaderUtil.checkGlError("glGetUniformLocation");
|
||||
|
||||
GLES20.glClearColor(0.0f, 0.0f, 0.0f, 1.0f);
|
||||
}
|
||||
|
||||
@Override
|
||||
public void onSurfaceChanged(GL10 gl, int width, int height) {
|
||||
surfaceWidth = width;
|
||||
surfaceHeight = height;
|
||||
GLES20.glViewport(0, 0, width, height);
|
||||
}
|
||||
|
||||
@Override
|
||||
public void onDrawFrame(GL10 gl) {
|
||||
TextureFrame frame = nextFrame.getAndSet(null);
|
||||
|
||||
GLES20.glClear(GLES20.GL_COLOR_BUFFER_BIT);
|
||||
ShaderUtil.checkGlError("glClear");
|
||||
|
||||
if (surfaceTexture == null && frame == null) {
|
||||
return;
|
||||
}
|
||||
|
||||
GLES20.glActiveTexture(GLES20.GL_TEXTURE0);
|
||||
ShaderUtil.checkGlError("glActiveTexture");
|
||||
if (surfaceTexture != null) {
|
||||
surfaceTexture.updateTexImage();
|
||||
surfaceTexture.getTransformMatrix(textureTransformMatrix);
|
||||
} else {
|
||||
GLES20.glBindTexture(textureTarget, frame.getTextureName());
|
||||
ShaderUtil.checkGlError("glBindTexture");
|
||||
}
|
||||
GLES20.glTexParameteri(textureTarget, GLES20.GL_TEXTURE_MIN_FILTER, GLES20.GL_LINEAR);
|
||||
GLES20.glTexParameteri(textureTarget, GLES20.GL_TEXTURE_MAG_FILTER, GLES20.GL_LINEAR);
|
||||
GLES20.glTexParameteri(textureTarget, GLES20.GL_TEXTURE_WRAP_S, GLES20.GL_CLAMP_TO_EDGE);
|
||||
GLES20.glTexParameteri(textureTarget, GLES20.GL_TEXTURE_WRAP_T, GLES20.GL_CLAMP_TO_EDGE);
|
||||
ShaderUtil.checkGlError("texture setup");
|
||||
|
||||
GLES20.glUseProgram(program);
|
||||
GLES20.glUniform1i(frameUniform, 0);
|
||||
GLES20.glUniformMatrix4fv(textureTransformUniform, 1, false, textureTransformMatrix, 0);
|
||||
ShaderUtil.checkGlError("glUniformMatrix4fv");
|
||||
GLES20.glEnableVertexAttribArray(ATTRIB_POSITION);
|
||||
GLES20.glVertexAttribPointer(
|
||||
ATTRIB_POSITION, 2, GLES20.GL_FLOAT, false, 0, CommonShaders.SQUARE_VERTICES);
|
||||
|
||||
// TODO: compute scale from surfaceTexture size.
|
||||
float scaleWidth = frameWidth > 0 ? (float) surfaceWidth / (float) frameWidth : 1.0f;
|
||||
float scaleHeight = frameHeight > 0 ? (float) surfaceHeight / (float) frameHeight : 1.0f;
|
||||
// Whichever of the two scales is greater corresponds to the dimension where the image
|
||||
// is proportionally smaller than the view. Dividing both scales by that number results
|
||||
// in that dimension having scale 1.0, and thus touching the edges of the view, while the
|
||||
// other is cropped proportionally.
|
||||
float maxScale = Math.max(scaleWidth, scaleHeight);
|
||||
scaleWidth /= maxScale;
|
||||
scaleHeight /= maxScale;
|
||||
|
||||
// Alignment controls where the visible section is placed within the full camera frame, with
|
||||
// (0, 0) being the bottom left, and (1, 1) being the top right.
|
||||
float textureLeft = (1.0f - scaleWidth) * alignmentHorizontal;
|
||||
float textureRight = textureLeft + scaleWidth;
|
||||
float textureBottom = (1.0f - scaleHeight) * alignmentVertical;
|
||||
float textureTop = textureBottom + scaleHeight;
|
||||
|
||||
// Unlike on iOS, there is no need to flip the surfaceTexture here.
|
||||
// But for regular textures, we will need to flip them.
|
||||
final FloatBuffer passThroughTextureVertices =
|
||||
ShaderUtil.floatBuffer(
|
||||
textureLeft, textureBottom,
|
||||
textureRight, textureBottom,
|
||||
textureLeft, textureTop,
|
||||
textureRight, textureTop);
|
||||
GLES20.glEnableVertexAttribArray(ATTRIB_TEXTURE_COORDINATE);
|
||||
GLES20.glVertexAttribPointer(
|
||||
ATTRIB_TEXTURE_COORDINATE, 2, GLES20.GL_FLOAT, false, 0, passThroughTextureVertices);
|
||||
ShaderUtil.checkGlError("program setup");
|
||||
|
||||
GLES20.glDrawArrays(GLES20.GL_TRIANGLE_STRIP, 0, 4);
|
||||
ShaderUtil.checkGlError("glDrawArrays");
|
||||
GLES20.glBindTexture(textureTarget, 0);
|
||||
ShaderUtil.checkGlError("unbind surfaceTexture");
|
||||
|
||||
// We must flush before releasing the frame.
|
||||
GLES20.glFlush();
|
||||
|
||||
if (frame != null) {
|
||||
frame.release();
|
||||
}
|
||||
}
|
||||
|
||||
public void setTextureTarget(int target) {
|
||||
if (program != 0) {
|
||||
throw new IllegalStateException(
|
||||
"setTextureTarget must be called before the surface is created");
|
||||
}
|
||||
textureTarget = target;
|
||||
}
|
||||
|
||||
public void setSurfaceTexture(SurfaceTexture texture) {
|
||||
if (!isExternalTexture()) {
|
||||
throw new IllegalStateException(
|
||||
"to use a SurfaceTexture, the texture target must be GL_TEXTURE_EXTERNAL_OES");
|
||||
}
|
||||
TextureFrame oldFrame = nextFrame.getAndSet(null);
|
||||
if (oldFrame != null) {
|
||||
oldFrame.release();
|
||||
}
|
||||
surfaceTexture = texture;
|
||||
}
|
||||
|
||||
// Use this when the texture is not a SurfaceTexture.
|
||||
public void setNextFrame(TextureFrame frame) {
|
||||
if (surfaceTexture != null) {
|
||||
Matrix.setIdentityM(textureTransformMatrix, 0 /* offset */);
|
||||
}
|
||||
TextureFrame oldFrame = nextFrame.getAndSet(frame);
|
||||
if (oldFrame != null
|
||||
&& (frame == null || (oldFrame.getTextureName() != frame.getTextureName()))) {
|
||||
oldFrame.release();
|
||||
}
|
||||
surfaceTexture = null;
|
||||
}
|
||||
|
||||
public void setFrameSize(int width, int height) {
|
||||
frameWidth = width;
|
||||
frameHeight = height;
|
||||
}
|
||||
|
||||
/**
|
||||
* When the aspect ratios between the camera frame and the surface size are mismatched, this
|
||||
* controls how the image is aligned. 0.0 means aligning the left/bottom edges; 1.0 means aligning
|
||||
* the right/top edges; 0.5 (default) means aligning the centers.
|
||||
*/
|
||||
public void setAlignment(float horizontal, float vertical) {
|
||||
alignmentHorizontal = horizontal;
|
||||
alignmentVertical = vertical;
|
||||
}
|
||||
|
||||
private boolean isExternalTexture() {
|
||||
return textureTarget == GLES11Ext.GL_TEXTURE_EXTERNAL_OES;
|
||||
}
|
||||
}
|
|
@ -16,6 +16,7 @@ package com.google.mediapipe.framework;
|
|||
|
||||
import android.graphics.Bitmap;
|
||||
import java.nio.ByteBuffer;
|
||||
import java.util.List;
|
||||
|
||||
// TODO: use Preconditions in this file.
|
||||
/**
|
||||
|
|
|
@ -444,8 +444,16 @@ JNIEXPORT jlong JNICALL PACKET_GETTER_METHOD(nativeGetGpuBuffer)(JNIEnv* env,
|
|||
mediapipe::android::Graph::GetPacketFromHandle(packet);
|
||||
mediapipe::GlTextureBufferSharedPtr ptr;
|
||||
if (mediapipe_packet.ValidateAsType<mediapipe::Image>().ok()) {
|
||||
const mediapipe::Image& buffer = mediapipe_packet.Get<mediapipe::Image>();
|
||||
ptr = buffer.GetGlTextureBufferSharedPtr();
|
||||
auto mediapipe_graph =
|
||||
mediapipe::android::Graph::GetContextFromHandle(packet);
|
||||
auto gl_context = mediapipe_graph->GetGpuResources()->gl_context();
|
||||
auto status =
|
||||
gl_context->Run([gl_context, mediapipe_packet, &ptr]() -> absl::Status {
|
||||
const mediapipe::Image& buffer =
|
||||
mediapipe_packet.Get<mediapipe::Image>();
|
||||
ptr = buffer.GetGlTextureBufferSharedPtr();
|
||||
return absl::OkStatus();
|
||||
});
|
||||
} else {
|
||||
const mediapipe::GpuBuffer& buffer =
|
||||
mediapipe_packet.Get<mediapipe::GpuBuffer>();
|
||||
|
|
67
mediapipe/java/com/google/mediapipe/solutionbase/BUILD
Normal file
67
mediapipe/java/com/google/mediapipe/solutionbase/BUILD
Normal file
|
@ -0,0 +1,67 @@
|
|||
# Copyright 2021 The MediaPipe Authors.
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
|
||||
package(default_visibility = ["//visibility:public"])
|
||||
|
||||
licenses(["notice"])
|
||||
|
||||
android_library(
|
||||
name = "solution_base",
|
||||
srcs = glob(
|
||||
["*.java"],
|
||||
exclude = [
|
||||
"CameraInput.java",
|
||||
],
|
||||
),
|
||||
visibility = ["//visibility:public"],
|
||||
deps = [
|
||||
"//mediapipe/java/com/google/mediapipe/framework:android_framework",
|
||||
"//mediapipe/java/com/google/mediapipe/glutil",
|
||||
"//third_party:autovalue",
|
||||
"@maven//:com_google_code_findbugs_jsr305",
|
||||
"@maven//:com_google_guava_guava",
|
||||
],
|
||||
)
|
||||
|
||||
android_library(
|
||||
name = "camera_input",
|
||||
srcs = ["CameraInput.java"],
|
||||
visibility = ["//visibility:public"],
|
||||
deps = [
|
||||
"//mediapipe/java/com/google/mediapipe/components:android_camerax_helper",
|
||||
"//mediapipe/java/com/google/mediapipe/components:android_components",
|
||||
"//mediapipe/java/com/google/mediapipe/framework:android_framework",
|
||||
"@maven//:com_google_guava_guava",
|
||||
],
|
||||
)
|
||||
|
||||
# Native dependencies of all MediaPipe solutions.
|
||||
cc_binary(
|
||||
name = "libmediapipe_jni.so",
|
||||
linkshared = 1,
|
||||
linkstatic = 1,
|
||||
# TODO: Add more calculators to support other top-level solutions.
|
||||
deps = [
|
||||
"//mediapipe/java/com/google/mediapipe/framework/jni:mediapipe_framework_jni",
|
||||
"//mediapipe/modules/hand_landmark:hand_landmark_tracking_gpu_image",
|
||||
],
|
||||
)
|
||||
|
||||
# Converts the .so cc_binary into a cc_library, to be consumed in an android_binary.
|
||||
cc_library(
|
||||
name = "mediapipe_jni_lib",
|
||||
srcs = [":libmediapipe_jni.so"],
|
||||
visibility = ["//visibility:public"],
|
||||
alwayslink = 1,
|
||||
)
|
|
@ -0,0 +1,109 @@
|
|||
// Copyright 2021 The MediaPipe Authors.
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
// You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software
|
||||
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
|
||||
package com.google.mediapipe.solutionbase;
|
||||
|
||||
import android.app.Activity;
|
||||
import com.google.mediapipe.components.CameraHelper;
|
||||
import com.google.mediapipe.components.CameraXPreviewHelper;
|
||||
import com.google.mediapipe.components.ExternalTextureConverter;
|
||||
import com.google.mediapipe.components.PermissionHelper;
|
||||
import com.google.mediapipe.components.TextureFrameConsumer;
|
||||
import com.google.mediapipe.framework.MediaPipeException;
|
||||
import com.google.mediapipe.framework.TextureFrame;
|
||||
import javax.microedition.khronos.egl.EGLContext;
|
||||
|
||||
/**
|
||||
* The camera component that takes the camera input and produces MediaPipe {@link TextureFrame}
|
||||
* objects.
|
||||
*/
|
||||
public class CameraInput {
|
||||
private static final String TAG = "CameraInput";
|
||||
|
||||
/** Represents the direction the camera faces relative to device screen. */
|
||||
public static enum CameraFacing {
|
||||
FRONT,
|
||||
BACK
|
||||
};
|
||||
|
||||
private final CameraXPreviewHelper cameraHelper;
|
||||
private TextureFrameConsumer cameraNewFrameListener;
|
||||
private ExternalTextureConverter converter;
|
||||
|
||||
/**
|
||||
* Initializes CamereInput and requests camera permissions.
|
||||
*
|
||||
* @param activity an Android {@link Activity}.
|
||||
*/
|
||||
public CameraInput(Activity activity) {
|
||||
cameraHelper = new CameraXPreviewHelper();
|
||||
PermissionHelper.checkAndRequestCameraPermissions(activity);
|
||||
}
|
||||
|
||||
/**
|
||||
* Sets a callback to be invoked when new frames available.
|
||||
*
|
||||
* @param listener the callback.
|
||||
*/
|
||||
public void setCameraNewFrameListener(TextureFrameConsumer listener) {
|
||||
cameraNewFrameListener = listener;
|
||||
}
|
||||
|
||||
/**
|
||||
* Sets up the external texture converter and starts the camera.
|
||||
*
|
||||
* @param activity an Android {@link Activity}.
|
||||
* @param eglContext an OpenGL {@link EGLContext}.
|
||||
* @param cameraFacing the direction the camera faces relative to device screen.
|
||||
* @param width the desired width of the converted texture.
|
||||
* @param height the desired height of the converted texture.
|
||||
*/
|
||||
public void start(
|
||||
Activity activity, EGLContext eglContext, CameraFacing cameraFacing, int width, int height) {
|
||||
if (!PermissionHelper.cameraPermissionsGranted(activity)) {
|
||||
return;
|
||||
}
|
||||
if (converter == null) {
|
||||
converter = new ExternalTextureConverter(eglContext, 2);
|
||||
}
|
||||
if (cameraNewFrameListener == null) {
|
||||
throw new MediaPipeException(
|
||||
MediaPipeException.StatusCode.FAILED_PRECONDITION.ordinal(),
|
||||
"cameraNewFrameListener is not set.");
|
||||
}
|
||||
converter.setConsumer(cameraNewFrameListener);
|
||||
cameraHelper.setOnCameraStartedListener(
|
||||
surfaceTexture ->
|
||||
converter.setSurfaceTextureAndAttachToGLContext(surfaceTexture, width, height));
|
||||
cameraHelper.startCamera(
|
||||
activity,
|
||||
cameraFacing == CameraFacing.FRONT
|
||||
? CameraHelper.CameraFacing.FRONT
|
||||
: CameraHelper.CameraFacing.BACK,
|
||||
/*unusedSurfaceTexture=*/ null,
|
||||
null);
|
||||
}
|
||||
|
||||
/** Stops the camera input. */
|
||||
public void stop() {
|
||||
if (converter != null) {
|
||||
converter.close();
|
||||
}
|
||||
}
|
||||
|
||||
/** Returns a boolean which is true if the camera is in Portrait mode, false in Landscape mode. */
|
||||
public boolean isCameraRotated() {
|
||||
return cameraHelper.isCameraRotated();
|
||||
}
|
||||
}
|
|
@ -0,0 +1,20 @@
|
|||
// Copyright 2021 The MediaPipe Authors.
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
// You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software
|
||||
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
|
||||
package com.google.mediapipe.solutionbase;
|
||||
|
||||
/** Interface for the customizable MediaPipe solution error listener. */
|
||||
public interface ErrorListener {
|
||||
void onError(String message, RuntimeException e);
|
||||
}
|
|
@ -0,0 +1,174 @@
|
|||
// Copyright 2021 The MediaPipe Authors.
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
// You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software
|
||||
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
|
||||
package com.google.mediapipe.solutionbase;
|
||||
|
||||
import android.content.Context;
|
||||
import android.graphics.Bitmap;
|
||||
import android.util.Log;
|
||||
import com.google.mediapipe.framework.MediaPipeException;
|
||||
import com.google.mediapipe.framework.Packet;
|
||||
import com.google.mediapipe.framework.TextureFrame;
|
||||
import com.google.mediapipe.glutil.EglManager;
|
||||
import java.util.concurrent.atomic.AtomicInteger;
|
||||
import javax.microedition.khronos.egl.EGLContext;
|
||||
|
||||
/** The base class of the MediaPipe image solutions. */
|
||||
// TODO: Consolidates the "send" methods to be a single "send(MlImage image)".
|
||||
public class ImageSolutionBase extends SolutionBase {
|
||||
public static final String TAG = "ImageSolutionBase";
|
||||
protected boolean staticImageMode;
|
||||
private EglManager eglManager;
|
||||
// Internal fake timestamp for static images.
|
||||
private final AtomicInteger staticImageTimestamp = new AtomicInteger(0);
|
||||
|
||||
/**
|
||||
* Initializes MediaPipe image solution base with Android context, solution specific settings, and
|
||||
* solution result handler.
|
||||
*
|
||||
* @param context an Android {@link Context}.
|
||||
* @param solutionInfo a {@link SolutionInfo} contains binary graph file path, graph input and
|
||||
* output stream names.
|
||||
* @param outputHandler a {@link OutputHandler} handles the solution graph output packets and
|
||||
* runtime exception.
|
||||
*/
|
||||
@Override
|
||||
public synchronized void initialize(
|
||||
Context context,
|
||||
SolutionInfo solutionInfo,
|
||||
OutputHandler<? extends SolutionResult> outputHandler) {
|
||||
staticImageMode = solutionInfo.staticImageMode();
|
||||
try {
|
||||
super.initialize(context, solutionInfo, outputHandler);
|
||||
eglManager = new EglManager(/*parentContext=*/ null);
|
||||
solutionGraph.setParentGlContext(eglManager.getNativeContext());
|
||||
} catch (MediaPipeException e) {
|
||||
throwException("Error occurs when creating MediaPipe image solution graph. ", e);
|
||||
}
|
||||
}
|
||||
|
||||
/** Returns the managed {@link EGLContext} to share the opengl context with other components. */
|
||||
public EGLContext getGlContext() {
|
||||
return eglManager.getContext();
|
||||
}
|
||||
|
||||
|
||||
/** Returns the opengl major version number. */
|
||||
public int getGlMajorVersion() {
|
||||
return eglManager.getGlMajorVersion();
|
||||
}
|
||||
|
||||
/** Sends a {@link TextureFrame} into solution graph for processing. */
|
||||
public void send(TextureFrame textureFrame) {
|
||||
if (!staticImageMode && textureFrame.getTimestamp() == Long.MIN_VALUE) {
|
||||
throwException(
|
||||
"Error occurs when calling the solution send method. ",
|
||||
new MediaPipeException(
|
||||
MediaPipeException.StatusCode.FAILED_PRECONDITION.ordinal(),
|
||||
"TextureFrame's timestamp needs to be explicitly set if not in static image mode."));
|
||||
return;
|
||||
}
|
||||
long timestampUs =
|
||||
staticImageMode ? staticImageTimestamp.getAndIncrement() : textureFrame.getTimestamp();
|
||||
sendImage(textureFrame, timestampUs);
|
||||
}
|
||||
|
||||
/**
|
||||
* Sends a {@link Bitmap} with a timestamp into solution graph for processing. In static image
|
||||
* mode, the timestamp is ignored.
|
||||
*/
|
||||
public void send(Bitmap inputBitmap, long timestamp) {
|
||||
if (staticImageMode) {
|
||||
Log.w(TAG, "In static image mode, the MediaPipe solution ignores the input timestamp.");
|
||||
}
|
||||
sendImage(inputBitmap, staticImageMode ? staticImageTimestamp.getAndIncrement() : timestamp);
|
||||
}
|
||||
|
||||
/** Sends a {@link Bitmap} (static image) into solution graph for processing. */
|
||||
public void send(Bitmap inputBitmap) {
|
||||
if (!staticImageMode) {
|
||||
throwException(
|
||||
"Error occurs when calling the solution send method. ",
|
||||
new MediaPipeException(
|
||||
MediaPipeException.StatusCode.FAILED_PRECONDITION.ordinal(),
|
||||
"When not in static image mode, a timestamp associated with the image is required."
|
||||
+ " Use send(Bitmap inputBitmap, long timestamp) instead."));
|
||||
return;
|
||||
}
|
||||
sendImage(inputBitmap, staticImageTimestamp.getAndIncrement());
|
||||
}
|
||||
|
||||
/** Internal implementation of sending Bitmap/TextureFrame into the MediaPipe solution. */
|
||||
private synchronized <T> void sendImage(T imageObj, long timestamp) {
|
||||
if (lastTimestamp >= timestamp) {
|
||||
throwException(
|
||||
"The received frame having a smaller timestamp than the processed timestamp.",
|
||||
new MediaPipeException(
|
||||
MediaPipeException.StatusCode.FAILED_PRECONDITION.ordinal(),
|
||||
"Receving a frame with invalid timestamp."));
|
||||
return;
|
||||
}
|
||||
lastTimestamp = timestamp;
|
||||
Packet imagePacket = null;
|
||||
try {
|
||||
if (imageObj instanceof TextureFrame) {
|
||||
imagePacket = packetCreator.createImage((TextureFrame) imageObj);
|
||||
imageObj = null;
|
||||
} else if (imageObj instanceof Bitmap) {
|
||||
imagePacket = packetCreator.createRgbaImage((Bitmap) imageObj);
|
||||
} else {
|
||||
throwException(
|
||||
"The input image type is not supported. ",
|
||||
new MediaPipeException(
|
||||
MediaPipeException.StatusCode.UNIMPLEMENTED.ordinal(),
|
||||
"The input image type is not supported."));
|
||||
}
|
||||
|
||||
try {
|
||||
// addConsumablePacketToInputStream allows the graph to take exclusive ownership of the
|
||||
// packet, which may allow for more memory optimizations.
|
||||
solutionGraph.addConsumablePacketToInputStream(
|
||||
imageInputStreamName, imagePacket, timestamp);
|
||||
// If addConsumablePacket succeeded, we don't need to release the packet ourselves.
|
||||
imagePacket = null;
|
||||
} catch (MediaPipeException e) {
|
||||
// TODO: do not suppress exceptions here!
|
||||
if (errorListener == null) {
|
||||
Log.e(TAG, "Mediapipe error: ", e);
|
||||
} else {
|
||||
throw e;
|
||||
}
|
||||
}
|
||||
} catch (RuntimeException e) {
|
||||
if (errorListener != null) {
|
||||
errorListener.onError("Mediapipe error: ", e);
|
||||
} else {
|
||||
throw e;
|
||||
}
|
||||
} finally {
|
||||
if (imagePacket != null) {
|
||||
// In case of error, addConsumablePacketToInputStream will not release the packet, so we
|
||||
// have to release it ourselves. (We could also re-try adding, but we don't).
|
||||
imagePacket.release();
|
||||
}
|
||||
if (imageObj instanceof TextureFrame) {
|
||||
if (imageObj != null) {
|
||||
// imagePacket will release frame if it has been created, but if not, we need to
|
||||
// release it.
|
||||
((TextureFrame) imageObj).release();
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
|
@ -0,0 +1,59 @@
|
|||
// Copyright 2021 The MediaPipe Authors.
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
// You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software
|
||||
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
|
||||
package com.google.mediapipe.solutionbase;
|
||||
|
||||
import android.graphics.Bitmap;
|
||||
import com.google.mediapipe.framework.AndroidPacketGetter;
|
||||
import com.google.mediapipe.framework.Packet;
|
||||
import com.google.mediapipe.framework.PacketGetter;
|
||||
import com.google.mediapipe.framework.TextureFrame;
|
||||
|
||||
/**
|
||||
* The base class of any MediaPipe image solution result. The base class contains the common parts
|
||||
* across all image solution results, including the input timestamp and the input image data. A new
|
||||
* MediaPipe image solution result class should extend ImageSolutionResult.
|
||||
*/
|
||||
public class ImageSolutionResult implements SolutionResult {
|
||||
protected long timestamp;
|
||||
protected Packet imagePacket;
|
||||
private Bitmap cachedBitmap;
|
||||
|
||||
// Result timestamp, which is set to the timestamp of the corresponding input image. May return
|
||||
// Long.MIN_VALUE if the input image is not associated with a timestamp.
|
||||
@Override
|
||||
public long timestamp() {
|
||||
return timestamp;
|
||||
}
|
||||
|
||||
// Returns the corresponding input image as a {@link Bitmap}.
|
||||
public Bitmap inputBitmap() {
|
||||
if (cachedBitmap != null) {
|
||||
return cachedBitmap;
|
||||
}
|
||||
cachedBitmap = AndroidPacketGetter.getBitmapFromRgba(imagePacket);
|
||||
return cachedBitmap;
|
||||
}
|
||||
|
||||
// Returns the corresponding input image as a {@link TextureFrame}. The caller must release the
|
||||
// acquired {@link TextureFrame} after using.
|
||||
public TextureFrame acquireTextureFrame() {
|
||||
return PacketGetter.getTextureFrame(imagePacket);
|
||||
}
|
||||
|
||||
// Releases image packet and the underlying data.
|
||||
void releaseImagePacket() {
|
||||
imagePacket.release();
|
||||
}
|
||||
}
|
|
@ -0,0 +1,86 @@
|
|||
// Copyright 2021 The MediaPipe Authors.
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
// You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software
|
||||
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
|
||||
package com.google.mediapipe.solutionbase;
|
||||
|
||||
import android.util.Log;
|
||||
import com.google.mediapipe.framework.MediaPipeException;
|
||||
import com.google.mediapipe.framework.Packet;
|
||||
import java.util.List;
|
||||
|
||||
/** Interface for handling MediaPipe solution graph outputs. */
|
||||
public class OutputHandler<T extends SolutionResult> {
|
||||
private static final String TAG = "OutputHandler";
|
||||
|
||||
/** Interface for converting outputs packet lists to solution result objects. */
|
||||
public interface OutputConverter<T extends SolutionResult> {
|
||||
public abstract T convert(List<Packet> packets);
|
||||
}
|
||||
// A solution specific graph output converter that should be implemented by solution.
|
||||
private OutputConverter<T> outputConverter;
|
||||
// The user-defined solution result listener.
|
||||
private ResultListener<T> customResultListener;
|
||||
// The user-defined error listener.
|
||||
private ErrorListener customErrorListener;
|
||||
|
||||
/**
|
||||
* Sets a callback to be invoked to convert a packet list to a solution result object.
|
||||
*
|
||||
* @param converter the solution-defined {@link OutputConverter} callback.
|
||||
*/
|
||||
public void setOutputConverter(OutputConverter<T> converter) {
|
||||
this.outputConverter = converter;
|
||||
}
|
||||
|
||||
/**
|
||||
* Sets a callback to be invoked when a solution result objects become available .
|
||||
*
|
||||
* @param listener the user-defined {@link ResultListener} callback.
|
||||
*/
|
||||
public void setResultListener(ResultListener<T> listener) {
|
||||
this.customResultListener = listener;
|
||||
}
|
||||
|
||||
/**
|
||||
* Sets a callback to be invoked when exceptions are thrown in the solution.
|
||||
*
|
||||
* @param listener the user-defined {@link ErrorListener} callback.
|
||||
*/
|
||||
public void setErrorListener(ErrorListener listener) {
|
||||
this.customErrorListener = listener;
|
||||
}
|
||||
|
||||
/** Handles a list of output packets. Invoked when packet lists become available. */
|
||||
public void run(List<Packet> packets) {
|
||||
T solutionResult = null;
|
||||
try {
|
||||
solutionResult = outputConverter.convert(packets);
|
||||
customResultListener.run(solutionResult);
|
||||
} catch (MediaPipeException e) {
|
||||
if (customErrorListener != null) {
|
||||
customErrorListener.onError("Error occurs when getting MediaPipe solution result. ", e);
|
||||
} else {
|
||||
Log.e(TAG, "Error occurs when getting MediaPipe solution result. " + e);
|
||||
}
|
||||
} finally {
|
||||
for (Packet packet : packets) {
|
||||
packet.release();
|
||||
}
|
||||
if (solutionResult instanceof ImageSolutionResult) {
|
||||
ImageSolutionResult imageSolutionResult = (ImageSolutionResult) solutionResult;
|
||||
imageSolutionResult.releaseImagePacket();
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
|
@ -0,0 +1,20 @@
|
|||
// Copyright 2021 The MediaPipe Authors.
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
// You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software
|
||||
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
|
||||
package com.google.mediapipe.solutionbase;
|
||||
|
||||
/** Interface for the customizable MediaPipe solution result listener. */
|
||||
public interface ResultListener<T> {
|
||||
void run(T result);
|
||||
}
|
|
@ -0,0 +1,150 @@
|
|||
// Copyright 2021 The MediaPipe Authors.
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
// You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software
|
||||
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
|
||||
package com.google.mediapipe.solutionbase;
|
||||
|
||||
import static java.util.concurrent.TimeUnit.MICROSECONDS;
|
||||
import static java.util.concurrent.TimeUnit.MILLISECONDS;
|
||||
|
||||
import android.content.Context;
|
||||
import android.os.SystemClock;
|
||||
import android.util.Log;
|
||||
import com.google.common.collect.ImmutableList;
|
||||
import com.google.mediapipe.framework.AndroidAssetUtil;
|
||||
import com.google.mediapipe.framework.AndroidPacketCreator;
|
||||
import com.google.mediapipe.framework.Graph;
|
||||
import com.google.mediapipe.framework.MediaPipeException;
|
||||
import com.google.mediapipe.framework.Packet;
|
||||
import com.google.mediapipe.framework.PacketGetter;
|
||||
import com.google.protobuf.Parser;
|
||||
import java.io.File;
|
||||
import java.util.List;
|
||||
import java.util.Map;
|
||||
import java.util.concurrent.atomic.AtomicBoolean;
|
||||
import javax.annotation.Nullable;
|
||||
|
||||
/** The base class of the MediaPipe solutions. */
|
||||
public class SolutionBase {
|
||||
private static final String TAG = "SolutionBase";
|
||||
protected Graph solutionGraph;
|
||||
protected AndroidPacketCreator packetCreator;
|
||||
protected ErrorListener errorListener;
|
||||
protected String imageInputStreamName;
|
||||
protected long lastTimestamp = Long.MIN_VALUE;
|
||||
protected final AtomicBoolean solutionGraphStarted = new AtomicBoolean(false);
|
||||
|
||||
static {
|
||||
// Load all native libraries needed by the app.
|
||||
System.loadLibrary("mediapipe_jni");
|
||||
System.loadLibrary("opencv_java3");
|
||||
}
|
||||
|
||||
/**
|
||||
* Initializes solution base with Android context, solution specific settings, and solution result
|
||||
* handler.
|
||||
*
|
||||
* @param context an Android {@link Context}.
|
||||
* @param solutionInfo a {@link SolutionInfo} contains binary graph file path, graph input and
|
||||
* output stream names.
|
||||
* @param outputHandler a {@link OutputHandler} handles both solution result object and runtime
|
||||
* exception.
|
||||
*/
|
||||
public synchronized void initialize(
|
||||
Context context,
|
||||
SolutionInfo solutionInfo,
|
||||
OutputHandler<? extends SolutionResult> outputHandler) {
|
||||
this.imageInputStreamName = solutionInfo.imageInputStreamName();
|
||||
try {
|
||||
AndroidAssetUtil.initializeNativeAssetManager(context);
|
||||
solutionGraph = new Graph();
|
||||
if (new File(solutionInfo.binaryGraphPath()).isAbsolute()) {
|
||||
solutionGraph.loadBinaryGraph(solutionInfo.binaryGraphPath());
|
||||
} else {
|
||||
solutionGraph.loadBinaryGraph(
|
||||
AndroidAssetUtil.getAssetBytes(context.getAssets(), solutionInfo.binaryGraphPath()));
|
||||
}
|
||||
solutionGraph.addMultiStreamCallback(
|
||||
solutionInfo.outputStreamNames(), outputHandler::run, /*observeTimestampBounds=*/ true);
|
||||
packetCreator = new AndroidPacketCreator(solutionGraph);
|
||||
} catch (MediaPipeException e) {
|
||||
throwException("Error occurs when creating the MediaPipe solution graph. ", e);
|
||||
}
|
||||
}
|
||||
|
||||
/** Throws exception with error message. */
|
||||
protected void throwException(String message, MediaPipeException e) {
|
||||
if (errorListener != null) {
|
||||
errorListener.onError(message, e);
|
||||
} else {
|
||||
Log.e(TAG, message, e);
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* A convinence method to get proto list from a packet. If packet is empty, returns an empty list.
|
||||
*/
|
||||
protected <T> List<T> getProtoVector(Packet packet, Parser<T> messageParser) {
|
||||
return packet.isEmpty()
|
||||
? ImmutableList.<T>of()
|
||||
: PacketGetter.getProtoVector(packet, messageParser);
|
||||
}
|
||||
|
||||
/** Gets current timestamp in microseconds. */
|
||||
protected long getCurrentTimestampUs() {
|
||||
return MICROSECONDS.convert(SystemClock.elapsedRealtime(), MILLISECONDS);
|
||||
}
|
||||
|
||||
/** Starts the solution graph by taking an optional input side packets map. */
|
||||
public synchronized void start(@Nullable Map<String, Packet> inputSidePackets) {
|
||||
try {
|
||||
if (inputSidePackets != null) {
|
||||
solutionGraph.setInputSidePackets(inputSidePackets);
|
||||
}
|
||||
if (!solutionGraphStarted.getAndSet(true)) {
|
||||
solutionGraph.startRunningGraph();
|
||||
}
|
||||
} catch (MediaPipeException e) {
|
||||
throwException("Error occurs when starting the MediaPipe solution graph. ", e);
|
||||
}
|
||||
}
|
||||
|
||||
/** A blocking API that returns until the solution finishes processing all the pending tasks. */
|
||||
public void waitUntilIdle() {
|
||||
try {
|
||||
solutionGraph.waitUntilGraphIdle();
|
||||
} catch (MediaPipeException e) {
|
||||
throwException("Error occurs when waiting until the MediaPipe graph becomes idle. ", e);
|
||||
}
|
||||
}
|
||||
|
||||
/** Closes and cleans up the solution graph. */
|
||||
public void close() {
|
||||
if (solutionGraphStarted.get()) {
|
||||
try {
|
||||
solutionGraph.closeAllPacketSources();
|
||||
solutionGraph.waitUntilGraphDone();
|
||||
} catch (MediaPipeException e) {
|
||||
// Note: errors during Process are reported at the earliest opportunity,
|
||||
// which may be addPacket or waitUntilDone, depending on timing. For consistency,
|
||||
// we want to always report them using the same async handler if installed.
|
||||
throwException("Error occurs when closing the Mediapipe solution graph. ", e);
|
||||
}
|
||||
try {
|
||||
solutionGraph.tearDown();
|
||||
} catch (MediaPipeException e) {
|
||||
throwException("Error occurs when closing the Mediapipe solution graph. ", e);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
|
@ -0,0 +1,48 @@
|
|||
// Copyright 2021 The MediaPipe Authors.
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
// You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software
|
||||
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
|
||||
package com.google.mediapipe.solutionbase;
|
||||
|
||||
import com.google.auto.value.AutoValue;
|
||||
import com.google.common.collect.ImmutableList;
|
||||
|
||||
/** SolutionInfo contains all needed informaton to initialize a MediaPipe solution graph. */
|
||||
@AutoValue
|
||||
public abstract class SolutionInfo {
|
||||
public abstract String binaryGraphPath();
|
||||
|
||||
public abstract String imageInputStreamName();
|
||||
|
||||
public abstract ImmutableList<String> outputStreamNames();
|
||||
|
||||
public abstract boolean staticImageMode();
|
||||
|
||||
public static Builder builder() {
|
||||
return new AutoValue_SolutionInfo.Builder();
|
||||
}
|
||||
|
||||
/** Builder for {@link SolutionInfo}. */
|
||||
@AutoValue.Builder
|
||||
public abstract static class Builder {
|
||||
public abstract Builder setBinaryGraphPath(String value);
|
||||
|
||||
public abstract Builder setImageInputStreamName(String value);
|
||||
|
||||
public abstract Builder setOutputStreamNames(ImmutableList<String> value);
|
||||
|
||||
public abstract Builder setStaticImageMode(boolean value);
|
||||
|
||||
public abstract SolutionInfo build();
|
||||
}
|
||||
}
|
|
@ -0,0 +1,23 @@
|
|||
// Copyright 2021 The MediaPipe Authors.
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
// You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software
|
||||
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
|
||||
package com.google.mediapipe.solutionbase;
|
||||
|
||||
/**
|
||||
* Interface of the MediaPipe solution result. Any MediaPipe solution-specific result class should
|
||||
* implement SolutionResult.
|
||||
*/
|
||||
public interface SolutionResult {
|
||||
long timestamp();
|
||||
}
|
|
@ -83,6 +83,8 @@ mediapipe_simple_subgraph(
|
|||
|
||||
exports_files(
|
||||
srcs = [
|
||||
"face_detection_back.tflite",
|
||||
"face_detection_back_sparse.tflite",
|
||||
"face_detection_front.tflite",
|
||||
],
|
||||
)
|
||||
|
|
Binary file not shown.
BIN
mediapipe/modules/face_detection/face_detection_back_sparse.tflite
Executable file
BIN
mediapipe/modules/face_detection/face_detection_back_sparse.tflite
Executable file
Binary file not shown.
|
@ -109,7 +109,7 @@ node {
|
|||
output_stream: "ensured_landmark_tensors"
|
||||
}
|
||||
|
||||
# Decodes the landmark tensors into a vector of lanmarks, where the landmark
|
||||
# Decodes the landmark tensors into a vector of landmarks, where the landmark
|
||||
# coordinates are normalized by the size of the input image to the model.
|
||||
node {
|
||||
calculator: "TensorsToLandmarksCalculator"
|
||||
|
|
|
@ -109,7 +109,7 @@ node {
|
|||
output_stream: "ensured_landmark_tensors"
|
||||
}
|
||||
|
||||
# Decodes the landmark tensors into a vector of lanmarks, where the landmark
|
||||
# Decodes the landmark tensors into a vector of landmarks, where the landmark
|
||||
# coordinates are normalized by the size of the input image to the model.
|
||||
node {
|
||||
calculator: "TensorsToLandmarksCalculator"
|
||||
|
|
|
@ -14,7 +14,7 @@
|
|||
|
||||
#include "mediapipe/modules/objectron/calculators/box.h"
|
||||
|
||||
#include "Eigen/src/Core/util/Constants.h"
|
||||
#include "Eigen/Core"
|
||||
#include "mediapipe/framework/port/logging.h"
|
||||
|
||||
namespace mediapipe {
|
||||
|
|
|
@ -78,7 +78,9 @@ mediapipe_simple_subgraph(
|
|||
graph = "pose_landmark_filtering.pbtxt",
|
||||
register_as = "PoseLandmarkFiltering",
|
||||
deps = [
|
||||
"//mediapipe/calculators/util:alignment_points_to_rects_calculator",
|
||||
"//mediapipe/calculators/util:landmarks_smoothing_calculator",
|
||||
"//mediapipe/calculators/util:landmarks_to_detection_calculator",
|
||||
"//mediapipe/calculators/util:visibility_smoothing_calculator",
|
||||
"//mediapipe/framework/tool:switch_container",
|
||||
],
|
||||
|
|
|
@ -29,6 +29,29 @@ output_stream: "FILTERED_NORM_LANDMARKS:filtered_landmarks"
|
|||
# Filtered auxiliary set of normalized landmarks. (NormalizedRect)
|
||||
output_stream: "FILTERED_AUX_NORM_LANDMARKS:filtered_aux_landmarks"
|
||||
|
||||
# Converts landmarks to a detection that tightly encloses all landmarks.
|
||||
node {
|
||||
calculator: "LandmarksToDetectionCalculator"
|
||||
input_stream: "NORM_LANDMARKS:aux_landmarks"
|
||||
output_stream: "DETECTION:aux_detection"
|
||||
}
|
||||
|
||||
# Converts detection into a rectangle based on center and scale alignment
|
||||
# points.
|
||||
node {
|
||||
calculator: "AlignmentPointsRectsCalculator"
|
||||
input_stream: "DETECTION:aux_detection"
|
||||
input_stream: "IMAGE_SIZE:image_size"
|
||||
output_stream: "NORM_RECT:roi"
|
||||
options: {
|
||||
[mediapipe.DetectionsToRectsCalculatorOptions.ext] {
|
||||
rotation_vector_start_keypoint_index: 0
|
||||
rotation_vector_end_keypoint_index: 1
|
||||
rotation_vector_target_angle_degrees: 90
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
# Smoothes pose landmark visibilities to reduce jitter.
|
||||
node {
|
||||
calculator: "SwitchContainer"
|
||||
|
@ -66,6 +89,7 @@ node {
|
|||
input_side_packet: "ENABLE:enable"
|
||||
input_stream: "NORM_LANDMARKS:filtered_visibility"
|
||||
input_stream: "IMAGE_SIZE:image_size"
|
||||
input_stream: "OBJECT_SCALE_ROI:roi"
|
||||
output_stream: "NORM_FILTERED_LANDMARKS:filtered_landmarks"
|
||||
options: {
|
||||
[mediapipe.SwitchContainerOptions.ext] {
|
||||
|
@ -83,12 +107,12 @@ node {
|
|||
options: {
|
||||
[mediapipe.LandmarksSmoothingCalculatorOptions.ext] {
|
||||
one_euro_filter {
|
||||
# Min cutoff 0.1 results into ~ 0.02 alpha in landmark EMA filter
|
||||
# Min cutoff 0.1 results into ~0.01 alpha in landmark EMA filter
|
||||
# when landmark is static.
|
||||
min_cutoff: 0.1
|
||||
# Beta 40.0 in combintation with min_cutoff 0.1 results into ~0.8
|
||||
# alpha in landmark EMA filter when landmark is moving fast.
|
||||
beta: 40.0
|
||||
min_cutoff: 0.05
|
||||
# Beta 80.0 in combintation with min_cutoff 0.05 results into
|
||||
# ~0.94 alpha in landmark EMA filter when landmark is moving fast.
|
||||
beta: 80.0
|
||||
# Derivative cutoff 1.0 results into ~0.17 alpha in landmark
|
||||
# velocity EMA filter.
|
||||
derivate_cutoff: 1.0
|
||||
|
@ -119,6 +143,7 @@ node {
|
|||
calculator: "LandmarksSmoothingCalculator"
|
||||
input_stream: "NORM_LANDMARKS:filtered_aux_visibility"
|
||||
input_stream: "IMAGE_SIZE:image_size"
|
||||
input_stream: "OBJECT_SCALE_ROI:roi"
|
||||
output_stream: "NORM_FILTERED_LANDMARKS:filtered_aux_landmarks"
|
||||
options: {
|
||||
[mediapipe.LandmarksSmoothingCalculatorOptions.ext] {
|
||||
|
@ -127,12 +152,12 @@ node {
|
|||
# object is not moving but responsive enough in case of sudden
|
||||
# movements.
|
||||
one_euro_filter {
|
||||
# Min cutoff 0.01 results into ~ 0.002 alpha in landmark EMA
|
||||
# Min cutoff 0.01 results into ~0.002 alpha in landmark EMA
|
||||
# filter when landmark is static.
|
||||
min_cutoff: 0.01
|
||||
# Beta 1.0 in combintation with min_cutoff 0.01 results into ~0.2
|
||||
# Beta 10.0 in combintation with min_cutoff 0.01 results into ~0.68
|
||||
# alpha in landmark EMA filter when landmark is moving fast.
|
||||
beta: 1.0
|
||||
beta: 10.0
|
||||
# Derivative cutoff 1.0 results into ~0.17 alpha in landmark
|
||||
# velocity EMA filter.
|
||||
derivate_cutoff: 1.0
|
||||
|
|
73
mediapipe/modules/selfie_segmentation/BUILD
Normal file
73
mediapipe/modules/selfie_segmentation/BUILD
Normal file
|
@ -0,0 +1,73 @@
|
|||
# Copyright 2021 The MediaPipe Authors.
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
|
||||
load(
|
||||
"//mediapipe/framework/tool:mediapipe_graph.bzl",
|
||||
"mediapipe_simple_subgraph",
|
||||
)
|
||||
|
||||
licenses(["notice"])
|
||||
|
||||
package(default_visibility = ["//visibility:public"])
|
||||
|
||||
mediapipe_simple_subgraph(
|
||||
name = "selfie_segmentation_model_loader",
|
||||
graph = "selfie_segmentation_model_loader.pbtxt",
|
||||
register_as = "SelfieSegmentationModelLoader",
|
||||
deps = [
|
||||
"//mediapipe/calculators/core:constant_side_packet_calculator",
|
||||
"//mediapipe/calculators/tflite:tflite_model_calculator",
|
||||
"//mediapipe/calculators/util:local_file_contents_calculator",
|
||||
"//mediapipe/framework/tool:switch_container",
|
||||
],
|
||||
)
|
||||
|
||||
mediapipe_simple_subgraph(
|
||||
name = "selfie_segmentation_cpu",
|
||||
graph = "selfie_segmentation_cpu.pbtxt",
|
||||
register_as = "SelfieSegmentationCpu",
|
||||
deps = [
|
||||
":selfie_segmentation_model_loader",
|
||||
"//mediapipe/calculators/image:image_properties_calculator",
|
||||
"//mediapipe/calculators/tensor:image_to_tensor_calculator",
|
||||
"//mediapipe/calculators/tensor:inference_calculator",
|
||||
"//mediapipe/calculators/tensor:tensors_to_segmentation_calculator",
|
||||
"//mediapipe/calculators/tflite:tflite_custom_op_resolver_calculator",
|
||||
"//mediapipe/calculators/util:from_image_calculator",
|
||||
"//mediapipe/framework/tool:switch_container",
|
||||
],
|
||||
)
|
||||
|
||||
mediapipe_simple_subgraph(
|
||||
name = "selfie_segmentation_gpu",
|
||||
graph = "selfie_segmentation_gpu.pbtxt",
|
||||
register_as = "SelfieSegmentationGpu",
|
||||
deps = [
|
||||
":selfie_segmentation_model_loader",
|
||||
"//mediapipe/calculators/image:image_properties_calculator",
|
||||
"//mediapipe/calculators/tensor:image_to_tensor_calculator",
|
||||
"//mediapipe/calculators/tensor:inference_calculator",
|
||||
"//mediapipe/calculators/tensor:tensors_to_segmentation_calculator",
|
||||
"//mediapipe/calculators/tflite:tflite_custom_op_resolver_calculator",
|
||||
"//mediapipe/calculators/util:from_image_calculator",
|
||||
"//mediapipe/framework/tool:switch_container",
|
||||
],
|
||||
)
|
||||
|
||||
exports_files(
|
||||
srcs = [
|
||||
"selfie_segmentation.tflite",
|
||||
"selfie_segmentation_landscape.tflite",
|
||||
],
|
||||
)
|
6
mediapipe/modules/selfie_segmentation/README.md
Normal file
6
mediapipe/modules/selfie_segmentation/README.md
Normal file
|
@ -0,0 +1,6 @@
|
|||
# selfie_segmentation
|
||||
|
||||
Subgraphs|Details
|
||||
:--- | :---
|
||||
[`SelfieSegmentationCpu`](https://github.com/google/mediapipe/tree/master/mediapipe/modules/selfie_segmentation/selfie_segmentation_cpu.pbtxt)| Segments the person from background in a selfie image. (CPU input, and inference is executed on CPU.)
|
||||
[`SelfieSegmentationGpu`](https://github.com/google/mediapipe/tree/master/mediapipe/modules/selfie_segmentation/selfie_segmentation_gpu.pbtxt)| Segments the person from background in a selfie image. (GPU input, and inference is executed on GPU.)
|
BIN
mediapipe/modules/selfie_segmentation/selfie_segmentation.tflite
Normal file
BIN
mediapipe/modules/selfie_segmentation/selfie_segmentation.tflite
Normal file
Binary file not shown.
|
@ -0,0 +1,131 @@
|
|||
# MediaPipe graph to perform selfie segmentation. (CPU input, and all processing
|
||||
# and inference are also performed on CPU)
|
||||
#
|
||||
# It is required that "selfie_segmentation.tflite" or
|
||||
# "selfie_segmentation_landscape.tflite" is available at
|
||||
# "mediapipe/modules/selfie_segmentation/selfie_segmentation.tflite"
|
||||
# or
|
||||
# "mediapipe/modules/selfie_segmentation/selfie_segmentation_landscape.tflite"
|
||||
# path respectively during execution, depending on the specification in the
|
||||
# MODEL_SELECTION input side packet.
|
||||
#
|
||||
# EXAMPLE:
|
||||
# node {
|
||||
# calculator: "SelfieSegmentationCpu"
|
||||
# input_side_packet: "MODEL_SELECTION:model_selection"
|
||||
# input_stream: "IMAGE:image"
|
||||
# output_stream: "SEGMENTATION_MASK:segmentation_mask"
|
||||
# }
|
||||
|
||||
type: "SelfieSegmentationCpu"
|
||||
|
||||
# CPU image. (ImageFrame)
|
||||
input_stream: "IMAGE:image"
|
||||
|
||||
# An integer 0 or 1. Use 0 to select a general-purpose model (operating on a
|
||||
# 256x256 tensor), and 1 to select a model (operating on a 256x144 tensor) more
|
||||
# optimized for landscape images. If unspecified, functions as set to 0. (int)
|
||||
input_side_packet: "MODEL_SELECTION:model_selection"
|
||||
|
||||
# Segmentation mask. (ImageFrame in ImageFormat::VEC32F1)
|
||||
output_stream: "SEGMENTATION_MASK:segmentation_mask"
|
||||
|
||||
# Resizes the input image into a tensor with a dimension desired by the model.
|
||||
node {
|
||||
calculator: "SwitchContainer"
|
||||
input_side_packet: "SELECT:model_selection"
|
||||
input_stream: "IMAGE:image"
|
||||
output_stream: "TENSORS:input_tensors"
|
||||
options: {
|
||||
[mediapipe.SwitchContainerOptions.ext] {
|
||||
select: 0
|
||||
contained_node: {
|
||||
calculator: "ImageToTensorCalculator"
|
||||
options: {
|
||||
[mediapipe.ImageToTensorCalculatorOptions.ext] {
|
||||
output_tensor_width: 256
|
||||
output_tensor_height: 256
|
||||
keep_aspect_ratio: false
|
||||
output_tensor_float_range {
|
||||
min: 0.0
|
||||
max: 1.0
|
||||
}
|
||||
border_mode: BORDER_ZERO
|
||||
}
|
||||
}
|
||||
}
|
||||
contained_node: {
|
||||
calculator: "ImageToTensorCalculator"
|
||||
options: {
|
||||
[mediapipe.ImageToTensorCalculatorOptions.ext] {
|
||||
output_tensor_width: 256
|
||||
output_tensor_height: 144
|
||||
keep_aspect_ratio: false
|
||||
output_tensor_float_range {
|
||||
min: 0.0
|
||||
max: 1.0
|
||||
}
|
||||
border_mode: BORDER_ZERO
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
# Generates a single side packet containing a TensorFlow Lite op resolver that
|
||||
# supports custom ops needed by the model used in this graph.
|
||||
node {
|
||||
calculator: "TfLiteCustomOpResolverCalculator"
|
||||
output_side_packet: "op_resolver"
|
||||
}
|
||||
|
||||
# Loads the selfie segmentation TF Lite model.
|
||||
node {
|
||||
calculator: "SelfieSegmentationModelLoader"
|
||||
input_side_packet: "MODEL_SELECTION:model_selection"
|
||||
output_side_packet: "MODEL:model"
|
||||
}
|
||||
|
||||
# Runs model inference on CPU.
|
||||
node {
|
||||
calculator: "InferenceCalculator"
|
||||
input_stream: "TENSORS:input_tensors"
|
||||
output_stream: "TENSORS:output_tensors"
|
||||
input_side_packet: "MODEL:model"
|
||||
input_side_packet: "CUSTOM_OP_RESOLVER:op_resolver"
|
||||
options: {
|
||||
[mediapipe.InferenceCalculatorOptions.ext] {
|
||||
delegate { xnnpack {} }
|
||||
}
|
||||
#
|
||||
}
|
||||
}
|
||||
|
||||
# Retrieves the size of the input image.
|
||||
node {
|
||||
calculator: "ImagePropertiesCalculator"
|
||||
input_stream: "IMAGE_CPU:image"
|
||||
output_stream: "SIZE:input_size"
|
||||
}
|
||||
|
||||
# Processes the output tensors into a segmentation mask that has the same size
|
||||
# as the input image into the graph.
|
||||
node {
|
||||
calculator: "TensorsToSegmentationCalculator"
|
||||
input_stream: "TENSORS:output_tensors"
|
||||
input_stream: "OUTPUT_SIZE:input_size"
|
||||
output_stream: "MASK:mask_image"
|
||||
options: {
|
||||
[mediapipe.TensorsToSegmentationCalculatorOptions.ext] {
|
||||
activation: NONE
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
# Converts the incoming Image into the corresponding ImageFrame type.
|
||||
node: {
|
||||
calculator: "FromImageCalculator"
|
||||
input_stream: "IMAGE:mask_image"
|
||||
output_stream: "IMAGE_CPU:segmentation_mask"
|
||||
}
|
|
@ -0,0 +1,133 @@
|
|||
# MediaPipe graph to perform selfie segmentation. (GPU input, and all processing
|
||||
# and inference are also performed on GPU)
|
||||
#
|
||||
# It is required that "selfie_segmentation.tflite" or
|
||||
# "selfie_segmentation_landscape.tflite" is available at
|
||||
# "mediapipe/modules/selfie_segmentation/selfie_segmentation.tflite"
|
||||
# or
|
||||
# "mediapipe/modules/selfie_segmentation/selfie_segmentation_landscape.tflite"
|
||||
# path respectively during execution, depending on the specification in the
|
||||
# MODEL_SELECTION input side packet.
|
||||
#
|
||||
# EXAMPLE:
|
||||
# node {
|
||||
# calculator: "SelfieSegmentationGpu"
|
||||
# input_side_packet: "MODEL_SELECTION:model_selection"
|
||||
# input_stream: "IMAGE:image"
|
||||
# output_stream: "SEGMENTATION_MASK:segmentation_mask"
|
||||
# }
|
||||
|
||||
type: "SelfieSegmentationGpu"
|
||||
|
||||
# GPU image. (GpuBuffer)
|
||||
input_stream: "IMAGE:image"
|
||||
|
||||
# An integer 0 or 1. Use 0 to select a general-purpose model (operating on a
|
||||
# 256x256 tensor), and 1 to select a model (operating on a 256x144 tensor) more
|
||||
# optimized for landscape images. If unspecified, functions as set to 0. (int)
|
||||
input_side_packet: "MODEL_SELECTION:model_selection"
|
||||
|
||||
# Segmentation mask. (GpuBuffer in RGBA, with the same mask values in R and A)
|
||||
output_stream: "SEGMENTATION_MASK:segmentation_mask"
|
||||
|
||||
# Resizes the input image into a tensor with a dimension desired by the model.
|
||||
node {
|
||||
calculator: "SwitchContainer"
|
||||
input_side_packet: "SELECT:model_selection"
|
||||
input_stream: "IMAGE_GPU:image"
|
||||
output_stream: "TENSORS:input_tensors"
|
||||
options: {
|
||||
[mediapipe.SwitchContainerOptions.ext] {
|
||||
select: 0
|
||||
contained_node: {
|
||||
calculator: "ImageToTensorCalculator"
|
||||
options: {
|
||||
[mediapipe.ImageToTensorCalculatorOptions.ext] {
|
||||
output_tensor_width: 256
|
||||
output_tensor_height: 256
|
||||
keep_aspect_ratio: false
|
||||
output_tensor_float_range {
|
||||
min: 0.0
|
||||
max: 1.0
|
||||
}
|
||||
border_mode: BORDER_ZERO
|
||||
gpu_origin: TOP_LEFT
|
||||
}
|
||||
}
|
||||
}
|
||||
contained_node: {
|
||||
calculator: "ImageToTensorCalculator"
|
||||
options: {
|
||||
[mediapipe.ImageToTensorCalculatorOptions.ext] {
|
||||
output_tensor_width: 256
|
||||
output_tensor_height: 144
|
||||
keep_aspect_ratio: false
|
||||
output_tensor_float_range {
|
||||
min: 0.0
|
||||
max: 1.0
|
||||
}
|
||||
border_mode: BORDER_ZERO
|
||||
gpu_origin: TOP_LEFT
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
# Generates a single side packet containing a TensorFlow Lite op resolver that
|
||||
# supports custom ops needed by the model used in this graph.
|
||||
node {
|
||||
calculator: "TfLiteCustomOpResolverCalculator"
|
||||
output_side_packet: "op_resolver"
|
||||
options: {
|
||||
[mediapipe.TfLiteCustomOpResolverCalculatorOptions.ext] {
|
||||
use_gpu: true
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
# Loads the selfie segmentation TF Lite model.
|
||||
node {
|
||||
calculator: "SelfieSegmentationModelLoader"
|
||||
input_side_packet: "MODEL_SELECTION:model_selection"
|
||||
output_side_packet: "MODEL:model"
|
||||
}
|
||||
|
||||
# Runs model inference on GPU.
|
||||
node {
|
||||
calculator: "InferenceCalculator"
|
||||
input_stream: "TENSORS:input_tensors"
|
||||
output_stream: "TENSORS:output_tensors"
|
||||
input_side_packet: "MODEL:model"
|
||||
input_side_packet: "CUSTOM_OP_RESOLVER:op_resolver"
|
||||
}
|
||||
|
||||
# Retrieves the size of the input image.
|
||||
node {
|
||||
calculator: "ImagePropertiesCalculator"
|
||||
input_stream: "IMAGE_GPU:image"
|
||||
output_stream: "SIZE:input_size"
|
||||
}
|
||||
|
||||
# Processes the output tensors into a segmentation mask that has the same size
|
||||
# as the input image into the graph.
|
||||
node {
|
||||
calculator: "TensorsToSegmentationCalculator"
|
||||
input_stream: "TENSORS:output_tensors"
|
||||
input_stream: "OUTPUT_SIZE:input_size"
|
||||
output_stream: "MASK:mask_image"
|
||||
options: {
|
||||
[mediapipe.TensorsToSegmentationCalculatorOptions.ext] {
|
||||
activation: NONE
|
||||
gpu_origin: TOP_LEFT
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
# Converts the incoming Image into the corresponding GpuBuffer type.
|
||||
node: {
|
||||
calculator: "FromImageCalculator"
|
||||
input_stream: "IMAGE:mask_image"
|
||||
output_stream: "IMAGE_GPU:segmentation_mask"
|
||||
}
|
BIN
mediapipe/modules/selfie_segmentation/selfie_segmentation_landscape.tflite
Executable file
BIN
mediapipe/modules/selfie_segmentation/selfie_segmentation_landscape.tflite
Executable file
Binary file not shown.
|
@ -0,0 +1,63 @@
|
|||
# MediaPipe graph to load a selected selfie segmentation TF Lite model.
|
||||
|
||||
type: "SelfieSegmentationModelLoader"
|
||||
|
||||
# An integer 0 or 1. Use 0 to select a general-purpose model (operating on a
|
||||
# 256x256 tensor), and 1 to select a model (operating on a 256x144 tensor) more
|
||||
# optimized for landscape images. If unspecified, functions as set to 0. (int)
|
||||
input_side_packet: "MODEL_SELECTION:model_selection"
|
||||
|
||||
# TF Lite model represented as a FlatBuffer.
|
||||
# (std::unique_ptr<tflite::FlatBufferModel, std::function<void(tflite::FlatBufferModel*)>>)
|
||||
output_side_packet: "MODEL:model"
|
||||
|
||||
# Determines path to the desired pose landmark model file.
|
||||
node {
|
||||
calculator: "SwitchContainer"
|
||||
input_side_packet: "SELECT:model_selection"
|
||||
output_side_packet: "PACKET:model_path"
|
||||
options: {
|
||||
[mediapipe.SwitchContainerOptions.ext] {
|
||||
select: 0
|
||||
contained_node: {
|
||||
calculator: "ConstantSidePacketCalculator"
|
||||
options: {
|
||||
[mediapipe.ConstantSidePacketCalculatorOptions.ext]: {
|
||||
packet {
|
||||
string_value: "mediapipe/modules/selfie_segmentation/selfie_segmentation.tflite"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
contained_node: {
|
||||
calculator: "ConstantSidePacketCalculator"
|
||||
options: {
|
||||
[mediapipe.ConstantSidePacketCalculatorOptions.ext]: {
|
||||
packet {
|
||||
string_value: "mediapipe/modules/selfie_segmentation/selfie_segmentation_landscape.tflite"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
# Loads the file in the specified path into a blob.
|
||||
node {
|
||||
calculator: "LocalFileContentsCalculator"
|
||||
input_side_packet: "FILE_PATH:model_path"
|
||||
output_side_packet: "CONTENTS:model_blob"
|
||||
options: {
|
||||
[mediapipe.LocalFileContentsCalculatorOptions.ext]: {
|
||||
text_mode: false
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
# Converts the input blob into a TF Lite model.
|
||||
node {
|
||||
calculator: "TfLiteModelCalculator"
|
||||
input_side_packet: "MODEL_BLOB:model_blob"
|
||||
output_side_packet: "MODEL:model"
|
||||
}
|
26
mediapipe/opensource_only/ISSUE_TEMPLATE/30-bug-issue.md
Normal file
26
mediapipe/opensource_only/ISSUE_TEMPLATE/30-bug-issue.md
Normal file
|
@ -0,0 +1,26 @@
|
|||
<em>Please make sure that this is a bug and also refer to the [troubleshooting](https://google.github.io/mediapipe/getting_started/troubleshooting.html), FAQ documentation before raising any issues.</em>
|
||||
|
||||
**System information** (Please provide as much relevant information as possible)
|
||||
|
||||
- Have I written custom code (as opposed to using a stock example script provided in MediaPipe):
|
||||
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04, Android 11, iOS 14.4):
|
||||
- Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device:
|
||||
- Browser and version (e.g. Google Chrome, Safari) if the issue happens on browser:
|
||||
- Programming Language and version ( e.g. C++, Python, Java):
|
||||
- [MediaPipe version](https://github.com/google/mediapipe/releases):
|
||||
- Bazel version (if compiling from source):
|
||||
- Solution ( e.g. FaceMesh, Pose, Holistic ):
|
||||
- Android Studio, NDK, SDK versions (if issue is related to building in Android environment):
|
||||
- Xcode & Tulsi version (if issue is related to building for iOS):
|
||||
|
||||
**Describe the current behavior:**
|
||||
|
||||
**Describe the expected behavior:**
|
||||
|
||||
**Standalone code to reproduce the issue:**
|
||||
Provide a reproducible test case that is the bare minimum necessary to replicate the problem. If possible, please share a link to Colab/repo link /any notebook:
|
||||
|
||||
**Other info / Complete Logs :**
|
||||
Include any logs or source code that would be helpful to
|
||||
diagnose the problem. If including tracebacks, please include the full
|
||||
traceback. Large logs and files should be attached
|
|
@ -0,0 +1,18 @@
|
|||
<em>Please make sure that this is a feature request.</em>
|
||||
|
||||
**System information** (Please provide as much relevant information as possible)
|
||||
|
||||
- MediaPipe Solution (you are using):
|
||||
- Programming language : C++/typescript/Python/Objective C/Android Java
|
||||
- Are you willing to contribute it (Yes/No):
|
||||
|
||||
|
||||
**Describe the feature and the current behavior/state:**
|
||||
|
||||
**Will this change the current api? How?**
|
||||
|
||||
**Who will benefit with this feature?**
|
||||
|
||||
**Please specify the use cases for this feature:**
|
||||
|
||||
**Any Other info:**
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue
Block a user