Project import generated by Copybara.

GitOrigin-RevId: 2146b10f0a498f665f246e16033b686c7947b92d
This commit is contained in:
MediaPipe Team 2021-05-10 12:19:00 -07:00 committed by chuoling
parent a9b643e0f5
commit 017c1dc7ea
52 changed files with 708 additions and 298 deletions

View File

@ -14,3 +14,6 @@ exclude mediapipe/modules/objectron/object_detection_3d_sneakers.tflite
exclude mediapipe/modules/objectron/object_detection_3d_chair.tflite
exclude mediapipe/modules/objectron/object_detection_3d_camera.tflite
exclude mediapipe/modules/objectron/object_detection_3d_cup.tflite
exclude mediapipe/modules/objectron/object_detection_ssd_mobilenetv2_oidv4_fp16.tflite
exclude mediapipe/modules/pose_landmark/pose_landmark_lite.tflite
exclude mediapipe/modules/pose_landmark/pose_landmark_heavy.tflite

View File

@ -110,3 +110,12 @@ Other policies are also available, implemented using a separate kind of
component known as an InputStreamHandler.
See [Synchronization](synchronization.md) for more details.
### Realtime data streams
MediaPipe calculator graphs are often used to process streams of video or audio
frames for interactive applications. Normally, each Calculator runs as soon as
all of its input packets for a given timestamp become available. Calculators
used in realtime graphs need to define output timestamp bounds based on input
timestamp bounds in order to allow downstream calculators to be scheduled
promptly. See [Realtime data streams](realtime.md) for details.

View File

@ -0,0 +1,187 @@
---
layout: default
title: Processing real-time data streams
nav_order: 6
has_children: true
has_toc: false
---
# Processing real-time data streams
{: .no_toc }
1. TOC
{:toc}
---
## Realtime timestamps
MediaPipe calculator graphs are often used to process streams of video or audio
frames for interactive applications. The MediaPipe framework requires only that
successive packets be assigned monotonically increasing timestamps. By
convention, realtime calculators and graphs use the recording time or the
presentation time of each frame as its timestamp, with each timestamp indicating
the microseconds since `Jan/1/1970:00:00:00`. This allows packets from various
sources to be processed in a globally consistent sequence.
## Realtime scheduling
Normally, each Calculator runs as soon as all of its input packets for a given
timestamp become available. Normally, this happens when the calculator has
finished processing the previous frame, and each of the calculators producing
its inputs have finished processing the current frame. The MediaPipe scheduler
invokes each calculator as soon as these conditions are met. See
[Synchronization](synchronization.md) for more details.
## Timestamp bounds
When a calculator does not produce any output packets for a given timestamp, it
can instead output a "timestamp bound" indicating that no packet will be
produced for that timestamp. This indication is necessary to allow downstream
calculators to run at that timestamp, even though no packet has arrived for
certain streams for that timestamp. This is especially important for realtime
graphs in interactive applications, where it is crucial that each calculator
begin processing as soon as possible.
Consider a graph like the following:
```
node {
calculator: "A"
input_stream: "alpha_in"
output_stream: "alpha"
}
node {
calculator: "B"
input_stream: "alpha"
input_stream: "foo"
output_stream: "beta"
}
```
Suppose: at timestamp `T`, node `A` doesn't send a packet in its output stream
`alpha`. Node `B` gets a packet in `foo` at timestamp `T` and is waiting for a
packet in `alpha` at timestamp `T`. If `A` doesn't send `B` a timestamp bound
update for `alpha`, `B` will keep waiting for a packet to arrive in `alpha`.
Meanwhile, the packet queue of `foo` will accumulate packets at `T`, `T+1` and
so on.
To output a packet on a stream, a calculator uses the API functions
`CalculatorContext::Outputs` and `OutputStream::Add`. To instead output a
timestamp bound on a stream, a calculator can use the API functions
`CalculatorContext::Outputs` and `CalculatorContext::SetNextTimestampBound`. The
specified bound is the lowest allowable timestamp for the next packet on the
specified output stream. When no packet is output, a calculator will typically
do something like:
```
cc->Outputs().Tag("output_frame").SetNextTimestampBound(
cc->InputTimestamp().NextAllowedInStream());
```
The function `Timestamp::NextAllowedInStream` returns the successive timestamp.
For example, `Timestamp(1).NextAllowedInStream() == Timestamp(2)`.
## Propagating timestamp bounds
Calculators that will be used in realtime graphs need to define output timestamp
bounds based on input timestamp bounds in order to allow downstream calculators
to be scheduled promptly. A common pattern is for calculators to output packets
with the same timestamps as their input packets. In this case, simply outputting
a packet on every call to `Calculator::Process` is sufficient to define output
timestamp bounds.
However, calculators are not required to follow this common pattern for output
timestamps, they are only required to choose monotonically increasing output
timestamps. As a result, certain calculators must calculate timestamp bounds
explicitly. MediaPipe provides several tools for computing appropriate timestamp
bound for each calculator.
1\. **SetNextTimestampBound()** can be used to specify the timestamp bound, `t +
1`, for an output stream.
```
cc->Outputs.Tag("OUT").SetNextTimestampBound(t.NextAllowedInStream());
```
Alternatively, an empty packet with timestamp `t` can be produced to specify the
timestamp bound `t + 1`.
```
cc->Outputs.Tag("OUT").Add(Packet(), t);
```
The timestamp bound of an input stream is indicated by the packet or the empty
packet on the input stream.
```
Timestamp bound = cc->Inputs().Tag("IN").Value().Timestamp();
```
2\. **TimestampOffset()** can be specified in order to automatically copy the
timestamp bound from input streams to output streams.
```
cc->SetTimestampOffset(0);
```
This setting has the advantage of propagating timestamp bounds automatically,
even when only timestamp bounds arrive and Calculator::Process is not invoked.
3\. **ProcessTimestampBounds()** can be specified in order to invoke
`Calculator::Process` for each new "settled timestamp", where the "settled
timestamp" is the new highest timestamp below the current timestamp bounds.
Without `ProcessTimestampBounds()`, `Calculator::Process` is invoked only with
one or more arriving packets.
```
cc->SetProcessTimestampBounds(true);
```
This setting allows a calculator to perform its own timestamp bounds calculation
and propagation, even when only input timestamps are updated. It can be used to
replicate the effect of `TimestampOffset()`, but it can also be used to
calculate a timestamp bound that takes into account additional factors.
For example, in order to replicate `SetTimestampOffset(0)`, a calculator could
do the following:
```
absl::Status Open(CalculatorContext* cc) {
cc->SetProcessTimestampBounds(true);
}
absl::Status Process(CalculatorContext* cc) {
cc->Outputs.Tag("OUT").SetNextTimestampBound(
cc->InputTimestamp().NextAllowedInStream());
}
```
## Scheduling of Calculator::Open and Calculator::Close
`Calculator::Open` is invoked when all required input side-packets have been
produced. Input side-packets can be provided by the enclosing application or by
"side-packet calculators" inside the graph. Side-packets can be specified from
outside the graph using the API's `CalculatorGraph::Initialize` and
`CalculatorGraph::StartRun`. Side packets can be specified by calculators within
the graph using `CalculatorGraphConfig::OutputSidePackets` and
`OutputSidePacket::Set`.
Calculator::Close is invoked when all of the input streams have become `Done` by
being closed or reaching timestamp bound `Timestamp::Done`.
**Note:** If the graph finishes all pending calculator execution and becomes
`Done`, before some streams become `Done`, then MediaPipe will invoke the
remaining calls to `Calculator::Close`, so that every calculator can produce its
final outputs.
The use of `TimestampOffset` has some implications for `Calculator::Close`. A
calculator specifying `SetTimestampOffset(0)` will by design signal that all of
its output streams have reached `Timestamp::Done` when all of its input streams
have reached `Timestamp::Done`, and therefore no further outputs are possible.
This prevents such a calculator from emitting any packets during
`Calculator::Close`. If a calculator needs to produce a summary packet during
`Calculator::Close`, `Calculator::Process` must specify timestamp bounds such
that at least one timestamp (such as `Timestamp::Max`) remains available during
`Calculator::Close`. This means that such a calculator normally cannot rely upon
`SetTimestampOffset(0)` and must instead specify timestamp bounds explicitly
using `SetNextTimestampBounds()`.

Binary file not shown.

After

Width:  |  Height:  |  Size: 56 KiB

View File

@ -79,19 +79,32 @@ to visualize its associated subgraphs, please see
## Pose Estimation Quality
To evaluate the quality of our [models](./models.md#pose) against other
well-performing publicly available solutions, we use a validation dataset,
consisting of 1k images with diverse Yoga, HIIT, and Dance postures. Each image
well-performing publicly available solutions, we use three different validation
datasets, representing different verticals: Yoga, Dance and HIIT. Each image
contains only a single person located 2-4 meters from the camera. To be
consistent with other solutions, we perform evaluation only for 17 keypoints
from [COCO topology](https://cocodataset.org/#keypoints-2020).
Method | [mAP](https://cocodataset.org/#keypoints-eval) | [PCK@0.2](https://github.com/cbsudux/Human-Pose-Estimation-101) | [FPS](https://en.wikipedia.org/wiki/Frame_rate), Pixel 3 [TFLite GPU](https://www.tensorflow.org/lite/performance/gpu_advanced) | [FPS](https://en.wikipedia.org/wiki/Frame_rate), MacBook Pro (15-inch, 2017)
----------------------------------------------------------------------------------------------------- | ---------------------------------------------: | --------------------------------------------------------------: | ------------------------------------------------------------------------------------------------------------------------------: | ---------------------------------------------------------------------------:
BlazePose.Lite | 49.1 | 91.7 | 49 | 40
BlazePose.Full | 64.5 | 95.8 | 40 | 37
BlazePose.Heavy | 70.9 | 97.0 | 19 | 26
[AlphaPose.ResNet50](https://github.com/MVIG-SJTU/AlphaPose) | 57.6 | 93.1 | N/A | N/A
[Apple Vision](https://developer.apple.com/documentation/vision/detecting_human_body_poses_in_images) | 37.0 | 85.3 | N/A | N/A
Method | Yoga <br/> [`mAP`] | Yoga <br/> [`PCK@0.2`] | Dance <br/> [`mAP`] | Dance <br/> [`PCK@0.2`] | HIIT <br/> [`mAP`] | HIIT <br/> [`PCK@0.2`]
----------------------------------------------------------------------------------------------------- | -----------------: | ---------------------: | ------------------: | ----------------------: | -----------------: | ---------------------:
BlazePose.Heavy | 68.1 | **96.4** | 73.0 | **97.2** | 74.0 | **97.5**
BlazePose.Full | 62.6 | **95.5** | 67.4 | **96.3** | 68.0 | **95.7**
BlazePose.Lite | 45.0 | **90.2** | 53.6 | **92.5** | 53.8 | **93.5**
[AlphaPose.ResNet50](https://github.com/MVIG-SJTU/AlphaPose) | 63.4 | **96.0** | 57.8 | **95.5** | 63.4 | **96.0**
[Apple.Vision](https://developer.apple.com/documentation/vision/detecting_human_body_poses_in_images) | 32.8 | **82.7** | 36.4 | **91.4** | 44.5 | **88.6**
![pose_tracking_pck_chart.png](../images/mobile/pose_tracking_pck_chart.png) |
:--------------------------------------------------------------------------: |
*Fig 2. Quality evaluation in [`PCK@0.2`].* |
We designed our models specifically for live perception use cases, so all of
them work in real-time on the majority of modern devices.
Method | Latency <br/> Pixel 3 [TFLite GPU](https://www.tensorflow.org/lite/performance/gpu_advanced) | Latency <br/> MacBook Pro (15-inch 2017)
--------------- | -------------------------------------------------------------------------------------------: | ---------------------------------------:
BlazePose.Heavy | 53 ms | 38 ms
BlazePose.Full | 25 ms | 27 ms
BlazePose.Lite | 20 ms | 25 ms
## Models
@ -109,7 +122,7 @@ hip midpoints.
![pose_tracking_detector_vitruvian_man.png](../images/mobile/pose_tracking_detector_vitruvian_man.png) |
:----------------------------------------------------------------------------------------------------: |
*Fig 2. Vitruvian man aligned via two virtual keypoints predicted by BlazePose detector in addition to the face bounding box.* |
*Fig 3. Vitruvian man aligned via two virtual keypoints predicted by BlazePose detector in addition to the face bounding box.* |
### Pose Landmark Model (BlazePose GHUM 3D)
@ -124,7 +137,7 @@ this [paper](https://arxiv.org/abs/2006.10204) and
![pose_tracking_full_body_landmarks.png](../images/mobile/pose_tracking_full_body_landmarks.png) |
:----------------------------------------------------------------------------------------------: |
*Fig 3. 33 pose landmarks.* |
*Fig 4. 33 pose landmarks.* |
## Solution APIs
@ -384,3 +397,6 @@ on how to build MediaPipe examples.
* [Models and model cards](./models.md#pose)
* [Web demo](https://code.mediapipe.dev/codepen/pose)
* [Python Colab](https://mediapipe.page.link/pose_py_colab)
[`mAP`]: https://cocodataset.org/#keypoints-eval
[`PCK@0.2`]: https\://github.com/cbsudux/Human-Pose-Estimation-101

View File

@ -233,6 +233,22 @@ cc_test(
],
)
cc_library(
name = "concatenate_vector_calculator_hdr",
hdrs = ["concatenate_vector_calculator.h"],
visibility = ["//visibility:public"],
deps = [
":concatenate_vector_calculator_cc_proto",
"//mediapipe/framework:calculator_framework",
"//mediapipe/framework/api2:node",
"//mediapipe/framework/api2:port",
"//mediapipe/framework/port:integral_types",
"//mediapipe/framework/port:ret_check",
"//mediapipe/framework/port:status",
],
alwayslink = 1,
)
cc_library(
name = "concatenate_vector_calculator",
srcs = ["concatenate_vector_calculator.cc"],

View File

@ -71,7 +71,8 @@ absl::Status DefaultSidePacketCalculator::GetContract(CalculatorContract* cc) {
if (cc->InputSidePackets().HasTag(kOptionalValueTag)) {
cc->InputSidePackets()
.Tag(kOptionalValueTag)
.SetSameAs(&cc->InputSidePackets().Tag(kDefaultValueTag));
.SetSameAs(&cc->InputSidePackets().Tag(kDefaultValueTag))
.Optional();
}
RET_CHECK(cc->OutputSidePackets().HasTag(kValueTag));

View File

@ -410,7 +410,9 @@ cc_library(
srcs = ["image_properties_calculator.cc"],
visibility = ["//visibility:public"],
deps = [
"//mediapipe/framework/api2:node",
"//mediapipe/framework:calculator_framework",
"//mediapipe/framework/formats:image",
"//mediapipe/framework/formats:image_frame",
"//mediapipe/framework/port:ret_check",
"//mediapipe/framework/port:status",

View File

@ -12,25 +12,32 @@
// See the License for the specific language governing permissions and
// limitations under the License.
#include "mediapipe/framework/api2/node.h"
#include "mediapipe/framework/calculator_framework.h"
#include "mediapipe/framework/formats/image.h"
#include "mediapipe/framework/formats/image_frame.h"
#if !MEDIAPIPE_DISABLE_GPU
#include "mediapipe/gpu/gpu_buffer.h"
#endif // !MEDIAPIPE_DISABLE_GPU
namespace {
constexpr char kImageFrameTag[] = "IMAGE";
constexpr char kGpuBufferTag[] = "IMAGE_GPU";
} // namespace
namespace mediapipe {
namespace api2 {
#if MEDIAPIPE_DISABLE_GPU
// Just a placeholder to not have to depend on mediapipe::GpuBuffer.
using GpuBuffer = AnyType;
#else
using GpuBuffer = mediapipe::GpuBuffer;
#endif // MEDIAPIPE_DISABLE_GPU
// Extracts image properties from the input image and outputs the properties.
// Currently only supports image size.
// Input:
// One of the following:
// IMAGE: An ImageFrame
// IMAGE: An Image or ImageFrame (for backward compatibility with existing
// graphs that use IMAGE for ImageFrame input)
// IMAGE_CPU: An ImageFrame
// IMAGE_GPU: A GpuBuffer
//
// Output:
@ -42,59 +49,64 @@ namespace mediapipe {
// input_stream: "IMAGE:image"
// output_stream: "SIZE:size"
// }
class ImagePropertiesCalculator : public CalculatorBase {
class ImagePropertiesCalculator : public Node {
public:
static absl::Status GetContract(CalculatorContract* cc) {
RET_CHECK(cc->Inputs().HasTag(kImageFrameTag) ^
cc->Inputs().HasTag(kGpuBufferTag));
if (cc->Inputs().HasTag(kImageFrameTag)) {
cc->Inputs().Tag(kImageFrameTag).Set<ImageFrame>();
}
#if !MEDIAPIPE_DISABLE_GPU
if (cc->Inputs().HasTag(kGpuBufferTag)) {
cc->Inputs().Tag(kGpuBufferTag).Set<::mediapipe::GpuBuffer>();
}
#endif // !MEDIAPIPE_DISABLE_GPU
static constexpr Input<
OneOf<mediapipe::Image, mediapipe::ImageFrame>>::Optional kIn{"IMAGE"};
// IMAGE_CPU, dedicated to ImageFrame input, is only needed in some top-level
// graphs for the Python Solution APIs to figure out the type of input stream
// without running into ambiguities from IMAGE.
// TODO: Remove IMAGE_CPU once Python Solution APIs adopt Image.
static constexpr Input<mediapipe::ImageFrame>::Optional kInCpu{"IMAGE_CPU"};
static constexpr Input<GpuBuffer>::Optional kInGpu{"IMAGE_GPU"};
static constexpr Output<std::pair<int, int>> kOut{"SIZE"};
if (cc->Outputs().HasTag("SIZE")) {
cc->Outputs().Tag("SIZE").Set<std::pair<int, int>>();
}
MEDIAPIPE_NODE_CONTRACT(kIn, kInCpu, kInGpu, kOut);
return absl::OkStatus();
}
static absl::Status UpdateContract(CalculatorContract* cc) {
RET_CHECK_EQ(kIn(cc).IsConnected() + kInCpu(cc).IsConnected() +
kInGpu(cc).IsConnected(),
1)
<< "One and only one of IMAGE, IMAGE_CPU and IMAGE_GPU input is "
"expected.";
absl::Status Open(CalculatorContext* cc) override {
cc->SetOffset(TimestampDiff(0));
return absl::OkStatus();
}
absl::Status Process(CalculatorContext* cc) override {
int width;
int height;
std::pair<int, int> size;
if (cc->Inputs().HasTag(kImageFrameTag) &&
!cc->Inputs().Tag(kImageFrameTag).IsEmpty()) {
const auto& image = cc->Inputs().Tag(kImageFrameTag).Get<ImageFrame>();
width = image.Width();
height = image.Height();
if (kIn(cc).IsConnected()) {
kIn(cc).Visit(
[&size](const mediapipe::Image& value) {
size.first = value.width();
size.second = value.height();
},
[&size](const mediapipe::ImageFrame& value) {
size.first = value.Width();
size.second = value.Height();
});
}
if (kInCpu(cc).IsConnected()) {
const auto& image = *kInCpu(cc);
size.first = image.Width();
size.second = image.Height();
}
#if !MEDIAPIPE_DISABLE_GPU
if (cc->Inputs().HasTag(kGpuBufferTag) &&
!cc->Inputs().Tag(kGpuBufferTag).IsEmpty()) {
const auto& image =
cc->Inputs().Tag(kGpuBufferTag).Get<mediapipe::GpuBuffer>();
width = image.width();
height = image.height();
if (kInGpu(cc).IsConnected()) {
const auto& image = *kInGpu(cc);
size.first = image.width();
size.second = image.height();
}
#endif // !MEDIAPIPE_DISABLE_GPU
cc->Outputs().Tag("SIZE").AddPacket(
MakePacket<std::pair<int, int>>(width, height)
.At(cc->InputTimestamp()));
kOut(cc).Send(size);
return absl::OkStatus();
}
};
REGISTER_CALCULATOR(ImagePropertiesCalculator);
MEDIAPIPE_REGISTER_NODE(ImagePropertiesCalculator);
} // namespace api2
} // namespace mediapipe

View File

@ -585,6 +585,7 @@ cc_library(
],
"//conditions:default": [],
}),
visibility = ["//visibility:public"],
deps = [
":image_to_tensor_utils",
"//mediapipe/framework/formats:image",

View File

@ -312,7 +312,7 @@ class GlProcessor : public ImageToTensorConverter {
return absl::OkStatus();
}));
return tensor;
return std::move(tensor);
}
~GlProcessor() override {

View File

@ -383,7 +383,7 @@ class MetalProcessor : public ImageToTensorConverter {
tflite::gpu::HW(output_dims.height, output_dims.width),
command_buffer, buffer_view.buffer()));
[command_buffer commit];
return tensor;
return std::move(tensor);
}
}

View File

@ -103,7 +103,7 @@ class OpenCvProcessor : public ImageToTensorConverter {
GetValueRangeTransformation(kInputImageRangeMin, kInputImageRangeMax,
range_min, range_max));
transformed.convertTo(dst, CV_32FC3, transform.scale, transform.offset);
return tensor;
return std::move(tensor);
}
private:

View File

@ -205,11 +205,12 @@ class VelocityFilter : public LandmarksFilter {
class OneEuroFilterImpl : public LandmarksFilter {
public:
OneEuroFilterImpl(double frequency, double min_cutoff, double beta,
double derivate_cutoff)
double derivate_cutoff, float min_allowed_object_scale)
: frequency_(frequency),
min_cutoff_(min_cutoff),
beta_(beta),
derivate_cutoff_(derivate_cutoff) {}
derivate_cutoff_(derivate_cutoff),
min_allowed_object_scale_(min_allowed_object_scale) {}
absl::Status Reset() override {
x_filters_.clear();
@ -224,15 +225,25 @@ class OneEuroFilterImpl : public LandmarksFilter {
// Initialize filters once.
MP_RETURN_IF_ERROR(InitializeFiltersIfEmpty(in_landmarks.landmark_size()));
const float object_scale = GetObjectScale(in_landmarks);
if (object_scale < min_allowed_object_scale_) {
*out_landmarks = in_landmarks;
return absl::OkStatus();
}
const float value_scale = 1.0f / object_scale;
// Filter landmarks. Every axis of every landmark is filtered separately.
for (int i = 0; i < in_landmarks.landmark_size(); ++i) {
const auto& in_landmark = in_landmarks.landmark(i);
auto* out_landmark = out_landmarks->add_landmark();
*out_landmark = in_landmark;
out_landmark->set_x(x_filters_[i].Apply(timestamp, in_landmark.x()));
out_landmark->set_y(y_filters_[i].Apply(timestamp, in_landmark.y()));
out_landmark->set_z(z_filters_[i].Apply(timestamp, in_landmark.z()));
out_landmark->set_x(
x_filters_[i].Apply(timestamp, value_scale, in_landmark.x()));
out_landmark->set_y(
y_filters_[i].Apply(timestamp, value_scale, in_landmark.y()));
out_landmark->set_z(
z_filters_[i].Apply(timestamp, value_scale, in_landmark.z()));
}
return absl::OkStatus();
@ -265,6 +276,7 @@ class OneEuroFilterImpl : public LandmarksFilter {
double min_cutoff_;
double beta_;
double derivate_cutoff_;
double min_allowed_object_scale_;
std::vector<OneEuroFilter> x_filters_;
std::vector<OneEuroFilter> y_filters_;
@ -344,7 +356,8 @@ absl::Status LandmarksSmoothingCalculator::Open(CalculatorContext* cc) {
options.one_euro_filter().frequency(),
options.one_euro_filter().min_cutoff(),
options.one_euro_filter().beta(),
options.one_euro_filter().derivate_cutoff());
options.one_euro_filter().derivate_cutoff(),
options.one_euro_filter().min_allowed_object_scale());
} else {
RET_CHECK_FAIL()
<< "Landmarks filter is either not specified or not supported";

View File

@ -50,9 +50,9 @@ message LandmarksSmoothingCalculatorOptions {
// For the details of the filter implementation and the procedure of its
// configuration please check http://cristal.univ-lille.fr/~casiez/1euro/
message OneEuroFilter {
// Frequency of incomming frames defined in seconds. Used only if can't be
// calculated from provided events (e.g. on the very first frame).
optional float frequency = 1 [default = 0.033];
// Frequency of incomming frames defined in frames per seconds. Used only if
// can't be calculated from provided events (e.g. on the very first frame).
optional float frequency = 1 [default = 30.0];
// Minimum cutoff frequency. Start by tuning this parameter while keeping
// `beta = 0` to reduce jittering to the desired level. 1Hz (the default
@ -68,6 +68,10 @@ message LandmarksSmoothingCalculatorOptions {
// algorithm, but can be tuned to further smooth the speed (i.e. derivate)
// on the object.
optional float derivate_cutoff = 4 [default = 1.0];
// If calculated object scale is less than given value smoothing will be
// disabled and landmarks will be returned as is.
optional float min_allowed_object_scale = 5 [default = 1e-6];
}
oneof filter_options {

View File

@ -77,10 +77,12 @@ class RefineLandmarksFromHeatmapCalculatorImpl
const auto& options =
cc->Options<mediapipe::RefineLandmarksFromHeatmapCalculatorOptions>();
ASSIGN_OR_RETURN(auto out_lms, RefineLandmarksFromHeatMap(
in_lms, hm_raw, hm_tensor.shape().dims,
options.kernel_size(),
options.min_confidence_to_refine()));
ASSIGN_OR_RETURN(
auto out_lms,
RefineLandmarksFromHeatMap(
in_lms, hm_raw, hm_tensor.shape().dims, options.kernel_size(),
options.min_confidence_to_refine(), options.refine_presence(),
options.refine_visibility()));
kOutLandmarks(cc).Send(std::move(out_lms));
return absl::OkStatus();
@ -104,7 +106,8 @@ class RefineLandmarksFromHeatmapCalculatorImpl
absl::StatusOr<mediapipe::NormalizedLandmarkList> RefineLandmarksFromHeatMap(
const mediapipe::NormalizedLandmarkList& in_lms,
const float* heatmap_raw_data, const std::vector<int>& heatmap_dims,
int kernel_size, float min_confidence_to_refine) {
int kernel_size, float min_confidence_to_refine, bool refine_presence,
bool refine_visibility) {
ASSIGN_OR_RETURN(auto hm_dims, GetHwcFromDims(heatmap_dims));
auto [hm_height, hm_width, hm_channels] = hm_dims;
@ -136,7 +139,7 @@ absl::StatusOr<mediapipe::NormalizedLandmarkList> RefineLandmarksFromHeatMap(
float sum = 0;
float weighted_col = 0;
float weighted_row = 0;
float max_value = 0;
float max_confidence_value = 0;
// Main loop. Go over kernel and calculate weighted sum of coordinates,
// sum of weights and max weights.
@ -150,15 +153,33 @@ absl::StatusOr<mediapipe::NormalizedLandmarkList> RefineLandmarksFromHeatMap(
// options.
float confidence = Sigmoid(heatmap_raw_data[idx]);
sum += confidence;
max_value = std::max(max_value, confidence);
max_confidence_value = std::max(max_confidence_value, confidence);
weighted_col += col * confidence;
weighted_row += row * confidence;
}
}
if (max_value >= min_confidence_to_refine && sum > 0) {
if (max_confidence_value >= min_confidence_to_refine && sum > 0) {
out_lms.mutable_landmark(lm_index)->set_x(weighted_col / hm_width / sum);
out_lms.mutable_landmark(lm_index)->set_y(weighted_row / hm_height / sum);
}
if (refine_presence && sum > 0 &&
out_lms.landmark(lm_index).has_presence()) {
// We assume confidence in heatmaps describes landmark presence.
// If landmark is not confident in heatmaps, probably it is not present.
const float presence = out_lms.landmark(lm_index).presence();
const float new_presence = std::min(presence, max_confidence_value);
out_lms.mutable_landmark(lm_index)->set_presence(new_presence);
}
if (refine_visibility && sum > 0 &&
out_lms.landmark(lm_index).has_visibility()) {
// We assume confidence in heatmaps describes landmark presence.
// As visibility = (not occluded but still present) -> that mean that if
// landmark is not present, it is not visible as well.
// I.e. visibility confidence cannot be bigger than presence confidence.
const float visibility = out_lms.landmark(lm_index).visibility();
const float new_visibility = std::min(visibility, max_confidence_value);
out_lms.mutable_landmark(lm_index)->set_visibility(new_visibility);
}
}
return out_lms;
}

View File

@ -43,7 +43,8 @@ class RefineLandmarksFromHeatmapCalculator : public NodeIntf {
absl::StatusOr<mediapipe::NormalizedLandmarkList> RefineLandmarksFromHeatMap(
const mediapipe::NormalizedLandmarkList& in_lms,
const float* heatmap_raw_data, const std::vector<int>& heatmap_dims,
int kernel_size, float min_confidence_to_refine);
int kernel_size, float min_confidence_to_refine, bool refine_presence,
bool refine_visibility);
} // namespace mediapipe

View File

@ -24,4 +24,6 @@ message RefineLandmarksFromHeatmapCalculatorOptions {
}
optional int32 kernel_size = 1 [default = 9];
optional float min_confidence_to_refine = 2 [default = 0.5];
optional bool refine_presence = 3 [default = false];
optional bool refine_visibility = 4 [default = false];
}

View File

@ -70,8 +70,8 @@ TEST(RefineLandmarksFromHeatmapTest, Smoke) {
z, z, z};
// clang-format on
auto ret_or_error = RefineLandmarksFromHeatMap(vec_to_lms({{0.5, 0.5}}),
hm.data(), {3, 3, 1}, 3, 0.1);
auto ret_or_error = RefineLandmarksFromHeatMap(
vec_to_lms({{0.5, 0.5}}), hm.data(), {3, 3, 1}, 3, 0.1, true, true);
MP_EXPECT_OK(ret_or_error);
EXPECT_THAT(lms_to_vec(*ret_or_error),
ElementsAre(Pair(FloatEq(0), FloatEq(1 / 3.))));
@ -94,7 +94,7 @@ TEST(RefineLandmarksFromHeatmapTest, MultiLayer) {
auto ret_or_error = RefineLandmarksFromHeatMap(
vec_to_lms({{0.5, 0.5}, {0.5, 0.5}, {0.5, 0.5}}), hm.data(), {3, 3, 3}, 3,
0.1);
0.1, true, true);
MP_EXPECT_OK(ret_or_error);
EXPECT_THAT(lms_to_vec(*ret_or_error),
ElementsAre(Pair(FloatEq(0), FloatEq(1 / 3.)),
@ -119,7 +119,7 @@ TEST(RefineLandmarksFromHeatmapTest, KeepIfNotSure) {
auto ret_or_error = RefineLandmarksFromHeatMap(
vec_to_lms({{0.5, 0.5}, {0.5, 0.5}, {0.5, 0.5}}), hm.data(), {3, 3, 3}, 3,
0.6);
0.6, true, true);
MP_EXPECT_OK(ret_or_error);
EXPECT_THAT(lms_to_vec(*ret_or_error),
ElementsAre(Pair(FloatEq(0.5), FloatEq(0.5)),
@ -140,8 +140,9 @@ TEST(RefineLandmarksFromHeatmapTest, Border) {
z, z, 0}, 3, 3, 2);
// clang-format on
auto ret_or_error = RefineLandmarksFromHeatMap(
vec_to_lms({{0.0, 0.0}, {0.9, 0.9}}), hm.data(), {3, 3, 2}, 3, 0.1);
auto ret_or_error =
RefineLandmarksFromHeatMap(vec_to_lms({{0.0, 0.0}, {0.9, 0.9}}),
hm.data(), {3, 3, 2}, 3, 0.1, true, true);
MP_EXPECT_OK(ret_or_error);
EXPECT_THAT(lms_to_vec(*ret_or_error),
ElementsAre(Pair(FloatEq(0), FloatEq(1 / 3.)),

View File

@ -1638,6 +1638,8 @@ cc_test(
":calculator_contract_test_cc_proto",
":calculator_framework",
":graph_validation",
"//mediapipe/calculators/core:constant_side_packet_calculator",
"//mediapipe/calculators/core:default_side_packet_calculator",
"//mediapipe/calculators/core:pass_through_calculator",
"//mediapipe/framework:calculator_cc_proto",
"//mediapipe/framework:packet_generator_cc_proto",

View File

@ -236,7 +236,8 @@ inline int Image::channels() const {
inline int Image::step() const {
if (use_gpu_)
return width() * ImageFrame::ByteDepthForFormat(image_format());
return width() * channels() *
ImageFrame::ByteDepthForFormat(image_format());
else
return image_frame_->WidthStep();
}

View File

@ -499,5 +499,55 @@ TEST(GraphValidationTest, OptionalInputsForGraph) {
MP_EXPECT_OK(graph_1.WaitUntilDone());
}
// Shows a calculator graph and DefaultSidePacketCalculator running with and
// without one optional side packet.
TEST(GraphValidationTest, DefaultOptionalInputsForGraph) {
// A subgraph defining one optional input-side-packet.
auto config_1 = ParseTextProtoOrDie<CalculatorGraphConfig>(R"pb(
type: "PassThroughGraph"
input_side_packet: "side_input_0"
output_side_packet: "OUTPUT:output_0"
node {
calculator: "ConstantSidePacketCalculator"
options: {
[mediapipe.ConstantSidePacketCalculatorOptions.ext]: {
packet { int_value: 2 }
}
}
output_side_packet: "PACKET:int_packet"
}
node {
calculator: "DefaultSidePacketCalculator"
input_side_packet: "OPTIONAL_VALUE:side_input_0"
input_side_packet: "DEFAULT_VALUE:int_packet"
output_side_packet: "VALUE:side_output_0"
}
)pb");
GraphValidation validation_1;
MP_EXPECT_OK(validation_1.Validate({config_1}, {}));
CalculatorGraph graph_1;
MP_EXPECT_OK(graph_1.Initialize({config_1}, {}));
// Run the graph specifying the optional side packet.
std::map<std::string, Packet> side_packets;
side_packets.insert({"side_input_0", MakePacket<int>(33)});
MP_EXPECT_OK(graph_1.StartRun(side_packets));
MP_EXPECT_OK(graph_1.CloseAllPacketSources());
MP_EXPECT_OK(graph_1.WaitUntilDone());
// The specified side packet value is used.
auto side_packet_0 = graph_1.GetOutputSidePacket("side_output_0");
EXPECT_EQ(side_packet_0->Get<int>(), 33);
// Run the graph omitting the optional inputs.
MP_EXPECT_OK(graph_1.StartRun({}));
MP_EXPECT_OK(graph_1.CloseAllPacketSources());
MP_EXPECT_OK(graph_1.WaitUntilDone());
// The default side packet value is used.
side_packet_0 = graph_1.GetOutputSidePacket("side_output_0");
EXPECT_EQ(side_packet_0->Get<int>(), 2);
}
} // namespace
} // namespace mediapipe

View File

@ -604,7 +604,6 @@ absl::Status GraphProfiler::CaptureProfile(GraphProfile* result) {
*result->mutable_calculator_profiles()->Add() = std::move(p);
}
this->Reset();
AssignNodeNames(result);
return status;
}

View File

@ -681,6 +681,7 @@ cc_library(
"//mediapipe/framework/port:logging",
"//mediapipe/framework/port:ret_check",
"//mediapipe/framework/port:status",
"//mediapipe/framework/tool:switch_container_cc_proto",
"@com_google_absl//absl/strings",
],
alwayslink = 1,
@ -705,6 +706,7 @@ cc_library(
"//mediapipe/framework/port:logging",
"//mediapipe/framework/port:ret_check",
"//mediapipe/framework/port:status",
"//mediapipe/framework/tool:switch_container_cc_proto",
"@com_google_absl//absl/strings",
],
alwayslink = 1,

View File

@ -82,6 +82,8 @@ CalculatorGraphConfig::Node* BuildDemuxNode(
CalculatorGraphConfig* config) {
CalculatorGraphConfig::Node* result = config->add_node();
*result->mutable_calculator() = "SwitchDemuxCalculator";
*result->mutable_input_stream_handler()->mutable_input_stream_handler() =
"ImmediateInputStreamHandler";
return result;
}
@ -91,9 +93,42 @@ CalculatorGraphConfig::Node* BuildMuxNode(
CalculatorGraphConfig* config) {
CalculatorGraphConfig::Node* result = config->add_node();
*result->mutable_calculator() = "SwitchMuxCalculator";
*result->mutable_input_stream_handler()->mutable_input_stream_handler() =
"ImmediateInputStreamHandler";
return result;
}
// Copies options from one node to another.
void CopyOptions(const CalculatorGraphConfig::Node& source,
CalculatorGraphConfig::Node* dest) {
if (source.has_options()) {
*dest->mutable_options() = source.options();
}
*dest->mutable_node_options() = source.node_options();
}
// Clears options that are consumed by the container and not forwarded.
void ClearContainerOptions(SwitchContainerOptions* result) {
result->clear_contained_node();
}
// Clears options that are consumed by the container and not forwarded.
void ClearContainerOptions(CalculatorGraphConfig::Node* dest) {
if (dest->has_options() &&
dest->mutable_options()->HasExtension(SwitchContainerOptions::ext)) {
ClearContainerOptions(
dest->mutable_options()->MutableExtension(SwitchContainerOptions::ext));
}
for (google::protobuf::Any& a : *dest->mutable_node_options()) {
if (a.Is<SwitchContainerOptions>()) {
SwitchContainerOptions extension;
a.UnpackTo(&extension);
ClearContainerOptions(&extension);
a.PackFrom(extension);
}
}
}
// Returns an unused name similar to a specified name.
std::string UniqueName(std::string name, std::set<std::string>* names) {
CHECK(names != nullptr);
@ -199,12 +234,16 @@ absl::StatusOr<CalculatorGraphConfig> SwitchContainer::GetConfig(
// Add a graph node for the demux, mux.
auto demux = BuildDemuxNode(input_tags, &config);
CopyOptions(container_node, demux);
ClearContainerOptions(demux);
demux->add_input_stream("SELECT:gate_select");
demux->add_input_stream("ENABLE:gate_enable");
demux->add_input_side_packet("SELECT:gate_select");
demux->add_input_side_packet("ENABLE:gate_enable");
auto mux = BuildMuxNode(output_tags, &config);
CopyOptions(container_node, mux);
ClearContainerOptions(mux);
mux->add_input_stream("SELECT:gate_select");
mux->add_input_stream("ENABLE:gate_enable");
mux->add_input_side_packet("SELECT:gate_select");

View File

@ -225,6 +225,12 @@ TEST(SwitchContainerTest, ApplyToSubnodes) {
input_stream: "foo"
output_stream: "C0__:switchcontainer__c0__foo"
output_stream: "C1__:switchcontainer__c1__foo"
options {
[mediapipe.SwitchContainerOptions.ext] {}
}
input_stream_handler {
input_stream_handler: "ImmediateInputStreamHandler"
}
}
node {
name: "switchcontainer__TripleIntCalculator"
@ -245,6 +251,12 @@ TEST(SwitchContainerTest, ApplyToSubnodes) {
input_stream: "C0__:switchcontainer__c0__bar"
input_stream: "C1__:switchcontainer__c1__bar"
output_stream: "bar"
options {
[mediapipe.SwitchContainerOptions.ext] {}
}
input_stream_handler {
input_stream_handler: "ImmediateInputStreamHandler"
}
}
node {
calculator: "PassThroughCalculator"
@ -270,6 +282,75 @@ TEST(SwitchContainerTest, RunsWithSubnodes) {
RunTestContainer(supergraph);
}
// Shows the SwitchContainer does not allow input_stream_handler overwrite.
TEST(SwitchContainerTest, ValidateInputStreamHandler) {
EXPECT_TRUE(SubgraphRegistry::IsRegistered("SwitchContainer"));
CalculatorGraph graph;
CalculatorGraphConfig supergraph = SideSubnodeContainerExample();
*supergraph.mutable_input_stream_handler()->mutable_input_stream_handler() =
"DefaultInputStreamHandler";
MP_ASSERT_OK(graph.Initialize(supergraph, {}));
CalculatorGraphConfig expected_graph = mediapipe::ParseTextProtoOrDie<
CalculatorGraphConfig>(R"pb(
node {
name: "switchcontainer__SwitchDemuxCalculator"
calculator: "SwitchDemuxCalculator"
input_side_packet: "ENABLE:enable"
input_side_packet: "foo"
output_side_packet: "C0__:switchcontainer__c0__foo"
output_side_packet: "C1__:switchcontainer__c1__foo"
options {
[mediapipe.SwitchContainerOptions.ext] {}
}
input_stream_handler {
input_stream_handler: "ImmediateInputStreamHandler"
}
}
node {
name: "switchcontainer__TripleIntCalculator"
calculator: "TripleIntCalculator"
input_side_packet: "switchcontainer__c0__foo"
output_side_packet: "switchcontainer__c0__bar"
input_stream_handler { input_stream_handler: "DefaultInputStreamHandler" }
}
node {
name: "switchcontainer__PassThroughCalculator"
calculator: "PassThroughCalculator"
input_side_packet: "switchcontainer__c1__foo"
output_side_packet: "switchcontainer__c1__bar"
input_stream_handler { input_stream_handler: "DefaultInputStreamHandler" }
}
node {
name: "switchcontainer__SwitchMuxCalculator"
calculator: "SwitchMuxCalculator"
input_side_packet: "ENABLE:enable"
input_side_packet: "C0__:switchcontainer__c0__bar"
input_side_packet: "C1__:switchcontainer__c1__bar"
output_side_packet: "bar"
options {
[mediapipe.SwitchContainerOptions.ext] {}
}
input_stream_handler {
input_stream_handler: "ImmediateInputStreamHandler"
}
}
node {
calculator: "PassThroughCalculator"
input_side_packet: "foo"
input_side_packet: "bar"
output_side_packet: "output_foo"
output_side_packet: "output_bar"
input_stream_handler { input_stream_handler: "DefaultInputStreamHandler" }
}
input_stream_handler { input_stream_handler: "DefaultInputStreamHandler" }
executor {}
input_side_packet: "foo"
input_side_packet: "enable"
output_side_packet: "output_bar"
)pb");
EXPECT_THAT(graph.Config(), mediapipe::EqualsProto(expected_graph));
}
// Shows the SwitchContainer container applied to a pair of simple subnodes.
TEST(SwitchContainerTest, ApplyToSideSubnodes) {
EXPECT_TRUE(SubgraphRegistry::IsRegistered("SwitchContainer"));
@ -286,6 +367,12 @@ TEST(SwitchContainerTest, ApplyToSideSubnodes) {
input_side_packet: "foo"
output_side_packet: "C0__:switchcontainer__c0__foo"
output_side_packet: "C1__:switchcontainer__c1__foo"
options {
[mediapipe.SwitchContainerOptions.ext] {}
}
input_stream_handler {
input_stream_handler: "ImmediateInputStreamHandler"
}
}
node {
name: "switchcontainer__TripleIntCalculator"
@ -306,6 +393,12 @@ TEST(SwitchContainerTest, ApplyToSideSubnodes) {
input_side_packet: "C0__:switchcontainer__c0__bar"
input_side_packet: "C1__:switchcontainer__c1__bar"
output_side_packet: "bar"
options {
[mediapipe.SwitchContainerOptions.ext] {}
}
input_stream_handler {
input_stream_handler: "ImmediateInputStreamHandler"
}
}
node {
calculator: "PassThroughCalculator"

View File

@ -70,17 +70,11 @@ REGISTER_CALCULATOR(SwitchDemuxCalculator);
absl::Status SwitchDemuxCalculator::GetContract(CalculatorContract* cc) {
// Allow any one of kSelectTag, kEnableTag.
if (cc->Inputs().HasTag(kSelectTag)) {
cc->Inputs().Tag(kSelectTag).Set<int>();
} else if (cc->Inputs().HasTag(kEnableTag)) {
cc->Inputs().Tag(kEnableTag).Set<bool>();
}
cc->Inputs().Tag(kSelectTag).Set<int>().Optional();
cc->Inputs().Tag(kEnableTag).Set<bool>().Optional();
// Allow any one of kSelectTag, kEnableTag.
if (cc->InputSidePackets().HasTag(kSelectTag)) {
cc->InputSidePackets().Tag(kSelectTag).Set<int>();
} else if (cc->InputSidePackets().HasTag(kEnableTag)) {
cc->InputSidePackets().Tag(kEnableTag).Set<bool>();
}
cc->InputSidePackets().Tag(kSelectTag).Set<int>().Optional();
cc->InputSidePackets().Tag(kEnableTag).Set<bool>().Optional();
// Set the types for all output channels to corresponding input types.
std::set<std::string> channel_tags = ChannelTags(cc->Outputs().TagMap());

View File

@ -73,17 +73,11 @@ REGISTER_CALCULATOR(SwitchMuxCalculator);
absl::Status SwitchMuxCalculator::GetContract(CalculatorContract* cc) {
// Allow any one of kSelectTag, kEnableTag.
if (cc->Inputs().HasTag(kSelectTag)) {
cc->Inputs().Tag(kSelectTag).Set<int>();
} else if (cc->Inputs().HasTag(kEnableTag)) {
cc->Inputs().Tag(kEnableTag).Set<bool>();
}
cc->Inputs().Tag(kSelectTag).Set<int>().Optional();
cc->Inputs().Tag(kEnableTag).Set<bool>().Optional();
// Allow any one of kSelectTag, kEnableTag.
if (cc->InputSidePackets().HasTag(kSelectTag)) {
cc->InputSidePackets().Tag(kSelectTag).Set<int>();
} else if (cc->InputSidePackets().HasTag(kEnableTag)) {
cc->InputSidePackets().Tag(kEnableTag).Set<bool>();
}
cc->InputSidePackets().Tag(kSelectTag).Set<int>().Optional();
cc->InputSidePackets().Tag(kEnableTag).Set<bool>().Optional();
// Set the types for all input channels to corresponding output types.
std::set<std::string> channel_tags = ChannelTags(cc->Inputs().TagMap());

View File

@ -44,7 +44,6 @@ cc_library(
name = "holistic_tracking_gpu_deps",
deps = [
":holistic_tracking_to_render_data",
"//mediapipe/calculators/core:constant_side_packet_calculator",
"//mediapipe/calculators/core:flow_limiter_calculator",
"//mediapipe/calculators/image:image_properties_calculator",
"//mediapipe/calculators/util:annotation_overlay_calculator",
@ -63,7 +62,6 @@ cc_library(
name = "holistic_tracking_cpu_graph_deps",
deps = [
":holistic_tracking_to_render_data",
"//mediapipe/calculators/core:constant_side_packet_calculator",
"//mediapipe/calculators/core:flow_limiter_calculator",
"//mediapipe/calculators/image:image_properties_calculator",
"//mediapipe/calculators/util:annotation_overlay_calculator",

View File

@ -36,23 +36,9 @@ node {
}
}
node {
calculator: "ConstantSidePacketCalculator"
output_side_packet: "PACKET:0:model_complexity"
output_side_packet: "PACKET:1:smooth_landmarks"
node_options: {
[type.googleapis.com/mediapipe.ConstantSidePacketCalculatorOptions]: {
packet { int_value: 1 }
packet { bool_value: true }
}
}
}
node {
calculator: "HolisticLandmarkCpu"
input_stream: "IMAGE:throttled_input_video"
input_side_packet: "MODEL_COMPLEXITY:model_complexity"
input_side_packet: "SMOOTH_LANDMARKS:smooth_landmarks"
output_stream: "POSE_LANDMARKS:pose_landmarks"
output_stream: "POSE_ROI:pose_roi"
output_stream: "POSE_DETECTION:pose_detection"

View File

@ -36,23 +36,9 @@ node {
}
}
node {
calculator: "ConstantSidePacketCalculator"
output_side_packet: "PACKET:0:model_complexity"
output_side_packet: "PACKET:1:smooth_landmarks"
node_options: {
[type.googleapis.com/mediapipe.ConstantSidePacketCalculatorOptions]: {
packet { int_value: 1 }
packet { bool_value: true }
}
}
}
node {
calculator: "HolisticLandmarkGpu"
input_stream: "IMAGE:throttled_input_video"
input_side_packet: "MODEL_COMPLEXITY:model_complexity"
input_side_packet: "SMOOTH_LANDMARKS:smooth_landmarks"
output_stream: "POSE_LANDMARKS:pose_landmarks"
output_stream: "POSE_ROI:pose_roi"
output_stream: "POSE_DETECTION:pose_detection"

View File

@ -24,10 +24,7 @@ package(default_visibility = ["//visibility:public"])
cc_library(
name = "pose_tracking_gpu_deps",
deps = [
"//mediapipe/calculators/core:constant_side_packet_calculator",
"//mediapipe/calculators/core:flow_limiter_calculator",
"//mediapipe/calculators/image:image_properties_calculator",
"//mediapipe/calculators/util:landmarks_smoothing_calculator",
"//mediapipe/graphs/pose_tracking/subgraphs:pose_renderer_gpu",
"//mediapipe/modules/pose_landmark:pose_landmark_gpu",
],
@ -43,10 +40,7 @@ mediapipe_binary_graph(
cc_library(
name = "pose_tracking_cpu_deps",
deps = [
"//mediapipe/calculators/core:constant_side_packet_calculator",
"//mediapipe/calculators/core:flow_limiter_calculator",
"//mediapipe/calculators/image:image_properties_calculator",
"//mediapipe/calculators/util:landmarks_smoothing_calculator",
"//mediapipe/graphs/pose_tracking/subgraphs:pose_renderer_cpu",
"//mediapipe/modules/pose_landmark:pose_landmark_cpu",
],

View File

@ -29,54 +29,20 @@ node {
output_stream: "throttled_input_video"
}
node {
calculator: "ConstantSidePacketCalculator"
output_side_packet: "PACKET:model_complexity"
node_options: {
[type.googleapis.com/mediapipe.ConstantSidePacketCalculatorOptions]: {
packet { int_value: 1 }
}
}
}
# Subgraph that detects poses and corresponding landmarks.
node {
calculator: "PoseLandmarkCpu"
input_side_packet: "MODEL_COMPLEXITY:model_complexity"
input_stream: "IMAGE:throttled_input_video"
output_stream: "LANDMARKS:pose_landmarks"
output_stream: "DETECTION:pose_detection"
output_stream: "ROI_FROM_LANDMARKS:roi_from_landmarks"
}
# Calculates size of the image.
node {
calculator: "ImagePropertiesCalculator"
input_stream: "IMAGE:throttled_input_video"
output_stream: "SIZE:image_size"
}
# Smoothes pose landmarks in order to reduce jitter.
node {
calculator: "LandmarksSmoothingCalculator"
input_stream: "NORM_LANDMARKS:pose_landmarks"
input_stream: "IMAGE_SIZE:image_size"
output_stream: "NORM_FILTERED_LANDMARKS:pose_landmarks_smoothed"
node_options: {
[type.googleapis.com/mediapipe.LandmarksSmoothingCalculatorOptions] {
velocity_filter: {
window_size: 5
velocity_scale: 10.0
}
}
}
}
# Subgraph that renders pose-landmark annotation onto the input image.
node {
calculator: "PoseRendererCpu"
input_stream: "IMAGE:throttled_input_video"
input_stream: "LANDMARKS:pose_landmarks_smoothed"
input_stream: "LANDMARKS:pose_landmarks"
input_stream: "ROI:roi_from_landmarks"
input_stream: "DETECTION:pose_detection"
output_stream: "IMAGE:output_video"

View File

@ -29,54 +29,20 @@ node {
output_stream: "throttled_input_video"
}
node {
calculator: "ConstantSidePacketCalculator"
output_side_packet: "PACKET:model_complexity"
node_options: {
[type.googleapis.com/mediapipe.ConstantSidePacketCalculatorOptions]: {
packet { int_value: 1 }
}
}
}
# Subgraph that detects poses and corresponding landmarks.
node {
calculator: "PoseLandmarkGpu"
input_side_packet: "MODEL_COMPLEXITY:model_complexity"
input_stream: "IMAGE:throttled_input_video"
output_stream: "LANDMARKS:pose_landmarks"
output_stream: "DETECTION:pose_detection"
output_stream: "ROI_FROM_LANDMARKS:roi_from_landmarks"
}
# Calculates size of the image.
node {
calculator: "ImagePropertiesCalculator"
input_stream: "IMAGE_GPU:throttled_input_video"
output_stream: "SIZE:image_size"
}
# Smoothes pose landmarks in order to reduce jitter.
node {
calculator: "LandmarksSmoothingCalculator"
input_stream: "NORM_LANDMARKS:pose_landmarks"
input_stream: "IMAGE_SIZE:image_size"
output_stream: "NORM_FILTERED_LANDMARKS:pose_landmarks_smoothed"
node_options: {
[type.googleapis.com/mediapipe.LandmarksSmoothingCalculatorOptions] {
velocity_filter: {
window_size: 5
velocity_scale: 10.0
}
}
}
}
# Subgraph that renders pose-landmark annotation onto the input image.
node {
calculator: "PoseRendererGpu"
input_stream: "IMAGE:throttled_input_video"
input_stream: "LANDMARKS:pose_landmarks_smoothed"
input_stream: "LANDMARKS:pose_landmarks"
input_stream: "ROI:roi_from_landmarks"
input_stream: "DETECTION:pose_detection"
output_stream: "IMAGE:output_video"

View File

@ -154,7 +154,7 @@ node {
# Extracts image size.
node {
calculator: "ImagePropertiesCalculator"
input_stream: "IMAGE:image"
input_stream: "IMAGE_CPU:image"
output_stream: "SIZE:image_size"
}

View File

@ -52,7 +52,7 @@ input_stream: "IMAGE:image"
# Complexity of the pose landmark model: 0, 1 or 2. Landmark accuracy as well as
# inference latency generally go up with the model complexity. If unspecified,
# functions as set to 0. (int)
# functions as set to 1. (int)
input_side_packet: "MODEL_COMPLEXITY:model_complexity"
# Whether to filter landmarks across different input images to reduce jitter.

View File

@ -52,7 +52,7 @@ input_stream: "IMAGE:image"
# Complexity of the pose landmark model: 0, 1 or 2. Landmark accuracy as well as
# inference latency generally go up with the model complexity. If unspecified,
# functions as set to 0. (int)
# functions as set to 1. (int)
input_side_packet: "MODEL_COMPLEXITY:model_complexity"
# Whether to filter landmarks across different input images to reduce jitter.

View File

@ -113,7 +113,7 @@ node {
# Extracts image size from the input images.
node {
calculator: "ImagePropertiesCalculator"
input_stream: "IMAGE:image"
input_stream: "IMAGE_CPU:image"
output_stream: "SIZE:image_size"
}

View File

@ -28,7 +28,7 @@ input_stream: "ROI:roi"
# Complexity of the pose landmark model: 0, 1 or 2. Landmark accuracy as well as
# inference latency generally go up with the model complexity. If unspecified,
# functions as set to 0. (int)
# functions as set to 1. (int)
input_side_packet: "MODEL_COMPLEXITY:model_complexity"
# Pose landmarks within the given ROI. (NormalizedLandmarkList)

View File

@ -28,7 +28,7 @@ input_stream: "ROI:roi"
# Complexity of the pose landmark model: 0, 1 or 2. Landmark accuracy as well as
# inference latency generally go up with the model complexity. If unspecified,
# functions as set to 0. (int)
# functions as set to 1. (int)
input_side_packet: "MODEL_COMPLEXITY:model_complexity"
# Pose landmarks within the given ROI. (NormalizedLandmarkList)

View File

@ -29,12 +29,12 @@ type: "PoseLandmarkCpu"
input_stream: "IMAGE:image"
# Whether to filter landmarks across different input images to reduce jitter.
# If unspecified, functions as set to false. (bool)
# If unspecified, functions as set to true. (bool)
input_side_packet: "SMOOTH_LANDMARKS:smooth_landmarks"
# Complexity of the pose landmark model: 0, 1 or 2. Landmark accuracy as well as
# inference latency generally go up with the model complexity. If unspecified,
# functions as set to 0. (int)
# functions as set to 1. (int)
input_side_packet: "MODEL_COMPLEXITY:model_complexity"
# Pose landmarks within the given ROI. (NormalizedLandmarkList)
@ -117,7 +117,7 @@ node: {
# Calculates size of the image.
node {
calculator: "ImagePropertiesCalculator"
input_stream: "IMAGE:image"
input_stream: "IMAGE_CPU:image"
output_stream: "SIZE:image_size"
}

View File

@ -14,7 +14,7 @@
type: "PoseLandmarkFiltering"
# Whether to enable filtering. If unspecified, functions as not enabled. (bool)
# Whether to enable filtering. If unspecified, functions as enabled. (bool)
input_side_packet: "ENABLE:enable"
# Size of the image (width & height) where the landmarks are estimated from.
@ -37,6 +37,7 @@ node {
output_stream: "NORM_FILTERED_LANDMARKS:filtered_visibility"
options: {
[mediapipe.SwitchContainerOptions.ext] {
enable: true
contained_node: {
calculator: "VisibilitySmoothingCalculator"
options: {
@ -68,6 +69,7 @@ node {
output_stream: "NORM_FILTERED_LANDMARKS:filtered_landmarks"
options: {
[mediapipe.SwitchContainerOptions.ext] {
enable: true
contained_node: {
calculator: "LandmarksSmoothingCalculator"
options: {
@ -80,9 +82,16 @@ node {
calculator: "LandmarksSmoothingCalculator"
options: {
[mediapipe.LandmarksSmoothingCalculatorOptions.ext] {
velocity_filter: {
window_size: 5
velocity_scale: 10.0
one_euro_filter {
# Min cutoff 0.1 results into ~ 0.02 alpha in landmark EMA filter
# when landmark is static.
min_cutoff: 0.1
# Beta 40.0 in combintation with min_cutoff 0.1 results into ~0.8
# alpha in landmark EMA filter when landmark is moving fast.
beta: 40.0
# Derivative cutoff 1.0 results into ~0.17 alpha in landmark
# velocity EMA filter.
derivate_cutoff: 1.0
}
}
}
@ -93,29 +102,13 @@ node {
# Smoothes pose landmark visibilities to reduce jitter.
node {
calculator: "SwitchContainer"
input_side_packet: "ENABLE:enable"
calculator: "VisibilitySmoothingCalculator"
input_stream: "NORM_LANDMARKS:aux_landmarks"
output_stream: "NORM_FILTERED_LANDMARKS:filtered_aux_visibility"
options: {
[mediapipe.SwitchContainerOptions.ext] {
contained_node: {
calculator: "VisibilitySmoothingCalculator"
options: {
[mediapipe.VisibilitySmoothingCalculatorOptions.ext] {
no_filter: {}
}
}
}
contained_node: {
calculator: "VisibilitySmoothingCalculator"
options: {
[mediapipe.VisibilitySmoothingCalculatorOptions.ext] {
low_pass_filter {
alpha: 0.1
}
}
}
[mediapipe.VisibilitySmoothingCalculatorOptions.ext] {
low_pass_filter {
alpha: 0.1
}
}
}
@ -123,31 +116,26 @@ node {
# Smoothes auxiliary landmarks to reduce jitter.
node {
calculator: "SwitchContainer"
input_side_packet: "ENABLE:enable"
calculator: "LandmarksSmoothingCalculator"
input_stream: "NORM_LANDMARKS:filtered_aux_visibility"
input_stream: "IMAGE_SIZE:image_size"
output_stream: "NORM_FILTERED_LANDMARKS:filtered_aux_landmarks"
options: {
[mediapipe.SwitchContainerOptions.ext] {
contained_node: {
calculator: "LandmarksSmoothingCalculator"
options: {
[mediapipe.LandmarksSmoothingCalculatorOptions.ext] {
no_filter: {}
}
}
}
contained_node: {
calculator: "LandmarksSmoothingCalculator"
options: {
[mediapipe.LandmarksSmoothingCalculatorOptions.ext] {
velocity_filter: {
window_size: 5
velocity_scale: 10.0
}
}
}
[mediapipe.LandmarksSmoothingCalculatorOptions.ext] {
# Auxiliary landmarks are smoothed heavier than main landmarks to
# make ROI crop for pose landmarks prediction very stable when
# object is not moving but responsive enough in case of sudden
# movements.
one_euro_filter {
# Min cutoff 0.01 results into ~ 0.002 alpha in landmark EMA
# filter when landmark is static.
min_cutoff: 0.01
# Beta 1.0 in combintation with min_cutoff 0.01 results into ~0.2
# alpha in landmark EMA filter when landmark is moving fast.
beta: 1.0
# Derivative cutoff 1.0 results into ~0.17 alpha in landmark
# velocity EMA filter.
derivate_cutoff: 1.0
}
}
}

View File

@ -29,12 +29,12 @@ type: "PoseLandmarkGpu"
input_stream: "IMAGE:image"
# Whether to filter landmarks across different input images to reduce jitter.
# If unspecified, functions as set to false. (bool)
# If unspecified, functions as set to true. (bool)
input_side_packet: "SMOOTH_LANDMARKS:smooth_landmarks"
# Complexity of the pose landmark model: 0, 1 or 2. Landmark accuracy as well as
# inference latency generally go up with the model complexity. If unspecified,
# functions as set to 0. (int)
# functions as set to 1. (int)
input_side_packet: "MODEL_COMPLEXITY:model_complexity"
# Pose landmarks within the given ROI. (NormalizedLandmarkList)

View File

@ -4,7 +4,7 @@ type: "PoseLandmarkModelLoader"
# Complexity of the pose landmark model: 0, 1 or 2. Landmark accuracy as well as
# inference latency generally go up with the model complexity. If unspecified,
# functions as set to 0. (int)
# functions as set to 1. (int)
input_side_packet: "MODEL_COMPLEXITY:model_complexity"
# TF Lite model represented as a FlatBuffer.
@ -18,6 +18,7 @@ node {
output_side_packet: "PACKET:model_path"
options: {
[mediapipe.SwitchContainerOptions.ext] {
select: 1
contained_node: {
calculator: "ConstantSidePacketCalculator"
options: {

View File

@ -0,0 +1,37 @@
# Copyright 2021 The MediaPipe Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""MediaPipe Downloading utils."""
import os
import shutil
import urllib.request
_OSS_URL_PREFIX = 'https://github.com/google/mediapipe/raw/master/'
def download_oss_model(model_path: str):
"""Downloads the oss model from the MediaPipe GitHub repo if it doesn't exist in the package."""
mp_root_path = os.sep.join(os.path.abspath(__file__).split(os.sep)[:-4])
model_abspath = os.path.join(mp_root_path, model_path)
if os.path.exists(model_abspath):
return
model_url = _OSS_URL_PREFIX + model_path
print('Downloading model to ' + model_abspath)
with urllib.request.urlopen(model_url) as response, open(model_abspath,
'wb') as out_file:
if response.code != 200:
raise ConnectionError('Cannot download ' + model_path +
' from the MediaPipe Github repo.')
shutil.copyfileobj(response, out_file)

View File

@ -44,7 +44,7 @@ class HandLandmark(enum.IntEnum):
WRIST = 0
THUMB_CMC = 1
THUMB_MCP = 2
THUMB_DIP = 3
THUMB_IP = 3
THUMB_TIP = 4
INDEX_FINGER_MCP = 5
INDEX_FINGER_PIP = 6
@ -68,8 +68,8 @@ BINARYPB_FILE_PATH = 'mediapipe/modules/hand_landmark/hand_landmark_tracking_cpu
HAND_CONNECTIONS = frozenset([
(HandLandmark.WRIST, HandLandmark.THUMB_CMC),
(HandLandmark.THUMB_CMC, HandLandmark.THUMB_MCP),
(HandLandmark.THUMB_MCP, HandLandmark.THUMB_DIP),
(HandLandmark.THUMB_DIP, HandLandmark.THUMB_TIP),
(HandLandmark.THUMB_MCP, HandLandmark.THUMB_IP),
(HandLandmark.THUMB_IP, HandLandmark.THUMB_TIP),
(HandLandmark.WRIST, HandLandmark.INDEX_FINGER_MCP),
(HandLandmark.INDEX_FINGER_MCP, HandLandmark.INDEX_FINGER_PIP),
(HandLandmark.INDEX_FINGER_PIP, HandLandmark.INDEX_FINGER_DIP),

View File

@ -1,4 +1,4 @@
# Copyright 2020 The MediaPipe Authors.
# Copyright 2020-2021 The MediaPipe Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
@ -18,6 +18,8 @@ from typing import NamedTuple
import numpy as np
from mediapipe.calculators.core import constant_side_packet_calculator_pb2
# The following imports are needed because python pb2 silently discards
# unknown protobuf fields.
# pylint: disable=unused-import
from mediapipe.calculators.core import gate_calculator_pb2
from mediapipe.calculators.core import split_vector_calculator_pb2
@ -32,9 +34,12 @@ from mediapipe.calculators.util import landmark_projection_calculator_pb2
from mediapipe.calculators.util import local_file_contents_calculator_pb2
from mediapipe.calculators.util import non_max_suppression_calculator_pb2
from mediapipe.calculators.util import rect_transformation_calculator_pb2
from mediapipe.framework.tool import switch_container_pb2
from mediapipe.modules.holistic_landmark.calculators import roi_tracking_calculator_pb2
# pylint: enable=unused-import
from mediapipe.python.solution_base import SolutionBase
from mediapipe.python.solutions import download_utils
# pylint: disable=unused-import
from mediapipe.python.solutions.face_mesh import FACE_CONNECTIONS
from mediapipe.python.solutions.hands import HAND_CONNECTIONS
@ -46,6 +51,17 @@ from mediapipe.python.solutions.pose import PoseLandmark
BINARYPB_FILE_PATH = 'mediapipe/modules/holistic_landmark/holistic_landmark_cpu.binarypb'
def _download_oss_pose_landmark_model(model_complexity):
"""Downloads the pose landmark lite/heavy model from the MediaPipe Github repo if it doesn't exist in the package."""
if model_complexity == 0:
download_utils.download_oss_model(
'mediapipe/modules/pose_landmark/pose_landmark_lite.tflite')
elif model_complexity == 2:
download_utils.download_oss_model(
'mediapipe/modules/pose_landmark/pose_landmark_heavy.tflite')
class Holistic(SolutionBase):
"""MediaPipe Holistic.
@ -81,6 +97,7 @@ class Holistic(SolutionBase):
pose landmarks to be considered tracked successfully. See details in
https://solutions.mediapipe.dev/holistic#min_tracking_confidence.
"""
_download_oss_pose_landmark_model(model_complexity)
super().__init__(
binary_graph_path=BINARYPB_FILE_PATH,
side_inputs={

View File

@ -1,4 +1,4 @@
# Copyright 2020 The MediaPipe Authors.
# Copyright 2020-2021 The MediaPipe Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
@ -15,10 +15,7 @@
"""MediaPipe Objectron."""
import enum
import os
import shutil
from typing import List, Tuple, NamedTuple, Optional
import urllib.request
import attr
import numpy as np
@ -48,6 +45,7 @@ from mediapipe.modules.objectron.calculators import frame_annotation_to_rect_cal
from mediapipe.modules.objectron.calculators import lift_2d_frame_annotation_to_3d_calculator_pb2
# pylint: enable=unused-import
from mediapipe.python.solution_base import SolutionBase
from mediapipe.python.solutions import download_utils
class BoxLandmark(enum.IntEnum):
@ -92,23 +90,6 @@ BOX_CONNECTIONS = frozenset([
(BoxLandmark.FRONT_BOTTOM_RIGHT, BoxLandmark.FRONT_TOP_RIGHT),
(BoxLandmark.BACK_TOP_RIGHT, BoxLandmark.FRONT_TOP_RIGHT),
])
_OSS_URL_PREFIX = 'https://github.com/google/mediapipe/raw/master/'
def _download_oss_model(model_path: str):
"""Download the objectron oss model from GitHub if it doesn't exist in the package."""
mp_root_path = os.sep.join(os.path.abspath(__file__).split(os.sep)[:-4])
model_abspath = os.path.join(mp_root_path, model_path)
if os.path.exists(model_abspath):
return
model_url = _OSS_URL_PREFIX + model_path
with urllib.request.urlopen(model_url) as response, open(model_abspath,
'wb') as out_file:
if response.code != 200:
raise ConnectionError('Cannot download ' + model_path +
' from the MediaPipe Github repo.')
shutil.copyfileobj(response, out_file)
@attr.s(auto_attribs=True)
@ -152,10 +133,19 @@ _MODEL_DICT = {
}
def _download_oss_objectron_models(objectron_model: str):
"""Downloads the objectron models from the MediaPipe Github repo if they don't exist in the package."""
download_utils.download_oss_model(
'mediapipe/modules/objectron/object_detection_ssd_mobilenetv2_oidv4_fp16.tflite'
)
download_utils.download_oss_model(objectron_model)
def get_model_by_name(name: str) -> ObjectronModel:
if name not in _MODEL_DICT:
raise ValueError(f'{name} is not a valid model name for Objectron.')
_download_oss_model(_MODEL_DICT[name].model_path)
_download_oss_objectron_models(_MODEL_DICT[name].model_path)
return _MODEL_DICT[name]

View File

@ -1,4 +1,4 @@
# Copyright 2020 The MediaPipe Authors.
# Copyright 2020-2021 The MediaPipe Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
@ -20,6 +20,8 @@ from typing import NamedTuple
import numpy as np
from mediapipe.calculators.core import constant_side_packet_calculator_pb2
# The following imports are needed because python pb2 silently discards
# unknown protobuf fields.
# pylint: disable=unused-import
from mediapipe.calculators.core import gate_calculator_pb2
from mediapipe.calculators.core import split_vector_calculator_pb2
@ -37,8 +39,11 @@ from mediapipe.calculators.util import non_max_suppression_calculator_pb2
from mediapipe.calculators.util import rect_transformation_calculator_pb2
from mediapipe.calculators.util import thresholding_calculator_pb2
from mediapipe.calculators.util import visibility_smoothing_calculator_pb2
from mediapipe.framework.tool import switch_container_pb2
# pylint: enable=unused-import
from mediapipe.python.solution_base import SolutionBase
from mediapipe.python.solutions import download_utils
class PoseLandmark(enum.IntEnum):
@ -117,6 +122,17 @@ POSE_CONNECTIONS = frozenset([
])
def _download_oss_pose_landmark_model(model_complexity):
"""Downloads the pose landmark lite/heavy model from the MediaPipe Github repo if it doesn't exist in the package."""
if model_complexity == 0:
download_utils.download_oss_model(
'mediapipe/modules/pose_landmark/pose_landmark_lite.tflite')
elif model_complexity == 2:
download_utils.download_oss_model(
'mediapipe/modules/pose_landmark/pose_landmark_heavy.tflite')
class Pose(SolutionBase):
"""MediaPipe Pose.
@ -151,6 +167,7 @@ class Pose(SolutionBase):
pose landmarks to be considered tracked successfully. See details in
https://solutions.mediapipe.dev/pose#min_tracking_confidence.
"""
_download_oss_pose_landmark_model(model_complexity)
super().__init__(
binary_graph_path=BINARYPB_FILE_PATH,
side_inputs={

View File

@ -21,7 +21,8 @@ OneEuroFilter::OneEuroFilter(double frequency, double min_cutoff, double beta,
last_time_ = 0;
}
double OneEuroFilter::Apply(absl::Duration timestamp, double value) {
double OneEuroFilter::Apply(absl::Duration timestamp, double value_scale,
double value) {
int64_t new_timestamp = absl::ToInt64Nanoseconds(timestamp);
if (last_time_ >= new_timestamp) {
// Results are unpredictable in this case, so nothing to do but
@ -39,7 +40,7 @@ double OneEuroFilter::Apply(absl::Duration timestamp, double value) {
// estimate the current variation per second
double dvalue = x_->HasLastRawValue()
? (value - x_->LastRawValue()) * frequency_
? (value - x_->LastRawValue()) * value_scale * frequency_
: 0.0; // FIXME: 0.0 or value?
double edvalue = dx_->ApplyWithAlpha(dvalue, GetAlpha(derivate_cutoff_));
// use it to update the cutoff frequency

View File

@ -13,7 +13,7 @@ class OneEuroFilter {
OneEuroFilter(double frequency, double min_cutoff, double beta,
double derivate_cutoff);
double Apply(absl::Duration timestamp, double value);
double Apply(absl::Duration timestamp, double value_scale, double value);
private:
double GetAlpha(double cutoff);

View File

@ -1,4 +1,4 @@
"""Copyright 2020 The MediaPipe Authors.
"""Copyright 2020-2021 The MediaPipe Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
@ -436,9 +436,9 @@ setuptools.setup(
'Operating System :: MacOS :: MacOS X',
'Operating System :: Microsoft :: Windows',
'Operating System :: POSIX :: Linux',
'Programming Language :: Python :: 3.6',
'Programming Language :: Python :: 3.7',
'Programming Language :: Python :: 3.8',
'Programming Language :: Python :: 3.9',
'Programming Language :: Python :: 3 :: Only',
'Topic :: Scientific/Engineering',
'Topic :: Scientific/Engineering :: Artificial Intelligence',