diff --git a/README.md b/README.md
index a82c88ab1..cb3d56de6 100644
--- a/README.md
+++ b/README.md
@@ -4,8 +4,6 @@ title: Home
nav_order: 1
---
-![MediaPipe](https://mediapipe.dev/images/mediapipe_small.png)
-
----
**Attention:** *Thanks for your interest in MediaPipe! We have moved to
@@ -14,86 +12,111 @@ as the primary developer documentation site for MediaPipe as of April 3, 2023.*
*This notice and web page will be removed on June 1, 2023.*
-----
+![MediaPipe](https://developers.google.com/static/mediapipe/images/home/hero_01_1920.png)
-
-
-
+**Attention**: MediaPipe Solutions Preview is an early release. [Learn
+more](https://developers.google.com/mediapipe/solutions/about#notice).
---------------------------------------------------------------------------------
+**On-device machine learning for everyone**
-## Live ML anywhere
+Delight your customers with innovative machine learning features. MediaPipe
+contains everything that you need to customize and deploy to mobile (Android,
+iOS), web, desktop, edge devices, and IoT, effortlessly.
-[MediaPipe](https://google.github.io/mediapipe/) offers cross-platform, customizable
-ML solutions for live and streaming media.
+* [See demos](https://goo.gle/mediapipe-studio)
+* [Learn more](https://developers.google.com/mediapipe/solutions)
-![accelerated.png](https://mediapipe.dev/images/accelerated_small.png) | ![cross_platform.png](https://mediapipe.dev/images/cross_platform_small.png)
-:------------------------------------------------------------------------------------------------------------: | :----------------------------------------------------:
-***End-to-End acceleration***: *Built-in fast ML inference and processing accelerated even on common hardware* | ***Build once, deploy anywhere***: *Unified solution works across Android, iOS, desktop/cloud, web and IoT*
-![ready_to_use.png](https://mediapipe.dev/images/ready_to_use_small.png) | ![open_source.png](https://mediapipe.dev/images/open_source_small.png)
-***Ready-to-use solutions***: *Cutting-edge ML solutions demonstrating full power of the framework* | ***Free and open source***: *Framework and solutions both under Apache 2.0, fully extensible and customizable*
+## Get started
-----
+You can get started with MediaPipe Solutions by by checking out any of the
+developer guides for
+[vision](https://developers.google.com/mediapipe/solutions/vision/object_detector),
+[text](https://developers.google.com/mediapipe/solutions/text/text_classifier),
+and
+[audio](https://developers.google.com/mediapipe/solutions/audio/audio_classifier)
+tasks. If you need help setting up a development environment for use with
+MediaPipe Tasks, check out the setup guides for
+[Android](https://developers.google.com/mediapipe/solutions/setup_android), [web
+apps](https://developers.google.com/mediapipe/solutions/setup_web), and
+[Python](https://developers.google.com/mediapipe/solutions/setup_python).
-## ML solutions in MediaPipe
+## Solutions
-Face Detection | Face Mesh | Iris | Hands | Pose | Holistic
-:----------------------------------------------------------------------------------------------------------------------------: | :-------------------------------------------------------------------------------------------------------------: | :-------------------------------------------------------------------------------------------------------: | :--------------------------------------------------------------------------------------------------------: | :-------------------------------------------------------------------------------------------------------: | :------:
-[![face_detection](https://mediapipe.dev/images/mobile/face_detection_android_gpu_small.gif)](https://google.github.io/mediapipe/solutions/face_detection) | [![face_mesh](https://mediapipe.dev/images/mobile/face_mesh_android_gpu_small.gif)](https://google.github.io/mediapipe/solutions/face_mesh) | [![iris](https://mediapipe.dev/images/mobile/iris_tracking_android_gpu_small.gif)](https://google.github.io/mediapipe/solutions/iris) | [![hand](https://mediapipe.dev/images/mobile/hand_tracking_android_gpu_small.gif)](https://google.github.io/mediapipe/solutions/hands) | [![pose](https://mediapipe.dev/images/mobile/pose_tracking_android_gpu_small.gif)](https://google.github.io/mediapipe/solutions/pose) | [![hair_segmentation](https://mediapipe.dev/images/mobile/holistic_tracking_android_gpu_small.gif)](https://google.github.io/mediapipe/solutions/holistic)
+MediaPipe Solutions provides a suite of libraries and tools for you to quickly
+apply artificial intelligence (AI) and machine learning (ML) techniques in your
+applications. You can plug these solutions into your applications immediately,
+customize them to your needs, and use them across multiple development
+platforms. MediaPipe Solutions is part of the MediaPipe [open source
+project](https://github.com/google/mediapipe), so you can further customize the
+solutions code to meet your application needs.
-Hair Segmentation | Object Detection | Box Tracking | Instant Motion Tracking | Objectron | KNIFT
-:-------------------------------------------------------------------------------------------------------------------------------------: | :----------------------------------------------------------------------------------------------------------------------------------: | :-------------------------------------------------------------------------------------------------------------------------: | :---------------------------------------------------------------------------------------------------------------------------------------------------: | :-------------------------------------------------------------------------------------------------------------------: | :---:
-[![hair_segmentation](https://mediapipe.dev/images/mobile/hair_segmentation_android_gpu_small.gif)](https://google.github.io/mediapipe/solutions/hair_segmentation) | [![object_detection](https://mediapipe.dev/images/mobile/object_detection_android_gpu_small.gif)](https://google.github.io/mediapipe/solutions/object_detection) | [![box_tracking](https://mediapipe.dev/images/mobile/object_tracking_android_gpu_small.gif)](https://google.github.io/mediapipe/solutions/box_tracking) | [![instant_motion_tracking](https://mediapipe.dev/images/mobile/instant_motion_tracking_android_small.gif)](https://google.github.io/mediapipe/solutions/instant_motion_tracking) | [![objectron](https://mediapipe.dev/images/mobile/objectron_chair_android_gpu_small.gif)](https://google.github.io/mediapipe/solutions/objectron) | [![knift](https://mediapipe.dev/images/mobile/template_matching_android_cpu_small.gif)](https://google.github.io/mediapipe/solutions/knift)
+These libraries and resources provide the core functionality for each MediaPipe
+Solution:
-
-
+* **MediaPipe Tasks**: Cross-platform APIs and libraries for deploying
+ solutions. [Learn
+ more](https://developers.google.com/mediapipe/solutions/tasks).
+* **MediaPipe models**: Pre-trained, ready-to-run models for use with each
+ solution.
-[]() | [Android](https://google.github.io/mediapipe/getting_started/android) | [iOS](https://google.github.io/mediapipe/getting_started/ios) | [C++](https://google.github.io/mediapipe/getting_started/cpp) | [Python](https://google.github.io/mediapipe/getting_started/python) | [JS](https://google.github.io/mediapipe/getting_started/javascript) | [Coral](https://github.com/google/mediapipe/tree/master/mediapipe/examples/coral/README.md)
-:---------------------------------------------------------------------------------------- | :-------------------------------------------------------------: | :-----------------------------------------------------: | :-----------------------------------------------------: | :-----------------------------------------------------------: | :-----------------------------------------------------------: | :--------------------------------------------------------------------:
-[Face Detection](https://google.github.io/mediapipe/solutions/face_detection) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅
-[Face Mesh](https://google.github.io/mediapipe/solutions/face_mesh) | ✅ | ✅ | ✅ | ✅ | ✅ |
-[Iris](https://google.github.io/mediapipe/solutions/iris) | ✅ | ✅ | ✅ | | |
-[Hands](https://google.github.io/mediapipe/solutions/hands) | ✅ | ✅ | ✅ | ✅ | ✅ |
-[Pose](https://google.github.io/mediapipe/solutions/pose) | ✅ | ✅ | ✅ | ✅ | ✅ |
-[Holistic](https://google.github.io/mediapipe/solutions/holistic) | ✅ | ✅ | ✅ | ✅ | ✅ |
-[Selfie Segmentation](https://google.github.io/mediapipe/solutions/selfie_segmentation) | ✅ | ✅ | ✅ | ✅ | ✅ |
-[Hair Segmentation](https://google.github.io/mediapipe/solutions/hair_segmentation) | ✅ | | ✅ | | |
-[Object Detection](https://google.github.io/mediapipe/solutions/object_detection) | ✅ | ✅ | ✅ | | | ✅
-[Box Tracking](https://google.github.io/mediapipe/solutions/box_tracking) | ✅ | ✅ | ✅ | | |
-[Instant Motion Tracking](https://google.github.io/mediapipe/solutions/instant_motion_tracking) | ✅ | | | | |
-[Objectron](https://google.github.io/mediapipe/solutions/objectron) | ✅ | | ✅ | ✅ | ✅ |
-[KNIFT](https://google.github.io/mediapipe/solutions/knift) | ✅ | | | | |
-[AutoFlip](https://google.github.io/mediapipe/solutions/autoflip) | | | ✅ | | |
-[MediaSequence](https://google.github.io/mediapipe/solutions/media_sequence) | | | ✅ | | |
-[YouTube 8M](https://google.github.io/mediapipe/solutions/youtube_8m) | | | ✅ | | |
+These tools let you customize and evaluate solutions:
-See also
-[MediaPipe Models and Model Cards](https://google.github.io/mediapipe/solutions/models)
-for ML models released in MediaPipe.
+* **MediaPipe Model Maker**: Customize models for solutions with your data.
+ [Learn more](https://developers.google.com/mediapipe/solutions/model_maker).
+* **MediaPipe Studio**: Visualize, evaluate, and benchmark solutions in your
+ browser. [Learn
+ more](https://developers.google.com/mediapipe/solutions/studio).
-## Getting started
+### Legacy solutions
-To start using MediaPipe
-[solutions](https://google.github.io/mediapipe/solutions/solutions) with only a few
-lines code, see example code and demos in
-[MediaPipe in Python](https://google.github.io/mediapipe/getting_started/python) and
-[MediaPipe in JavaScript](https://google.github.io/mediapipe/getting_started/javascript).
+We have ended support for [these MediaPipe Legacy Solutions](https://developers.google.com/mediapipe/solutions/guide#legacy)
+as of March 1, 2023. All other MediaPipe Legacy Solutions will be upgraded to
+a new MediaPipe Solution. See the [Solutions guide](https://developers.google.com/mediapipe/solutions/guide#legacy)
+for details. The [code repository](https://github.com/google/mediapipe/tree/master/mediapipe)
+and prebuilt binaries for all MediaPipe Legacy Solutions will continue to be
+provided on an as-is basis.
-To use MediaPipe in C++, Android and iOS, which allow further customization of
-the [solutions](https://google.github.io/mediapipe/solutions/solutions) as well as
-building your own, learn how to
-[install](https://google.github.io/mediapipe/getting_started/install) MediaPipe and
-start building example applications in
-[C++](https://google.github.io/mediapipe/getting_started/cpp),
-[Android](https://google.github.io/mediapipe/getting_started/android) and
-[iOS](https://google.github.io/mediapipe/getting_started/ios).
+For more on the legacy solutions, see the [documentation](https://github.com/google/mediapipe/tree/master/docs/solutions).
-The source code is hosted in the
-[MediaPipe Github repository](https://github.com/google/mediapipe), and you can
-run code search using
-[Google Open Source Code Search](https://cs.opensource.google/mediapipe/mediapipe).
+## Framework
-## Publications
+To start using MediaPipe Framework, [install MediaPipe
+Framework](https://developers.google.com/mediapipe/framework/getting_started/install)
+and start building example applications in C++, Android, and iOS.
+
+[MediaPipe Framework](https://developers.google.com/mediapipe/framework) is the
+low-level component used to build efficient on-device machine learning
+pipelines, similar to the premade MediaPipe Solutions.
+
+Before using MediaPipe Framework, familiarize yourself with the following key
+[Framework
+concepts](https://developers.google.com/mediapipe/framework/framework_concepts/overview.md):
+
+* [Packets](https://developers.google.com/mediapipe/framework/framework_concepts/packets.md)
+* [Graphs](https://developers.google.com/mediapipe/framework/framework_concepts/graphs.md)
+* [Calculators](https://developers.google.com/mediapipe/framework/framework_concepts/calculators.md)
+
+## Community
+
+* [Slack community](https://mediapipe.page.link/joinslack) for MediaPipe
+ users.
+* [Discuss](https://groups.google.com/forum/#!forum/mediapipe) - General
+ community discussion around MediaPipe.
+* [Awesome MediaPipe](https://mediapipe.page.link/awesome-mediapipe) - A
+ curated list of awesome MediaPipe related frameworks, libraries and
+ software.
+
+## Contributing
+
+We welcome contributions. Please follow these
+[guidelines](https://github.com/google/mediapipe/blob/master/CONTRIBUTING.md).
+
+We use GitHub issues for tracking requests and bugs. Please post questions to
+the MediaPipe Stack Overflow with a `mediapipe` tag.
+
+## Resources
+
+### Publications
* [Bringing artworks to life with AR](https://developers.googleblog.com/2021/07/bringing-artworks-to-life-with-ar.html)
in Google Developers Blog
@@ -102,7 +125,8 @@ run code search using
* [SignAll SDK: Sign language interface using MediaPipe is now available for
developers](https://developers.googleblog.com/2021/04/signall-sdk-sign-language-interface-using-mediapipe-now-available.html)
in Google Developers Blog
-* [MediaPipe Holistic - Simultaneous Face, Hand and Pose Prediction, on Device](https://ai.googleblog.com/2020/12/mediapipe-holistic-simultaneous-face.html)
+* [MediaPipe Holistic - Simultaneous Face, Hand and Pose Prediction, on
+ Device](https://ai.googleblog.com/2020/12/mediapipe-holistic-simultaneous-face.html)
in Google AI Blog
* [Background Features in Google Meet, Powered by Web ML](https://ai.googleblog.com/2020/10/background-features-in-google-meet.html)
in Google AI Blog
@@ -130,43 +154,6 @@ run code search using
in Google AI Blog
* [MediaPipe: A Framework for Building Perception Pipelines](https://arxiv.org/abs/1906.08172)
-## Videos
+### Videos
* [YouTube Channel](https://www.youtube.com/c/MediaPipe)
-
-## Events
-
-* [MediaPipe Seattle Meetup, Google Building Waterside, 13 Feb 2020](https://mediapipe.page.link/seattle2020)
-* [AI Nextcon 2020, 12-16 Feb 2020, Seattle](http://aisea20.xnextcon.com/)
-* [MediaPipe Madrid Meetup, 16 Dec 2019](https://www.meetup.com/Madrid-AI-Developers-Group/events/266329088/)
-* [MediaPipe London Meetup, Google 123 Building, 12 Dec 2019](https://www.meetup.com/London-AI-Tech-Talk/events/266329038)
-* [ML Conference, Berlin, 11 Dec 2019](https://mlconference.ai/machine-learning-advanced-development/mediapipe-building-real-time-cross-platform-mobile-web-edge-desktop-video-audio-ml-pipelines/)
-* [MediaPipe Berlin Meetup, Google Berlin, 11 Dec 2019](https://www.meetup.com/Berlin-AI-Tech-Talk/events/266328794/)
-* [The 3rd Workshop on YouTube-8M Large Scale Video Understanding Workshop,
- Seoul, Korea ICCV
- 2019](https://research.google.com/youtube8m/workshop2019/index.html)
-* [AI DevWorld 2019, 10 Oct 2019, San Jose, CA](https://aidevworld.com)
-* [Google Industry Workshop at ICIP 2019, 24 Sept 2019, Taipei, Taiwan](http://2019.ieeeicip.org/?action=page4&id=14#Google)
- ([presentation](https://docs.google.com/presentation/d/e/2PACX-1vRIBBbO_LO9v2YmvbHHEt1cwyqH6EjDxiILjuT0foXy1E7g6uyh4CesB2DkkEwlRDO9_lWfuKMZx98T/pub?start=false&loop=false&delayms=3000&slide=id.g556cc1a659_0_5))
-* [Open sourced at CVPR 2019, 17~20 June, Long Beach, CA](https://sites.google.com/corp/view/perception-cv4arvr/mediapipe)
-
-## Community
-
-* [Awesome MediaPipe](https://mediapipe.page.link/awesome-mediapipe) - A
- curated list of awesome MediaPipe related frameworks, libraries and software
-* [Slack community](https://mediapipe.page.link/joinslack) for MediaPipe users
-* [Discuss](https://groups.google.com/forum/#!forum/mediapipe) - General
- community discussion around MediaPipe
-
-## Alpha disclaimer
-
-MediaPipe is currently in alpha at v0.7. We may be still making breaking API
-changes and expect to get to stable APIs by v1.0.
-
-## Contributing
-
-We welcome contributions. Please follow these
-[guidelines](https://github.com/google/mediapipe/blob/master/CONTRIBUTING.md).
-
-We use GitHub issues for tracking requests and bugs. Please post questions to
-the MediaPipe Stack Overflow with a `mediapipe` tag.
diff --git a/WORKSPACE b/WORKSPACE
index 760898185..ee2506ed7 100644
--- a/WORKSPACE
+++ b/WORKSPACE
@@ -375,6 +375,18 @@ http_archive(
url = "https://github.com/opencv/opencv/releases/download/3.2.0/opencv-3.2.0-ios-framework.zip",
)
+# Building an opencv.xcframework from the OpenCV 4.5.1 sources is necessary for
+# MediaPipe iOS Task Libraries to be supported on arm64(M1) Macs. An
+# `opencv.xcframework` archive has not been released and it is recommended to
+# build the same from source using a script provided in OpenCV 4.5.0 upwards.
+http_archive(
+ name = "ios_opencv_source",
+ sha256 = "5fbc26ee09e148a4d494b225d04217f7c913ca1a4d46115b70cca3565d7bbe05",
+ build_file = "@//third_party:opencv_ios_source.BUILD",
+ type = "zip",
+ url = "https://github.com/opencv/opencv/archive/refs/tags/4.5.1.zip",
+)
+
http_archive(
name = "stblib",
strip_prefix = "stb-b42009b3b9d4ca35bc703f5310eedc74f584be58",
diff --git a/docs/index.md b/docs/index.md
index a82c88ab1..cb3d56de6 100644
--- a/docs/index.md
+++ b/docs/index.md
@@ -4,8 +4,6 @@ title: Home
nav_order: 1
---
-![MediaPipe](https://mediapipe.dev/images/mediapipe_small.png)
-
----
**Attention:** *Thanks for your interest in MediaPipe! We have moved to
@@ -14,86 +12,111 @@ as the primary developer documentation site for MediaPipe as of April 3, 2023.*
*This notice and web page will be removed on June 1, 2023.*
-----
+![MediaPipe](https://developers.google.com/static/mediapipe/images/home/hero_01_1920.png)
-
-
-
+**Attention**: MediaPipe Solutions Preview is an early release. [Learn
+more](https://developers.google.com/mediapipe/solutions/about#notice).
---------------------------------------------------------------------------------
+**On-device machine learning for everyone**
-## Live ML anywhere
+Delight your customers with innovative machine learning features. MediaPipe
+contains everything that you need to customize and deploy to mobile (Android,
+iOS), web, desktop, edge devices, and IoT, effortlessly.
-[MediaPipe](https://google.github.io/mediapipe/) offers cross-platform, customizable
-ML solutions for live and streaming media.
+* [See demos](https://goo.gle/mediapipe-studio)
+* [Learn more](https://developers.google.com/mediapipe/solutions)
-![accelerated.png](https://mediapipe.dev/images/accelerated_small.png) | ![cross_platform.png](https://mediapipe.dev/images/cross_platform_small.png)
-:------------------------------------------------------------------------------------------------------------: | :----------------------------------------------------:
-***End-to-End acceleration***: *Built-in fast ML inference and processing accelerated even on common hardware* | ***Build once, deploy anywhere***: *Unified solution works across Android, iOS, desktop/cloud, web and IoT*
-![ready_to_use.png](https://mediapipe.dev/images/ready_to_use_small.png) | ![open_source.png](https://mediapipe.dev/images/open_source_small.png)
-***Ready-to-use solutions***: *Cutting-edge ML solutions demonstrating full power of the framework* | ***Free and open source***: *Framework and solutions both under Apache 2.0, fully extensible and customizable*
+## Get started
-----
+You can get started with MediaPipe Solutions by by checking out any of the
+developer guides for
+[vision](https://developers.google.com/mediapipe/solutions/vision/object_detector),
+[text](https://developers.google.com/mediapipe/solutions/text/text_classifier),
+and
+[audio](https://developers.google.com/mediapipe/solutions/audio/audio_classifier)
+tasks. If you need help setting up a development environment for use with
+MediaPipe Tasks, check out the setup guides for
+[Android](https://developers.google.com/mediapipe/solutions/setup_android), [web
+apps](https://developers.google.com/mediapipe/solutions/setup_web), and
+[Python](https://developers.google.com/mediapipe/solutions/setup_python).
-## ML solutions in MediaPipe
+## Solutions
-Face Detection | Face Mesh | Iris | Hands | Pose | Holistic
-:----------------------------------------------------------------------------------------------------------------------------: | :-------------------------------------------------------------------------------------------------------------: | :-------------------------------------------------------------------------------------------------------: | :--------------------------------------------------------------------------------------------------------: | :-------------------------------------------------------------------------------------------------------: | :------:
-[![face_detection](https://mediapipe.dev/images/mobile/face_detection_android_gpu_small.gif)](https://google.github.io/mediapipe/solutions/face_detection) | [![face_mesh](https://mediapipe.dev/images/mobile/face_mesh_android_gpu_small.gif)](https://google.github.io/mediapipe/solutions/face_mesh) | [![iris](https://mediapipe.dev/images/mobile/iris_tracking_android_gpu_small.gif)](https://google.github.io/mediapipe/solutions/iris) | [![hand](https://mediapipe.dev/images/mobile/hand_tracking_android_gpu_small.gif)](https://google.github.io/mediapipe/solutions/hands) | [![pose](https://mediapipe.dev/images/mobile/pose_tracking_android_gpu_small.gif)](https://google.github.io/mediapipe/solutions/pose) | [![hair_segmentation](https://mediapipe.dev/images/mobile/holistic_tracking_android_gpu_small.gif)](https://google.github.io/mediapipe/solutions/holistic)
+MediaPipe Solutions provides a suite of libraries and tools for you to quickly
+apply artificial intelligence (AI) and machine learning (ML) techniques in your
+applications. You can plug these solutions into your applications immediately,
+customize them to your needs, and use them across multiple development
+platforms. MediaPipe Solutions is part of the MediaPipe [open source
+project](https://github.com/google/mediapipe), so you can further customize the
+solutions code to meet your application needs.
-Hair Segmentation | Object Detection | Box Tracking | Instant Motion Tracking | Objectron | KNIFT
-:-------------------------------------------------------------------------------------------------------------------------------------: | :----------------------------------------------------------------------------------------------------------------------------------: | :-------------------------------------------------------------------------------------------------------------------------: | :---------------------------------------------------------------------------------------------------------------------------------------------------: | :-------------------------------------------------------------------------------------------------------------------: | :---:
-[![hair_segmentation](https://mediapipe.dev/images/mobile/hair_segmentation_android_gpu_small.gif)](https://google.github.io/mediapipe/solutions/hair_segmentation) | [![object_detection](https://mediapipe.dev/images/mobile/object_detection_android_gpu_small.gif)](https://google.github.io/mediapipe/solutions/object_detection) | [![box_tracking](https://mediapipe.dev/images/mobile/object_tracking_android_gpu_small.gif)](https://google.github.io/mediapipe/solutions/box_tracking) | [![instant_motion_tracking](https://mediapipe.dev/images/mobile/instant_motion_tracking_android_small.gif)](https://google.github.io/mediapipe/solutions/instant_motion_tracking) | [![objectron](https://mediapipe.dev/images/mobile/objectron_chair_android_gpu_small.gif)](https://google.github.io/mediapipe/solutions/objectron) | [![knift](https://mediapipe.dev/images/mobile/template_matching_android_cpu_small.gif)](https://google.github.io/mediapipe/solutions/knift)
+These libraries and resources provide the core functionality for each MediaPipe
+Solution:
-
-
+* **MediaPipe Tasks**: Cross-platform APIs and libraries for deploying
+ solutions. [Learn
+ more](https://developers.google.com/mediapipe/solutions/tasks).
+* **MediaPipe models**: Pre-trained, ready-to-run models for use with each
+ solution.
-[]() | [Android](https://google.github.io/mediapipe/getting_started/android) | [iOS](https://google.github.io/mediapipe/getting_started/ios) | [C++](https://google.github.io/mediapipe/getting_started/cpp) | [Python](https://google.github.io/mediapipe/getting_started/python) | [JS](https://google.github.io/mediapipe/getting_started/javascript) | [Coral](https://github.com/google/mediapipe/tree/master/mediapipe/examples/coral/README.md)
-:---------------------------------------------------------------------------------------- | :-------------------------------------------------------------: | :-----------------------------------------------------: | :-----------------------------------------------------: | :-----------------------------------------------------------: | :-----------------------------------------------------------: | :--------------------------------------------------------------------:
-[Face Detection](https://google.github.io/mediapipe/solutions/face_detection) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅
-[Face Mesh](https://google.github.io/mediapipe/solutions/face_mesh) | ✅ | ✅ | ✅ | ✅ | ✅ |
-[Iris](https://google.github.io/mediapipe/solutions/iris) | ✅ | ✅ | ✅ | | |
-[Hands](https://google.github.io/mediapipe/solutions/hands) | ✅ | ✅ | ✅ | ✅ | ✅ |
-[Pose](https://google.github.io/mediapipe/solutions/pose) | ✅ | ✅ | ✅ | ✅ | ✅ |
-[Holistic](https://google.github.io/mediapipe/solutions/holistic) | ✅ | ✅ | ✅ | ✅ | ✅ |
-[Selfie Segmentation](https://google.github.io/mediapipe/solutions/selfie_segmentation) | ✅ | ✅ | ✅ | ✅ | ✅ |
-[Hair Segmentation](https://google.github.io/mediapipe/solutions/hair_segmentation) | ✅ | | ✅ | | |
-[Object Detection](https://google.github.io/mediapipe/solutions/object_detection) | ✅ | ✅ | ✅ | | | ✅
-[Box Tracking](https://google.github.io/mediapipe/solutions/box_tracking) | ✅ | ✅ | ✅ | | |
-[Instant Motion Tracking](https://google.github.io/mediapipe/solutions/instant_motion_tracking) | ✅ | | | | |
-[Objectron](https://google.github.io/mediapipe/solutions/objectron) | ✅ | | ✅ | ✅ | ✅ |
-[KNIFT](https://google.github.io/mediapipe/solutions/knift) | ✅ | | | | |
-[AutoFlip](https://google.github.io/mediapipe/solutions/autoflip) | | | ✅ | | |
-[MediaSequence](https://google.github.io/mediapipe/solutions/media_sequence) | | | ✅ | | |
-[YouTube 8M](https://google.github.io/mediapipe/solutions/youtube_8m) | | | ✅ | | |
+These tools let you customize and evaluate solutions:
-See also
-[MediaPipe Models and Model Cards](https://google.github.io/mediapipe/solutions/models)
-for ML models released in MediaPipe.
+* **MediaPipe Model Maker**: Customize models for solutions with your data.
+ [Learn more](https://developers.google.com/mediapipe/solutions/model_maker).
+* **MediaPipe Studio**: Visualize, evaluate, and benchmark solutions in your
+ browser. [Learn
+ more](https://developers.google.com/mediapipe/solutions/studio).
-## Getting started
+### Legacy solutions
-To start using MediaPipe
-[solutions](https://google.github.io/mediapipe/solutions/solutions) with only a few
-lines code, see example code and demos in
-[MediaPipe in Python](https://google.github.io/mediapipe/getting_started/python) and
-[MediaPipe in JavaScript](https://google.github.io/mediapipe/getting_started/javascript).
+We have ended support for [these MediaPipe Legacy Solutions](https://developers.google.com/mediapipe/solutions/guide#legacy)
+as of March 1, 2023. All other MediaPipe Legacy Solutions will be upgraded to
+a new MediaPipe Solution. See the [Solutions guide](https://developers.google.com/mediapipe/solutions/guide#legacy)
+for details. The [code repository](https://github.com/google/mediapipe/tree/master/mediapipe)
+and prebuilt binaries for all MediaPipe Legacy Solutions will continue to be
+provided on an as-is basis.
-To use MediaPipe in C++, Android and iOS, which allow further customization of
-the [solutions](https://google.github.io/mediapipe/solutions/solutions) as well as
-building your own, learn how to
-[install](https://google.github.io/mediapipe/getting_started/install) MediaPipe and
-start building example applications in
-[C++](https://google.github.io/mediapipe/getting_started/cpp),
-[Android](https://google.github.io/mediapipe/getting_started/android) and
-[iOS](https://google.github.io/mediapipe/getting_started/ios).
+For more on the legacy solutions, see the [documentation](https://github.com/google/mediapipe/tree/master/docs/solutions).
-The source code is hosted in the
-[MediaPipe Github repository](https://github.com/google/mediapipe), and you can
-run code search using
-[Google Open Source Code Search](https://cs.opensource.google/mediapipe/mediapipe).
+## Framework
-## Publications
+To start using MediaPipe Framework, [install MediaPipe
+Framework](https://developers.google.com/mediapipe/framework/getting_started/install)
+and start building example applications in C++, Android, and iOS.
+
+[MediaPipe Framework](https://developers.google.com/mediapipe/framework) is the
+low-level component used to build efficient on-device machine learning
+pipelines, similar to the premade MediaPipe Solutions.
+
+Before using MediaPipe Framework, familiarize yourself with the following key
+[Framework
+concepts](https://developers.google.com/mediapipe/framework/framework_concepts/overview.md):
+
+* [Packets](https://developers.google.com/mediapipe/framework/framework_concepts/packets.md)
+* [Graphs](https://developers.google.com/mediapipe/framework/framework_concepts/graphs.md)
+* [Calculators](https://developers.google.com/mediapipe/framework/framework_concepts/calculators.md)
+
+## Community
+
+* [Slack community](https://mediapipe.page.link/joinslack) for MediaPipe
+ users.
+* [Discuss](https://groups.google.com/forum/#!forum/mediapipe) - General
+ community discussion around MediaPipe.
+* [Awesome MediaPipe](https://mediapipe.page.link/awesome-mediapipe) - A
+ curated list of awesome MediaPipe related frameworks, libraries and
+ software.
+
+## Contributing
+
+We welcome contributions. Please follow these
+[guidelines](https://github.com/google/mediapipe/blob/master/CONTRIBUTING.md).
+
+We use GitHub issues for tracking requests and bugs. Please post questions to
+the MediaPipe Stack Overflow with a `mediapipe` tag.
+
+## Resources
+
+### Publications
* [Bringing artworks to life with AR](https://developers.googleblog.com/2021/07/bringing-artworks-to-life-with-ar.html)
in Google Developers Blog
@@ -102,7 +125,8 @@ run code search using
* [SignAll SDK: Sign language interface using MediaPipe is now available for
developers](https://developers.googleblog.com/2021/04/signall-sdk-sign-language-interface-using-mediapipe-now-available.html)
in Google Developers Blog
-* [MediaPipe Holistic - Simultaneous Face, Hand and Pose Prediction, on Device](https://ai.googleblog.com/2020/12/mediapipe-holistic-simultaneous-face.html)
+* [MediaPipe Holistic - Simultaneous Face, Hand and Pose Prediction, on
+ Device](https://ai.googleblog.com/2020/12/mediapipe-holistic-simultaneous-face.html)
in Google AI Blog
* [Background Features in Google Meet, Powered by Web ML](https://ai.googleblog.com/2020/10/background-features-in-google-meet.html)
in Google AI Blog
@@ -130,43 +154,6 @@ run code search using
in Google AI Blog
* [MediaPipe: A Framework for Building Perception Pipelines](https://arxiv.org/abs/1906.08172)
-## Videos
+### Videos
* [YouTube Channel](https://www.youtube.com/c/MediaPipe)
-
-## Events
-
-* [MediaPipe Seattle Meetup, Google Building Waterside, 13 Feb 2020](https://mediapipe.page.link/seattle2020)
-* [AI Nextcon 2020, 12-16 Feb 2020, Seattle](http://aisea20.xnextcon.com/)
-* [MediaPipe Madrid Meetup, 16 Dec 2019](https://www.meetup.com/Madrid-AI-Developers-Group/events/266329088/)
-* [MediaPipe London Meetup, Google 123 Building, 12 Dec 2019](https://www.meetup.com/London-AI-Tech-Talk/events/266329038)
-* [ML Conference, Berlin, 11 Dec 2019](https://mlconference.ai/machine-learning-advanced-development/mediapipe-building-real-time-cross-platform-mobile-web-edge-desktop-video-audio-ml-pipelines/)
-* [MediaPipe Berlin Meetup, Google Berlin, 11 Dec 2019](https://www.meetup.com/Berlin-AI-Tech-Talk/events/266328794/)
-* [The 3rd Workshop on YouTube-8M Large Scale Video Understanding Workshop,
- Seoul, Korea ICCV
- 2019](https://research.google.com/youtube8m/workshop2019/index.html)
-* [AI DevWorld 2019, 10 Oct 2019, San Jose, CA](https://aidevworld.com)
-* [Google Industry Workshop at ICIP 2019, 24 Sept 2019, Taipei, Taiwan](http://2019.ieeeicip.org/?action=page4&id=14#Google)
- ([presentation](https://docs.google.com/presentation/d/e/2PACX-1vRIBBbO_LO9v2YmvbHHEt1cwyqH6EjDxiILjuT0foXy1E7g6uyh4CesB2DkkEwlRDO9_lWfuKMZx98T/pub?start=false&loop=false&delayms=3000&slide=id.g556cc1a659_0_5))
-* [Open sourced at CVPR 2019, 17~20 June, Long Beach, CA](https://sites.google.com/corp/view/perception-cv4arvr/mediapipe)
-
-## Community
-
-* [Awesome MediaPipe](https://mediapipe.page.link/awesome-mediapipe) - A
- curated list of awesome MediaPipe related frameworks, libraries and software
-* [Slack community](https://mediapipe.page.link/joinslack) for MediaPipe users
-* [Discuss](https://groups.google.com/forum/#!forum/mediapipe) - General
- community discussion around MediaPipe
-
-## Alpha disclaimer
-
-MediaPipe is currently in alpha at v0.7. We may be still making breaking API
-changes and expect to get to stable APIs by v1.0.
-
-## Contributing
-
-We welcome contributions. Please follow these
-[guidelines](https://github.com/google/mediapipe/blob/master/CONTRIBUTING.md).
-
-We use GitHub issues for tracking requests and bugs. Please post questions to
-the MediaPipe Stack Overflow with a `mediapipe` tag.
diff --git a/mediapipe/BUILD b/mediapipe/BUILD
index 3187c0cf7..fd0cbab36 100644
--- a/mediapipe/BUILD
+++ b/mediapipe/BUILD
@@ -141,6 +141,7 @@ config_setting(
"ios_armv7",
"ios_arm64",
"ios_arm64e",
+ "ios_sim_arm64",
]
]
diff --git a/mediapipe/framework/BUILD b/mediapipe/framework/BUILD
index ae788ed58..126261c90 100644
--- a/mediapipe/framework/BUILD
+++ b/mediapipe/framework/BUILD
@@ -33,7 +33,9 @@ bzl_library(
srcs = [
"transitive_protos.bzl",
],
- visibility = ["//mediapipe/framework:__subpackages__"],
+ visibility = [
+ "//mediapipe/framework:__subpackages__",
+ ],
)
bzl_library(
diff --git a/mediapipe/framework/calculator_options.proto b/mediapipe/framework/calculator_options.proto
index 747e9c4af..3bc9f6615 100644
--- a/mediapipe/framework/calculator_options.proto
+++ b/mediapipe/framework/calculator_options.proto
@@ -23,15 +23,13 @@ package mediapipe;
option java_package = "com.google.mediapipe.proto";
option java_outer_classname = "CalculatorOptionsProto";
-// Options for Calculators. Each Calculator implementation should
-// have its own options proto, which should look like this:
+// Options for Calculators, DEPRECATED. New calculators are encouraged to use
+// proto3 syntax options:
//
// message MyCalculatorOptions {
-// extend CalculatorOptions {
-// optional MyCalculatorOptions ext = ;
-// }
-// optional string field_needed_by_my_calculator = 1;
-// optional int32 another_field = 2;
+// // proto3 does not expect "optional"
+// string field_needed_by_my_calculator = 1;
+// int32 another_field = 2;
// // etc
// }
message CalculatorOptions {
diff --git a/mediapipe/framework/profiler/testing/BUILD b/mediapipe/framework/profiler/testing/BUILD
index 0b0d256e5..67668ef7d 100644
--- a/mediapipe/framework/profiler/testing/BUILD
+++ b/mediapipe/framework/profiler/testing/BUILD
@@ -15,9 +15,7 @@
licenses(["notice"])
-package(
- default_visibility = ["//mediapipe/framework:__subpackages__"],
-)
+package(default_visibility = ["//mediapipe/framework:__subpackages__"])
cc_library(
name = "simple_calculator",
diff --git a/mediapipe/framework/tool/template_parser.cc b/mediapipe/framework/tool/template_parser.cc
index f012ac418..ad799c34f 100644
--- a/mediapipe/framework/tool/template_parser.cc
+++ b/mediapipe/framework/tool/template_parser.cc
@@ -974,7 +974,7 @@ class TemplateParser::Parser::ParserImpl {
}
// Consumes an identifier and saves its value in the identifier parameter.
- // Returns false if the token is not of type IDENTFIER.
+ // Returns false if the token is not of type IDENTIFIER.
bool ConsumeIdentifier(std::string* identifier) {
if (LookingAtType(io::Tokenizer::TYPE_IDENTIFIER)) {
*identifier = tokenizer_.current().text;
@@ -1672,7 +1672,9 @@ class TemplateParser::Parser::MediaPipeParserImpl
if (field_type == ProtoUtilLite::FieldType::TYPE_MESSAGE) {
*args = {""};
} else {
- MEDIAPIPE_CHECK_OK(ProtoUtilLite::Serialize({"1"}, field_type, args));
+ constexpr char kPlaceholderValue[] = "1";
+ MEDIAPIPE_CHECK_OK(
+ ProtoUtilLite::Serialize({kPlaceholderValue}, field_type, args));
}
}
diff --git a/mediapipe/model_maker/models/gesture_recognizer/BUILD b/mediapipe/model_maker/models/gesture_recognizer/BUILD
index 947508f1b..c57d7a2c9 100644
--- a/mediapipe/model_maker/models/gesture_recognizer/BUILD
+++ b/mediapipe/model_maker/models/gesture_recognizer/BUILD
@@ -19,9 +19,7 @@ load(
licenses(["notice"])
-package(
- default_visibility = ["//mediapipe/model_maker/python/vision/gesture_recognizer:__subpackages__"],
-)
+package(default_visibility = ["//mediapipe/model_maker/python/vision/gesture_recognizer:__subpackages__"])
mediapipe_files(
srcs = [
diff --git a/mediapipe/model_maker/models/text_classifier/BUILD b/mediapipe/model_maker/models/text_classifier/BUILD
index d9d55048d..460d6cfd1 100644
--- a/mediapipe/model_maker/models/text_classifier/BUILD
+++ b/mediapipe/model_maker/models/text_classifier/BUILD
@@ -19,9 +19,7 @@ load(
licenses(["notice"])
-package(
- default_visibility = ["//mediapipe/model_maker/python/text/text_classifier:__subpackages__"],
-)
+package(default_visibility = ["//mediapipe/model_maker/python/text/text_classifier:__subpackages__"])
mediapipe_files(
srcs = [
diff --git a/mediapipe/model_maker/python/core/BUILD b/mediapipe/model_maker/python/core/BUILD
index 6331e638e..0ed20a2fe 100644
--- a/mediapipe/model_maker/python/core/BUILD
+++ b/mediapipe/model_maker/python/core/BUILD
@@ -14,9 +14,7 @@
# Placeholder for internal Python strict library and test compatibility macro.
-package(
- default_visibility = ["//mediapipe:__subpackages__"],
-)
+package(default_visibility = ["//mediapipe:__subpackages__"])
licenses(["notice"])
diff --git a/mediapipe/model_maker/python/core/data/BUILD b/mediapipe/model_maker/python/core/data/BUILD
index cc0381f60..1c2fb7a44 100644
--- a/mediapipe/model_maker/python/core/data/BUILD
+++ b/mediapipe/model_maker/python/core/data/BUILD
@@ -17,9 +17,7 @@
licenses(["notice"])
-package(
- default_visibility = ["//mediapipe:__subpackages__"],
-)
+package(default_visibility = ["//mediapipe:__subpackages__"])
py_library(
name = "data_util",
diff --git a/mediapipe/model_maker/python/core/tasks/BUILD b/mediapipe/model_maker/python/core/tasks/BUILD
index 6a3e60c97..818d78feb 100644
--- a/mediapipe/model_maker/python/core/tasks/BUILD
+++ b/mediapipe/model_maker/python/core/tasks/BUILD
@@ -15,9 +15,7 @@
# Placeholder for internal Python strict library and test compatibility macro.
# Placeholder for internal Python strict test compatibility macro.
-package(
- default_visibility = ["//mediapipe:__subpackages__"],
-)
+package(default_visibility = ["//mediapipe:__subpackages__"])
licenses(["notice"])
diff --git a/mediapipe/model_maker/python/core/utils/BUILD b/mediapipe/model_maker/python/core/utils/BUILD
index e86cbb1e3..ef9cab290 100644
--- a/mediapipe/model_maker/python/core/utils/BUILD
+++ b/mediapipe/model_maker/python/core/utils/BUILD
@@ -17,9 +17,7 @@
licenses(["notice"])
-package(
- default_visibility = ["//mediapipe:__subpackages__"],
-)
+package(default_visibility = ["//mediapipe:__subpackages__"])
py_library(
name = "test_util",
diff --git a/mediapipe/model_maker/python/text/core/BUILD b/mediapipe/model_maker/python/text/core/BUILD
index 3ba4e8e6e..d99f46b77 100644
--- a/mediapipe/model_maker/python/text/core/BUILD
+++ b/mediapipe/model_maker/python/text/core/BUILD
@@ -14,9 +14,7 @@
# Placeholder for internal Python strict library and test compatibility macro.
-package(
- default_visibility = ["//mediapipe:__subpackages__"],
-)
+package(default_visibility = ["//mediapipe:__subpackages__"])
licenses(["notice"])
diff --git a/mediapipe/model_maker/python/text/text_classifier/BUILD b/mediapipe/model_maker/python/text/text_classifier/BUILD
index 1ae3e2873..9fe96849b 100644
--- a/mediapipe/model_maker/python/text/text_classifier/BUILD
+++ b/mediapipe/model_maker/python/text/text_classifier/BUILD
@@ -15,9 +15,7 @@
# Placeholder for internal Python strict library and test compatibility macro.
# Placeholder for internal Python strict test compatibility macro.
-package(
- default_visibility = ["//mediapipe:__subpackages__"],
-)
+package(default_visibility = ["//mediapipe:__subpackages__"])
licenses(["notice"])
diff --git a/mediapipe/model_maker/python/vision/BUILD b/mediapipe/model_maker/python/vision/BUILD
index 4410d859f..b7d0d13a6 100644
--- a/mediapipe/model_maker/python/vision/BUILD
+++ b/mediapipe/model_maker/python/vision/BUILD
@@ -12,8 +12,6 @@
# See the License for the specific language governing permissions and
# limitations under the License.
-package(
- default_visibility = ["//mediapipe:__subpackages__"],
-)
+package(default_visibility = ["//mediapipe:__subpackages__"])
licenses(["notice"])
diff --git a/mediapipe/model_maker/python/vision/gesture_recognizer/BUILD b/mediapipe/model_maker/python/vision/gesture_recognizer/BUILD
index 578723fb0..27f8934b3 100644
--- a/mediapipe/model_maker/python/vision/gesture_recognizer/BUILD
+++ b/mediapipe/model_maker/python/vision/gesture_recognizer/BUILD
@@ -17,9 +17,7 @@
licenses(["notice"])
-package(
- default_visibility = ["//mediapipe:__subpackages__"],
-)
+package(default_visibility = ["//mediapipe:__subpackages__"])
# TODO: Remove the unnecessary test data once the demo data are moved to an open-sourced
# directory.
diff --git a/mediapipe/model_maker/python/vision/image_classifier/BUILD b/mediapipe/model_maker/python/vision/image_classifier/BUILD
index 3b6d7551a..73d1d2f7c 100644
--- a/mediapipe/model_maker/python/vision/image_classifier/BUILD
+++ b/mediapipe/model_maker/python/vision/image_classifier/BUILD
@@ -17,9 +17,7 @@
licenses(["notice"])
-package(
- default_visibility = ["//mediapipe:__subpackages__"],
-)
+package(default_visibility = ["//mediapipe:__subpackages__"])
######################################################################
# Public target of the MediaPipe Model Maker ImageClassifier APIs.
diff --git a/mediapipe/model_maker/python/vision/object_detector/BUILD b/mediapipe/model_maker/python/vision/object_detector/BUILD
index f9c3f00fc..75c08dbc8 100644
--- a/mediapipe/model_maker/python/vision/object_detector/BUILD
+++ b/mediapipe/model_maker/python/vision/object_detector/BUILD
@@ -17,9 +17,7 @@
licenses(["notice"])
-package(
- default_visibility = ["//mediapipe:__subpackages__"],
-)
+package(default_visibility = ["//mediapipe:__subpackages__"])
py_library(
name = "object_detector_import",
@@ -88,6 +86,17 @@ py_test(
],
)
+py_library(
+ name = "detection",
+ srcs = ["detection.py"],
+)
+
+py_test(
+ name = "detection_test",
+ srcs = ["detection_test.py"],
+ deps = [":detection"],
+)
+
py_library(
name = "hyperparameters",
srcs = ["hyperparameters.py"],
@@ -116,6 +125,7 @@ py_library(
name = "model",
srcs = ["model.py"],
deps = [
+ ":detection",
":model_options",
":model_spec",
],
@@ -163,6 +173,7 @@ py_library(
"//mediapipe/model_maker/python/core/tasks:classifier",
"//mediapipe/model_maker/python/core/utils:model_util",
"//mediapipe/model_maker/python/core/utils:quantization",
+ "//mediapipe/tasks/python/metadata/metadata_writers:metadata_info",
"//mediapipe/tasks/python/metadata/metadata_writers:metadata_writer",
"//mediapipe/tasks/python/metadata/metadata_writers:object_detector",
],
diff --git a/mediapipe/model_maker/python/vision/object_detector/__init__.py b/mediapipe/model_maker/python/vision/object_detector/__init__.py
index 4670b343c..3e0a62bf8 100644
--- a/mediapipe/model_maker/python/vision/object_detector/__init__.py
+++ b/mediapipe/model_maker/python/vision/object_detector/__init__.py
@@ -32,6 +32,7 @@ ObjectDetectorOptions = object_detector_options.ObjectDetectorOptions
# Remove duplicated and non-public API
del dataset
del dataset_util # pylint: disable=undefined-variable
+del detection # pylint: disable=undefined-variable
del hyperparameters
del model # pylint: disable=undefined-variable
del model_options
diff --git a/mediapipe/model_maker/python/vision/object_detector/dataset.py b/mediapipe/model_maker/python/vision/object_detector/dataset.py
index 6899d8612..c18a071b2 100644
--- a/mediapipe/model_maker/python/vision/object_detector/dataset.py
+++ b/mediapipe/model_maker/python/vision/object_detector/dataset.py
@@ -106,7 +106,7 @@ class Dataset(classification_dataset.ClassificationDataset):
...
Each .xml annotation file should have the following format:
- file0.jpg
+ file0.jpg
diff --git a/mediapipe/model_maker/python/vision/object_detector/detection.py b/mediapipe/model_maker/python/vision/object_detector/detection.py
new file mode 100644
index 000000000..769189b24
--- /dev/null
+++ b/mediapipe/model_maker/python/vision/object_detector/detection.py
@@ -0,0 +1,34 @@
+# Copyright 2023 The MediaPipe Authors.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""Custom Detection export module for Object Detection."""
+
+from typing import Any, Mapping
+
+from official.vision.serving import detection
+
+
+class DetectionModule(detection.DetectionModule):
+ """A serving detection module for exporting the model.
+
+ This module overrides the tensorflow_models DetectionModule by only outputting
+ the pre-nms detection_boxes and detection_scores.
+ """
+
+ def serve(self, images) -> Mapping[str, Any]:
+ result = super().serve(images)
+ final_outputs = {
+ 'detection_boxes': result['detection_boxes'],
+ 'detection_scores': result['detection_scores'],
+ }
+ return final_outputs
diff --git a/mediapipe/model_maker/python/vision/object_detector/detection_test.py b/mediapipe/model_maker/python/vision/object_detector/detection_test.py
new file mode 100644
index 000000000..34f16c21c
--- /dev/null
+++ b/mediapipe/model_maker/python/vision/object_detector/detection_test.py
@@ -0,0 +1,73 @@
+# Copyright 2023 The MediaPipe Authors.
+#
+# Licensed under the Apache License, Version 2.0 (the 'License');
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an 'AS IS' BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from unittest import mock
+import tensorflow as tf
+
+from mediapipe.model_maker.python.vision.object_detector import detection
+from official.core import config_definitions as cfg
+from official.vision import configs
+from official.vision.serving import detection as detection_module
+
+
+class ObjectDetectorTest(tf.test.TestCase):
+
+ @mock.patch.object(detection_module.DetectionModule, 'serve', autospec=True)
+ def test_detection_module(self, mock_serve):
+ mock_serve.return_value = {
+ 'detection_boxes': 1,
+ 'detection_scores': 2,
+ 'detection_classes': 3,
+ 'num_detections': 4,
+ }
+ model_config = configs.retinanet.RetinaNet(
+ min_level=3,
+ max_level=7,
+ num_classes=10,
+ input_size=[256, 256, 3],
+ anchor=configs.retinanet.Anchor(
+ num_scales=3, aspect_ratios=[0.5, 1.0, 2.0], anchor_size=3
+ ),
+ backbone=configs.backbones.Backbone(
+ type='mobilenet', mobilenet=configs.backbones.MobileNet()
+ ),
+ decoder=configs.decoders.Decoder(
+ type='fpn',
+ fpn=configs.decoders.FPN(
+ num_filters=128, use_separable_conv=True, use_keras_layer=True
+ ),
+ ),
+ head=configs.retinanet.RetinaNetHead(
+ num_filters=128, use_separable_conv=True
+ ),
+ detection_generator=configs.retinanet.DetectionGenerator(),
+ norm_activation=configs.common.NormActivation(activation='relu6'),
+ )
+ task_config = configs.retinanet.RetinaNetTask(model=model_config)
+ params = cfg.ExperimentConfig(
+ task=task_config,
+ )
+ detection_instance = detection.DetectionModule(
+ params=params, batch_size=1, input_image_size=[256, 256]
+ )
+ outputs = detection_instance.serve(0)
+ expected_outputs = {
+ 'detection_boxes': 1,
+ 'detection_scores': 2,
+ }
+ self.assertAllEqual(outputs, expected_outputs)
+
+
+if __name__ == '__main__':
+ tf.test.main()
diff --git a/mediapipe/model_maker/python/vision/object_detector/hyperparameters.py b/mediapipe/model_maker/python/vision/object_detector/hyperparameters.py
index 1bc7514f2..35fb630ae 100644
--- a/mediapipe/model_maker/python/vision/object_detector/hyperparameters.py
+++ b/mediapipe/model_maker/python/vision/object_detector/hyperparameters.py
@@ -27,8 +27,6 @@ class HParams(hp.BaseHParams):
learning_rate: Learning rate to use for gradient descent training.
batch_size: Batch size for training.
epochs: Number of training iterations over the dataset.
- do_fine_tuning: If true, the base module is trained together with the
- classification layer on top.
cosine_decay_epochs: The number of epochs for cosine decay learning rate.
See
https://www.tensorflow.org/api_docs/python/tf/keras/optimizers/schedules/CosineDecay
@@ -39,13 +37,13 @@ class HParams(hp.BaseHParams):
"""
# Parameters from BaseHParams class.
- learning_rate: float = 0.003
- batch_size: int = 32
- epochs: int = 10
+ learning_rate: float = 0.3
+ batch_size: int = 8
+ epochs: int = 30
# Parameters for cosine learning rate decay
cosine_decay_epochs: Optional[int] = None
- cosine_decay_alpha: float = 0.0
+ cosine_decay_alpha: float = 1.0
@dataclasses.dataclass
@@ -67,8 +65,8 @@ class QATHParams:
for more information.
"""
- learning_rate: float = 0.03
- batch_size: int = 32
- epochs: int = 10
- decay_steps: int = 231
+ learning_rate: float = 0.3
+ batch_size: int = 8
+ epochs: int = 15
+ decay_steps: int = 8
decay_rate: float = 0.96
diff --git a/mediapipe/model_maker/python/vision/object_detector/model.py b/mediapipe/model_maker/python/vision/object_detector/model.py
index e3eb3a651..70e63d5b5 100644
--- a/mediapipe/model_maker/python/vision/object_detector/model.py
+++ b/mediapipe/model_maker/python/vision/object_detector/model.py
@@ -18,6 +18,7 @@ from typing import Mapping, Optional, Sequence, Union
import tensorflow as tf
+from mediapipe.model_maker.python.vision.object_detector import detection
from mediapipe.model_maker.python.vision.object_detector import model_options as model_opt
from mediapipe.model_maker.python.vision.object_detector import model_spec as ms
from official.core import config_definitions as cfg
@@ -29,7 +30,6 @@ from official.vision.losses import loss_utils
from official.vision.modeling import factory
from official.vision.modeling import retinanet_model
from official.vision.modeling.layers import detection_generator
-from official.vision.serving import detection
class ObjectDetectorModel(tf.keras.Model):
@@ -199,6 +199,7 @@ class ObjectDetectorModel(tf.keras.Model):
max_detections=10,
max_classes_per_detection=1,
normalize_anchor_coordinates=True,
+ omit_nms=True,
),
)
tflite_post_processing_config = (
diff --git a/mediapipe/model_maker/python/vision/object_detector/object_detector.py b/mediapipe/model_maker/python/vision/object_detector/object_detector.py
index 746eef1b3..486c3ffa9 100644
--- a/mediapipe/model_maker/python/vision/object_detector/object_detector.py
+++ b/mediapipe/model_maker/python/vision/object_detector/object_detector.py
@@ -28,6 +28,7 @@ from mediapipe.model_maker.python.vision.object_detector import model_options as
from mediapipe.model_maker.python.vision.object_detector import model_spec as ms
from mediapipe.model_maker.python.vision.object_detector import object_detector_options
from mediapipe.model_maker.python.vision.object_detector import preprocessor
+from mediapipe.tasks.python.metadata.metadata_writers import metadata_info
from mediapipe.tasks.python.metadata.metadata_writers import metadata_writer
from mediapipe.tasks.python.metadata.metadata_writers import object_detector as object_detector_writer
from official.vision.evaluation import coco_evaluator
@@ -264,6 +265,27 @@ class ObjectDetector(classifier.Classifier):
coco_metrics = coco_eval.result()
return losses, coco_metrics
+ def _create_fixed_anchor(
+ self, anchor_box: List[float]
+ ) -> object_detector_writer.FixedAnchor:
+ """Helper function to create FixedAnchor objects from an anchor box array.
+
+ Args:
+ anchor_box: List of anchor box coordinates in the format of [x_min, y_min,
+ x_max, y_max].
+
+ Returns:
+ A FixedAnchor object representing the anchor_box.
+ """
+ image_shape = self._model_spec.input_image_shape[:2]
+ y_center_norm = (anchor_box[0] + anchor_box[2]) / (2 * image_shape[0])
+ x_center_norm = (anchor_box[1] + anchor_box[3]) / (2 * image_shape[1])
+ height_norm = (anchor_box[2] - anchor_box[0]) / image_shape[0]
+ width_norm = (anchor_box[3] - anchor_box[1]) / image_shape[1]
+ return object_detector_writer.FixedAnchor(
+ x_center_norm, y_center_norm, width_norm, height_norm
+ )
+
def export_model(
self,
model_name: str = 'model.tflite',
@@ -328,11 +350,40 @@ class ObjectDetector(classifier.Classifier):
converter.target_spec.supported_ops = (tf.lite.OpsSet.TFLITE_BUILTINS,)
tflite_model = converter.convert()
- writer = object_detector_writer.MetadataWriter.create_for_models_with_nms(
+ # Build anchors
+ raw_anchor_boxes = self._preprocessor.anchor_boxes
+ anchors = []
+ for _, anchor_boxes in raw_anchor_boxes.items():
+ anchor_boxes_reshaped = anchor_boxes.numpy().reshape((-1, 4))
+ for ab in anchor_boxes_reshaped:
+ anchors.append(self._create_fixed_anchor(ab))
+
+ ssd_anchors_options = object_detector_writer.SsdAnchorsOptions(
+ object_detector_writer.FixedAnchorsSchema(anchors)
+ )
+
+ tensor_decoding_options = object_detector_writer.TensorsDecodingOptions(
+ num_classes=self._num_classes,
+ num_boxes=len(anchors),
+ num_coords=4,
+ keypoint_coord_offset=0,
+ num_keypoints=0,
+ num_values_per_keypoint=2,
+ x_scale=1,
+ y_scale=1,
+ w_scale=1,
+ h_scale=1,
+ apply_exponential_on_box_size=True,
+ sigmoid_score=False,
+ )
+ writer = object_detector_writer.MetadataWriter.create_for_models_without_nms(
tflite_model,
self._model_spec.mean_rgb,
self._model_spec.stddev_rgb,
labels=metadata_writer.Labels().add(list(self._label_names)),
+ ssd_anchors_options=ssd_anchors_options,
+ tensors_decoding_options=tensor_decoding_options,
+ output_tensors_order=metadata_info.RawDetectionOutputTensorsOrder.LOCATION_SCORE,
)
tflite_model_with_metadata, metadata_json = writer.populate()
model_util.save_tflite(tflite_model_with_metadata, tflite_file)
diff --git a/mediapipe/model_maker/python/vision/object_detector/preprocessor.py b/mediapipe/model_maker/python/vision/object_detector/preprocessor.py
index b4e08f997..ebea6a07b 100644
--- a/mediapipe/model_maker/python/vision/object_detector/preprocessor.py
+++ b/mediapipe/model_maker/python/vision/object_detector/preprocessor.py
@@ -44,6 +44,26 @@ class Preprocessor(object):
self._aug_scale_max = 2.0
self._max_num_instances = 100
+ self._padded_size = preprocess_ops.compute_padded_size(
+ self._output_size, 2**self._max_level
+ )
+
+ input_anchor = anchor.build_anchor_generator(
+ min_level=self._min_level,
+ max_level=self._max_level,
+ num_scales=self._num_scales,
+ aspect_ratios=self._aspect_ratios,
+ anchor_size=self._anchor_size,
+ )
+ self._anchor_boxes = input_anchor(image_size=self._output_size)
+ self._anchor_labeler = anchor.AnchorLabeler(
+ self._match_threshold, self._unmatched_threshold
+ )
+
+ @property
+ def anchor_boxes(self):
+ return self._anchor_boxes
+
def __call__(
self, data: Mapping[str, Any], is_training: bool = True
) -> Tuple[tf.Tensor, Mapping[str, Any]]:
@@ -90,13 +110,10 @@ class Preprocessor(object):
image, image_info = preprocess_ops.resize_and_crop_image(
image,
self._output_size,
- padded_size=preprocess_ops.compute_padded_size(
- self._output_size, 2**self._max_level
- ),
+ padded_size=self._padded_size,
aug_scale_min=(self._aug_scale_min if is_training else 1.0),
aug_scale_max=(self._aug_scale_max if is_training else 1.0),
)
- image_height, image_width, _ = image.get_shape().as_list()
# Resize and crop boxes.
image_scale = image_info[2, :]
@@ -110,20 +127,9 @@ class Preprocessor(object):
classes = tf.gather(classes, indices)
# Assign anchors.
- input_anchor = anchor.build_anchor_generator(
- min_level=self._min_level,
- max_level=self._max_level,
- num_scales=self._num_scales,
- aspect_ratios=self._aspect_ratios,
- anchor_size=self._anchor_size,
- )
- anchor_boxes = input_anchor(image_size=(image_height, image_width))
- anchor_labeler = anchor.AnchorLabeler(
- self._match_threshold, self._unmatched_threshold
- )
(cls_targets, box_targets, _, cls_weights, box_weights) = (
- anchor_labeler.label_anchors(
- anchor_boxes, boxes, tf.expand_dims(classes, axis=1)
+ self._anchor_labeler.label_anchors(
+ self.anchor_boxes, boxes, tf.expand_dims(classes, axis=1)
)
)
@@ -134,7 +140,7 @@ class Preprocessor(object):
labels = {
'cls_targets': cls_targets,
'box_targets': box_targets,
- 'anchor_boxes': anchor_boxes,
+ 'anchor_boxes': self.anchor_boxes,
'cls_weights': cls_weights,
'box_weights': box_weights,
'image_info': image_info,
diff --git a/mediapipe/tasks/cc/vision/face_stylizer/face_stylizer_graph.cc b/mediapipe/tasks/cc/vision/face_stylizer/face_stylizer_graph.cc
index cb49ef59d..d7265a146 100644
--- a/mediapipe/tasks/cc/vision/face_stylizer/face_stylizer_graph.cc
+++ b/mediapipe/tasks/cc/vision/face_stylizer/face_stylizer_graph.cc
@@ -361,9 +361,10 @@ class FaceStylizerGraph : public core::ModelTaskGraph {
auto& tensors_to_image =
graph.AddNode("mediapipe.tasks.TensorsToImageCalculator");
- ConfigureTensorsToImageCalculator(
- image_to_tensor_options,
- &tensors_to_image.GetOptions());
+ auto& tensors_to_image_options =
+ tensors_to_image.GetOptions();
+ tensors_to_image_options.mutable_input_tensor_float_range()->set_min(-1);
+ tensors_to_image_options.mutable_input_tensor_float_range()->set_max(1);
face_alignment_image >> tensors_to_image.In(kTensorsTag);
face_alignment = tensors_to_image.Out(kImageTag).Cast();
diff --git a/mediapipe/tasks/cc/vision/image_segmenter/BUILD b/mediapipe/tasks/cc/vision/image_segmenter/BUILD
index 183b1bb86..fc977c0b5 100644
--- a/mediapipe/tasks/cc/vision/image_segmenter/BUILD
+++ b/mediapipe/tasks/cc/vision/image_segmenter/BUILD
@@ -63,6 +63,8 @@ cc_library(
"//mediapipe/calculators/image:image_properties_calculator",
"//mediapipe/calculators/image:image_transformation_calculator",
"//mediapipe/calculators/image:image_transformation_calculator_cc_proto",
+ "//mediapipe/calculators/image:set_alpha_calculator",
+ "//mediapipe/calculators/image:set_alpha_calculator_cc_proto",
"//mediapipe/calculators/tensor:image_to_tensor_calculator",
"//mediapipe/calculators/tensor:image_to_tensor_calculator_cc_proto",
"//mediapipe/calculators/tensor:inference_calculator",
diff --git a/mediapipe/tasks/cc/vision/image_segmenter/calculators/segmentation_postprocessor_gl.cc b/mediapipe/tasks/cc/vision/image_segmenter/calculators/segmentation_postprocessor_gl.cc
index 5b212069f..311f8d6aa 100644
--- a/mediapipe/tasks/cc/vision/image_segmenter/calculators/segmentation_postprocessor_gl.cc
+++ b/mediapipe/tasks/cc/vision/image_segmenter/calculators/segmentation_postprocessor_gl.cc
@@ -188,7 +188,7 @@ void main() {
// Special argmax shader for N=1 classes. We don't need to worry about softmax
// activation (it is assumed softmax requires N > 1 classes), but this should
// occur after SIGMOID activation if specified. Instead of a true argmax, we
-// simply use 0.5 as the cutoff, assigning 1 (foreground) or 0 (background)
+// simply use 0.5 as the cutoff, assigning 0 (foreground) or 255 (background)
// based on whether the confidence value reaches this cutoff or not,
// respectively.
static constexpr char kArgmaxOneClassShader[] = R"(
@@ -199,12 +199,12 @@ uniform sampler2D input_texture;
void main() {
float input_val = texture2D(input_texture, sample_coordinate).x;
// Category is just value rounded to nearest integer; then we map to either
- // 0 or 1/255 accordingly. If the input has been activated properly, then the
+ // 0 or 1 accordingly. If the input has been activated properly, then the
// values should always be in the range [0, 1]. But just in case it hasn't, to
// avoid category overflow issues when the activation function is not properly
// chosen, we add an extra clamp here, as performance hit is minimal.
- float category = clamp(floor(input_val + 0.5), 0.0, 1.0);
- gl_FragColor = vec4(category / 255.0, 0.0, 0.0, 1.0);
+ float category = clamp(floor(1.5 - input_val), 0.0, 1.0);
+ gl_FragColor = vec4(category, 0.0, 0.0, 1.0);
})";
// Softmax is in 3 steps:
diff --git a/mediapipe/tasks/cc/vision/image_segmenter/calculators/tensors_to_segmentation_calculator.cc b/mediapipe/tasks/cc/vision/image_segmenter/calculators/tensors_to_segmentation_calculator.cc
index c2d1520dd..660dc59b7 100644
--- a/mediapipe/tasks/cc/vision/image_segmenter/calculators/tensors_to_segmentation_calculator.cc
+++ b/mediapipe/tasks/cc/vision/image_segmenter/calculators/tensors_to_segmentation_calculator.cc
@@ -61,6 +61,8 @@ using ::mediapipe::tasks::vision::GetImageLikeTensorShape;
using ::mediapipe::tasks::vision::Shape;
using ::mediapipe::tasks::vision::image_segmenter::proto::SegmenterOptions;
+constexpr uint8_t kUnLabeledPixelValue = 255;
+
void StableSoftmax(absl::Span values,
absl::Span activated_values) {
float max_value = *std::max_element(values.begin(), values.end());
@@ -153,9 +155,11 @@ Image ProcessForCategoryMaskCpu(const Shape& input_shape,
}
if (input_channels == 1) {
// if the input tensor is a single mask, it is assumed to be a binary
- // foreground segmentation mask. For such a mask, we make foreground
- // category 1, and background category 0.
- pixel = static_cast(confidence_scores[0] > 0.5f);
+ // foreground segmentation mask. For such a mask, instead of a true
+ // argmax, we simply use 0.5 as the cutoff, assigning 0 (foreground) or
+ // 255 (background) based on whether the confidence value reaches this
+ // cutoff or not, respectively.
+ pixel = confidence_scores[0] > 0.5f ? 0 : kUnLabeledPixelValue;
} else {
const int maximum_category_idx =
std::max_element(confidence_scores.begin(), confidence_scores.end()) -
diff --git a/mediapipe/tasks/cc/vision/image_segmenter/image_segmenter_graph.cc b/mediapipe/tasks/cc/vision/image_segmenter/image_segmenter_graph.cc
index a52d3fa9a..6ecfa3685 100644
--- a/mediapipe/tasks/cc/vision/image_segmenter/image_segmenter_graph.cc
+++ b/mediapipe/tasks/cc/vision/image_segmenter/image_segmenter_graph.cc
@@ -23,6 +23,7 @@ limitations under the License.
#include "absl/strings/str_format.h"
#include "mediapipe/calculators/image/image_clone_calculator.pb.h"
#include "mediapipe/calculators/image/image_transformation_calculator.pb.h"
+#include "mediapipe/calculators/image/set_alpha_calculator.pb.h"
#include "mediapipe/calculators/tensor/tensor_converter_calculator.pb.h"
#include "mediapipe/framework/api2/builder.h"
#include "mediapipe/framework/api2/port.h"
@@ -249,7 +250,8 @@ void ConfigureTensorConverterCalculator(
// the tflite model.
absl::StatusOr ConvertImageToTensors(
Source image_in, Source norm_rect_in, bool use_gpu,
- const core::ModelResources& model_resources, Graph& graph) {
+ bool is_hair_segmentation, const core::ModelResources& model_resources,
+ Graph& graph) {
ASSIGN_OR_RETURN(const tflite::Tensor* tflite_input_tensor,
GetInputTensor(model_resources));
if (tflite_input_tensor->shape()->size() != 4) {
@@ -294,9 +296,17 @@ absl::StatusOr ConvertImageToTensors(
// Convert from Image to legacy ImageFrame or GpuBuffer.
auto& from_image = graph.AddNode("FromImageCalculator");
image_on_device >> from_image.In(kImageTag);
- auto image_cpu_or_gpu =
+ Source image_cpu_or_gpu =
from_image.Out(use_gpu ? kImageGpuTag : kImageCpuTag);
+ if (is_hair_segmentation) {
+ auto& set_alpha = graph.AddNode("SetAlphaCalculator");
+ set_alpha.GetOptions()
+ .set_alpha_value(0);
+ image_cpu_or_gpu >> set_alpha.In(use_gpu ? kImageGpuTag : kImageTag);
+ image_cpu_or_gpu = set_alpha.Out(use_gpu ? kImageGpuTag : kImageTag);
+ }
+
// Resize the input image to the model input size.
auto& image_transformation = graph.AddNode("ImageTransformationCalculator");
ConfigureImageTransformationCalculator(
@@ -461,22 +471,41 @@ class ImageSegmenterGraph : public core::ModelTaskGraph {
bool use_gpu =
components::processors::DetermineImagePreprocessingGpuBackend(
task_options.base_options().acceleration());
- ASSIGN_OR_RETURN(auto image_and_tensors,
- ConvertImageToTensors(image_in, norm_rect_in, use_gpu,
- model_resources, graph));
- // Adds inference subgraph and connects its input stream to the output
- // tensors produced by the ImageToTensorCalculator.
- auto& inference = AddInference(
- model_resources, task_options.base_options().acceleration(), graph);
- image_and_tensors.tensors >> inference.In(kTensorsTag);
- // Adds segmentation calculators for output streams.
+ // Adds segmentation calculators for output streams. Add this calculator
+ // first to get the labels.
auto& tensor_to_images =
graph.AddNode("mediapipe.tasks.TensorsToSegmentationCalculator");
RET_CHECK_OK(ConfigureTensorsToSegmentationCalculator(
task_options, model_resources,
&tensor_to_images
.GetOptions()));
+ const auto& tensor_to_images_options =
+ tensor_to_images.GetOptions();
+
+ // TODO: remove special logic for hair segmentation model.
+ // The alpha channel of hair segmentation model indicates the interested
+ // area. The model was designed for live stream mode, so that the mask of
+ // previous frame is used as the indicator for the next frame. For the first
+ // frame, it expects the alpha channel to be empty. To consolidate IMAGE,
+ // VIDEO and LIVE_STREAM mode in mediapipe tasks, here we forcely set the
+ // alpha channel to be empty if we find the model is the hair segmentation
+ // model.
+ bool is_hair_segmentation = false;
+ if (tensor_to_images_options.label_items_size() == 2 &&
+ tensor_to_images_options.label_items().at(1).name() == "hair") {
+ is_hair_segmentation = true;
+ }
+
+ ASSIGN_OR_RETURN(
+ auto image_and_tensors,
+ ConvertImageToTensors(image_in, norm_rect_in, use_gpu,
+ is_hair_segmentation, model_resources, graph));
+ // Adds inference subgraph and connects its input stream to the output
+ // tensors produced by the ImageToTensorCalculator.
+ auto& inference = AddInference(
+ model_resources, task_options.base_options().acceleration(), graph);
+ image_and_tensors.tensors >> inference.In(kTensorsTag);
inference.Out(kTensorsTag) >> tensor_to_images.In(kTensorsTag);
// Adds image property calculator for output size.
diff --git a/mediapipe/tasks/cc/vision/image_segmenter/image_segmenter_test.cc b/mediapipe/tasks/cc/vision/image_segmenter/image_segmenter_test.cc
index 339ec1424..656ed0715 100644
--- a/mediapipe/tasks/cc/vision/image_segmenter/image_segmenter_test.cc
+++ b/mediapipe/tasks/cc/vision/image_segmenter/image_segmenter_test.cc
@@ -30,6 +30,7 @@ limitations under the License.
#include "mediapipe/framework/port/opencv_imgcodecs_inc.h"
#include "mediapipe/framework/port/opencv_imgproc_inc.h"
#include "mediapipe/framework/port/status_matchers.h"
+#include "mediapipe/framework/tool/test_util.h"
#include "mediapipe/tasks/cc/components/containers/rect.h"
#include "mediapipe/tasks/cc/core/base_options.h"
#include "mediapipe/tasks/cc/core/proto/base_options.pb.h"
@@ -425,6 +426,28 @@ TEST_F(ImageModeTest, SucceedsSelfie144x256Segmentations) {
SimilarToFloatMask(expected_mask_float, kGoldenMaskSimilarity));
}
+TEST_F(ImageModeTest, SucceedsSelfieSegmentationSingleLabel) {
+ auto options = std::make_unique();
+ options->base_options.model_asset_path =
+ JoinPath("./", kTestDataDirectory, kSelfieSegmentation);
+ MP_ASSERT_OK_AND_ASSIGN(std::unique_ptr segmenter,
+ ImageSegmenter::Create(std::move(options)));
+ ASSERT_EQ(segmenter->GetLabels().size(), 1);
+ EXPECT_EQ(segmenter->GetLabels()[0], "selfie");
+ MP_ASSERT_OK(segmenter->Close());
+}
+
+TEST_F(ImageModeTest, SucceedsSelfieSegmentationLandscapeSingleLabel) {
+ auto options = std::make_unique();
+ options->base_options.model_asset_path =
+ JoinPath("./", kTestDataDirectory, kSelfieSegmentationLandscape);
+ MP_ASSERT_OK_AND_ASSIGN(std::unique_ptr segmenter,
+ ImageSegmenter::Create(std::move(options)));
+ ASSERT_EQ(segmenter->GetLabels().size(), 1);
+ EXPECT_EQ(segmenter->GetLabels()[0], "selfie");
+ MP_ASSERT_OK(segmenter->Close());
+}
+
TEST_F(ImageModeTest, SucceedsPortraitSelfieSegmentationConfidenceMask) {
Image image =
GetSRGBImage(JoinPath("./", kTestDataDirectory, "portrait.jpg"));
@@ -464,6 +487,9 @@ TEST_F(ImageModeTest, SucceedsPortraitSelfieSegmentationCategoryMask) {
EXPECT_TRUE(result.category_mask.has_value());
MP_ASSERT_OK(segmenter->Close());
+ MP_EXPECT_OK(
+ SavePngTestOutput(*result.category_mask->GetImageFrameSharedPtr(),
+ "portrait_selfie_segmentation_expected_category_mask"));
cv::Mat selfie_mask = mediapipe::formats::MatView(
result.category_mask->GetImageFrameSharedPtr().get());
cv::Mat expected_mask = cv::imread(
@@ -471,7 +497,7 @@ TEST_F(ImageModeTest, SucceedsPortraitSelfieSegmentationCategoryMask) {
"portrait_selfie_segmentation_expected_category_mask.jpg"),
cv::IMREAD_GRAYSCALE);
EXPECT_THAT(selfie_mask,
- SimilarToUint8Mask(expected_mask, kGoldenMaskSimilarity, 255));
+ SimilarToUint8Mask(expected_mask, kGoldenMaskSimilarity, 1));
}
TEST_F(ImageModeTest, SucceedsPortraitSelfieSegmentationLandscapeCategoryMask) {
@@ -487,6 +513,9 @@ TEST_F(ImageModeTest, SucceedsPortraitSelfieSegmentationLandscapeCategoryMask) {
EXPECT_TRUE(result.category_mask.has_value());
MP_ASSERT_OK(segmenter->Close());
+ MP_EXPECT_OK(SavePngTestOutput(
+ *result.category_mask->GetImageFrameSharedPtr(),
+ "portrait_selfie_segmentation_landscape_expected_category_mask"));
cv::Mat selfie_mask = mediapipe::formats::MatView(
result.category_mask->GetImageFrameSharedPtr().get());
cv::Mat expected_mask = cv::imread(
@@ -495,7 +524,7 @@ TEST_F(ImageModeTest, SucceedsPortraitSelfieSegmentationLandscapeCategoryMask) {
"portrait_selfie_segmentation_landscape_expected_category_mask.jpg"),
cv::IMREAD_GRAYSCALE);
EXPECT_THAT(selfie_mask,
- SimilarToUint8Mask(expected_mask, kGoldenMaskSimilarity, 255));
+ SimilarToUint8Mask(expected_mask, kGoldenMaskSimilarity, 1));
}
TEST_F(ImageModeTest, SucceedsHairSegmentation) {
diff --git a/mediapipe/tasks/cc/vision/object_detector/object_detector.cc b/mediapipe/tasks/cc/vision/object_detector/object_detector.cc
index 01fd3eb7b..152ee3273 100644
--- a/mediapipe/tasks/cc/vision/object_detector/object_detector.cc
+++ b/mediapipe/tasks/cc/vision/object_detector/object_detector.cc
@@ -129,9 +129,17 @@ absl::StatusOr> ObjectDetector::Create(
if (status_or_packets.value()[kImageOutStreamName].IsEmpty()) {
return;
}
+ Packet image_packet = status_or_packets.value()[kImageOutStreamName];
Packet detections_packet =
status_or_packets.value()[kDetectionsOutStreamName];
- Packet image_packet = status_or_packets.value()[kImageOutStreamName];
+ if (detections_packet.IsEmpty()) {
+ Packet empty_packet =
+ status_or_packets.value()[kDetectionsOutStreamName];
+ result_callback(
+ {ConvertToDetectionResult({})}, image_packet.Get(),
+ empty_packet.Timestamp().Value() / kMicroSecondsPerMilliSecond);
+ return;
+ }
result_callback(ConvertToDetectionResult(
detections_packet.Get>()),
image_packet.Get(),
@@ -165,6 +173,9 @@ absl::StatusOr ObjectDetector::Detect(
ProcessImageData(
{{kImageInStreamName, MakePacket(std::move(image))},
{kNormRectName, MakePacket(std::move(norm_rect))}}));
+ if (output_packets[kDetectionsOutStreamName].IsEmpty()) {
+ return {ConvertToDetectionResult({})};
+ }
return ConvertToDetectionResult(
output_packets[kDetectionsOutStreamName].Get>());
}
@@ -190,6 +201,9 @@ absl::StatusOr ObjectDetector::DetectForVideo(
{kNormRectName,
MakePacket(std::move(norm_rect))
.At(Timestamp(timestamp_ms * kMicroSecondsPerMilliSecond))}}));
+ if (output_packets[kDetectionsOutStreamName].IsEmpty()) {
+ return {ConvertToDetectionResult({})};
+ }
return ConvertToDetectionResult(
output_packets[kDetectionsOutStreamName].Get>());
}
diff --git a/mediapipe/tasks/cc/vision/object_detector/object_detector_test.cc b/mediapipe/tasks/cc/vision/object_detector/object_detector_test.cc
index 8642af7c4..e66fc19bb 100644
--- a/mediapipe/tasks/cc/vision/object_detector/object_detector_test.cc
+++ b/mediapipe/tasks/cc/vision/object_detector/object_detector_test.cc
@@ -499,6 +499,22 @@ TEST_F(ImageModeTest, SucceedsEfficientDetNoNmsModel) {
})pb")}));
}
+TEST_F(ImageModeTest, SucceedsNoObjectDetected) {
+ MP_ASSERT_OK_AND_ASSIGN(Image image,
+ DecodeImageFromFile(JoinPath("./", kTestDataDirectory,
+ "cats_and_dogs.jpg")));
+ auto options = std::make_unique();
+ options->max_results = 4;
+ options->score_threshold = 1.0f;
+ options->base_options.model_asset_path =
+ JoinPath("./", kTestDataDirectory, kEfficientDetWithoutNms);
+ MP_ASSERT_OK_AND_ASSIGN(std::unique_ptr object_detector,
+ ObjectDetector::Create(std::move(options)));
+ MP_ASSERT_OK_AND_ASSIGN(auto results, object_detector->Detect(image));
+ MP_ASSERT_OK(object_detector->Close());
+ EXPECT_THAT(results.detections, testing::IsEmpty());
+}
+
TEST_F(ImageModeTest, SucceedsWithoutImageResizing) {
MP_ASSERT_OK_AND_ASSIGN(Image image, DecodeImageFromFile(JoinPath(
"./", kTestDataDirectory,
diff --git a/mediapipe/tasks/cc/vision/pose_landmarker/pose_landmarker.cc b/mediapipe/tasks/cc/vision/pose_landmarker/pose_landmarker.cc
index f421c7376..797e71488 100644
--- a/mediapipe/tasks/cc/vision/pose_landmarker/pose_landmarker.cc
+++ b/mediapipe/tasks/cc/vision/pose_landmarker/pose_landmarker.cc
@@ -63,8 +63,6 @@ constexpr char kNormLandmarksTag[] = "NORM_LANDMARKS";
constexpr char kNormLandmarksStreamName[] = "norm_landmarks";
constexpr char kPoseWorldLandmarksTag[] = "WORLD_LANDMARKS";
constexpr char kPoseWorldLandmarksStreamName[] = "world_landmarks";
-constexpr char kPoseAuxiliaryLandmarksTag[] = "AUXILIARY_LANDMARKS";
-constexpr char kPoseAuxiliaryLandmarksStreamName[] = "auxiliary_landmarks";
constexpr int kMicroSecondsPerMilliSecond = 1000;
// Creates a MediaPipe graph config that contains a subgraph node of
@@ -83,9 +81,6 @@ CalculatorGraphConfig CreateGraphConfig(
graph.Out(kNormLandmarksTag);
subgraph.Out(kPoseWorldLandmarksTag).SetName(kPoseWorldLandmarksStreamName) >>
graph.Out(kPoseWorldLandmarksTag);
- subgraph.Out(kPoseAuxiliaryLandmarksTag)
- .SetName(kPoseAuxiliaryLandmarksStreamName) >>
- graph.Out(kPoseAuxiliaryLandmarksTag);
subgraph.Out(kImageTag).SetName(kImageOutStreamName) >> graph.Out(kImageTag);
if (output_segmentation_masks) {
subgraph.Out(kSegmentationMaskTag).SetName(kSegmentationMaskStreamName) >>
@@ -163,8 +158,6 @@ absl::StatusOr> PoseLandmarker::Create(
status_or_packets.value()[kNormLandmarksStreamName];
Packet pose_world_landmarks_packet =
status_or_packets.value()[kPoseWorldLandmarksStreamName];
- Packet pose_auxiliary_landmarks_packet =
- status_or_packets.value()[kPoseAuxiliaryLandmarksStreamName];
std::optional> segmentation_mask = std::nullopt;
if (output_segmentation_masks) {
segmentation_mask = segmentation_mask_packet.Get>();
@@ -175,9 +168,7 @@ absl::StatusOr> PoseLandmarker::Create(
/* pose_landmarks= */
pose_landmarks_packet.Get>(),
/* pose_world_landmarks= */
- pose_world_landmarks_packet.Get>(),
- pose_auxiliary_landmarks_packet
- .Get>()),
+ pose_world_landmarks_packet.Get>()),
image_packet.Get(),
pose_landmarks_packet.Timestamp().Value() /
kMicroSecondsPerMilliSecond);
@@ -234,10 +225,7 @@ absl::StatusOr PoseLandmarker::Detect(
.Get>(),
/* pose_world_landmarks */
output_packets[kPoseWorldLandmarksStreamName]
- .Get>(),
- /*pose_auxiliary_landmarks= */
- output_packets[kPoseAuxiliaryLandmarksStreamName]
- .Get>());
+ .Get>());
}
absl::StatusOr PoseLandmarker::DetectForVideo(
@@ -277,10 +265,7 @@ absl::StatusOr PoseLandmarker::DetectForVideo(
.Get>(),
/* pose_world_landmarks */
output_packets[kPoseWorldLandmarksStreamName]
- .Get>(),
- /* pose_auxiliary_landmarks= */
- output_packets[kPoseAuxiliaryLandmarksStreamName]
- .Get>());
+ .Get>());
}
absl::Status PoseLandmarker::DetectAsync(
diff --git a/mediapipe/tasks/cc/vision/pose_landmarker/pose_landmarker_result.cc b/mediapipe/tasks/cc/vision/pose_landmarker/pose_landmarker_result.cc
index 77f374d1e..da4c630b3 100644
--- a/mediapipe/tasks/cc/vision/pose_landmarker/pose_landmarker_result.cc
+++ b/mediapipe/tasks/cc/vision/pose_landmarker/pose_landmarker_result.cc
@@ -27,15 +27,12 @@ namespace pose_landmarker {
PoseLandmarkerResult ConvertToPoseLandmarkerResult(
std::optional> segmentation_masks,
const std::vector& pose_landmarks_proto,
- const std::vector& pose_world_landmarks_proto,
- const std::vector&
- pose_auxiliary_landmarks_proto) {
+ const std::vector& pose_world_landmarks_proto) {
PoseLandmarkerResult result;
result.segmentation_masks = segmentation_masks;
result.pose_landmarks.resize(pose_landmarks_proto.size());
result.pose_world_landmarks.resize(pose_world_landmarks_proto.size());
- result.pose_auxiliary_landmarks.resize(pose_auxiliary_landmarks_proto.size());
std::transform(pose_landmarks_proto.begin(), pose_landmarks_proto.end(),
result.pose_landmarks.begin(),
components::containers::ConvertToNormalizedLandmarks);
@@ -43,10 +40,6 @@ PoseLandmarkerResult ConvertToPoseLandmarkerResult(
pose_world_landmarks_proto.end(),
result.pose_world_landmarks.begin(),
components::containers::ConvertToLandmarks);
- std::transform(pose_auxiliary_landmarks_proto.begin(),
- pose_auxiliary_landmarks_proto.end(),
- result.pose_auxiliary_landmarks.begin(),
- components::containers::ConvertToNormalizedLandmarks);
return result;
}
diff --git a/mediapipe/tasks/cc/vision/pose_landmarker/pose_landmarker_result.h b/mediapipe/tasks/cc/vision/pose_landmarker/pose_landmarker_result.h
index f45994837..8978e5147 100644
--- a/mediapipe/tasks/cc/vision/pose_landmarker/pose_landmarker_result.h
+++ b/mediapipe/tasks/cc/vision/pose_landmarker/pose_landmarker_result.h
@@ -37,17 +37,12 @@ struct PoseLandmarkerResult {
std::vector pose_landmarks;
// Detected pose landmarks in world coordinates.
std::vector pose_world_landmarks;
- // Detected auxiliary landmarks, used for deriving ROI for next frame.
- std::vector
- pose_auxiliary_landmarks;
};
PoseLandmarkerResult ConvertToPoseLandmarkerResult(
std::optional> segmentation_mask,
const std::vector& pose_landmarks_proto,
- const std::vector& pose_world_landmarks_proto,
- const std::vector&
- pose_auxiliary_landmarks_proto);
+ const std::vector& pose_world_landmarks_proto);
} // namespace pose_landmarker
} // namespace vision
diff --git a/mediapipe/tasks/cc/vision/pose_landmarker/pose_landmarker_result_test.cc b/mediapipe/tasks/cc/vision/pose_landmarker/pose_landmarker_result_test.cc
index 14916215c..05e83b655 100644
--- a/mediapipe/tasks/cc/vision/pose_landmarker/pose_landmarker_result_test.cc
+++ b/mediapipe/tasks/cc/vision/pose_landmarker/pose_landmarker_result_test.cc
@@ -47,13 +47,6 @@ TEST(ConvertFromProto, Succeeds) {
landmark_proto.set_y(5.2);
landmark_proto.set_z(4.3);
- mediapipe::NormalizedLandmarkList auxiliary_landmark_list_proto;
- mediapipe::NormalizedLandmark& auxiliary_landmark_proto =
- *auxiliary_landmark_list_proto.add_landmark();
- auxiliary_landmark_proto.set_x(0.5);
- auxiliary_landmark_proto.set_y(0.5);
- auxiliary_landmark_proto.set_z(0.5);
-
std::vector segmentation_masks_lists = {segmentation_mask};
std::vector normalized_landmarks_lists = {
@@ -62,12 +55,9 @@ TEST(ConvertFromProto, Succeeds) {
std::vector world_landmarks_lists = {
world_landmark_list_proto};
- std::vector auxiliary_landmarks_lists = {
- auxiliary_landmark_list_proto};
-
PoseLandmarkerResult pose_landmarker_result = ConvertToPoseLandmarkerResult(
segmentation_masks_lists, normalized_landmarks_lists,
- world_landmarks_lists, auxiliary_landmarks_lists);
+ world_landmarks_lists);
EXPECT_EQ(pose_landmarker_result.pose_landmarks.size(), 1);
EXPECT_EQ(pose_landmarker_result.pose_landmarks[0].landmarks.size(), 1);
@@ -82,14 +72,6 @@ TEST(ConvertFromProto, Succeeds) {
testing::FieldsAre(testing::FloatEq(3.1), testing::FloatEq(5.2),
testing::FloatEq(4.3), std::nullopt,
std::nullopt, std::nullopt));
-
- EXPECT_EQ(pose_landmarker_result.pose_auxiliary_landmarks.size(), 1);
- EXPECT_EQ(pose_landmarker_result.pose_auxiliary_landmarks[0].landmarks.size(),
- 1);
- EXPECT_THAT(pose_landmarker_result.pose_auxiliary_landmarks[0].landmarks[0],
- testing::FieldsAre(testing::FloatEq(0.5), testing::FloatEq(0.5),
- testing::FloatEq(0.5), std::nullopt,
- std::nullopt, std::nullopt));
}
} // namespace pose_landmarker
diff --git a/mediapipe/tasks/ios/common/utils/sources/NSString+Helpers.mm b/mediapipe/tasks/ios/common/utils/sources/NSString+Helpers.mm
index dfc7749be..5f484fce5 100644
--- a/mediapipe/tasks/ios/common/utils/sources/NSString+Helpers.mm
+++ b/mediapipe/tasks/ios/common/utils/sources/NSString+Helpers.mm
@@ -24,7 +24,7 @@
return [NSString stringWithCString:text.c_str() encoding:[NSString defaultCStringEncoding]];
}
-+ (NSString *)uuidString{
++ (NSString *)uuidString {
return [[NSUUID UUID] UUIDString];
}
diff --git a/mediapipe/tasks/ios/components/containers/sources/MPPDetection.m b/mediapipe/tasks/ios/components/containers/sources/MPPDetection.m
index c245478db..c61cf0b39 100644
--- a/mediapipe/tasks/ios/components/containers/sources/MPPDetection.m
+++ b/mediapipe/tasks/ios/components/containers/sources/MPPDetection.m
@@ -28,7 +28,12 @@
return self;
}
-// TODO: Implement hash
+- (NSUInteger)hash {
+ NSUInteger nonNullPropertiesHash =
+ @(self.location.x).hash ^ @(self.location.y).hash ^ @(self.score).hash;
+
+ return self.label ? nonNullPropertiesHash ^ self.label.hash : nonNullPropertiesHash;
+}
- (BOOL)isEqual:(nullable id)object {
if (!object) {
diff --git a/mediapipe/tasks/ios/test/vision/image_classifier/MPPImageClassifierTests.m b/mediapipe/tasks/ios/test/vision/image_classifier/MPPImageClassifierTests.m
index 7eb93df8e..8db71a11b 100644
--- a/mediapipe/tasks/ios/test/vision/image_classifier/MPPImageClassifierTests.m
+++ b/mediapipe/tasks/ios/test/vision/image_classifier/MPPImageClassifierTests.m
@@ -452,13 +452,14 @@ static NSString *const kLiveStreamTestsDictExpectationKey = @"expectation";
[self
assertCreateImageClassifierWithOptions:options
failsWithExpectedError:
- [NSError errorWithDomain:kExpectedErrorDomain
- code:MPPTasksErrorCodeInvalidArgumentError
- userInfo:@{
- NSLocalizedDescriptionKey :
- @"The vision task is in image or video mode. The "
- @"delegate must not be set in the task's options."
- }]];
+ [NSError
+ errorWithDomain:kExpectedErrorDomain
+ code:MPPTasksErrorCodeInvalidArgumentError
+ userInfo:@{
+ NSLocalizedDescriptionKey :
+ @"The vision task is in image or video mode. The "
+ @"delegate must not be set in the task's options."
+ }]];
}
}
@@ -469,15 +470,15 @@ static NSString *const kLiveStreamTestsDictExpectationKey = @"expectation";
[self assertCreateImageClassifierWithOptions:options
failsWithExpectedError:
- [NSError
- errorWithDomain:kExpectedErrorDomain
- code:MPPTasksErrorCodeInvalidArgumentError
- userInfo:@{
- NSLocalizedDescriptionKey :
- @"The vision task is in live stream mode. An object "
- @"must be set as the delegate of the task in its "
- @"options to ensure asynchronous delivery of results."
- }]];
+ [NSError errorWithDomain:kExpectedErrorDomain
+ code:MPPTasksErrorCodeInvalidArgumentError
+ userInfo:@{
+ NSLocalizedDescriptionKey :
+ @"The vision task is in live stream mode. An "
+ @"object must be set as the delegate of the task "
+ @"in its options to ensure asynchronous delivery "
+ @"of results."
+ }]];
}
- (void)testClassifyFailsWithCallingWrongApiInImageMode {
diff --git a/mediapipe/tasks/ios/test/vision/object_detector/MPPObjectDetectorTests.m b/mediapipe/tasks/ios/test/vision/object_detector/MPPObjectDetectorTests.m
index d3b81703b..700df65a5 100644
--- a/mediapipe/tasks/ios/test/vision/object_detector/MPPObjectDetectorTests.m
+++ b/mediapipe/tasks/ios/test/vision/object_detector/MPPObjectDetectorTests.m
@@ -25,6 +25,8 @@ static NSDictionary *const kCatsAndDogsRotatedImage =
static NSString *const kExpectedErrorDomain = @"com.google.mediapipe.tasks";
static const float pixelDifferenceTolerance = 10.0f;
static const float scoreDifferenceTolerance = 0.02f;
+static NSString *const kLiveStreamTestsDictObjectDetectorKey = @"object_detector";
+static NSString *const kLiveStreamTestsDictExpectationKey = @"expectation";
#define AssertEqualErrors(error, expectedError) \
XCTAssertNotNil(error); \
@@ -58,7 +60,10 @@ static const float scoreDifferenceTolerance = 0.02f;
XCTAssertEqualWithAccuracy(boundingBox.size.height, expectedBoundingBox.size.height, \
pixelDifferenceTolerance, @"index i = %d", idx);
-@interface MPPObjectDetectorTests : XCTestCase
+@interface MPPObjectDetectorTests : XCTestCase {
+ NSDictionary *liveStreamSucceedsTestDict;
+ NSDictionary *outOfOrderTimestampTestDict;
+}
@end
@implementation MPPObjectDetectorTests
@@ -446,31 +451,28 @@ static const float scoreDifferenceTolerance = 0.02f;
#pragma mark Running Mode Tests
-- (void)testCreateObjectDetectorFailsWithResultListenerInNonLiveStreamMode {
+- (void)testCreateObjectDetectorFailsWithDelegateInNonLiveStreamMode {
MPPRunningMode runningModesToTest[] = {MPPRunningModeImage, MPPRunningModeVideo};
for (int i = 0; i < sizeof(runningModesToTest) / sizeof(runningModesToTest[0]); i++) {
MPPObjectDetectorOptions *options = [self objectDetectorOptionsWithModelName:kModelName];
options.runningMode = runningModesToTest[i];
- options.completion =
- ^(MPPObjectDetectionResult *result, NSInteger timestampInMilliseconds, NSError *error) {
- };
+ options.objectDetectorLiveStreamDelegate = self;
[self
assertCreateObjectDetectorWithOptions:options
failsWithExpectedError:
- [NSError
- errorWithDomain:kExpectedErrorDomain
- code:MPPTasksErrorCodeInvalidArgumentError
- userInfo:@{
- NSLocalizedDescriptionKey :
- @"The vision task is in image or video mode, a "
- @"user-defined result callback should not be provided."
- }]];
+ [NSError errorWithDomain:kExpectedErrorDomain
+ code:MPPTasksErrorCodeInvalidArgumentError
+ userInfo:@{
+ NSLocalizedDescriptionKey :
+ @"The vision task is in image or video mode. The "
+ @"delegate must not be set in the task's options."
+ }]];
}
}
-- (void)testCreateObjectDetectorFailsWithMissingResultListenerInLiveStreamMode {
+- (void)testCreateObjectDetectorFailsWithMissingDelegateInLiveStreamMode {
MPPObjectDetectorOptions *options = [self objectDetectorOptionsWithModelName:kModelName];
options.runningMode = MPPRunningModeLiveStream;
@@ -481,8 +483,10 @@ static const float scoreDifferenceTolerance = 0.02f;
code:MPPTasksErrorCodeInvalidArgumentError
userInfo:@{
NSLocalizedDescriptionKey :
- @"The vision task is in live stream mode, a "
- @"user-defined result callback must be provided."
+ @"The vision task is in live stream mode. An "
+ @"object must be set as the delegate of the task "
+ @"in its options to ensure asynchronous delivery "
+ @"of results."
}]];
}
@@ -563,10 +567,7 @@ static const float scoreDifferenceTolerance = 0.02f;
MPPObjectDetectorOptions *options = [self objectDetectorOptionsWithModelName:kModelName];
options.runningMode = MPPRunningModeLiveStream;
- options.completion =
- ^(MPPObjectDetectionResult *result, NSInteger timestampInMilliseconds, NSError *error) {
-
- };
+ options.objectDetectorLiveStreamDelegate = self;
MPPObjectDetector *objectDetector = [self objectDetectorWithOptionsSucceeds:options];
@@ -631,23 +632,17 @@ static const float scoreDifferenceTolerance = 0.02f;
options.maxResults = maxResults;
options.runningMode = MPPRunningModeLiveStream;
+ options.objectDetectorLiveStreamDelegate = self;
XCTestExpectation *expectation = [[XCTestExpectation alloc]
initWithDescription:@"detectWithOutOfOrderTimestampsAndLiveStream"];
expectation.expectedFulfillmentCount = 1;
- options.completion =
- ^(MPPObjectDetectionResult *result, NSInteger timestampInMilliseconds, NSError *error) {
- [self assertObjectDetectionResult:result
- isEqualToExpectedResult:
- [MPPObjectDetectorTests
- expectedDetectionResultForCatsAndDogsImageWithTimestampInMilliseconds:
- timestampInMilliseconds]
- expectedDetectionsCount:maxResults];
- [expectation fulfill];
- };
-
MPPObjectDetector *objectDetector = [self objectDetectorWithOptionsSucceeds:options];
+ liveStreamSucceedsTestDict = @{
+ kLiveStreamTestsDictObjectDetectorKey : objectDetector,
+ kLiveStreamTestsDictExpectationKey : expectation
+ };
MPPImage *image = [self imageWithFileInfo:kCatsAndDogsImage];
@@ -695,19 +690,15 @@ static const float scoreDifferenceTolerance = 0.02f;
expectation.expectedFulfillmentCount = iterationCount + 1;
expectation.inverted = YES;
- options.completion =
- ^(MPPObjectDetectionResult *result, NSInteger timestampInMilliseconds, NSError *error) {
- [self assertObjectDetectionResult:result
- isEqualToExpectedResult:
- [MPPObjectDetectorTests
- expectedDetectionResultForCatsAndDogsImageWithTimestampInMilliseconds:
- timestampInMilliseconds]
- expectedDetectionsCount:maxResults];
- [expectation fulfill];
- };
+ options.objectDetectorLiveStreamDelegate = self;
MPPObjectDetector *objectDetector = [self objectDetectorWithOptionsSucceeds:options];
+ liveStreamSucceedsTestDict = @{
+ kLiveStreamTestsDictObjectDetectorKey : objectDetector,
+ kLiveStreamTestsDictExpectationKey : expectation
+ };
+
// TODO: Mimic initialization from CMSampleBuffer as live stream mode is most likely to be used
// with the iOS camera. AVCaptureVideoDataOutput sample buffer delegates provide frames of type
// `CMSampleBuffer`.
@@ -721,4 +712,24 @@ static const float scoreDifferenceTolerance = 0.02f;
[self waitForExpectations:@[ expectation ] timeout:timeout];
}
+#pragma mark MPPObjectDetectorLiveStreamDelegate Methods
+- (void)objectDetector:(MPPObjectDetector *)objectDetector
+ didFinishDetectionWithResult:(MPPObjectDetectionResult *)objectDetectionResult
+ timestampInMilliseconds:(NSInteger)timestampInMilliseconds
+ error:(NSError *)error {
+ NSInteger maxResults = 4;
+ [self assertObjectDetectionResult:objectDetectionResult
+ isEqualToExpectedResult:
+ [MPPObjectDetectorTests
+ expectedDetectionResultForCatsAndDogsImageWithTimestampInMilliseconds:
+ timestampInMilliseconds]
+ expectedDetectionsCount:maxResults];
+
+ if (objectDetector == outOfOrderTimestampTestDict[kLiveStreamTestsDictObjectDetectorKey]) {
+ [outOfOrderTimestampTestDict[kLiveStreamTestsDictExpectationKey] fulfill];
+ } else if (objectDetector == liveStreamSucceedsTestDict[kLiveStreamTestsDictObjectDetectorKey]) {
+ [liveStreamSucceedsTestDict[kLiveStreamTestsDictExpectationKey] fulfill];
+ }
+}
+
@end
diff --git a/mediapipe/tasks/ios/vision/core/BUILD b/mediapipe/tasks/ios/vision/core/BUILD
index 328d9e892..fe0fba0ef 100644
--- a/mediapipe/tasks/ios/vision/core/BUILD
+++ b/mediapipe/tasks/ios/vision/core/BUILD
@@ -63,5 +63,10 @@ objc_library(
"//third_party/apple_frameworks:UIKit",
"@com_google_absl//absl/status:statusor",
"@ios_opencv//:OpencvFramework",
- ],
+ ] + select({
+ "@//third_party:opencv_ios_sim_arm64_source_build": ["@ios_opencv_source//:opencv_xcframework"],
+ "@//third_party:opencv_ios_sim_fat_source_build": ["@ios_opencv_source//:opencv_xcframework"],
+ "@//third_party:opencv_ios_arm64_source_build": ["@ios_opencv_source//:opencv_xcframework"],
+ "//conditions:default": [],
+ }),
)
diff --git a/mediapipe/tasks/ios/vision/object_detector/sources/MPPObjectDetector.h b/mediapipe/tasks/ios/vision/object_detector/sources/MPPObjectDetector.h
index 4443f56d1..ae18bf58d 100644
--- a/mediapipe/tasks/ios/vision/object_detector/sources/MPPObjectDetector.h
+++ b/mediapipe/tasks/ios/vision/object_detector/sources/MPPObjectDetector.h
@@ -96,6 +96,15 @@ NS_SWIFT_NAME(ObjectDetector)
* `MPPImage`. Only use this method when the `MPPObjectDetector` is created with
* `MPPRunningModeImage`.
*
+ * This method supports classification of RGBA images. If your `MPPImage` has a source type of
+ * `MPPImageSourceTypePixelBuffer` or `MPPImageSourceTypeSampleBuffer`, the underlying pixel buffer
+ * must have one of the following pixel format types:
+ * 1. kCVPixelFormatType_32BGRA
+ * 2. kCVPixelFormatType_32RGBA
+ *
+ * If your `MPPImage` has a source type of `MPPImageSourceTypeImage` ensure that the color space is
+ * RGB with an Alpha channel.
+ *
* @param image The `MPPImage` on which object detection is to be performed.
* @param error An optional error parameter populated when there is an error in performing object
* detection on the input image.
@@ -115,6 +124,15 @@ NS_SWIFT_NAME(ObjectDetector)
* the provided `MPPImage`. Only use this method when the `MPPObjectDetector` is created with
* `MPPRunningModeVideo`.
*
+ * This method supports classification of RGBA images. If your `MPPImage` has a source type of
+ * `MPPImageSourceTypePixelBuffer` or `MPPImageSourceTypeSampleBuffer`, the underlying pixel buffer
+ * must have one of the following pixel format types:
+ * 1. kCVPixelFormatType_32BGRA
+ * 2. kCVPixelFormatType_32RGBA
+ *
+ * If your `MPPImage` has a source type of `MPPImageSourceTypeImage` ensure that the color space is
+ * RGB with an Alpha channel.
+ *
* @param image The `MPPImage` on which object detection is to be performed.
* @param timestampInMilliseconds The video frame's timestamp (in milliseconds). The input
* timestamps must be monotonically increasing.
@@ -135,12 +153,28 @@ NS_SWIFT_NAME(ObjectDetector)
* Sends live stream image data of type `MPPImage` to perform object detection using the whole
* image as region of interest. Rotation will be applied according to the `orientation` property of
* the provided `MPPImage`. Only use this method when the `MPPObjectDetector` is created with
- * `MPPRunningModeLiveStream`. Results are provided asynchronously via the `completion` callback
- * provided in the `MPPObjectDetectorOptions`.
+ * `MPPRunningModeLiveStream`.
+ *
+ * The object which needs to be continuously notified of the available results of object
+ * detection must confirm to `MPPObjectDetectorLiveStreamDelegate` protocol and implement the
+ * `objectDetector:didFinishDetectionWithResult:timestampInMilliseconds:error:` delegate method.
*
* It's required to provide a timestamp (in milliseconds) to indicate when the input image is sent
* to the object detector. The input timestamps must be monotonically increasing.
*
+ * This method supports classification of RGBA images. If your `MPPImage` has a source type of
+ * `MPPImageSourceTypePixelBuffer` or `MPPImageSourceTypeSampleBuffer`, the underlying pixel buffer
+ * must have one of the following pixel format types:
+ * 1. kCVPixelFormatType_32BGRA
+ * 2. kCVPixelFormatType_32RGBA
+ *
+ * If the input `MPPImage` has a source type of `MPPImageSourceTypeImage` ensure that the color
+ * space is RGB with an Alpha channel.
+ *
+ * If this method is used for classifying live camera frames using `AVFoundation`, ensure that you
+ * request `AVCaptureVideoDataOutput` to output frames in `kCMPixelFormat_32RGBA` using its
+ * `videoSettings` property.
+ *
* @param image A live stream image data of type `MPPImage` on which object detection is to be
* performed.
* @param timestampInMilliseconds The timestamp (in milliseconds) which indicates when the input
diff --git a/mediapipe/tasks/ios/vision/object_detector/sources/MPPObjectDetector.mm b/mediapipe/tasks/ios/vision/object_detector/sources/MPPObjectDetector.mm
index f0914cdb1..a5b4077be 100644
--- a/mediapipe/tasks/ios/vision/object_detector/sources/MPPObjectDetector.mm
+++ b/mediapipe/tasks/ios/vision/object_detector/sources/MPPObjectDetector.mm
@@ -37,8 +37,8 @@ static NSString *const kImageOutStreamName = @"image_out";
static NSString *const kImageTag = @"IMAGE";
static NSString *const kNormRectStreamName = @"norm_rect_in";
static NSString *const kNormRectTag = @"NORM_RECT";
-
static NSString *const kTaskGraphName = @"mediapipe.tasks.vision.ObjectDetectorGraph";
+static NSString *const kTaskName = @"objectDetector";
#define InputPacketMap(imagePacket, normalizedRectPacket) \
{ \
@@ -51,6 +51,7 @@ static NSString *const kTaskGraphName = @"mediapipe.tasks.vision.ObjectDetectorG
/** iOS Vision Task Runner */
MPPVisionTaskRunner *_visionTaskRunner;
}
+@property(nonatomic, weak) id objectDetectorLiveStreamDelegate;
@end
@implementation MPPObjectDetector
@@ -78,11 +79,37 @@ static NSString *const kTaskGraphName = @"mediapipe.tasks.vision.ObjectDetectorG
PacketsCallback packetsCallback = nullptr;
- if (options.completion) {
+ if (options.objectDetectorLiveStreamDelegate) {
+ _objectDetectorLiveStreamDelegate = options.objectDetectorLiveStreamDelegate;
+
+ // Capturing `self` as weak in order to avoid `self` being kept in memory
+ // and cause a retain cycle, after self is set to `nil`.
+ MPPObjectDetector *__weak weakSelf = self;
+
+ // Create a private serial dispatch queue in which the delegate method will be called
+ // asynchronously. This is to ensure that if the client performs a long running operation in
+ // the delegate method, the queue on which the C++ callbacks is invoked is not blocked and is
+ // freed up to continue with its operations.
+ dispatch_queue_t callbackQueue = dispatch_queue_create(
+ [MPPVisionTaskRunner uniqueDispatchQueueNameWithSuffix:kTaskName], NULL);
packetsCallback = [=](absl::StatusOr statusOrPackets) {
+ if (!weakSelf) {
+ return;
+ }
+ if (![weakSelf.objectDetectorLiveStreamDelegate
+ respondsToSelector:@selector
+ (objectDetector:didFinishDetectionWithResult:timestampInMilliseconds:error:)]) {
+ return;
+ }
+
NSError *callbackError = nil;
if (![MPPCommonUtils checkCppError:statusOrPackets.status() toError:&callbackError]) {
- options.completion(nil, Timestamp::Unset().Value(), callbackError);
+ dispatch_async(callbackQueue, ^{
+ [weakSelf.objectDetectorLiveStreamDelegate objectDetector:weakSelf
+ didFinishDetectionWithResult:nil
+ timestampInMilliseconds:Timestamp::Unset().Value()
+ error:callbackError];
+ });
return;
}
@@ -95,10 +122,15 @@ static NSString *const kTaskGraphName = @"mediapipe.tasks.vision.ObjectDetectorG
objectDetectionResultWithDetectionsPacket:statusOrPackets.value()[kDetectionsStreamName
.cppString]];
- options.completion(result,
- outputPacketMap[kImageOutStreamName.cppString].Timestamp().Value() /
- kMicroSecondsPerMilliSecond,
- callbackError);
+ NSInteger timeStampInMilliseconds =
+ outputPacketMap[kImageOutStreamName.cppString].Timestamp().Value() /
+ kMicroSecondsPerMilliSecond;
+ dispatch_async(callbackQueue, ^{
+ [weakSelf.objectDetectorLiveStreamDelegate objectDetector:weakSelf
+ didFinishDetectionWithResult:result
+ timestampInMilliseconds:timeStampInMilliseconds
+ error:callbackError];
+ });
};
}
@@ -112,6 +144,7 @@ static NSString *const kTaskGraphName = @"mediapipe.tasks.vision.ObjectDetectorG
return nil;
}
}
+
return self;
}
@@ -224,5 +257,4 @@ static NSString *const kTaskGraphName = @"mediapipe.tasks.vision.ObjectDetectorG
return [_visionTaskRunner processLiveStreamPacketMap:inputPacketMap.value() error:error];
}
-
@end
diff --git a/mediapipe/tasks/ios/vision/object_detector/sources/MPPObjectDetectorOptions.h b/mediapipe/tasks/ios/vision/object_detector/sources/MPPObjectDetectorOptions.h
index 79bc9baa6..c91e170c9 100644
--- a/mediapipe/tasks/ios/vision/object_detector/sources/MPPObjectDetectorOptions.h
+++ b/mediapipe/tasks/ios/vision/object_detector/sources/MPPObjectDetectorOptions.h
@@ -20,19 +20,70 @@
NS_ASSUME_NONNULL_BEGIN
+@class MPPObjectDetector;
+
+/**
+ * This protocol defines an interface for the delegates of `MPPObjectDetector` object to receive
+ * results of performing asynchronous object detection on images (i.e, when `runningMode` =
+ * `MPPRunningModeLiveStream`).
+ *
+ * The delegate of `MPPObjectDetector` must adopt `MPPObjectDetectorLiveStreamDelegate` protocol.
+ * The methods in this protocol are optional.
+ */
+NS_SWIFT_NAME(ObjectDetectorLiveStreamDelegate)
+@protocol MPPObjectDetectorLiveStreamDelegate
+
+@optional
+
+/**
+ * This method notifies a delegate that the results of asynchronous object detection of
+ * an image submitted to the `MPPObjectDetector` is available.
+ *
+ * This method is called on a private serial dispatch queue created by the `MPPObjectDetector`
+ * for performing the asynchronous delegates calls.
+ *
+ * @param objectDetector The object detector which performed the object detection.
+ * This is useful to test equality when there are multiple instances of `MPPObjectDetector`.
+ * @param result The `MPPObjectDetectionResult` object that contains a list of detections, each
+ * detection has a bounding box that is expressed in the unrotated input frame of reference
+ * coordinates system, i.e. in `[0,image_width) x [0,image_height)`, which are the dimensions of the
+ * underlying image data.
+ * @param timestampInMilliseconds The timestamp (in milliseconds) which indicates when the input
+ * image was sent to the object detector.
+ * @param error An optional error parameter populated when there is an error in performing object
+ * detection on the input live stream image data.
+ *
+ */
+- (void)objectDetector:(MPPObjectDetector *)objectDetector
+ didFinishDetectionWithResult:(nullable MPPObjectDetectionResult *)result
+ timestampInMilliseconds:(NSInteger)timestampInMilliseconds
+ error:(nullable NSError *)error
+ NS_SWIFT_NAME(objectDetector(_:didFinishDetection:timestampInMilliseconds:error:));
+@end
+
/** Options for setting up a `MPPObjectDetector`. */
NS_SWIFT_NAME(ObjectDetectorOptions)
@interface MPPObjectDetectorOptions : MPPTaskOptions
+/**
+ * Running mode of the object detector task. Defaults to `MPPRunningModeImage`.
+ * `MPPImageClassifier` can be created with one of the following running modes:
+ * 1. `MPPRunningModeImage`: The mode for performing object detection on single image inputs.
+ * 2. `MPPRunningModeVideo`: The mode for performing object detection on the decoded frames of a
+ * video.
+ * 3. `MPPRunningModeLiveStream`: The mode for performing object detection on a live stream of
+ * input data, such as from the camera.
+ */
@property(nonatomic) MPPRunningMode runningMode;
/**
- * The user-defined result callback for processing live stream data. The result callback should only
- * be specified when the running mode is set to the live stream mode.
- * TODO: Add parameter `MPPImage` in the callback.
+ * An object that confirms to `MPPObjectDetectorLiveStreamDelegate` protocol. This object must
+ * implement `objectDetector:didFinishDetectionWithResult:timestampInMilliseconds:error:` to receive
+ * the results of performing asynchronous object detection on images (i.e, when `runningMode` =
+ * `MPPRunningModeLiveStream`).
*/
-@property(nonatomic, copy) void (^completion)
- (MPPObjectDetectionResult *__nullable result, NSInteger timestampMs, NSError *error);
+@property(nonatomic, weak, nullable) id
+ objectDetectorLiveStreamDelegate;
/**
* The locale to use for display names specified through the TFLite Model Metadata, if any. Defaults
diff --git a/mediapipe/tasks/ios/vision/object_detector/sources/MPPObjectDetectorOptions.m b/mediapipe/tasks/ios/vision/object_detector/sources/MPPObjectDetectorOptions.m
index 73f8ce5b5..b93a6b30b 100644
--- a/mediapipe/tasks/ios/vision/object_detector/sources/MPPObjectDetectorOptions.m
+++ b/mediapipe/tasks/ios/vision/object_detector/sources/MPPObjectDetectorOptions.m
@@ -33,7 +33,7 @@
objectDetectorOptions.categoryDenylist = self.categoryDenylist;
objectDetectorOptions.categoryAllowlist = self.categoryAllowlist;
objectDetectorOptions.displayNamesLocale = self.displayNamesLocale;
- objectDetectorOptions.completion = self.completion;
+ objectDetectorOptions.objectDetectorLiveStreamDelegate = self.objectDetectorLiveStreamDelegate;
return objectDetectorOptions;
}
diff --git a/mediapipe/tasks/java/com/google/mediapipe/tasks/vision/objectdetector/ObjectDetector.java b/mediapipe/tasks/java/com/google/mediapipe/tasks/vision/objectdetector/ObjectDetector.java
index 5287ba325..d9a36cce7 100644
--- a/mediapipe/tasks/java/com/google/mediapipe/tasks/vision/objectdetector/ObjectDetector.java
+++ b/mediapipe/tasks/java/com/google/mediapipe/tasks/vision/objectdetector/ObjectDetector.java
@@ -39,6 +39,7 @@ import com.google.mediapipe.formats.proto.DetectionProto.Detection;
import java.io.File;
import java.io.IOException;
import java.nio.ByteBuffer;
+import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collections;
import java.util.List;
@@ -170,6 +171,13 @@ public final class ObjectDetector extends BaseVisionTaskApi {
new OutputHandler.OutputPacketConverter() {
@Override
public ObjectDetectionResult convertToTaskResult(List packets) {
+ // If there is no object detected in the image, just returns empty lists.
+ if (packets.get(DETECTIONS_OUT_STREAM_INDEX).isEmpty()) {
+ return ObjectDetectionResult.create(
+ new ArrayList<>(),
+ BaseVisionTaskApi.generateResultTimestampMs(
+ detectorOptions.runningMode(), packets.get(DETECTIONS_OUT_STREAM_INDEX)));
+ }
return ObjectDetectionResult.create(
PacketGetter.getProtoVector(
packets.get(DETECTIONS_OUT_STREAM_INDEX), Detection.parser()),
diff --git a/mediapipe/tasks/java/com/google/mediapipe/tasks/vision/poselandmarker/PoseLandmarker.java b/mediapipe/tasks/java/com/google/mediapipe/tasks/vision/poselandmarker/PoseLandmarker.java
index 2ebdc0732..fa2d3da17 100644
--- a/mediapipe/tasks/java/com/google/mediapipe/tasks/vision/poselandmarker/PoseLandmarker.java
+++ b/mediapipe/tasks/java/com/google/mediapipe/tasks/vision/poselandmarker/PoseLandmarker.java
@@ -79,8 +79,7 @@ public final class PoseLandmarker extends BaseVisionTaskApi {
private static final int LANDMARKS_OUT_STREAM_INDEX = 0;
private static final int WORLD_LANDMARKS_OUT_STREAM_INDEX = 1;
- private static final int AUXILIARY_LANDMARKS_OUT_STREAM_INDEX = 2;
- private static final int IMAGE_OUT_STREAM_INDEX = 3;
+ private static final int IMAGE_OUT_STREAM_INDEX = 2;
private static int segmentationMasksOutStreamIndex = -1;
private static final String TASK_GRAPH_NAME =
"mediapipe.tasks.vision.pose_landmarker.PoseLandmarkerGraph";
@@ -145,7 +144,6 @@ public final class PoseLandmarker extends BaseVisionTaskApi {
List outputStreams = new ArrayList<>();
outputStreams.add("NORM_LANDMARKS:pose_landmarks");
outputStreams.add("WORLD_LANDMARKS:world_landmarks");
- outputStreams.add("AUXILIARY_LANDMARKS:auxiliary_landmarks");
outputStreams.add("IMAGE:image_out");
if (landmarkerOptions.outputSegmentationMasks()) {
outputStreams.add("SEGMENTATION_MASK:segmentation_masks");
@@ -161,7 +159,6 @@ public final class PoseLandmarker extends BaseVisionTaskApi {
// If there is no poses detected in the image, just returns empty lists.
if (packets.get(LANDMARKS_OUT_STREAM_INDEX).isEmpty()) {
return PoseLandmarkerResult.create(
- new ArrayList<>(),
new ArrayList<>(),
new ArrayList<>(),
Optional.empty(),
@@ -179,9 +176,6 @@ public final class PoseLandmarker extends BaseVisionTaskApi {
packets.get(LANDMARKS_OUT_STREAM_INDEX), NormalizedLandmarkList.parser()),
PacketGetter.getProtoVector(
packets.get(WORLD_LANDMARKS_OUT_STREAM_INDEX), LandmarkList.parser()),
- PacketGetter.getProtoVector(
- packets.get(AUXILIARY_LANDMARKS_OUT_STREAM_INDEX),
- NormalizedLandmarkList.parser()),
segmentedMasks,
BaseVisionTaskApi.generateResultTimestampMs(
landmarkerOptions.runningMode(), packets.get(LANDMARKS_OUT_STREAM_INDEX)));
diff --git a/mediapipe/tasks/java/com/google/mediapipe/tasks/vision/poselandmarker/PoseLandmarkerResult.java b/mediapipe/tasks/java/com/google/mediapipe/tasks/vision/poselandmarker/PoseLandmarkerResult.java
index 488f2a556..389e78266 100644
--- a/mediapipe/tasks/java/com/google/mediapipe/tasks/vision/poselandmarker/PoseLandmarkerResult.java
+++ b/mediapipe/tasks/java/com/google/mediapipe/tasks/vision/poselandmarker/PoseLandmarkerResult.java
@@ -40,7 +40,6 @@ public abstract class PoseLandmarkerResult implements TaskResult {
static PoseLandmarkerResult create(
List landmarksProto,
List worldLandmarksProto,
- List auxiliaryLandmarksProto,
Optional> segmentationMasksData,
long timestampMs) {
@@ -52,7 +51,6 @@ public abstract class PoseLandmarkerResult implements TaskResult {
List> multiPoseLandmarks = new ArrayList<>();
List> multiPoseWorldLandmarks = new ArrayList<>();
- List> multiPoseAuxiliaryLandmarks = new ArrayList<>();
for (LandmarkProto.NormalizedLandmarkList poseLandmarksProto : landmarksProto) {
List poseLandmarks = new ArrayList<>();
multiPoseLandmarks.add(poseLandmarks);
@@ -75,24 +73,10 @@ public abstract class PoseLandmarkerResult implements TaskResult {
poseWorldLandmarkProto.getZ()));
}
}
- for (LandmarkProto.NormalizedLandmarkList poseAuxiliaryLandmarksProto :
- auxiliaryLandmarksProto) {
- List poseAuxiliaryLandmarks = new ArrayList<>();
- multiPoseAuxiliaryLandmarks.add(poseAuxiliaryLandmarks);
- for (LandmarkProto.NormalizedLandmark poseAuxiliaryLandmarkProto :
- poseAuxiliaryLandmarksProto.getLandmarkList()) {
- poseAuxiliaryLandmarks.add(
- NormalizedLandmark.create(
- poseAuxiliaryLandmarkProto.getX(),
- poseAuxiliaryLandmarkProto.getY(),
- poseAuxiliaryLandmarkProto.getZ()));
- }
- }
return new AutoValue_PoseLandmarkerResult(
timestampMs,
Collections.unmodifiableList(multiPoseLandmarks),
Collections.unmodifiableList(multiPoseWorldLandmarks),
- Collections.unmodifiableList(multiPoseAuxiliaryLandmarks),
multiPoseSegmentationMasks);
}
@@ -105,9 +89,6 @@ public abstract class PoseLandmarkerResult implements TaskResult {
/** Pose landmarks in world coordniates of detected poses. */
public abstract List> worldLandmarks();
- /** Pose auxiliary landmarks. */
- public abstract List> auxiliaryLandmarks();
-
/** Pose segmentation masks. */
public abstract Optional> segmentationMasks();
}
diff --git a/mediapipe/tasks/javatests/com/google/mediapipe/tasks/vision/objectdetector/ObjectDetectorTest.java b/mediapipe/tasks/javatests/com/google/mediapipe/tasks/vision/objectdetector/ObjectDetectorTest.java
index 33aa025d2..20ddfcef6 100644
--- a/mediapipe/tasks/javatests/com/google/mediapipe/tasks/vision/objectdetector/ObjectDetectorTest.java
+++ b/mediapipe/tasks/javatests/com/google/mediapipe/tasks/vision/objectdetector/ObjectDetectorTest.java
@@ -45,6 +45,7 @@ import org.junit.runners.Suite.SuiteClasses;
@SuiteClasses({ObjectDetectorTest.General.class, ObjectDetectorTest.RunningModeTest.class})
public class ObjectDetectorTest {
private static final String MODEL_FILE = "coco_ssd_mobilenet_v1_1.0_quant_2018_06_29.tflite";
+ private static final String NO_NMS_MODEL_FILE = "efficientdet_lite0_fp16_no_nms.tflite";
private static final String CAT_AND_DOG_IMAGE = "cats_and_dogs.jpg";
private static final String CAT_AND_DOG_ROTATED_IMAGE = "cats_and_dogs_rotated.jpg";
private static final int IMAGE_WIDTH = 1200;
@@ -109,6 +110,20 @@ public class ObjectDetectorTest {
assertContainsOnlyCat(results, CAT_BOUNDING_BOX, CAT_SCORE);
}
+ @Test
+ public void detect_succeedsWithNoObjectDetected() throws Exception {
+ ObjectDetectorOptions options =
+ ObjectDetectorOptions.builder()
+ .setBaseOptions(BaseOptions.builder().setModelAssetPath(NO_NMS_MODEL_FILE).build())
+ .setScoreThreshold(1.0f)
+ .build();
+ ObjectDetector objectDetector =
+ ObjectDetector.createFromOptions(ApplicationProvider.getApplicationContext(), options);
+ ObjectDetectionResult results = objectDetector.detect(getImageFromAsset(CAT_AND_DOG_IMAGE));
+ // The score threshold should block objects.
+ assertThat(results.detections()).isEmpty();
+ }
+
@Test
public void detect_succeedsWithAllowListOption() throws Exception {
ObjectDetectorOptions options =
diff --git a/mediapipe/tasks/javatests/com/google/mediapipe/tasks/vision/poselandmarker/PoseLandmarkerTest.java b/mediapipe/tasks/javatests/com/google/mediapipe/tasks/vision/poselandmarker/PoseLandmarkerTest.java
index 1d0b1decd..7adef9e27 100644
--- a/mediapipe/tasks/javatests/com/google/mediapipe/tasks/vision/poselandmarker/PoseLandmarkerTest.java
+++ b/mediapipe/tasks/javatests/com/google/mediapipe/tasks/vision/poselandmarker/PoseLandmarkerTest.java
@@ -330,7 +330,6 @@ public class PoseLandmarkerTest {
return PoseLandmarkerResult.create(
Arrays.asList(landmarksDetectionResultProto.getLandmarks()),
Arrays.asList(landmarksDetectionResultProto.getWorldLandmarks()),
- Arrays.asList(),
Optional.empty(),
/* timestampMs= */ 0);
}
diff --git a/mediapipe/tasks/python/test/vision/object_detector_test.py b/mediapipe/tasks/python/test/vision/object_detector_test.py
index 7878e7f52..adeddafd7 100644
--- a/mediapipe/tasks/python/test/vision/object_detector_test.py
+++ b/mediapipe/tasks/python/test/vision/object_detector_test.py
@@ -44,6 +44,7 @@ _ObjectDetectorOptions = object_detector.ObjectDetectorOptions
_RUNNING_MODE = running_mode_module.VisionTaskRunningMode
_MODEL_FILE = 'coco_ssd_mobilenet_v1_1.0_quant_2018_06_29.tflite'
+_NO_NMS_MODEL_FILE = 'efficientdet_lite0_fp16_no_nms.tflite'
_IMAGE_FILE = 'cats_and_dogs.jpg'
_EXPECTED_DETECTION_RESULT = _DetectionResult(
detections=[
@@ -304,7 +305,7 @@ class ObjectDetectorTest(parameterized.TestCase):
with _ObjectDetector.create_from_options(options) as unused_detector:
pass
- def test_empty_detection_outputs(self):
+ def test_empty_detection_outputs_with_in_model_nms(self):
options = _ObjectDetectorOptions(
base_options=_BaseOptions(model_asset_path=self.model_path),
score_threshold=1,
@@ -314,6 +315,18 @@ class ObjectDetectorTest(parameterized.TestCase):
detection_result = detector.detect(self.test_image)
self.assertEmpty(detection_result.detections)
+ def test_empty_detection_outputs_without_in_model_nms(self):
+ options = _ObjectDetectorOptions(
+ base_options=_BaseOptions(
+ model_asset_path=test_utils.get_test_data_path(
+ os.path.join(_TEST_DATA_DIR, _NO_NMS_MODEL_FILE))),
+ score_threshold=1,
+ )
+ with _ObjectDetector.create_from_options(options) as detector:
+ # Performs object detection on the input.
+ detection_result = detector.detect(self.test_image)
+ self.assertEmpty(detection_result.detections)
+
def test_missing_result_callback(self):
options = _ObjectDetectorOptions(
base_options=_BaseOptions(model_asset_path=self.model_path),
diff --git a/mediapipe/tasks/python/test/vision/pose_landmarker_test.py b/mediapipe/tasks/python/test/vision/pose_landmarker_test.py
index 1b73ecdfb..fff6879cc 100644
--- a/mediapipe/tasks/python/test/vision/pose_landmarker_test.py
+++ b/mediapipe/tasks/python/test/vision/pose_landmarker_test.py
@@ -74,7 +74,6 @@ def _get_expected_pose_landmarker_result(
return PoseLandmarkerResult(
pose_landmarks=[landmarks_detection_result.landmarks],
pose_world_landmarks=[],
- pose_auxiliary_landmarks=[],
)
@@ -296,7 +295,6 @@ class PoseLandmarkerTest(parameterized.TestCase):
# Comparing results.
self.assertEmpty(detection_result.pose_landmarks)
self.assertEmpty(detection_result.pose_world_landmarks)
- self.assertEmpty(detection_result.pose_auxiliary_landmarks)
def test_missing_result_callback(self):
options = _PoseLandmarkerOptions(
@@ -391,7 +389,7 @@ class PoseLandmarkerTest(parameterized.TestCase):
True,
_get_expected_pose_landmarker_result(_POSE_LANDMARKS),
),
- (_BURGER_IMAGE, 0, False, PoseLandmarkerResult([], [], [])),
+ (_BURGER_IMAGE, 0, False, PoseLandmarkerResult([], [])),
)
def test_detect_for_video(
self, image_path, rotation, output_segmentation_masks, expected_result
@@ -473,7 +471,7 @@ class PoseLandmarkerTest(parameterized.TestCase):
True,
_get_expected_pose_landmarker_result(_POSE_LANDMARKS),
),
- (_BURGER_IMAGE, 0, False, PoseLandmarkerResult([], [], [])),
+ (_BURGER_IMAGE, 0, False, PoseLandmarkerResult([], [])),
)
def test_detect_async_calls(
self, image_path, rotation, output_segmentation_masks, expected_result
diff --git a/mediapipe/tasks/python/vision/object_detector.py b/mediapipe/tasks/python/vision/object_detector.py
index 3bdd1b5de..380d57c22 100644
--- a/mediapipe/tasks/python/vision/object_detector.py
+++ b/mediapipe/tasks/python/vision/object_detector.py
@@ -198,6 +198,15 @@ class ObjectDetector(base_vision_task_api.BaseVisionTaskApi):
def packets_callback(output_packets: Mapping[str, packet_module.Packet]):
if output_packets[_IMAGE_OUT_STREAM_NAME].is_empty():
return
+ image = packet_getter.get_image(output_packets[_IMAGE_OUT_STREAM_NAME])
+ if output_packets[_DETECTIONS_OUT_STREAM_NAME].is_empty():
+ empty_packet = output_packets[_DETECTIONS_OUT_STREAM_NAME]
+ options.result_callback(
+ ObjectDetectorResult([]),
+ image,
+ empty_packet.timestamp.value // _MICRO_SECONDS_PER_MILLISECOND,
+ )
+ return
detection_proto_list = packet_getter.get_proto_list(
output_packets[_DETECTIONS_OUT_STREAM_NAME]
)
@@ -207,7 +216,6 @@ class ObjectDetector(base_vision_task_api.BaseVisionTaskApi):
for result in detection_proto_list
]
)
- image = packet_getter.get_image(output_packets[_IMAGE_OUT_STREAM_NAME])
timestamp = output_packets[_IMAGE_OUT_STREAM_NAME].timestamp
options.result_callback(detection_result, image, timestamp)
@@ -266,6 +274,8 @@ class ObjectDetector(base_vision_task_api.BaseVisionTaskApi):
normalized_rect.to_pb2()
),
})
+ if output_packets[_DETECTIONS_OUT_STREAM_NAME].is_empty():
+ return ObjectDetectorResult([])
detection_proto_list = packet_getter.get_proto_list(
output_packets[_DETECTIONS_OUT_STREAM_NAME]
)
@@ -315,6 +325,8 @@ class ObjectDetector(base_vision_task_api.BaseVisionTaskApi):
normalized_rect.to_pb2()
).at(timestamp_ms * _MICRO_SECONDS_PER_MILLISECOND),
})
+ if output_packets[_DETECTIONS_OUT_STREAM_NAME].is_empty():
+ return ObjectDetectorResult([])
detection_proto_list = packet_getter.get_proto_list(
output_packets[_DETECTIONS_OUT_STREAM_NAME]
)
diff --git a/mediapipe/tasks/python/vision/pose_landmarker.py b/mediapipe/tasks/python/vision/pose_landmarker.py
index 3ff7edb0a..8f67e6739 100644
--- a/mediapipe/tasks/python/vision/pose_landmarker.py
+++ b/mediapipe/tasks/python/vision/pose_landmarker.py
@@ -49,8 +49,6 @@ _NORM_LANDMARKS_STREAM_NAME = 'norm_landmarks'
_NORM_LANDMARKS_TAG = 'NORM_LANDMARKS'
_POSE_WORLD_LANDMARKS_STREAM_NAME = 'world_landmarks'
_POSE_WORLD_LANDMARKS_TAG = 'WORLD_LANDMARKS'
-_POSE_AUXILIARY_LANDMARKS_STREAM_NAME = 'auxiliary_landmarks'
-_POSE_AUXILIARY_LANDMARKS_TAG = 'AUXILIARY_LANDMARKS'
_TASK_GRAPH_NAME = 'mediapipe.tasks.vision.pose_landmarker.PoseLandmarkerGraph'
_MICRO_SECONDS_PER_MILLISECOND = 1000
@@ -62,14 +60,11 @@ class PoseLandmarkerResult:
Attributes:
pose_landmarks: Detected pose landmarks in normalized image coordinates.
pose_world_landmarks: Detected pose landmarks in world coordinates.
- pose_auxiliary_landmarks: Detected auxiliary landmarks, used for deriving
- ROI for next frame.
segmentation_masks: Optional segmentation masks for pose.
"""
pose_landmarks: List[List[landmark_module.NormalizedLandmark]]
pose_world_landmarks: List[List[landmark_module.Landmark]]
- pose_auxiliary_landmarks: List[List[landmark_module.NormalizedLandmark]]
segmentation_masks: Optional[List[image_module.Image]] = None
@@ -77,7 +72,7 @@ def _build_landmarker_result(
output_packets: Mapping[str, packet_module.Packet]
) -> PoseLandmarkerResult:
"""Constructs a `PoseLandmarkerResult` from output packets."""
- pose_landmarker_result = PoseLandmarkerResult([], [], [])
+ pose_landmarker_result = PoseLandmarkerResult([], [])
if _SEGMENTATION_MASK_STREAM_NAME in output_packets:
pose_landmarker_result.segmentation_masks = packet_getter.get_image_list(
@@ -90,9 +85,6 @@ def _build_landmarker_result(
pose_world_landmarks_proto_list = packet_getter.get_proto_list(
output_packets[_POSE_WORLD_LANDMARKS_STREAM_NAME]
)
- pose_auxiliary_landmarks_proto_list = packet_getter.get_proto_list(
- output_packets[_POSE_AUXILIARY_LANDMARKS_STREAM_NAME]
- )
for proto in pose_landmarks_proto_list:
pose_landmarks = landmark_pb2.NormalizedLandmarkList()
@@ -116,19 +108,6 @@ def _build_landmarker_result(
pose_world_landmarks_list
)
- for proto in pose_auxiliary_landmarks_proto_list:
- pose_auxiliary_landmarks = landmark_pb2.NormalizedLandmarkList()
- pose_auxiliary_landmarks.MergeFrom(proto)
- pose_auxiliary_landmarks_list = []
- for pose_auxiliary_landmark in pose_auxiliary_landmarks.landmark:
- pose_auxiliary_landmarks_list.append(
- landmark_module.NormalizedLandmark.create_from_pb2(
- pose_auxiliary_landmark
- )
- )
- pose_landmarker_result.pose_auxiliary_landmarks.append(
- pose_auxiliary_landmarks_list
- )
return pose_landmarker_result
@@ -301,7 +280,7 @@ class PoseLandmarker(base_vision_task_api.BaseVisionTaskApi):
if output_packets[_NORM_LANDMARKS_STREAM_NAME].is_empty():
empty_packet = output_packets[_NORM_LANDMARKS_STREAM_NAME]
options.result_callback(
- PoseLandmarkerResult([], [], []),
+ PoseLandmarkerResult([], []),
image,
empty_packet.timestamp.value // _MICRO_SECONDS_PER_MILLISECOND,
)
@@ -320,10 +299,6 @@ class PoseLandmarker(base_vision_task_api.BaseVisionTaskApi):
':'.join(
[_POSE_WORLD_LANDMARKS_TAG, _POSE_WORLD_LANDMARKS_STREAM_NAME]
),
- ':'.join([
- _POSE_AUXILIARY_LANDMARKS_TAG,
- _POSE_AUXILIARY_LANDMARKS_STREAM_NAME,
- ]),
':'.join([_IMAGE_TAG, _IMAGE_OUT_STREAM_NAME]),
]
@@ -382,7 +357,7 @@ class PoseLandmarker(base_vision_task_api.BaseVisionTaskApi):
})
if output_packets[_NORM_LANDMARKS_STREAM_NAME].is_empty():
- return PoseLandmarkerResult([], [], [])
+ return PoseLandmarkerResult([], [])
return _build_landmarker_result(output_packets)
@@ -427,7 +402,7 @@ class PoseLandmarker(base_vision_task_api.BaseVisionTaskApi):
})
if output_packets[_NORM_LANDMARKS_STREAM_NAME].is_empty():
- return PoseLandmarkerResult([], [], [])
+ return PoseLandmarkerResult([], [])
return _build_landmarker_result(output_packets)
diff --git a/mediapipe/tasks/web/vision/BUILD b/mediapipe/tasks/web/vision/BUILD
index 503db3252..10e98de8b 100644
--- a/mediapipe/tasks/web/vision/BUILD
+++ b/mediapipe/tasks/web/vision/BUILD
@@ -21,6 +21,7 @@ VISION_LIBS = [
"//mediapipe/tasks/web/core:fileset_resolver",
"//mediapipe/tasks/web/vision/core:drawing_utils",
"//mediapipe/tasks/web/vision/core:image",
+ "//mediapipe/tasks/web/vision/core:mask",
"//mediapipe/tasks/web/vision/face_detector",
"//mediapipe/tasks/web/vision/face_landmarker",
"//mediapipe/tasks/web/vision/face_stylizer",
diff --git a/mediapipe/tasks/web/vision/core/BUILD b/mediapipe/tasks/web/vision/core/BUILD
index c53247ba7..325603353 100644
--- a/mediapipe/tasks/web/vision/core/BUILD
+++ b/mediapipe/tasks/web/vision/core/BUILD
@@ -41,7 +41,10 @@ mediapipe_ts_library(
mediapipe_ts_library(
name = "image",
- srcs = ["image.ts"],
+ srcs = [
+ "image.ts",
+ "image_shader_context.ts",
+ ],
)
mediapipe_ts_library(
@@ -56,12 +59,34 @@ jasmine_node_test(
deps = [":image_test_lib"],
)
+mediapipe_ts_library(
+ name = "mask",
+ srcs = ["mask.ts"],
+ deps = [":image"],
+)
+
+mediapipe_ts_library(
+ name = "mask_test_lib",
+ testonly = True,
+ srcs = ["mask.test.ts"],
+ deps = [
+ ":image",
+ ":mask",
+ ],
+)
+
+jasmine_node_test(
+ name = "mask_test",
+ deps = [":mask_test_lib"],
+)
+
mediapipe_ts_library(
name = "vision_task_runner",
srcs = ["vision_task_runner.ts"],
deps = [
":image",
":image_processing_options",
+ ":mask",
":vision_task_options",
"//mediapipe/framework/formats:rect_jspb_proto",
"//mediapipe/tasks/web/core",
@@ -91,7 +116,6 @@ mediapipe_ts_library(
mediapipe_ts_library(
name = "render_utils",
srcs = ["render_utils.ts"],
- deps = [":image"],
)
jasmine_node_test(
diff --git a/mediapipe/tasks/web/vision/core/image.test.ts b/mediapipe/tasks/web/vision/core/image.test.ts
index 73eb44240..e92debc2e 100644
--- a/mediapipe/tasks/web/vision/core/image.test.ts
+++ b/mediapipe/tasks/web/vision/core/image.test.ts
@@ -16,7 +16,8 @@
import 'jasmine';
-import {MPImage, MPImageShaderContext, MPImageType} from './image';
+import {MPImage} from './image';
+import {MPImageShaderContext} from './image_shader_context';
const WIDTH = 2;
const HEIGHT = 2;
@@ -40,8 +41,6 @@ const IMAGE_2_3 = [
class MPImageTestContext {
canvas!: OffscreenCanvas;
gl!: WebGL2RenderingContext;
- uint8ClampedArray!: Uint8ClampedArray;
- float32Array!: Float32Array;
imageData!: ImageData;
imageBitmap!: ImageBitmap;
webGLTexture!: WebGLTexture;
@@ -55,17 +54,11 @@ class MPImageTestContext {
const gl = this.gl;
- this.uint8ClampedArray = new Uint8ClampedArray(pixels.length / 4);
- this.float32Array = new Float32Array(pixels.length / 4);
- for (let i = 0; i < this.uint8ClampedArray.length; ++i) {
- this.uint8ClampedArray[i] = pixels[i * 4];
- this.float32Array[i] = pixels[i * 4] / 255;
- }
this.imageData =
new ImageData(new Uint8ClampedArray(pixels), width, height);
this.imageBitmap = await createImageBitmap(this.imageData);
- this.webGLTexture = gl.createTexture()!;
+ this.webGLTexture = gl.createTexture()!;
gl.bindTexture(gl.TEXTURE_2D, this.webGLTexture);
gl.texImage2D(
gl.TEXTURE_2D, 0, gl.RGBA, gl.RGBA, gl.UNSIGNED_BYTE, this.imageBitmap);
@@ -74,10 +67,6 @@ class MPImageTestContext {
get(type: unknown) {
switch (type) {
- case Uint8ClampedArray:
- return this.uint8ClampedArray;
- case Float32Array:
- return this.float32Array;
case ImageData:
return this.imageData;
case ImageBitmap:
@@ -125,25 +114,22 @@ class MPImageTestContext {
gl.bindTexture(gl.TEXTURE_2D, null);
+ // Sanity check
+ expect(pixels.find(v => !!v)).toBeDefined();
+
return pixels;
}
function assertEquality(image: MPImage, expected: ImageType): void {
- if (expected instanceof Uint8ClampedArray) {
- const result = image.get(MPImageType.UINT8_CLAMPED_ARRAY);
- expect(result).toEqual(expected);
- } else if (expected instanceof Float32Array) {
- const result = image.get(MPImageType.FLOAT32_ARRAY);
- expect(result).toEqual(expected);
- } else if (expected instanceof ImageData) {
- const result = image.get(MPImageType.IMAGE_DATA);
+ if (expected instanceof ImageData) {
+ const result = image.getAsImageData();
expect(result).toEqual(expected);
} else if (expected instanceof ImageBitmap) {
- const result = image.get(MPImageType.IMAGE_BITMAP);
+ const result = image.getAsImageBitmap();
expect(readPixelsFromImageBitmap(result))
.toEqual(readPixelsFromImageBitmap(expected));
} else { // WebGLTexture
- const result = image.get(MPImageType.WEBGL_TEXTURE);
+ const result = image.getAsWebGLTexture();
expect(readPixelsFromWebGLTexture(result))
.toEqual(readPixelsFromWebGLTexture(expected));
}
@@ -177,9 +163,7 @@ class MPImageTestContext {
shaderContext.close();
}
- const sources = skip ?
- [] :
- [Uint8ClampedArray, Float32Array, ImageData, ImageBitmap, WebGLTexture];
+ const sources = skip ? [] : [ImageData, ImageBitmap, WebGLTexture];
for (let i = 0; i < sources.length; i++) {
for (let j = 0; j < sources.length; j++) {
@@ -202,11 +186,11 @@ class MPImageTestContext {
const shaderContext = new MPImageShaderContext();
const image = new MPImage(
- [context.webGLTexture],
- /* ownsImageBitmap= */ false, /* ownsWebGLTexture= */ false,
- context.canvas, shaderContext, WIDTH, HEIGHT);
+ [context.webGLTexture], /* ownsImageBitmap= */ false,
+ /* ownsWebGLTexture= */ false, context.canvas, shaderContext, WIDTH,
+ HEIGHT);
- const result = image.clone().get(MPImageType.IMAGE_DATA);
+ const result = image.clone().getAsImageData();
expect(result).toEqual(context.imageData);
shaderContext.close();
@@ -217,19 +201,19 @@ class MPImageTestContext {
const shaderContext = new MPImageShaderContext();
const image = new MPImage(
- [context.webGLTexture],
- /* ownsImageBitmap= */ false, /* ownsWebGLTexture= */ false,
- context.canvas, shaderContext, WIDTH, HEIGHT);
+ [context.webGLTexture], /* ownsImageBitmap= */ false,
+ /* ownsWebGLTexture= */ false, context.canvas, shaderContext, WIDTH,
+ HEIGHT);
// Verify that we can mix the different shader modes by running them out of
// order.
- let result = image.get(MPImageType.IMAGE_DATA);
+ let result = image.getAsImageData();
expect(result).toEqual(context.imageData);
- result = image.clone().get(MPImageType.IMAGE_DATA);
+ result = image.clone().getAsImageData();
expect(result).toEqual(context.imageData);
- result = image.get(MPImageType.IMAGE_DATA);
+ result = image.getAsImageData();
expect(result).toEqual(context.imageData);
shaderContext.close();
@@ -241,43 +225,21 @@ class MPImageTestContext {
const shaderContext = new MPImageShaderContext();
const image = createImage(shaderContext, context.imageData, WIDTH, HEIGHT);
- expect(image.has(MPImageType.IMAGE_DATA)).toBe(true);
- expect(image.has(MPImageType.UINT8_CLAMPED_ARRAY)).toBe(false);
- expect(image.has(MPImageType.FLOAT32_ARRAY)).toBe(false);
- expect(image.has(MPImageType.WEBGL_TEXTURE)).toBe(false);
- expect(image.has(MPImageType.IMAGE_BITMAP)).toBe(false);
+ expect(image.hasImageData()).toBe(true);
+ expect(image.hasWebGLTexture()).toBe(false);
+ expect(image.hasImageBitmap()).toBe(false);
- image.get(MPImageType.UINT8_CLAMPED_ARRAY);
+ image.getAsWebGLTexture();
- expect(image.has(MPImageType.IMAGE_DATA)).toBe(true);
- expect(image.has(MPImageType.UINT8_CLAMPED_ARRAY)).toBe(true);
- expect(image.has(MPImageType.FLOAT32_ARRAY)).toBe(false);
- expect(image.has(MPImageType.WEBGL_TEXTURE)).toBe(false);
- expect(image.has(MPImageType.IMAGE_BITMAP)).toBe(false);
+ expect(image.hasImageData()).toBe(true);
+ expect(image.hasWebGLTexture()).toBe(true);
+ expect(image.hasImageBitmap()).toBe(false);
- image.get(MPImageType.FLOAT32_ARRAY);
+ image.getAsImageBitmap();
- expect(image.has(MPImageType.IMAGE_DATA)).toBe(true);
- expect(image.has(MPImageType.UINT8_CLAMPED_ARRAY)).toBe(true);
- expect(image.has(MPImageType.FLOAT32_ARRAY)).toBe(true);
- expect(image.has(MPImageType.WEBGL_TEXTURE)).toBe(false);
- expect(image.has(MPImageType.IMAGE_BITMAP)).toBe(false);
-
- image.get(MPImageType.WEBGL_TEXTURE);
-
- expect(image.has(MPImageType.IMAGE_DATA)).toBe(true);
- expect(image.has(MPImageType.UINT8_CLAMPED_ARRAY)).toBe(true);
- expect(image.has(MPImageType.FLOAT32_ARRAY)).toBe(true);
- expect(image.has(MPImageType.WEBGL_TEXTURE)).toBe(true);
- expect(image.has(MPImageType.IMAGE_BITMAP)).toBe(false);
-
- image.get(MPImageType.IMAGE_BITMAP);
-
- expect(image.has(MPImageType.IMAGE_DATA)).toBe(true);
- expect(image.has(MPImageType.UINT8_CLAMPED_ARRAY)).toBe(true);
- expect(image.has(MPImageType.FLOAT32_ARRAY)).toBe(true);
- expect(image.has(MPImageType.WEBGL_TEXTURE)).toBe(true);
- expect(image.has(MPImageType.IMAGE_BITMAP)).toBe(true);
+ expect(image.hasImageData()).toBe(true);
+ expect(image.hasWebGLTexture()).toBe(true);
+ expect(image.hasImageBitmap()).toBe(true);
image.close();
shaderContext.close();
diff --git a/mediapipe/tasks/web/vision/core/image.ts b/mediapipe/tasks/web/vision/core/image.ts
index 7d6997d37..bcc6b7ca1 100644
--- a/mediapipe/tasks/web/vision/core/image.ts
+++ b/mediapipe/tasks/web/vision/core/image.ts
@@ -14,14 +14,10 @@
* limitations under the License.
*/
+import {assertNotNull, MPImageShaderContext} from '../../../../tasks/web/vision/core/image_shader_context';
+
/** The underlying type of the image. */
-export enum MPImageType {
- /** Represents the native `UInt8ClampedArray` type. */
- UINT8_CLAMPED_ARRAY,
- /**
- * Represents the native `Float32Array` type. Values range from [0.0, 1.0].
- */
- FLOAT32_ARRAY,
+enum MPImageType {
/** Represents the native `ImageData` type. */
IMAGE_DATA,
/** Represents the native `ImageBitmap` type. */
@@ -31,377 +27,16 @@ export enum MPImageType {
}
/** The supported image formats. For internal usage. */
-export type MPImageContainer =
- Uint8ClampedArray|Float32Array|ImageData|ImageBitmap|WebGLTexture;
-
-const VERTEX_SHADER = `
- attribute vec2 aVertex;
- attribute vec2 aTex;
- varying vec2 vTex;
- void main(void) {
- gl_Position = vec4(aVertex, 0.0, 1.0);
- vTex = aTex;
- }`;
-
-const FRAGMENT_SHADER = `
- precision mediump float;
- varying vec2 vTex;
- uniform sampler2D inputTexture;
- void main() {
- gl_FragColor = texture2D(inputTexture, vTex);
- }
- `;
-
-function assertNotNull(value: T|null, msg: string): T {
- if (value === null) {
- throw new Error(`Unable to obtain required WebGL resource: ${msg}`);
- }
- return value;
-}
-
-// TODO: Move internal-only types to different module.
-
-/**
- * Utility class that encapsulates the buffers used by `MPImageShaderContext`.
- * For internal use only.
- */
-class MPImageShaderBuffers {
- constructor(
- private readonly gl: WebGL2RenderingContext,
- private readonly vertexArrayObject: WebGLVertexArrayObject,
- private readonly vertexBuffer: WebGLBuffer,
- private readonly textureBuffer: WebGLBuffer) {}
-
- bind() {
- this.gl.bindVertexArray(this.vertexArrayObject);
- }
-
- unbind() {
- this.gl.bindVertexArray(null);
- }
-
- close() {
- this.gl.deleteVertexArray(this.vertexArrayObject);
- this.gl.deleteBuffer(this.vertexBuffer);
- this.gl.deleteBuffer(this.textureBuffer);
- }
-}
-
-/**
- * A class that encapsulates the shaders used by an MPImage. Can be re-used
- * across MPImages that use the same WebGL2Rendering context.
- *
- * For internal use only.
- */
-export class MPImageShaderContext {
- private gl?: WebGL2RenderingContext;
- private framebuffer?: WebGLFramebuffer;
- private program?: WebGLProgram;
- private vertexShader?: WebGLShader;
- private fragmentShader?: WebGLShader;
- private aVertex?: GLint;
- private aTex?: GLint;
-
- /**
- * The shader buffers used for passthrough renders that don't modify the
- * input texture.
- */
- private shaderBuffersPassthrough?: MPImageShaderBuffers;
-
- /**
- * The shader buffers used for passthrough renders that flip the input texture
- * vertically before conversion to a different type. This is used to flip the
- * texture to the expected orientation for drawing in the browser.
- */
- private shaderBuffersFlipVertically?: MPImageShaderBuffers;
-
- private compileShader(source: string, type: number): WebGLShader {
- const gl = this.gl!;
- const shader =
- assertNotNull(gl.createShader(type), 'Failed to create WebGL shader');
- gl.shaderSource(shader, source);
- gl.compileShader(shader);
- if (!gl.getShaderParameter(shader, gl.COMPILE_STATUS)) {
- const info = gl.getShaderInfoLog(shader);
- throw new Error(`Could not compile WebGL shader: ${info}`);
- }
- gl.attachShader(this.program!, shader);
- return shader;
- }
-
- private setupShaders(): void {
- const gl = this.gl!;
- this.program =
- assertNotNull(gl.createProgram()!, 'Failed to create WebGL program');
-
- this.vertexShader = this.compileShader(VERTEX_SHADER, gl.VERTEX_SHADER);
- this.fragmentShader =
- this.compileShader(FRAGMENT_SHADER, gl.FRAGMENT_SHADER);
-
- gl.linkProgram(this.program);
- const linked = gl.getProgramParameter(this.program, gl.LINK_STATUS);
- if (!linked) {
- const info = gl.getProgramInfoLog(this.program);
- throw new Error(`Error during program linking: ${info}`);
- }
-
- this.aVertex = gl.getAttribLocation(this.program, 'aVertex');
- this.aTex = gl.getAttribLocation(this.program, 'aTex');
- }
-
- private createBuffers(flipVertically: boolean): MPImageShaderBuffers {
- const gl = this.gl!;
- const vertexArrayObject =
- assertNotNull(gl.createVertexArray(), 'Failed to create vertex array');
- gl.bindVertexArray(vertexArrayObject);
-
- const vertexBuffer =
- assertNotNull(gl.createBuffer(), 'Failed to create buffer');
- gl.bindBuffer(gl.ARRAY_BUFFER, vertexBuffer);
- gl.enableVertexAttribArray(this.aVertex!);
- gl.vertexAttribPointer(this.aVertex!, 2, gl.FLOAT, false, 0, 0);
- gl.bufferData(
- gl.ARRAY_BUFFER, new Float32Array([-1, -1, -1, 1, 1, 1, 1, -1]),
- gl.STATIC_DRAW);
-
- const textureBuffer =
- assertNotNull(gl.createBuffer(), 'Failed to create buffer');
- gl.bindBuffer(gl.ARRAY_BUFFER, textureBuffer);
- gl.enableVertexAttribArray(this.aTex!);
- gl.vertexAttribPointer(this.aTex!, 2, gl.FLOAT, false, 0, 0);
-
- const bufferData =
- flipVertically ? [0, 1, 0, 0, 1, 0, 1, 1] : [0, 0, 0, 1, 1, 1, 1, 0];
- gl.bufferData(
- gl.ARRAY_BUFFER, new Float32Array(bufferData), gl.STATIC_DRAW);
-
- gl.bindBuffer(gl.ARRAY_BUFFER, null);
- gl.bindVertexArray(null);
-
- return new MPImageShaderBuffers(
- gl, vertexArrayObject, vertexBuffer, textureBuffer);
- }
-
- private getShaderBuffers(flipVertically: boolean): MPImageShaderBuffers {
- if (flipVertically) {
- if (!this.shaderBuffersFlipVertically) {
- this.shaderBuffersFlipVertically =
- this.createBuffers(/* flipVertically= */ true);
- }
- return this.shaderBuffersFlipVertically;
- } else {
- if (!this.shaderBuffersPassthrough) {
- this.shaderBuffersPassthrough =
- this.createBuffers(/* flipVertically= */ false);
- }
- return this.shaderBuffersPassthrough;
- }
- }
-
- private maybeInitGL(gl: WebGL2RenderingContext): void {
- if (!this.gl) {
- this.gl = gl;
- } else if (gl !== this.gl) {
- throw new Error('Cannot change GL context once initialized');
- }
- }
-
- /** Runs the callback using the shader. */
- run(
- gl: WebGL2RenderingContext, flipVertically: boolean,
- callback: () => T): T {
- this.maybeInitGL(gl);
-
- if (!this.program) {
- this.setupShaders();
- }
-
- const shaderBuffers = this.getShaderBuffers(flipVertically);
- gl.useProgram(this.program!);
- shaderBuffers.bind();
- const result = callback();
- shaderBuffers.unbind();
-
- return result;
- }
-
- /**
- * Binds a framebuffer to the canvas. If the framebuffer does not yet exist,
- * creates it first. Binds the provided texture to the framebuffer.
- */
- bindFramebuffer(gl: WebGL2RenderingContext, texture: WebGLTexture): void {
- this.maybeInitGL(gl);
- if (!this.framebuffer) {
- this.framebuffer =
- assertNotNull(gl.createFramebuffer(), 'Failed to create framebuffe.');
- }
- gl.bindFramebuffer(gl.FRAMEBUFFER, this.framebuffer);
- gl.framebufferTexture2D(
- gl.FRAMEBUFFER, gl.COLOR_ATTACHMENT0, gl.TEXTURE_2D, texture, 0);
- }
-
- unbindFramebuffer(): void {
- this.gl?.bindFramebuffer(this.gl.FRAMEBUFFER, null);
- }
-
- close() {
- if (this.program) {
- const gl = this.gl!;
- gl.deleteProgram(this.program);
- gl.deleteShader(this.vertexShader!);
- gl.deleteShader(this.fragmentShader!);
- }
- if (this.framebuffer) {
- this.gl!.deleteFramebuffer(this.framebuffer);
- }
- if (this.shaderBuffersPassthrough) {
- this.shaderBuffersPassthrough.close();
- }
- if (this.shaderBuffersFlipVertically) {
- this.shaderBuffersFlipVertically.close();
- }
- }
-}
-
-/** A four channel color with a red, green, blue and alpha values. */
-export type RGBAColor = [number, number, number, number];
-
-/**
- * An interface that can be used to provide custom conversion functions. These
- * functions are invoked to convert pixel values between different channel
- * counts and value ranges. Any conversion function that is not specified will
- * result in a default conversion.
- */
-export interface MPImageChannelConverter {
- /**
- * A conversion function to convert a number in the [0.0, 1.0] range to RGBA.
- * The output is an array with four elemeents whose values range from 0 to 255
- * inclusive.
- *
- * The default conversion function is `[v * 255, v * 255, v * 255, 255]`
- * and will log a warning if invoked.
- */
- floatToRGBAConverter?: (value: number) => RGBAColor;
-
- /*
- * A conversion function to convert a number in the [0, 255] range to RGBA.
- * The output is an array with four elemeents whose values range from 0 to 255
- * inclusive.
- *
- * The default conversion function is `[v, v , v , 255]` and will log a
- * warning if invoked.
- */
- uint8ToRGBAConverter?: (value: number) => RGBAColor;
-
- /**
- * A conversion function to convert an RGBA value in the range of 0 to 255 to
- * a single value in the [0.0, 1.0] range.
- *
- * The default conversion function is `(r / 3 + g / 3 + b / 3) / 255` and will
- * log a warning if invoked.
- */
- rgbaToFloatConverter?: (r: number, g: number, b: number, a: number) => number;
-
- /**
- * A conversion function to convert an RGBA value in the range of 0 to 255 to
- * a single value in the [0, 255] range.
- *
- * The default conversion function is `r / 3 + g / 3 + b / 3` and will log a
- * warning if invoked.
- */
- rgbaToUint8Converter?: (r: number, g: number, b: number, a: number) => number;
-
- /**
- * A conversion function to convert a single value in the 0.0 to 1.0 range to
- * [0, 255].
- *
- * The default conversion function is `r * 255` and will log a warning if
- * invoked.
- */
- floatToUint8Converter?: (value: number) => number;
-
- /**
- * A conversion function to convert a single value in the 0 to 255 range to
- * [0.0, 1.0] .
- *
- * The default conversion function is `r / 255` and will log a warning if
- * invoked.
- */
- uint8ToFloatConverter?: (value: number) => number;
-}
-/**
- * Color converter that falls back to a default implementation if the
- * user-provided converter does not specify a conversion.
- */
-class DefaultColorConverter implements Required {
- private static readonly WARNINGS_LOGGED = new Set();
-
- constructor(private readonly customConverter: MPImageChannelConverter) {}
-
- floatToRGBAConverter(v: number): RGBAColor {
- if (this.customConverter.floatToRGBAConverter) {
- return this.customConverter.floatToRGBAConverter(v);
- }
- this.logWarningOnce('floatToRGBAConverter');
- return [v * 255, v * 255, v * 255, 255];
- }
-
- uint8ToRGBAConverter(v: number): RGBAColor {
- if (this.customConverter.uint8ToRGBAConverter) {
- return this.customConverter.uint8ToRGBAConverter(v);
- }
- this.logWarningOnce('uint8ToRGBAConverter');
- return [v, v, v, 255];
- }
-
- rgbaToFloatConverter(r: number, g: number, b: number, a: number): number {
- if (this.customConverter.rgbaToFloatConverter) {
- return this.customConverter.rgbaToFloatConverter(r, g, b, a);
- }
- this.logWarningOnce('rgbaToFloatConverter');
- return (r / 3 + g / 3 + b / 3) / 255;
- }
-
- rgbaToUint8Converter(r: number, g: number, b: number, a: number): number {
- if (this.customConverter.rgbaToUint8Converter) {
- return this.customConverter.rgbaToUint8Converter(r, g, b, a);
- }
- this.logWarningOnce('rgbaToUint8Converter');
- return r / 3 + g / 3 + b / 3;
- }
-
- floatToUint8Converter(v: number): number {
- if (this.customConverter.floatToUint8Converter) {
- return this.customConverter.floatToUint8Converter(v);
- }
- this.logWarningOnce('floatToUint8Converter');
- return v * 255;
- }
-
- uint8ToFloatConverter(v: number): number {
- if (this.customConverter.uint8ToFloatConverter) {
- return this.customConverter.uint8ToFloatConverter(v);
- }
- this.logWarningOnce('uint8ToFloatConverter');
- return v / 255;
- }
-
- private logWarningOnce(methodName: string): void {
- if (!DefaultColorConverter.WARNINGS_LOGGED.has(methodName)) {
- console.log(`Using default ${methodName}`);
- DefaultColorConverter.WARNINGS_LOGGED.add(methodName);
- }
- }
-}
+export type MPImageContainer = ImageData|ImageBitmap|WebGLTexture;
/**
* The wrapper class for MediaPipe Image objects.
*
* Images are stored as `ImageData`, `ImageBitmap` or `WebGLTexture` objects.
* You can convert the underlying type to any other type by passing the
- * desired type to `get()`. As type conversions can be expensive, it is
+ * desired type to `getAs...()`. As type conversions can be expensive, it is
* recommended to limit these conversions. You can verify what underlying
- * types are already available by invoking `has()`.
+ * types are already available by invoking `has...()`.
*
* Images that are returned from a MediaPipe Tasks are owned by by the
* underlying C++ Task. If you need to extend the lifetime of these objects,
@@ -413,21 +48,10 @@ class DefaultColorConverter implements Required {
* initialized with an `OffscreenCanvas`. As we require WebGL2 support, this
* places some limitations on Browser support as outlined here:
* https://developer.mozilla.org/en-US/docs/Web/API/OffscreenCanvas/getContext
- *
- * Some MediaPipe tasks return single channel masks. These masks are stored
- * using an underlying `Uint8ClampedArray` an `Float32Array` (represented as
- * single-channel arrays). To convert these type to other formats a conversion
- * function is invoked to convert pixel values between single channel and four
- * channel RGBA values. To customize this conversion, you can specify these
- * conversion functions when you invoke `get()`. If you use the default
- * conversion function a warning will be logged to the console.
*/
export class MPImage {
private gl?: WebGL2RenderingContext;
- /** The underlying type of the image. */
- static TYPE = MPImageType;
-
/** @hideconstructor */
constructor(
private readonly containers: MPImageContainer[],
@@ -442,113 +66,60 @@ export class MPImage {
readonly height: number,
) {}
- /**
- * Returns whether this `MPImage` stores the image in the desired format.
- * This method can be called to reduce expensive conversion before invoking
- * `get()`.
- */
- has(type: MPImageType): boolean {
- return !!this.getContainer(type);
+ /** Returns whether this `MPImage` contains a mask of type `ImageData`. */
+ hasImageData(): boolean {
+ return !!this.getContainer(MPImageType.IMAGE_DATA);
+ }
+
+ /** Returns whether this `MPImage` contains a mask of type `ImageBitmap`. */
+ hasImageBitmap(): boolean {
+ return !!this.getContainer(MPImageType.IMAGE_BITMAP);
+ }
+
+ /** Returns whether this `MPImage` contains a mask of type `WebGLTexture`. */
+ hasWebGLTexture(): boolean {
+ return !!this.getContainer(MPImageType.WEBGL_TEXTURE);
}
- /**
- * Returns the underlying image as a single channel `Uint8ClampedArray`. Note
- * that this involves an expensive GPU to CPU transfer if the current image is
- * only available as an `ImageBitmap` or `WebGLTexture`. If necessary, this
- * function converts RGBA data pixel-by-pixel to a single channel value by
- * invoking a conversion function (see class comment for detail).
- *
- * @param type The type of image to return.
- * @param converter A set of conversion functions that will be invoked to
- * convert the underlying pixel data if necessary. You may omit this
- * function if the requested conversion does not change the pixel format.
- * @return The current data as a Uint8ClampedArray.
- */
- get(type: MPImageType.UINT8_CLAMPED_ARRAY,
- converter?: MPImageChannelConverter): Uint8ClampedArray;
- /**
- * Returns the underlying image as a single channel `Float32Array`. Note
- * that this involves an expensive GPU to CPU transfer if the current image is
- * only available as an `ImageBitmap` or `WebGLTexture`. If necessary, this
- * function converts RGBA data pixel-by-pixel to a single channel value by
- * invoking a conversion function (see class comment for detail).
- *
- * @param type The type of image to return.
- * @param converter A set of conversion functions that will be invoked to
- * convert the underlying pixel data if necessary. You may omit this
- * function if the requested conversion does not change the pixel format.
- * @return The current image as a Float32Array.
- */
- get(type: MPImageType.FLOAT32_ARRAY,
- converter?: MPImageChannelConverter): Float32Array;
/**
* Returns the underlying image as an `ImageData` object. Note that this
* involves an expensive GPU to CPU transfer if the current image is only
- * available as an `ImageBitmap` or `WebGLTexture`. If necessary, this
- * function converts single channel pixel values to RGBA by invoking a
- * conversion function (see class comment for detail).
+ * available as an `ImageBitmap` or `WebGLTexture`.
*
* @return The current image as an ImageData object.
*/
- get(type: MPImageType.IMAGE_DATA,
- converter?: MPImageChannelConverter): ImageData;
+ getAsImageData(): ImageData {
+ return this.convertToImageData();
+ }
+
/**
* Returns the underlying image as an `ImageBitmap`. Note that
* conversions to `ImageBitmap` are expensive, especially if the data
- * currently resides on CPU. If necessary, this function first converts single
- * channel pixel values to RGBA by invoking a conversion function (see class
- * comment for detail).
+ * currently resides on CPU.
*
* Processing with `ImageBitmap`s requires that the MediaPipe Task was
* initialized with an `OffscreenCanvas` with WebGL2 support. See
* https://developer.mozilla.org/en-US/docs/Web/API/OffscreenCanvas/getContext
* for a list of supported platforms.
*
- * @param type The type of image to return.
- * @param converter A set of conversion functions that will be invoked to
- * convert the underlying pixel data if necessary. You may omit this
- * function if the requested conversion does not change the pixel format.
* @return The current image as an ImageBitmap object.
*/
- get(type: MPImageType.IMAGE_BITMAP,
- converter?: MPImageChannelConverter): ImageBitmap;
+ getAsImageBitmap(): ImageBitmap {
+ return this.convertToImageBitmap();
+ }
+
/**
* Returns the underlying image as a `WebGLTexture` object. Note that this
* involves a CPU to GPU transfer if the current image is only available as
* an `ImageData` object. The returned texture is bound to the current
* canvas (see `.canvas`).
*
- * @param type The type of image to return.
- * @param converter A set of conversion functions that will be invoked to
- * convert the underlying pixel data if necessary. You may omit this
- * function if the requested conversion does not change the pixel format.
* @return The current image as a WebGLTexture.
*/
- get(type: MPImageType.WEBGL_TEXTURE,
- converter?: MPImageChannelConverter): WebGLTexture;
- get(type?: MPImageType,
- converter?: MPImageChannelConverter): MPImageContainer {
- const internalConverter = new DefaultColorConverter(converter ?? {});
- switch (type) {
- case MPImageType.UINT8_CLAMPED_ARRAY:
- return this.convertToUint8ClampedArray(internalConverter);
- case MPImageType.FLOAT32_ARRAY:
- return this.convertToFloat32Array(internalConverter);
- case MPImageType.IMAGE_DATA:
- return this.convertToImageData(internalConverter);
- case MPImageType.IMAGE_BITMAP:
- return this.convertToImageBitmap(internalConverter);
- case MPImageType.WEBGL_TEXTURE:
- return this.convertToWebGLTexture(internalConverter);
- default:
- throw new Error(`Type is not supported: ${type}`);
- }
+ getAsWebGLTexture(): WebGLTexture {
+ return this.convertToWebGLTexture();
}
-
- private getContainer(type: MPImageType.UINT8_CLAMPED_ARRAY): Uint8ClampedArray
- |undefined;
- private getContainer(type: MPImageType.FLOAT32_ARRAY): Float32Array|undefined;
private getContainer(type: MPImageType.IMAGE_DATA): ImageData|undefined;
private getContainer(type: MPImageType.IMAGE_BITMAP): ImageBitmap|undefined;
private getContainer(type: MPImageType.WEBGL_TEXTURE): WebGLTexture|undefined;
@@ -556,16 +127,16 @@ export class MPImage {
/** Returns the container for the requested storage type iff it exists. */
private getContainer(type: MPImageType): MPImageContainer|undefined {
switch (type) {
- case MPImageType.UINT8_CLAMPED_ARRAY:
- return this.containers.find(img => img instanceof Uint8ClampedArray);
- case MPImageType.FLOAT32_ARRAY:
- return this.containers.find(img => img instanceof Float32Array);
case MPImageType.IMAGE_DATA:
return this.containers.find(img => img instanceof ImageData);
case MPImageType.IMAGE_BITMAP:
- return this.containers.find(img => img instanceof ImageBitmap);
+ return this.containers.find(
+ img => typeof ImageBitmap !== 'undefined' &&
+ img instanceof ImageBitmap);
case MPImageType.WEBGL_TEXTURE:
- return this.containers.find(img => img instanceof WebGLTexture);
+ return this.containers.find(
+ img => typeof WebGLTexture !== 'undefined' &&
+ img instanceof WebGLTexture);
default:
throw new Error(`Type is not supported: ${type}`);
}
@@ -586,11 +157,7 @@ export class MPImage {
for (const container of this.containers) {
let destinationContainer: MPImageContainer;
- if (container instanceof Uint8ClampedArray) {
- destinationContainer = new Uint8ClampedArray(container);
- } else if (container instanceof Float32Array) {
- destinationContainer = new Float32Array(container);
- } else if (container instanceof ImageData) {
+ if (container instanceof ImageData) {
destinationContainer =
new ImageData(container.data, this.width, this.height);
} else if (container instanceof WebGLTexture) {
@@ -619,7 +186,7 @@ export class MPImage {
this.unbindTexture();
} else if (container instanceof ImageBitmap) {
- this.convertToWebGLTexture(new DefaultColorConverter({}));
+ this.convertToWebGLTexture();
this.bindTexture();
destinationContainer = this.copyTextureToBitmap();
this.unbindTexture();
@@ -631,9 +198,8 @@ export class MPImage {
}
return new MPImage(
- destinationContainers, this.has(MPImageType.IMAGE_BITMAP),
- this.has(MPImageType.WEBGL_TEXTURE), this.canvas, this.shaderContext,
- this.width, this.height);
+ destinationContainers, this.hasImageBitmap(), this.hasWebGLTexture(),
+ this.canvas, this.shaderContext, this.width, this.height);
}
private getOffscreenCanvas(): OffscreenCanvas {
@@ -667,11 +233,10 @@ export class MPImage {
return this.shaderContext;
}
- private convertToImageBitmap(converter: Required):
- ImageBitmap {
+ private convertToImageBitmap(): ImageBitmap {
let imageBitmap = this.getContainer(MPImageType.IMAGE_BITMAP);
if (!imageBitmap) {
- this.convertToWebGLTexture(converter);
+ this.convertToWebGLTexture();
imageBitmap = this.convertWebGLTextureToImageBitmap();
this.containers.push(imageBitmap);
this.ownsImageBitmap = true;
@@ -680,115 +245,37 @@ export class MPImage {
return imageBitmap;
}
- private convertToImageData(converter: Required):
- ImageData {
+ private convertToImageData(): ImageData {
let imageData = this.getContainer(MPImageType.IMAGE_DATA);
if (!imageData) {
- if (this.has(MPImageType.UINT8_CLAMPED_ARRAY)) {
- const source = this.getContainer(MPImageType.UINT8_CLAMPED_ARRAY)!;
- const destination = new Uint8ClampedArray(this.width * this.height * 4);
- for (let i = 0; i < this.width * this.height; i++) {
- const rgba = converter.uint8ToRGBAConverter(source[i]);
- destination[i * 4] = rgba[0];
- destination[i * 4 + 1] = rgba[1];
- destination[i * 4 + 2] = rgba[2];
- destination[i * 4 + 3] = rgba[3];
- }
- imageData = new ImageData(destination, this.width, this.height);
- this.containers.push(imageData);
- } else if (this.has(MPImageType.FLOAT32_ARRAY)) {
- const source = this.getContainer(MPImageType.FLOAT32_ARRAY)!;
- const destination = new Uint8ClampedArray(this.width * this.height * 4);
- for (let i = 0; i < this.width * this.height; i++) {
- const rgba = converter.floatToRGBAConverter(source[i]);
- destination[i * 4] = rgba[0];
- destination[i * 4 + 1] = rgba[1];
- destination[i * 4 + 2] = rgba[2];
- destination[i * 4 + 3] = rgba[3];
- }
- imageData = new ImageData(destination, this.width, this.height);
- this.containers.push(imageData);
- } else if (
- this.has(MPImageType.IMAGE_BITMAP) ||
- this.has(MPImageType.WEBGL_TEXTURE)) {
- const gl = this.getGL();
- const shaderContext = this.getShaderContext();
- const pixels = new Uint8Array(this.width * this.height * 4);
+ const gl = this.getGL();
+ const shaderContext = this.getShaderContext();
+ const pixels = new Uint8Array(this.width * this.height * 4);
- // Create texture if needed
- const webGlTexture = this.convertToWebGLTexture(converter);
+ // Create texture if needed
+ const webGlTexture = this.convertToWebGLTexture();
- // Create a framebuffer from the texture and read back pixels
- shaderContext.bindFramebuffer(gl, webGlTexture);
- gl.readPixels(
- 0, 0, this.width, this.height, gl.RGBA, gl.UNSIGNED_BYTE, pixels);
- shaderContext.unbindFramebuffer();
+ // Create a framebuffer from the texture and read back pixels
+ shaderContext.bindFramebuffer(gl, webGlTexture);
+ gl.readPixels(
+ 0, 0, this.width, this.height, gl.RGBA, gl.UNSIGNED_BYTE, pixels);
+ shaderContext.unbindFramebuffer();
- imageData = new ImageData(
- new Uint8ClampedArray(pixels.buffer), this.width, this.height);
- this.containers.push(imageData);
- } else {
- throw new Error('Couldn\t find backing image for ImageData conversion');
- }
+ imageData = new ImageData(
+ new Uint8ClampedArray(pixels.buffer), this.width, this.height);
+ this.containers.push(imageData);
}
return imageData;
}
- private convertToUint8ClampedArray(
- converter: Required): Uint8ClampedArray {
- let uint8ClampedArray = this.getContainer(MPImageType.UINT8_CLAMPED_ARRAY);
- if (!uint8ClampedArray) {
- if (this.has(MPImageType.FLOAT32_ARRAY)) {
- const source = this.getContainer(MPImageType.FLOAT32_ARRAY)!;
- uint8ClampedArray = new Uint8ClampedArray(
- source.map(v => converter.floatToUint8Converter(v)));
- } else {
- const source = this.convertToImageData(converter).data;
- uint8ClampedArray = new Uint8ClampedArray(this.width * this.height);
- for (let i = 0; i < this.width * this.height; i++) {
- uint8ClampedArray[i] = converter.rgbaToUint8Converter(
- source[i * 4], source[i * 4 + 1], source[i * 4 + 2],
- source[i * 4 + 3]);
- }
- }
- this.containers.push(uint8ClampedArray);
- }
-
- return uint8ClampedArray;
- }
-
- private convertToFloat32Array(converter: Required):
- Float32Array {
- let float32Array = this.getContainer(MPImageType.FLOAT32_ARRAY);
- if (!float32Array) {
- if (this.has(MPImageType.UINT8_CLAMPED_ARRAY)) {
- const source = this.getContainer(MPImageType.UINT8_CLAMPED_ARRAY)!;
- float32Array = new Float32Array(source).map(
- v => converter.uint8ToFloatConverter(v));
- } else {
- const source = this.convertToImageData(converter).data;
- float32Array = new Float32Array(this.width * this.height);
- for (let i = 0; i < this.width * this.height; i++) {
- float32Array[i] = converter.rgbaToFloatConverter(
- source[i * 4], source[i * 4 + 1], source[i * 4 + 2],
- source[i * 4 + 3]);
- }
- }
- this.containers.push(float32Array);
- }
-
- return float32Array;
- }
-
- private convertToWebGLTexture(converter: Required):
- WebGLTexture {
+ private convertToWebGLTexture(): WebGLTexture {
let webGLTexture = this.getContainer(MPImageType.WEBGL_TEXTURE);
if (!webGLTexture) {
const gl = this.getGL();
webGLTexture = this.bindTexture();
const source = this.getContainer(MPImageType.IMAGE_BITMAP) ||
- this.convertToImageData(converter);
+ this.convertToImageData();
gl.texImage2D(
gl.TEXTURE_2D, 0, gl.RGBA, gl.RGBA, gl.UNSIGNED_BYTE, source);
this.unbindTexture();
diff --git a/mediapipe/tasks/web/vision/core/image_shader_context.ts b/mediapipe/tasks/web/vision/core/image_shader_context.ts
new file mode 100644
index 000000000..eb17d001a
--- /dev/null
+++ b/mediapipe/tasks/web/vision/core/image_shader_context.ts
@@ -0,0 +1,243 @@
+/**
+ * Copyright 2023 The MediaPipe Authors.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+const VERTEX_SHADER = `
+ attribute vec2 aVertex;
+ attribute vec2 aTex;
+ varying vec2 vTex;
+ void main(void) {
+ gl_Position = vec4(aVertex, 0.0, 1.0);
+ vTex = aTex;
+ }`;
+
+const FRAGMENT_SHADER = `
+ precision mediump float;
+ varying vec2 vTex;
+ uniform sampler2D inputTexture;
+ void main() {
+ gl_FragColor = texture2D(inputTexture, vTex);
+ }
+ `;
+
+/** Helper to assert that `value` is not null. */
+export function assertNotNull(value: T|null, msg: string): T {
+ if (value === null) {
+ throw new Error(`Unable to obtain required WebGL resource: ${msg}`);
+ }
+ return value;
+}
+
+/**
+ * Utility class that encapsulates the buffers used by `MPImageShaderContext`.
+ * For internal use only.
+ */
+class MPImageShaderBuffers {
+ constructor(
+ private readonly gl: WebGL2RenderingContext,
+ private readonly vertexArrayObject: WebGLVertexArrayObject,
+ private readonly vertexBuffer: WebGLBuffer,
+ private readonly textureBuffer: WebGLBuffer) {}
+
+ bind() {
+ this.gl.bindVertexArray(this.vertexArrayObject);
+ }
+
+ unbind() {
+ this.gl.bindVertexArray(null);
+ }
+
+ close() {
+ this.gl.deleteVertexArray(this.vertexArrayObject);
+ this.gl.deleteBuffer(this.vertexBuffer);
+ this.gl.deleteBuffer(this.textureBuffer);
+ }
+}
+
+/**
+ * A class that encapsulates the shaders used by an MPImage. Can be re-used
+ * across MPImages that use the same WebGL2Rendering context.
+ *
+ * For internal use only.
+ */
+export class MPImageShaderContext {
+ private gl?: WebGL2RenderingContext;
+ private framebuffer?: WebGLFramebuffer;
+ private program?: WebGLProgram;
+ private vertexShader?: WebGLShader;
+ private fragmentShader?: WebGLShader;
+ private aVertex?: GLint;
+ private aTex?: GLint;
+
+ /**
+ * The shader buffers used for passthrough renders that don't modify the
+ * input texture.
+ */
+ private shaderBuffersPassthrough?: MPImageShaderBuffers;
+
+ /**
+ * The shader buffers used for passthrough renders that flip the input texture
+ * vertically before conversion to a different type. This is used to flip the
+ * texture to the expected orientation for drawing in the browser.
+ */
+ private shaderBuffersFlipVertically?: MPImageShaderBuffers;
+
+ private compileShader(source: string, type: number): WebGLShader {
+ const gl = this.gl!;
+ const shader =
+ assertNotNull(gl.createShader(type), 'Failed to create WebGL shader');
+ gl.shaderSource(shader, source);
+ gl.compileShader(shader);
+ if (!gl.getShaderParameter(shader, gl.COMPILE_STATUS)) {
+ const info = gl.getShaderInfoLog(shader);
+ throw new Error(`Could not compile WebGL shader: ${info}`);
+ }
+ gl.attachShader(this.program!, shader);
+ return shader;
+ }
+
+ private setupShaders(): void {
+ const gl = this.gl!;
+ this.program =
+ assertNotNull(gl.createProgram()!, 'Failed to create WebGL program');
+
+ this.vertexShader = this.compileShader(VERTEX_SHADER, gl.VERTEX_SHADER);
+ this.fragmentShader =
+ this.compileShader(FRAGMENT_SHADER, gl.FRAGMENT_SHADER);
+
+ gl.linkProgram(this.program);
+ const linked = gl.getProgramParameter(this.program, gl.LINK_STATUS);
+ if (!linked) {
+ const info = gl.getProgramInfoLog(this.program);
+ throw new Error(`Error during program linking: ${info}`);
+ }
+
+ this.aVertex = gl.getAttribLocation(this.program, 'aVertex');
+ this.aTex = gl.getAttribLocation(this.program, 'aTex');
+ }
+
+ private createBuffers(flipVertically: boolean): MPImageShaderBuffers {
+ const gl = this.gl!;
+ const vertexArrayObject =
+ assertNotNull(gl.createVertexArray(), 'Failed to create vertex array');
+ gl.bindVertexArray(vertexArrayObject);
+
+ const vertexBuffer =
+ assertNotNull(gl.createBuffer(), 'Failed to create buffer');
+ gl.bindBuffer(gl.ARRAY_BUFFER, vertexBuffer);
+ gl.enableVertexAttribArray(this.aVertex!);
+ gl.vertexAttribPointer(this.aVertex!, 2, gl.FLOAT, false, 0, 0);
+ gl.bufferData(
+ gl.ARRAY_BUFFER, new Float32Array([-1, -1, -1, 1, 1, 1, 1, -1]),
+ gl.STATIC_DRAW);
+
+ const textureBuffer =
+ assertNotNull(gl.createBuffer(), 'Failed to create buffer');
+ gl.bindBuffer(gl.ARRAY_BUFFER, textureBuffer);
+ gl.enableVertexAttribArray(this.aTex!);
+ gl.vertexAttribPointer(this.aTex!, 2, gl.FLOAT, false, 0, 0);
+
+ const bufferData =
+ flipVertically ? [0, 1, 0, 0, 1, 0, 1, 1] : [0, 0, 0, 1, 1, 1, 1, 0];
+ gl.bufferData(
+ gl.ARRAY_BUFFER, new Float32Array(bufferData), gl.STATIC_DRAW);
+
+ gl.bindBuffer(gl.ARRAY_BUFFER, null);
+ gl.bindVertexArray(null);
+
+ return new MPImageShaderBuffers(
+ gl, vertexArrayObject, vertexBuffer, textureBuffer);
+ }
+
+ private getShaderBuffers(flipVertically: boolean): MPImageShaderBuffers {
+ if (flipVertically) {
+ if (!this.shaderBuffersFlipVertically) {
+ this.shaderBuffersFlipVertically =
+ this.createBuffers(/* flipVertically= */ true);
+ }
+ return this.shaderBuffersFlipVertically;
+ } else {
+ if (!this.shaderBuffersPassthrough) {
+ this.shaderBuffersPassthrough =
+ this.createBuffers(/* flipVertically= */ false);
+ }
+ return this.shaderBuffersPassthrough;
+ }
+ }
+
+ private maybeInitGL(gl: WebGL2RenderingContext): void {
+ if (!this.gl) {
+ this.gl = gl;
+ } else if (gl !== this.gl) {
+ throw new Error('Cannot change GL context once initialized');
+ }
+ }
+
+ /** Runs the callback using the shader. */
+ run(
+ gl: WebGL2RenderingContext, flipVertically: boolean,
+ callback: () => T): T {
+ this.maybeInitGL(gl);
+
+ if (!this.program) {
+ this.setupShaders();
+ }
+
+ const shaderBuffers = this.getShaderBuffers(flipVertically);
+ gl.useProgram(this.program!);
+ shaderBuffers.bind();
+ const result = callback();
+ shaderBuffers.unbind();
+
+ return result;
+ }
+
+ /**
+ * Binds a framebuffer to the canvas. If the framebuffer does not yet exist,
+ * creates it first. Binds the provided texture to the framebuffer.
+ */
+ bindFramebuffer(gl: WebGL2RenderingContext, texture: WebGLTexture): void {
+ this.maybeInitGL(gl);
+ if (!this.framebuffer) {
+ this.framebuffer =
+ assertNotNull(gl.createFramebuffer(), 'Failed to create framebuffe.');
+ }
+ gl.bindFramebuffer(gl.FRAMEBUFFER, this.framebuffer);
+ gl.framebufferTexture2D(
+ gl.FRAMEBUFFER, gl.COLOR_ATTACHMENT0, gl.TEXTURE_2D, texture, 0);
+ }
+
+ unbindFramebuffer(): void {
+ this.gl?.bindFramebuffer(this.gl.FRAMEBUFFER, null);
+ }
+
+ close() {
+ if (this.program) {
+ const gl = this.gl!;
+ gl.deleteProgram(this.program);
+ gl.deleteShader(this.vertexShader!);
+ gl.deleteShader(this.fragmentShader!);
+ }
+ if (this.framebuffer) {
+ this.gl!.deleteFramebuffer(this.framebuffer);
+ }
+ if (this.shaderBuffersPassthrough) {
+ this.shaderBuffersPassthrough.close();
+ }
+ if (this.shaderBuffersFlipVertically) {
+ this.shaderBuffersFlipVertically.close();
+ }
+ }
+}
diff --git a/mediapipe/tasks/web/vision/core/mask.test.ts b/mediapipe/tasks/web/vision/core/mask.test.ts
new file mode 100644
index 000000000..b632f2dc5
--- /dev/null
+++ b/mediapipe/tasks/web/vision/core/mask.test.ts
@@ -0,0 +1,269 @@
+/**
+ * Copyright 2022 The MediaPipe Authors.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import 'jasmine';
+
+import {MPImageShaderContext} from './image_shader_context';
+import {MPMask} from './mask';
+
+const WIDTH = 2;
+const HEIGHT = 2;
+
+const skip = typeof document === 'undefined';
+if (skip) {
+ console.log('These tests must be run in a browser.');
+}
+
+/** The mask types supported by MPMask. */
+type MaskType = Uint8Array|Float32Array|WebGLTexture;
+
+const MASK_2_1 = [1, 2];
+const MASK_2_2 = [1, 2, 3, 4];
+const MASK_2_3 = [1, 2, 3, 4, 5, 6];
+
+/** The test images and data to use for the unit tests below. */
+class MPMaskTestContext {
+ canvas!: OffscreenCanvas;
+ gl!: WebGL2RenderingContext;
+ uint8Array!: Uint8Array;
+ float32Array!: Float32Array;
+ webGLTexture!: WebGLTexture;
+
+ async init(pixels = MASK_2_2, width = WIDTH, height = HEIGHT): Promise {
+ // Initialize a canvas with default dimensions. Note that the canvas size
+ // can be different from the mask size.
+ this.canvas = new OffscreenCanvas(WIDTH, HEIGHT);
+ this.gl = this.canvas.getContext('webgl2') as WebGL2RenderingContext;
+
+ const gl = this.gl;
+ if (!gl.getExtension('EXT_color_buffer_float')) {
+ throw new Error('Missing required EXT_color_buffer_float extension');
+ }
+
+ this.uint8Array = new Uint8Array(pixels);
+ this.float32Array = new Float32Array(pixels.length);
+ for (let i = 0; i < this.uint8Array.length; ++i) {
+ this.float32Array[i] = pixels[i] / 255;
+ }
+
+ this.webGLTexture = gl.createTexture()!;
+
+ gl.bindTexture(gl.TEXTURE_2D, this.webGLTexture);
+ gl.texImage2D(
+ gl.TEXTURE_2D, 0, gl.R32F, width, height, 0, gl.RED, gl.FLOAT,
+ new Float32Array(pixels).map(v => v / 255));
+ gl.bindTexture(gl.TEXTURE_2D, null);
+ }
+
+ get(type: unknown) {
+ switch (type) {
+ case Uint8Array:
+ return this.uint8Array;
+ case Float32Array:
+ return this.float32Array;
+ case WebGLTexture:
+ return this.webGLTexture;
+ default:
+ throw new Error(`Unsupported type: ${type}`);
+ }
+ }
+
+ close(): void {
+ this.gl.deleteTexture(this.webGLTexture);
+ }
+}
+
+(skip ? xdescribe : describe)('MPMask', () => {
+ const context = new MPMaskTestContext();
+
+ afterEach(() => {
+ context.close();
+ });
+
+ function readPixelsFromWebGLTexture(texture: WebGLTexture): Float32Array {
+ const pixels = new Float32Array(WIDTH * HEIGHT);
+
+ const gl = context.gl;
+ gl.bindTexture(gl.TEXTURE_2D, texture);
+
+ const framebuffer = gl.createFramebuffer()!;
+ gl.bindFramebuffer(gl.FRAMEBUFFER, framebuffer);
+ gl.framebufferTexture2D(
+ gl.FRAMEBUFFER, gl.COLOR_ATTACHMENT0, gl.TEXTURE_2D, texture, 0);
+ gl.readPixels(0, 0, WIDTH, HEIGHT, gl.RED, gl.FLOAT, pixels);
+ gl.bindFramebuffer(gl.FRAMEBUFFER, null);
+ gl.deleteFramebuffer(framebuffer);
+
+ gl.bindTexture(gl.TEXTURE_2D, null);
+
+ // Sanity check values
+ expect(pixels[0]).not.toBe(0);
+
+ return pixels;
+ }
+
+ function assertEquality(mask: MPMask, expected: MaskType): void {
+ if (expected instanceof Uint8Array) {
+ const result = mask.getAsUint8Array();
+ expect(result).toEqual(expected);
+ } else if (expected instanceof Float32Array) {
+ const result = mask.getAsFloat32Array();
+ expect(result).toEqual(expected);
+ } else { // WebGLTexture
+ const result = mask.getAsWebGLTexture();
+ expect(readPixelsFromWebGLTexture(result))
+ .toEqual(readPixelsFromWebGLTexture(expected));
+ }
+ }
+
+ function createImage(
+ shaderContext: MPImageShaderContext, input: MaskType, width: number,
+ height: number): MPMask {
+ return new MPMask(
+ [input],
+ /* ownsWebGLTexture= */ false, context.canvas, shaderContext, width,
+ height);
+ }
+
+ function runConversionTest(
+ input: MaskType, output: MaskType, width = WIDTH, height = HEIGHT): void {
+ const shaderContext = new MPImageShaderContext();
+ const mask = createImage(shaderContext, input, width, height);
+ assertEquality(mask, output);
+ mask.close();
+ shaderContext.close();
+ }
+
+ function runCloneTest(input: MaskType): void {
+ const shaderContext = new MPImageShaderContext();
+ const mask = createImage(shaderContext, input, WIDTH, HEIGHT);
+ const clone = mask.clone();
+ assertEquality(clone, input);
+ clone.close();
+ shaderContext.close();
+ }
+
+ const sources = skip ? [] : [Uint8Array, Float32Array, WebGLTexture];
+
+ for (let i = 0; i < sources.length; i++) {
+ for (let j = 0; j < sources.length; j++) {
+ it(`converts from ${sources[i].name} to ${sources[j].name}`, async () => {
+ await context.init();
+ runConversionTest(context.get(sources[i]), context.get(sources[j]));
+ });
+ }
+ }
+
+ for (let i = 0; i < sources.length; i++) {
+ it(`clones ${sources[i].name}`, async () => {
+ await context.init();
+ runCloneTest(context.get(sources[i]));
+ });
+ }
+
+ it(`does not flip textures twice`, async () => {
+ await context.init();
+
+ const shaderContext = new MPImageShaderContext();
+ const mask = new MPMask(
+ [context.webGLTexture],
+ /* ownsWebGLTexture= */ false, context.canvas, shaderContext, WIDTH,
+ HEIGHT);
+
+ const result = mask.clone().getAsUint8Array();
+ expect(result).toEqual(context.uint8Array);
+ shaderContext.close();
+ });
+
+ it(`can clone and get mask`, async () => {
+ await context.init();
+
+ const shaderContext = new MPImageShaderContext();
+ const mask = new MPMask(
+ [context.webGLTexture],
+ /* ownsWebGLTexture= */ false, context.canvas, shaderContext, WIDTH,
+ HEIGHT);
+
+ // Verify that we can mix the different shader modes by running them out of
+ // order.
+ let result = mask.getAsUint8Array();
+ expect(result).toEqual(context.uint8Array);
+
+ result = mask.clone().getAsUint8Array();
+ expect(result).toEqual(context.uint8Array);
+
+ result = mask.getAsUint8Array();
+ expect(result).toEqual(context.uint8Array);
+
+ shaderContext.close();
+ });
+
+ it('supports has()', async () => {
+ await context.init();
+
+ const shaderContext = new MPImageShaderContext();
+ const mask = createImage(shaderContext, context.uint8Array, WIDTH, HEIGHT);
+
+ expect(mask.hasUint8Array()).toBe(true);
+ expect(mask.hasFloat32Array()).toBe(false);
+ expect(mask.hasWebGLTexture()).toBe(false);
+
+ mask.getAsFloat32Array();
+
+ expect(mask.hasUint8Array()).toBe(true);
+ expect(mask.hasFloat32Array()).toBe(true);
+ expect(mask.hasWebGLTexture()).toBe(false);
+
+ mask.getAsWebGLTexture();
+
+ expect(mask.hasUint8Array()).toBe(true);
+ expect(mask.hasFloat32Array()).toBe(true);
+ expect(mask.hasWebGLTexture()).toBe(true);
+
+ mask.close();
+ shaderContext.close();
+ });
+
+ it('supports mask that is smaller than the canvas', async () => {
+ await context.init(MASK_2_1, /* width= */ 2, /* height= */ 1);
+
+ runConversionTest(
+ context.uint8Array, context.webGLTexture, /* width= */ 2,
+ /* height= */ 1);
+ runConversionTest(
+ context.webGLTexture, context.float32Array, /* width= */ 2,
+ /* height= */ 1);
+ runConversionTest(
+ context.float32Array, context.uint8Array, /* width= */ 2,
+ /* height= */ 1);
+
+ context.close();
+ });
+
+ it('supports mask that is larger than the canvas', async () => {
+ await context.init(MASK_2_3, /* width= */ 2, /* height= */ 3);
+
+ runConversionTest(
+ context.uint8Array, context.webGLTexture, /* width= */ 2,
+ /* height= */ 3);
+ runConversionTest(
+ context.webGLTexture, context.float32Array, /* width= */ 2,
+ /* height= */ 3);
+ runConversionTest(
+ context.float32Array, context.uint8Array, /* width= */ 2,
+ /* height= */ 3);
+ });
+});
diff --git a/mediapipe/tasks/web/vision/core/mask.ts b/mediapipe/tasks/web/vision/core/mask.ts
new file mode 100644
index 000000000..da14f104f
--- /dev/null
+++ b/mediapipe/tasks/web/vision/core/mask.ts
@@ -0,0 +1,315 @@
+/**
+ * Copyright 2023 The MediaPipe Authors.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import {assertNotNull, MPImageShaderContext} from '../../../../tasks/web/vision/core/image_shader_context';
+
+/** The underlying type of the image. */
+enum MPMaskType {
+ /** Represents the native `UInt8Array` type. */
+ UINT8_ARRAY,
+ /** Represents the native `Float32Array` type. */
+ FLOAT32_ARRAY,
+ /** Represents the native `WebGLTexture` type. */
+ WEBGL_TEXTURE
+}
+
+/** The supported mask formats. For internal usage. */
+export type MPMaskContainer = Uint8Array|Float32Array|WebGLTexture;
+
+/**
+ * The wrapper class for MediaPipe segmentation masks.
+ *
+ * Masks are stored as `Uint8Array`, `Float32Array` or `WebGLTexture` objects.
+ * You can convert the underlying type to any other type by passing the desired
+ * type to `getAs...()`. As type conversions can be expensive, it is recommended
+ * to limit these conversions. You can verify what underlying types are already
+ * available by invoking `has...()`.
+ *
+ * Masks that are returned from a MediaPipe Tasks are owned by by the
+ * underlying C++ Task. If you need to extend the lifetime of these objects,
+ * you can invoke the `clone()` method. To free up the resources obtained
+ * during any clone or type conversion operation, it is important to invoke
+ * `close()` on the `MPMask` instance.
+ */
+export class MPMask {
+ private gl?: WebGL2RenderingContext;
+
+ /** @hideconstructor */
+ constructor(
+ private readonly containers: MPMaskContainer[],
+ private ownsWebGLTexture: boolean,
+ /** Returns the canvas element that the mask is bound to. */
+ readonly canvas: HTMLCanvasElement|OffscreenCanvas|undefined,
+ private shaderContext: MPImageShaderContext|undefined,
+ /** Returns the width of the mask. */
+ readonly width: number,
+ /** Returns the height of the mask. */
+ readonly height: number,
+ ) {}
+
+ /** Returns whether this `MPMask` contains a mask of type `Uint8Array`. */
+ hasUint8Array(): boolean {
+ return !!this.getContainer(MPMaskType.UINT8_ARRAY);
+ }
+
+ /** Returns whether this `MPMask` contains a mask of type `Float32Array`. */
+ hasFloat32Array(): boolean {
+ return !!this.getContainer(MPMaskType.FLOAT32_ARRAY);
+ }
+
+ /** Returns whether this `MPMask` contains a mask of type `WebGLTexture`. */
+ hasWebGLTexture(): boolean {
+ return !!this.getContainer(MPMaskType.WEBGL_TEXTURE);
+ }
+
+ /**
+ * Returns the underlying mask as a Uint8Array`. Note that this involves an
+ * expensive GPU to CPU transfer if the current mask is only available as a
+ * `WebGLTexture`.
+ *
+ * @return The current data as a Uint8Array.
+ */
+ getAsUint8Array(): Uint8Array {
+ return this.convertToUint8Array();
+ }
+
+ /**
+ * Returns the underlying mask as a single channel `Float32Array`. Note that
+ * this involves an expensive GPU to CPU transfer if the current mask is only
+ * available as a `WebGLTexture`.
+ *
+ * @return The current mask as a Float32Array.
+ */
+ getAsFloat32Array(): Float32Array {
+ return this.convertToFloat32Array();
+ }
+
+ /**
+ * Returns the underlying mask as a `WebGLTexture` object. Note that this
+ * involves a CPU to GPU transfer if the current mask is only available as
+ * a CPU array. The returned texture is bound to the current canvas (see
+ * `.canvas`).
+ *
+ * @return The current mask as a WebGLTexture.
+ */
+ getAsWebGLTexture(): WebGLTexture {
+ return this.convertToWebGLTexture();
+ }
+
+ private getContainer(type: MPMaskType.UINT8_ARRAY): Uint8Array|undefined;
+ private getContainer(type: MPMaskType.FLOAT32_ARRAY): Float32Array|undefined;
+ private getContainer(type: MPMaskType.WEBGL_TEXTURE): WebGLTexture|undefined;
+ private getContainer(type: MPMaskType): MPMaskContainer|undefined;
+ /** Returns the container for the requested storage type iff it exists. */
+ private getContainer(type: MPMaskType): MPMaskContainer|undefined {
+ switch (type) {
+ case MPMaskType.UINT8_ARRAY:
+ return this.containers.find(img => img instanceof Uint8Array);
+ case MPMaskType.FLOAT32_ARRAY:
+ return this.containers.find(img => img instanceof Float32Array);
+ case MPMaskType.WEBGL_TEXTURE:
+ return this.containers.find(
+ img => typeof WebGLTexture !== 'undefined' &&
+ img instanceof WebGLTexture);
+ default:
+ throw new Error(`Type is not supported: ${type}`);
+ }
+ }
+
+ /**
+ * Creates a copy of the resources stored in this `MPMask`. You can
+ * invoke this method to extend the lifetime of a mask returned by a
+ * MediaPipe Task. Note that performance critical applications should aim to
+ * only use the `MPMask` within the MediaPipe Task callback so that
+ * copies can be avoided.
+ */
+ clone(): MPMask {
+ const destinationContainers: MPMaskContainer[] = [];
+
+ // TODO: We might only want to clone one backing datastructure
+ // even if multiple are defined;
+ for (const container of this.containers) {
+ let destinationContainer: MPMaskContainer;
+
+ if (container instanceof Uint8Array) {
+ destinationContainer = new Uint8Array(container);
+ } else if (container instanceof Float32Array) {
+ destinationContainer = new Float32Array(container);
+ } else if (container instanceof WebGLTexture) {
+ const gl = this.getGL();
+ const shaderContext = this.getShaderContext();
+
+ // Create a new texture and use it to back a framebuffer
+ gl.activeTexture(gl.TEXTURE1);
+ destinationContainer =
+ assertNotNull(gl.createTexture(), 'Failed to create texture');
+ gl.bindTexture(gl.TEXTURE_2D, destinationContainer);
+ gl.texImage2D(
+ gl.TEXTURE_2D, 0, gl.R32F, this.width, this.height, 0, gl.RED,
+ gl.FLOAT, null);
+ gl.bindTexture(gl.TEXTURE_2D, null);
+
+ shaderContext.bindFramebuffer(gl, destinationContainer);
+ shaderContext.run(gl, /* flipVertically= */ false, () => {
+ this.bindTexture(); // This activates gl.TEXTURE0
+ gl.clearColor(0, 0, 0, 0);
+ gl.clear(gl.COLOR_BUFFER_BIT);
+ gl.drawArrays(gl.TRIANGLE_FAN, 0, 4);
+ this.unbindTexture();
+ });
+ shaderContext.unbindFramebuffer();
+
+ this.unbindTexture();
+ } else {
+ throw new Error(`Type is not supported: ${container}`);
+ }
+
+ destinationContainers.push(destinationContainer);
+ }
+
+ return new MPMask(
+ destinationContainers, this.hasWebGLTexture(), this.canvas,
+ this.shaderContext, this.width, this.height);
+ }
+
+ private getGL(): WebGL2RenderingContext {
+ if (!this.canvas) {
+ throw new Error(
+ 'Conversion to different image formats require that a canvas ' +
+ 'is passed when iniitializing the image.');
+ }
+ if (!this.gl) {
+ this.gl = assertNotNull(
+ this.canvas.getContext('webgl2') as WebGL2RenderingContext | null,
+ 'You cannot use a canvas that is already bound to a different ' +
+ 'type of rendering context.');
+ }
+ const ext = this.gl.getExtension('EXT_color_buffer_float');
+ if (!ext) {
+ // TODO: Ensure this works on iOS
+ throw new Error('Missing required EXT_color_buffer_float extension');
+ }
+ return this.gl;
+ }
+
+ private getShaderContext(): MPImageShaderContext {
+ if (!this.shaderContext) {
+ this.shaderContext = new MPImageShaderContext();
+ }
+ return this.shaderContext;
+ }
+
+ private convertToFloat32Array(): Float32Array {
+ let float32Array = this.getContainer(MPMaskType.FLOAT32_ARRAY);
+ if (!float32Array) {
+ const uint8Array = this.getContainer(MPMaskType.UINT8_ARRAY);
+ if (uint8Array) {
+ float32Array = new Float32Array(uint8Array).map(v => v / 255);
+ } else {
+ const gl = this.getGL();
+ const shaderContext = this.getShaderContext();
+ float32Array = new Float32Array(this.width * this.height);
+
+ // Create texture if needed
+ const webGlTexture = this.convertToWebGLTexture();
+
+ // Create a framebuffer from the texture and read back pixels
+ shaderContext.bindFramebuffer(gl, webGlTexture);
+ gl.readPixels(
+ 0, 0, this.width, this.height, gl.RED, gl.FLOAT, float32Array);
+ shaderContext.unbindFramebuffer();
+ }
+ this.containers.push(float32Array);
+ }
+
+ return float32Array;
+ }
+
+ private convertToUint8Array(): Uint8Array {
+ let uint8Array = this.getContainer(MPMaskType.UINT8_ARRAY);
+ if (!uint8Array) {
+ const floatArray = this.convertToFloat32Array();
+ uint8Array = new Uint8Array(floatArray.map(v => 255 * v));
+ this.containers.push(uint8Array);
+ }
+ return uint8Array;
+ }
+
+ private convertToWebGLTexture(): WebGLTexture {
+ let webGLTexture = this.getContainer(MPMaskType.WEBGL_TEXTURE);
+ if (!webGLTexture) {
+ const gl = this.getGL();
+ webGLTexture = this.bindTexture();
+
+ const data = this.convertToFloat32Array();
+ // TODO: Add support for R16F to support iOS
+ gl.texImage2D(
+ gl.TEXTURE_2D, 0, gl.R32F, this.width, this.height, 0, gl.RED,
+ gl.FLOAT, data);
+ this.unbindTexture();
+ }
+
+ return webGLTexture;
+ }
+
+ /**
+ * Binds the backing texture to the canvas. If the texture does not yet
+ * exist, creates it first.
+ */
+ private bindTexture(): WebGLTexture {
+ const gl = this.getGL();
+
+ gl.viewport(0, 0, this.width, this.height);
+ gl.activeTexture(gl.TEXTURE0);
+
+ let webGLTexture = this.getContainer(MPMaskType.WEBGL_TEXTURE);
+ if (!webGLTexture) {
+ webGLTexture =
+ assertNotNull(gl.createTexture(), 'Failed to create texture');
+ this.containers.push(webGLTexture);
+ this.ownsWebGLTexture = true;
+ }
+
+ gl.bindTexture(gl.TEXTURE_2D, webGLTexture);
+ // TODO: Ideally, we would only set these once per texture and
+ // not once every frame.
+ gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_WRAP_S, gl.CLAMP_TO_EDGE);
+ gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_WRAP_T, gl.CLAMP_TO_EDGE);
+ gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_MIN_FILTER, gl.NEAREST);
+ gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_MAG_FILTER, gl.NEAREST);
+
+ return webGLTexture;
+ }
+
+ private unbindTexture(): void {
+ this.gl!.bindTexture(this.gl!.TEXTURE_2D, null);
+ }
+
+ /**
+ * Frees up any resources owned by this `MPMask` instance.
+ *
+ * Note that this method does not free masks that are owned by the C++
+ * Task, as these are freed automatically once you leave the MediaPipe
+ * callback. Additionally, some shared state is freed only once you invoke
+ * the Task's `close()` method.
+ */
+ close(): void {
+ if (this.ownsWebGLTexture) {
+ const gl = this.getGL();
+ gl.deleteTexture(this.getContainer(MPMaskType.WEBGL_TEXTURE)!);
+ }
+ }
+}
diff --git a/mediapipe/tasks/web/vision/core/render_utils.ts b/mediapipe/tasks/web/vision/core/render_utils.ts
index 05f2a4df1..ebb3be16a 100644
--- a/mediapipe/tasks/web/vision/core/render_utils.ts
+++ b/mediapipe/tasks/web/vision/core/render_utils.ts
@@ -16,8 +16,6 @@
* limitations under the License.
*/
-import {MPImageChannelConverter} from '../../../../tasks/web/vision/core/image';
-
// Pre-baked color table for a maximum of 12 classes.
const CM_ALPHA = 128;
const COLOR_MAP: Array<[number, number, number, number]> = [
@@ -35,8 +33,37 @@ const COLOR_MAP: Array<[number, number, number, number]> = [
[255, 255, 255, CM_ALPHA] // class 11 is white; could do black instead?
];
-/** The color converter we use in our demos. */
-export const RENDER_UTIL_CONVERTER: MPImageChannelConverter = {
- floatToRGBAConverter: v => [128, 0, 0, v * 255],
- uint8ToRGBAConverter: v => COLOR_MAP[v % COLOR_MAP.length],
-};
+
+/** Helper function to draw a confidence mask */
+export function drawConfidenceMask(
+ ctx: CanvasRenderingContext2D, image: Float32Array, width: number,
+ height: number): void {
+ const uint8Array = new Uint8ClampedArray(width * height * 4);
+ for (let i = 0; i < image.length; i++) {
+ uint8Array[4 * i] = 128;
+ uint8Array[4 * i + 1] = 0;
+ uint8Array[4 * i + 2] = 0;
+ uint8Array[4 * i + 3] = image[i] * 255;
+ }
+ ctx.putImageData(new ImageData(uint8Array, width, height), 0, 0);
+}
+
+/**
+ * Helper function to draw a category mask. For GPU, we only have F32Arrays
+ * for now.
+ */
+export function drawCategoryMask(
+ ctx: CanvasRenderingContext2D, image: Uint8Array|Float32Array,
+ width: number, height: number): void {
+ const rgbaArray = new Uint8ClampedArray(width * height * 4);
+ const isFloatArray = image instanceof Float32Array;
+ for (let i = 0; i < image.length; i++) {
+ const colorIndex = isFloatArray ? Math.round(image[i] * 255) : image[i];
+ const color = COLOR_MAP[colorIndex % COLOR_MAP.length];
+ rgbaArray[4 * i] = color[0];
+ rgbaArray[4 * i + 1] = color[1];
+ rgbaArray[4 * i + 2] = color[2];
+ rgbaArray[4 * i + 3] = color[3];
+ }
+ ctx.putImageData(new ImageData(rgbaArray, width, height), 0, 0);
+}
diff --git a/mediapipe/tasks/web/vision/core/types.d.ts b/mediapipe/tasks/web/vision/core/types.d.ts
index c985a9f36..64d67bc30 100644
--- a/mediapipe/tasks/web/vision/core/types.d.ts
+++ b/mediapipe/tasks/web/vision/core/types.d.ts
@@ -19,7 +19,10 @@ import {NormalizedKeypoint} from '../../../../tasks/web/components/containers/ke
/** A Region-Of-Interest (ROI) to represent a region within an image. */
export declare interface RegionOfInterest {
/** The ROI in keypoint format. */
- keypoint: NormalizedKeypoint;
+ keypoint?: NormalizedKeypoint;
+
+ /** The ROI as scribbles over the object that the user wants to segment. */
+ scribble?: NormalizedKeypoint[];
}
/** A connection between two landmarks. */
diff --git a/mediapipe/tasks/web/vision/core/vision_task_runner.ts b/mediapipe/tasks/web/vision/core/vision_task_runner.ts
index 285dbf900..f8f7826d0 100644
--- a/mediapipe/tasks/web/vision/core/vision_task_runner.ts
+++ b/mediapipe/tasks/web/vision/core/vision_task_runner.ts
@@ -17,8 +17,10 @@
import {NormalizedRect} from '../../../../framework/formats/rect_pb';
import {TaskRunner} from '../../../../tasks/web/core/task_runner';
import {WasmFileset} from '../../../../tasks/web/core/wasm_fileset';
-import {MPImage, MPImageShaderContext} from '../../../../tasks/web/vision/core/image';
+import {MPImage} from '../../../../tasks/web/vision/core/image';
import {ImageProcessingOptions} from '../../../../tasks/web/vision/core/image_processing_options';
+import {MPImageShaderContext} from '../../../../tasks/web/vision/core/image_shader_context';
+import {MPMask} from '../../../../tasks/web/vision/core/mask';
import {GraphRunner, ImageSource, WasmMediaPipeConstructor} from '../../../../web/graph_runner/graph_runner';
import {SupportImage, WasmImage} from '../../../../web/graph_runner/graph_runner_image_lib';
import {isWebKit} from '../../../../web/graph_runner/platform_utils';
@@ -57,11 +59,6 @@ export abstract class VisionTaskRunner extends TaskRunner {
protected static async createVisionInstance(
type: WasmMediaPipeConstructor, fileset: WasmFileset,
options: VisionTaskOptions): Promise {
- if (options.baseOptions?.delegate === 'GPU') {
- if (!options.canvas) {
- throw new Error('You must specify a canvas for GPU processing.');
- }
- }
const canvas = options.canvas ?? createCanvas();
return TaskRunner.createInstance(type, canvas, fileset, options);
}
@@ -225,19 +222,18 @@ export abstract class VisionTaskRunner extends TaskRunner {
/**
* Converts a WasmImage to an MPImage.
*
- * Converts the underlying Uint8ClampedArray-backed images to ImageData
+ * Converts the underlying Uint8Array-backed images to ImageData
* (adding an alpha channel if necessary), passes through WebGLTextures and
* throws for Float32Array-backed images.
*/
- protected convertToMPImage(wasmImage: WasmImage): MPImage {
+ protected convertToMPImage(wasmImage: WasmImage, shouldCopyData: boolean):
+ MPImage {
const {data, width, height} = wasmImage;
const pixels = width * height;
- let container: ImageData|WebGLTexture|Uint8ClampedArray;
- if (data instanceof Uint8ClampedArray) {
- if (data.length === pixels) {
- container = data; // Mask
- } else if (data.length === pixels * 3) {
+ let container: ImageData|WebGLTexture;
+ if (data instanceof Uint8Array) {
+ if (data.length === pixels * 3) {
// TODO: Convert in C++
const rgba = new Uint8ClampedArray(pixels * 4);
for (let i = 0; i < pixels; ++i) {
@@ -247,25 +243,48 @@ export abstract class VisionTaskRunner extends TaskRunner {
rgba[4 * i + 3] = 255;
}
container = new ImageData(rgba, width, height);
- } else if (data.length ===pixels * 4) {
- container = new ImageData(data, width, height);
+ } else if (data.length === pixels * 4) {
+ container = new ImageData(
+ new Uint8ClampedArray(data.buffer, data.byteOffset, data.length),
+ width, height);
} else {
throw new Error(`Unsupported channel count: ${data.length/pixels}`);
}
- } else if (data instanceof Float32Array) {
+ } else if (data instanceof WebGLTexture) {
+ container = data;
+ } else {
+ throw new Error(`Unsupported format: ${data.constructor.name}`);
+ }
+
+ const image = new MPImage(
+ [container], /* ownsImageBitmap= */ false,
+ /* ownsWebGLTexture= */ false, this.graphRunner.wasmModule.canvas!,
+ this.shaderContext, width, height);
+ return shouldCopyData ? image.clone() : image;
+ }
+
+ /** Converts a WasmImage to an MPMask. */
+ protected convertToMPMask(wasmImage: WasmImage, shouldCopyData: boolean):
+ MPMask {
+ const {data, width, height} = wasmImage;
+ const pixels = width * height;
+
+ let container: WebGLTexture|Uint8Array|Float32Array;
+ if (data instanceof Uint8Array || data instanceof Float32Array) {
if (data.length === pixels) {
- container = data; // Mask
+ container = data;
} else {
- throw new Error(`Unsupported channel count: ${data.length/pixels}`);
+ throw new Error(`Unsupported channel count: ${data.length / pixels}`);
}
- } else { // WebGLTexture
+ } else {
container = data;
}
- return new MPImage(
- [container], /* ownsImageBitmap= */ false, /* ownsWebGLTexture= */ false,
- this.graphRunner.wasmModule.canvas!, this.shaderContext, width,
- height);
+ const mask = new MPMask(
+ [container],
+ /* ownsWebGLTexture= */ false, this.graphRunner.wasmModule.canvas!,
+ this.shaderContext, width, height);
+ return shouldCopyData ? mask.clone() : mask;
}
/** Closes and cleans up the resources held by this task. */
diff --git a/mediapipe/tasks/web/vision/face_stylizer/BUILD b/mediapipe/tasks/web/vision/face_stylizer/BUILD
index 0c0167dbd..fe9146987 100644
--- a/mediapipe/tasks/web/vision/face_stylizer/BUILD
+++ b/mediapipe/tasks/web/vision/face_stylizer/BUILD
@@ -47,7 +47,6 @@ mediapipe_ts_library(
"//mediapipe/framework:calculator_jspb_proto",
"//mediapipe/tasks/web/core",
"//mediapipe/tasks/web/core:task_runner_test_utils",
- "//mediapipe/tasks/web/vision/core:image",
"//mediapipe/web/graph_runner:graph_runner_image_lib_ts",
],
)
diff --git a/mediapipe/tasks/web/vision/face_stylizer/face_stylizer.ts b/mediapipe/tasks/web/vision/face_stylizer/face_stylizer.ts
index 2a9adb315..8169e6775 100644
--- a/mediapipe/tasks/web/vision/face_stylizer/face_stylizer.ts
+++ b/mediapipe/tasks/web/vision/face_stylizer/face_stylizer.ts
@@ -50,7 +50,8 @@ export type FaceStylizerCallback = (image: MPImage|null) => void;
/** Performs face stylization on images. */
export class FaceStylizer extends VisionTaskRunner {
- private userCallback: FaceStylizerCallback = () => {};
+ private userCallback?: FaceStylizerCallback;
+ private result?: MPImage|null;
private readonly options: FaceStylizerGraphOptionsProto;
/**
@@ -130,21 +131,58 @@ export class FaceStylizer extends VisionTaskRunner {
return super.applyOptions(options);
}
-
/**
- * Performs face stylization on the provided single image. The method returns
- * synchronously once the callback returns. Only use this method when the
- * FaceStylizer is created with the image running mode.
+ * Performs face stylization on the provided single image and invokes the
+ * callback with result. The method returns synchronously once the callback
+ * returns. Only use this method when the FaceStylizer is created with the
+ * image running mode.
*
* @param image An image to process.
- * @param callback The callback that is invoked with the stylized image. The
- * lifetime of the returned data is only guaranteed for the duration of the
- * callback.
+ * @param callback The callback that is invoked with the stylized image or
+ * `null` if no face was detected. The lifetime of the returned data is
+ * only guaranteed for the duration of the callback.
*/
stylize(image: ImageSource, callback: FaceStylizerCallback): void;
/**
- * Performs face stylization on the provided single image. The method returns
- * synchronously once the callback returns. Only use this method when the
+ * Performs face stylization on the provided single image and invokes the
+ * callback with result. The method returns synchronously once the callback
+ * returns. Only use this method when the FaceStylizer is created with the
+ * image running mode.
+ *
+ * The 'imageProcessingOptions' parameter can be used to specify one or all
+ * of:
+ * - the rotation to apply to the image before performing stylization, by
+ * setting its 'rotationDegrees' property.
+ * - the region-of-interest on which to perform stylization, by setting its
+ * 'regionOfInterest' property. If not specified, the full image is used.
+ * If both are specified, the crop around the region-of-interest is extracted
+ * first, then the specified rotation is applied to the crop.
+ *
+ * @param image An image to process.
+ * @param imageProcessingOptions the `ImageProcessingOptions` specifying how
+ * to process the input image before running inference.
+ * @param callback The callback that is invoked with the stylized image or
+ * `null` if no face was detected. The lifetime of the returned data is
+ * only guaranteed for the duration of the callback.
+ */
+ stylize(
+ image: ImageSource, imageProcessingOptions: ImageProcessingOptions,
+ callback: FaceStylizerCallback): void;
+ /**
+ * Performs face stylization on the provided single image and returns the
+ * result. This method creates a copy of the resulting image and should not be
+ * used in high-throughput applictions. Only use this method when the
+ * FaceStylizer is created with the image running mode.
+ *
+ * @param image An image to process.
+ * @return A stylized face or `null` if no face was detected. The result is
+ * copied to avoid lifetime issues.
+ */
+ stylize(image: ImageSource): MPImage|null;
+ /**
+ * Performs face stylization on the provided single image and returns the
+ * result. This method creates a copy of the resulting image and should not be
+ * used in high-throughput applictions. Only use this method when the
* FaceStylizer is created with the image running mode.
*
* The 'imageProcessingOptions' parameter can be used to specify one or all
@@ -159,18 +197,16 @@ export class FaceStylizer extends VisionTaskRunner {
* @param image An image to process.
* @param imageProcessingOptions the `ImageProcessingOptions` specifying how
* to process the input image before running inference.
- * @param callback The callback that is invoked with the stylized image. The
- * lifetime of the returned data is only guaranteed for the duration of the
- * callback.
+ * @return A stylized face or `null` if no face was detected. The result is
+ * copied to avoid lifetime issues.
*/
- stylize(
- image: ImageSource, imageProcessingOptions: ImageProcessingOptions,
- callback: FaceStylizerCallback): void;
+ stylize(image: ImageSource, imageProcessingOptions: ImageProcessingOptions):
+ MPImage|null;
stylize(
image: ImageSource,
- imageProcessingOptionsOrCallback: ImageProcessingOptions|
+ imageProcessingOptionsOrCallback?: ImageProcessingOptions|
FaceStylizerCallback,
- callback?: FaceStylizerCallback): void {
+ callback?: FaceStylizerCallback): MPImage|null|void {
const imageProcessingOptions =
typeof imageProcessingOptionsOrCallback !== 'function' ?
imageProcessingOptionsOrCallback :
@@ -178,14 +214,19 @@ export class FaceStylizer extends VisionTaskRunner {
this.userCallback = typeof imageProcessingOptionsOrCallback === 'function' ?
imageProcessingOptionsOrCallback :
- callback!;
+ callback;
this.processImageData(image, imageProcessingOptions ?? {});
- this.userCallback = () => {};
+
+ if (!this.userCallback) {
+ return this.result;
+ }
}
/**
- * Performs face stylization on the provided video frame. Only use this method
- * when the FaceStylizer is created with the video running mode.
+ * Performs face stylization on the provided video frame and invokes the
+ * callback with result. The method returns synchronously once the callback
+ * returns. Only use this method when the FaceStylizer is created with the
+ * video running mode.
*
* The input frame can be of any size. It's required to provide the video
* frame's timestamp (in milliseconds). The input timestamps must be
@@ -193,16 +234,18 @@ export class FaceStylizer extends VisionTaskRunner {
*
* @param videoFrame A video frame to process.
* @param timestamp The timestamp of the current frame, in ms.
- * @param callback The callback that is invoked with the stylized image. The
- * lifetime of the returned data is only guaranteed for the duration of
- * the callback.
+ * @param callback The callback that is invoked with the stylized image or
+ * `null` if no face was detected. The lifetime of the returned data is only
+ * guaranteed for the duration of the callback.
*/
stylizeForVideo(
videoFrame: ImageSource, timestamp: number,
callback: FaceStylizerCallback): void;
/**
- * Performs face stylization on the provided video frame. Only use this
- * method when the FaceStylizer is created with the video running mode.
+ * Performs face stylization on the provided video frame and invokes the
+ * callback with result. The method returns synchronously once the callback
+ * returns. Only use this method when the FaceStylizer is created with the
+ * video running mode.
*
* The 'imageProcessingOptions' parameter can be used to specify one or all
* of:
@@ -218,34 +261,83 @@ export class FaceStylizer extends VisionTaskRunner {
* monotonically increasing.
*
* @param videoFrame A video frame to process.
+ * @param timestamp The timestamp of the current frame, in ms.
* @param imageProcessingOptions the `ImageProcessingOptions` specifying how
* to process the input image before running inference.
- * @param timestamp The timestamp of the current frame, in ms.
- * @param callback The callback that is invoked with the stylized image. The
- * lifetime of the returned data is only guaranteed for the duration of
- * the callback.
+ * @param callback The callback that is invoked with the stylized image or
+ * `null` if no face was detected. The lifetime of the returned data is only
+ * guaranteed for the duration of the callback.
*/
stylizeForVideo(
- videoFrame: ImageSource, imageProcessingOptions: ImageProcessingOptions,
- timestamp: number, callback: FaceStylizerCallback): void;
+ videoFrame: ImageSource, timestamp: number,
+ imageProcessingOptions: ImageProcessingOptions,
+ callback: FaceStylizerCallback): void;
+ /**
+ * Performs face stylization on the provided video frame. This method creates
+ * a copy of the resulting image and should not be used in high-throughput
+ * applictions. Only use this method when the FaceStylizer is created with the
+ * video running mode.
+ *
+ * The input frame can be of any size. It's required to provide the video
+ * frame's timestamp (in milliseconds). The input timestamps must be
+ * monotonically increasing.
+ *
+ * @param videoFrame A video frame to process.
+ * @param timestamp The timestamp of the current frame, in ms.
+ * @return A stylized face or `null` if no face was detected. The result is
+ * copied to avoid lifetime issues.
+ */
+ stylizeForVideo(videoFrame: ImageSource, timestamp: number): MPImage|null;
+ /**
+ * Performs face stylization on the provided video frame. This method creates
+ * a copy of the resulting image and should not be used in high-throughput
+ * applictions. Only use this method when the FaceStylizer is created with the
+ * video running mode.
+ *
+ * The 'imageProcessingOptions' parameter can be used to specify one or all
+ * of:
+ * - the rotation to apply to the image before performing stylization, by
+ * setting its 'rotationDegrees' property.
+ * - the region-of-interest on which to perform stylization, by setting its
+ * 'regionOfInterest' property. If not specified, the full image is used.
+ * If both are specified, the crop around the region-of-interest is
+ * extracted first, then the specified rotation is applied to the crop.
+ *
+ * The input frame can be of any size. It's required to provide the video
+ * frame's timestamp (in milliseconds). The input timestamps must be
+ * monotonically increasing.
+ *
+ * @param videoFrame A video frame to process.
+ * @param timestamp The timestamp of the current frame, in ms.
+ * @param imageProcessingOptions the `ImageProcessingOptions` specifying how
+ * to process the input image before running inference.
+ * @return A stylized face or `null` if no face was detected. The result is
+ * copied to avoid lifetime issues.
+ */
stylizeForVideo(
videoFrame: ImageSource,
- timestampOrImageProcessingOptions: number|ImageProcessingOptions,
- timestampOrCallback: number|FaceStylizerCallback,
- callback?: FaceStylizerCallback): void {
+ timestamp: number,
+ imageProcessingOptions: ImageProcessingOptions,
+ ): MPImage|null;
+ stylizeForVideo(
+ videoFrame: ImageSource, timestamp: number,
+ imageProcessingOptionsOrCallback?: ImageProcessingOptions|
+ FaceStylizerCallback,
+ callback?: FaceStylizerCallback): MPImage|null|void {
const imageProcessingOptions =
- typeof timestampOrImageProcessingOptions !== 'number' ?
- timestampOrImageProcessingOptions :
+ typeof imageProcessingOptionsOrCallback !== 'function' ?
+ imageProcessingOptionsOrCallback :
{};
- const timestamp = typeof timestampOrImageProcessingOptions === 'number' ?
- timestampOrImageProcessingOptions :
- timestampOrCallback as number;
- this.userCallback = typeof timestampOrCallback === 'function' ?
- timestampOrCallback :
- callback!;
+ this.userCallback = typeof imageProcessingOptionsOrCallback === 'function' ?
+ imageProcessingOptionsOrCallback :
+ callback;
this.processVideoData(videoFrame, imageProcessingOptions, timestamp);
- this.userCallback = () => {};
+ this.userCallback = undefined;
+
+ if (!this.userCallback) {
+ return this.result;
+ }
}
/** Updates the MediaPipe graph configuration. */
@@ -270,13 +362,20 @@ export class FaceStylizer extends VisionTaskRunner {
this.graphRunner.attachImageListener(
STYLIZED_IMAGE_STREAM, (wasmImage, timestamp) => {
- const mpImage = this.convertToMPImage(wasmImage);
- this.userCallback(mpImage);
+ const mpImage = this.convertToMPImage(
+ wasmImage, /* shouldCopyData= */ !this.userCallback);
+ this.result = mpImage;
+ if (this.userCallback) {
+ this.userCallback(mpImage);
+ }
this.setLatestOutputTimestamp(timestamp);
});
this.graphRunner.attachEmptyPacketListener(
STYLIZED_IMAGE_STREAM, timestamp => {
- this.userCallback(null);
+ this.result = null;
+ if (this.userCallback) {
+ this.userCallback(null);
+ }
this.setLatestOutputTimestamp(timestamp);
});
diff --git a/mediapipe/tasks/web/vision/face_stylizer/face_stylizer_test.ts b/mediapipe/tasks/web/vision/face_stylizer/face_stylizer_test.ts
index 17764c9e5..c092bf0f8 100644
--- a/mediapipe/tasks/web/vision/face_stylizer/face_stylizer_test.ts
+++ b/mediapipe/tasks/web/vision/face_stylizer/face_stylizer_test.ts
@@ -19,7 +19,6 @@ import 'jasmine';
// Placeholder for internal dependency on encodeByteArray
import {CalculatorGraphConfig} from '../../../../framework/calculator_pb';
import {addJasmineCustomFloatEqualityTester, createSpyWasmModule, MediapipeTasksFake, SpyWasmModule, verifyGraph, verifyListenersRegistered} from '../../../../tasks/web/core/task_runner_test_utils';
-import {MPImage} from '../../../../tasks/web/vision/core/image';
import {WasmImage} from '../../../../web/graph_runner/graph_runner_image_lib';
import {FaceStylizer} from './face_stylizer';
@@ -99,6 +98,30 @@ describe('FaceStylizer', () => {
]);
});
+ it('returns result', () => {
+ if (typeof ImageData === 'undefined') {
+ console.log('ImageData tests are not supported on Node');
+ return;
+ }
+
+ // Pass the test data to our listener
+ faceStylizer.fakeWasmModule._waitUntilIdle.and.callFake(() => {
+ verifyListenersRegistered(faceStylizer);
+ faceStylizer.imageListener!
+ ({data: new Uint8Array([1, 1, 1, 1]), width: 1, height: 1},
+ /* timestamp= */ 1337);
+ });
+
+ // Invoke the face stylizeer
+ const image = faceStylizer.stylize({} as HTMLImageElement);
+ expect(faceStylizer.fakeWasmModule._waitUntilIdle).toHaveBeenCalled();
+ expect(image).not.toBeNull();
+ expect(image!.hasImageData()).toBeTrue();
+ expect(image!.width).toEqual(1);
+ expect(image!.height).toEqual(1);
+ image!.close();
+ });
+
it('invokes callback', (done) => {
if (typeof ImageData === 'undefined') {
console.log('ImageData tests are not supported on Node');
@@ -110,7 +133,7 @@ describe('FaceStylizer', () => {
faceStylizer.fakeWasmModule._waitUntilIdle.and.callFake(() => {
verifyListenersRegistered(faceStylizer);
faceStylizer.imageListener!
- ({data: new Uint8ClampedArray([1, 1, 1, 1]), width: 1, height: 1},
+ ({data: new Uint8Array([1, 1, 1, 1]), width: 1, height: 1},
/* timestamp= */ 1337);
});
@@ -118,35 +141,14 @@ describe('FaceStylizer', () => {
faceStylizer.stylize({} as HTMLImageElement, image => {
expect(faceStylizer.fakeWasmModule._waitUntilIdle).toHaveBeenCalled();
expect(image).not.toBeNull();
- expect(image!.has(MPImage.TYPE.IMAGE_DATA)).toBeTrue();
+ expect(image!.hasImageData()).toBeTrue();
expect(image!.width).toEqual(1);
expect(image!.height).toEqual(1);
done();
});
});
- it('invokes callback even when no faes are detected', (done) => {
- if (typeof ImageData === 'undefined') {
- console.log('ImageData tests are not supported on Node');
- done();
- return;
- }
-
- // Pass the test data to our listener
- faceStylizer.fakeWasmModule._waitUntilIdle.and.callFake(() => {
- verifyListenersRegistered(faceStylizer);
- faceStylizer.emptyPacketListener!(/* timestamp= */ 1337);
- });
-
- // Invoke the face stylizeer
- faceStylizer.stylize({} as HTMLImageElement, image => {
- expect(faceStylizer.fakeWasmModule._waitUntilIdle).toHaveBeenCalled();
- expect(image).toBeNull();
- done();
- });
- });
-
- it('invokes callback even when no faes are detected', (done) => {
+ it('invokes callback even when no faces are detected', (done) => {
// Pass the test data to our listener
faceStylizer.fakeWasmModule._waitUntilIdle.and.callFake(() => {
verifyListenersRegistered(faceStylizer);
diff --git a/mediapipe/tasks/web/vision/image_segmenter/BUILD b/mediapipe/tasks/web/vision/image_segmenter/BUILD
index 6c1829bd3..1a008cc95 100644
--- a/mediapipe/tasks/web/vision/image_segmenter/BUILD
+++ b/mediapipe/tasks/web/vision/image_segmenter/BUILD
@@ -35,7 +35,7 @@ mediapipe_ts_declaration(
deps = [
"//mediapipe/tasks/web/core",
"//mediapipe/tasks/web/core:classifier_options",
- "//mediapipe/tasks/web/vision/core:image",
+ "//mediapipe/tasks/web/vision/core:mask",
"//mediapipe/tasks/web/vision/core:vision_task_options",
],
)
@@ -52,7 +52,7 @@ mediapipe_ts_library(
"//mediapipe/framework:calculator_jspb_proto",
"//mediapipe/tasks/web/core",
"//mediapipe/tasks/web/core:task_runner_test_utils",
- "//mediapipe/tasks/web/vision/core:image",
+ "//mediapipe/tasks/web/vision/core:mask",
"//mediapipe/web/graph_runner:graph_runner_image_lib_ts",
],
)
diff --git a/mediapipe/tasks/web/vision/image_segmenter/image_segmenter.ts b/mediapipe/tasks/web/vision/image_segmenter/image_segmenter.ts
index 60b965345..39e57d94e 100644
--- a/mediapipe/tasks/web/vision/image_segmenter/image_segmenter.ts
+++ b/mediapipe/tasks/web/vision/image_segmenter/image_segmenter.ts
@@ -60,7 +60,7 @@ export type ImageSegmenterCallback = (result: ImageSegmenterResult) => void;
export class ImageSegmenter extends VisionTaskRunner {
private result: ImageSegmenterResult = {};
private labels: string[] = [];
- private userCallback: ImageSegmenterCallback = () => {};
+ private userCallback?: ImageSegmenterCallback;
private outputCategoryMask = DEFAULT_OUTPUT_CATEGORY_MASK;
private outputConfidenceMasks = DEFAULT_OUTPUT_CONFIDENCE_MASKS;
private readonly options: ImageSegmenterGraphOptionsProto;
@@ -224,22 +224,51 @@ export class ImageSegmenter extends VisionTaskRunner {
segment(
image: ImageSource, imageProcessingOptions: ImageProcessingOptions,
callback: ImageSegmenterCallback): void;
+ /**
+ * Performs image segmentation on the provided single image and returns the
+ * segmentation result. This method creates a copy of the resulting masks and
+ * should not be used in high-throughput applictions. Only use this method
+ * when the ImageSegmenter is created with running mode `image`.
+ *
+ * @param image An image to process.
+ * @return The segmentation result. The data is copied to avoid lifetime
+ * issues.
+ */
+ segment(image: ImageSource): ImageSegmenterResult;
+ /**
+ * Performs image segmentation on the provided single image and returns the
+ * segmentation result. This method creates a copy of the resulting masks and
+ * should not be used in high-v applictions. Only use this method when
+ * the ImageSegmenter is created with running mode `image`.
+ *
+ * @param image An image to process.
+ * @param imageProcessingOptions the `ImageProcessingOptions` specifying how
+ * to process the input image before running inference.
+ * @return The segmentation result. The data is copied to avoid lifetime
+ * issues.
+ */
+ segment(image: ImageSource, imageProcessingOptions: ImageProcessingOptions):
+ ImageSegmenterResult;
segment(
image: ImageSource,
- imageProcessingOptionsOrCallback: ImageProcessingOptions|
+ imageProcessingOptionsOrCallback?: ImageProcessingOptions|
ImageSegmenterCallback,
- callback?: ImageSegmenterCallback): void {
+ callback?: ImageSegmenterCallback): ImageSegmenterResult|void {
const imageProcessingOptions =
typeof imageProcessingOptionsOrCallback !== 'function' ?
imageProcessingOptionsOrCallback :
{};
+
this.userCallback = typeof imageProcessingOptionsOrCallback === 'function' ?
imageProcessingOptionsOrCallback :
- callback!;
+ callback;
this.reset();
this.processImageData(image, imageProcessingOptions);
- this.userCallback = () => {};
+
+ if (!this.userCallback) {
+ return this.result;
+ }
}
/**
@@ -264,35 +293,64 @@ export class ImageSegmenter extends VisionTaskRunner {
* created with running mode `video`.
*
* @param videoFrame A video frame to process.
- * @param imageProcessingOptions the `ImageProcessingOptions` specifying how
- * to process the input image before running inference.
* @param timestamp The timestamp of the current frame, in ms.
+ * @param imageProcessingOptions the `ImageProcessingOptions` specifying how
+ * to process the input frame before running inference.
* @param callback The callback that is invoked with the segmented masks. The
* lifetime of the returned data is only guaranteed for the duration of the
* callback.
*/
segmentForVideo(
- videoFrame: ImageSource, imageProcessingOptions: ImageProcessingOptions,
- timestamp: number, callback: ImageSegmenterCallback): void;
+ videoFrame: ImageSource, timestamp: number,
+ imageProcessingOptions: ImageProcessingOptions,
+ callback: ImageSegmenterCallback): void;
+ /**
+ * Performs image segmentation on the provided video frame and returns the
+ * segmentation result. This method creates a copy of the resulting masks and
+ * should not be used in high-throughput applictions. Only use this method
+ * when the ImageSegmenter is created with running mode `video`.
+ *
+ * @param videoFrame A video frame to process.
+ * @return The segmentation result. The data is copied to avoid lifetime
+ * issues.
+ */
+ segmentForVideo(videoFrame: ImageSource, timestamp: number):
+ ImageSegmenterResult;
+ /**
+ * Performs image segmentation on the provided video frame and returns the
+ * segmentation result. This method creates a copy of the resulting masks and
+ * should not be used in high-v applictions. Only use this method when
+ * the ImageSegmenter is created with running mode `video`.
+ *
+ * @param videoFrame A video frame to process.
+ * @param timestamp The timestamp of the current frame, in ms.
+ * @param imageProcessingOptions the `ImageProcessingOptions` specifying how
+ * to process the input frame before running inference.
+ * @return The segmentation result. The data is copied to avoid lifetime
+ * issues.
+ */
segmentForVideo(
- videoFrame: ImageSource,
- timestampOrImageProcessingOptions: number|ImageProcessingOptions,
- timestampOrCallback: number|ImageSegmenterCallback,
- callback?: ImageSegmenterCallback): void {
+ videoFrame: ImageSource, timestamp: number,
+ imageProcessingOptions: ImageProcessingOptions): ImageSegmenterResult;
+ segmentForVideo(
+ videoFrame: ImageSource, timestamp: number,
+ imageProcessingOptionsOrCallback?: ImageProcessingOptions|
+ ImageSegmenterCallback,
+ callback?: ImageSegmenterCallback): ImageSegmenterResult|void {
const imageProcessingOptions =
- typeof timestampOrImageProcessingOptions !== 'number' ?
- timestampOrImageProcessingOptions :
+ typeof imageProcessingOptionsOrCallback !== 'function' ?
+ imageProcessingOptionsOrCallback :
{};
- const timestamp = typeof timestampOrImageProcessingOptions === 'number' ?
- timestampOrImageProcessingOptions :
- timestampOrCallback as number;
- this.userCallback = typeof timestampOrCallback === 'function' ?
- timestampOrCallback :
- callback!;
+ this.userCallback = typeof imageProcessingOptionsOrCallback === 'function' ?
+ imageProcessingOptionsOrCallback :
+ callback;
this.reset();
this.processVideoData(videoFrame, imageProcessingOptions, timestamp);
- this.userCallback = () => {};
+
+ if (!this.userCallback) {
+ return this.result;
+ }
}
/**
@@ -323,7 +381,9 @@ export class ImageSegmenter extends VisionTaskRunner {
return;
}
- this.userCallback(this.result);
+ if (this.userCallback) {
+ this.userCallback(this.result);
+ }
}
/** Updates the MediaPipe graph configuration. */
@@ -351,8 +411,9 @@ export class ImageSegmenter extends VisionTaskRunner {
this.graphRunner.attachImageVectorListener(
CONFIDENCE_MASKS_STREAM, (masks, timestamp) => {
- this.result.confidenceMasks =
- masks.map(wasmImage => this.convertToMPImage(wasmImage));
+ this.result.confidenceMasks = masks.map(
+ wasmImage => this.convertToMPMask(
+ wasmImage, /* shouldCopyData= */ !this.userCallback));
this.setLatestOutputTimestamp(timestamp);
this.maybeInvokeCallback();
});
@@ -370,7 +431,8 @@ export class ImageSegmenter extends VisionTaskRunner {
this.graphRunner.attachImageListener(
CATEGORY_MASK_STREAM, (mask, timestamp) => {
- this.result.categoryMask = this.convertToMPImage(mask);
+ this.result.categoryMask = this.convertToMPMask(
+ mask, /* shouldCopyData= */ !this.userCallback);
this.setLatestOutputTimestamp(timestamp);
this.maybeInvokeCallback();
});
diff --git a/mediapipe/tasks/web/vision/image_segmenter/image_segmenter_result.d.ts b/mediapipe/tasks/web/vision/image_segmenter/image_segmenter_result.d.ts
index 454ec27ea..25962d57e 100644
--- a/mediapipe/tasks/web/vision/image_segmenter/image_segmenter_result.d.ts
+++ b/mediapipe/tasks/web/vision/image_segmenter/image_segmenter_result.d.ts
@@ -14,7 +14,7 @@
* limitations under the License.
*/
-import {MPImage} from '../../../../tasks/web/vision/core/image';
+import {MPMask} from '../../../../tasks/web/vision/core/mask';
/** The output result of ImageSegmenter. */
export declare interface ImageSegmenterResult {
@@ -23,12 +23,12 @@ export declare interface ImageSegmenterResult {
* `MPImage`s where, for each mask, each pixel represents the prediction
* confidence, usually in the [0, 1] range.
*/
- confidenceMasks?: MPImage[];
+ confidenceMasks?: MPMask[];
/**
* A category mask represented as a `Uint8ClampedArray` or
* `WebGLTexture`-backed `MPImage` where each pixel represents the class which
* the pixel in the original image was predicted to belong to.
*/
- categoryMask?: MPImage;
+ categoryMask?: MPMask;
}
diff --git a/mediapipe/tasks/web/vision/image_segmenter/image_segmenter_test.ts b/mediapipe/tasks/web/vision/image_segmenter/image_segmenter_test.ts
index c1ccd7997..f9172ecd3 100644
--- a/mediapipe/tasks/web/vision/image_segmenter/image_segmenter_test.ts
+++ b/mediapipe/tasks/web/vision/image_segmenter/image_segmenter_test.ts
@@ -19,8 +19,8 @@ import 'jasmine';
// Placeholder for internal dependency on encodeByteArray
import {CalculatorGraphConfig} from '../../../../framework/calculator_pb';
import {addJasmineCustomFloatEqualityTester, createSpyWasmModule, MediapipeTasksFake, SpyWasmModule, verifyGraph} from '../../../../tasks/web/core/task_runner_test_utils';
+import {MPMask} from '../../../../tasks/web/vision/core/mask';
import {WasmImage} from '../../../../web/graph_runner/graph_runner_image_lib';
-import {MPImage} from '../../../../tasks/web/vision/core/image';
import {ImageSegmenter} from './image_segmenter';
import {ImageSegmenterOptions} from './image_segmenter_options';
@@ -165,7 +165,7 @@ describe('ImageSegmenter', () => {
});
it('supports category mask', async () => {
- const mask = new Uint8ClampedArray([1, 2, 3, 4]);
+ const mask = new Uint8Array([1, 2, 3, 4]);
await imageSegmenter.setOptions(
{outputCategoryMask: true, outputConfidenceMasks: false});
@@ -183,7 +183,7 @@ describe('ImageSegmenter', () => {
return new Promise(resolve => {
imageSegmenter.segment({} as HTMLImageElement, result => {
expect(imageSegmenter.fakeWasmModule._waitUntilIdle).toHaveBeenCalled();
- expect(result.categoryMask).toBeInstanceOf(MPImage);
+ expect(result.categoryMask).toBeInstanceOf(MPMask);
expect(result.confidenceMasks).not.toBeDefined();
expect(result.categoryMask!.width).toEqual(2);
expect(result.categoryMask!.height).toEqual(2);
@@ -216,18 +216,18 @@ describe('ImageSegmenter', () => {
expect(imageSegmenter.fakeWasmModule._waitUntilIdle).toHaveBeenCalled();
expect(result.categoryMask).not.toBeDefined();
- expect(result.confidenceMasks![0]).toBeInstanceOf(MPImage);
+ expect(result.confidenceMasks![0]).toBeInstanceOf(MPMask);
expect(result.confidenceMasks![0].width).toEqual(2);
expect(result.confidenceMasks![0].height).toEqual(2);
- expect(result.confidenceMasks![1]).toBeInstanceOf(MPImage);
+ expect(result.confidenceMasks![1]).toBeInstanceOf(MPMask);
resolve();
});
});
});
it('supports combined category and confidence masks', async () => {
- const categoryMask = new Uint8ClampedArray([1]);
+ const categoryMask = new Uint8Array([1]);
const confidenceMask1 = new Float32Array([0.0]);
const confidenceMask2 = new Float32Array([1.0]);
@@ -252,19 +252,19 @@ describe('ImageSegmenter', () => {
// Invoke the image segmenter
imageSegmenter.segment({} as HTMLImageElement, result => {
expect(imageSegmenter.fakeWasmModule._waitUntilIdle).toHaveBeenCalled();
- expect(result.categoryMask).toBeInstanceOf(MPImage);
+ expect(result.categoryMask).toBeInstanceOf(MPMask);
expect(result.categoryMask!.width).toEqual(1);
expect(result.categoryMask!.height).toEqual(1);
- expect(result.confidenceMasks![0]).toBeInstanceOf(MPImage);
- expect(result.confidenceMasks![1]).toBeInstanceOf(MPImage);
+ expect(result.confidenceMasks![0]).toBeInstanceOf(MPMask);
+ expect(result.confidenceMasks![1]).toBeInstanceOf(MPMask);
resolve();
});
});
});
- it('invokes listener once masks are avaiblae', async () => {
- const categoryMask = new Uint8ClampedArray([1]);
+ it('invokes listener once masks are available', async () => {
+ const categoryMask = new Uint8Array([1]);
const confidenceMask = new Float32Array([0.0]);
let listenerCalled = false;
@@ -292,4 +292,21 @@ describe('ImageSegmenter', () => {
});
});
});
+
+ it('returns result', () => {
+ const confidenceMask = new Float32Array([0.0]);
+
+ // Pass the test data to our listener
+ imageSegmenter.fakeWasmModule._waitUntilIdle.and.callFake(() => {
+ imageSegmenter.confidenceMasksListener!(
+ [
+ {data: confidenceMask, width: 1, height: 1},
+ ],
+ 1337);
+ });
+
+ const result = imageSegmenter.segment({} as HTMLImageElement);
+ expect(result.confidenceMasks![0]).toBeInstanceOf(MPMask);
+ result.confidenceMasks![0].close();
+ });
});
diff --git a/mediapipe/tasks/web/vision/index.ts b/mediapipe/tasks/web/vision/index.ts
index 34c1206cc..5b643b84e 100644
--- a/mediapipe/tasks/web/vision/index.ts
+++ b/mediapipe/tasks/web/vision/index.ts
@@ -16,7 +16,8 @@
import {FilesetResolver as FilesetResolverImpl} from '../../../tasks/web/core/fileset_resolver';
import {DrawingUtils as DrawingUtilsImpl} from '../../../tasks/web/vision/core/drawing_utils';
-import {MPImage as MPImageImpl, MPImageType as MPImageTypeImpl} from '../../../tasks/web/vision/core/image';
+import {MPImage as MPImageImpl} from '../../../tasks/web/vision/core/image';
+import {MPMask as MPMaskImpl} from '../../../tasks/web/vision/core/mask';
import {FaceDetector as FaceDetectorImpl} from '../../../tasks/web/vision/face_detector/face_detector';
import {FaceLandmarker as FaceLandmarkerImpl, FaceLandmarksConnections as FaceLandmarksConnectionsImpl} from '../../../tasks/web/vision/face_landmarker/face_landmarker';
import {FaceStylizer as FaceStylizerImpl} from '../../../tasks/web/vision/face_stylizer/face_stylizer';
@@ -34,7 +35,7 @@ import {PoseLandmarker as PoseLandmarkerImpl} from '../../../tasks/web/vision/po
const DrawingUtils = DrawingUtilsImpl;
const FilesetResolver = FilesetResolverImpl;
const MPImage = MPImageImpl;
-const MPImageType = MPImageTypeImpl;
+const MPMask = MPMaskImpl;
const FaceDetector = FaceDetectorImpl;
const FaceLandmarker = FaceLandmarkerImpl;
const FaceLandmarksConnections = FaceLandmarksConnectionsImpl;
@@ -52,7 +53,7 @@ export {
DrawingUtils,
FilesetResolver,
MPImage,
- MPImageType,
+ MPMask,
FaceDetector,
FaceLandmarker,
FaceLandmarksConnections,
diff --git a/mediapipe/tasks/web/vision/interactive_segmenter/BUILD b/mediapipe/tasks/web/vision/interactive_segmenter/BUILD
index c3be79ebf..57b0946a2 100644
--- a/mediapipe/tasks/web/vision/interactive_segmenter/BUILD
+++ b/mediapipe/tasks/web/vision/interactive_segmenter/BUILD
@@ -37,7 +37,7 @@ mediapipe_ts_declaration(
deps = [
"//mediapipe/tasks/web/core",
"//mediapipe/tasks/web/core:classifier_options",
- "//mediapipe/tasks/web/vision/core:image",
+ "//mediapipe/tasks/web/vision/core:mask",
"//mediapipe/tasks/web/vision/core:vision_task_options",
],
)
@@ -54,7 +54,7 @@ mediapipe_ts_library(
"//mediapipe/framework:calculator_jspb_proto",
"//mediapipe/tasks/web/core",
"//mediapipe/tasks/web/core:task_runner_test_utils",
- "//mediapipe/tasks/web/vision/core:image",
+ "//mediapipe/tasks/web/vision/core:mask",
"//mediapipe/util:render_data_jspb_proto",
"//mediapipe/web/graph_runner:graph_runner_image_lib_ts",
],
diff --git a/mediapipe/tasks/web/vision/interactive_segmenter/interactive_segmenter.ts b/mediapipe/tasks/web/vision/interactive_segmenter/interactive_segmenter.ts
index 67d6ec3f6..2a51a5fcf 100644
--- a/mediapipe/tasks/web/vision/interactive_segmenter/interactive_segmenter.ts
+++ b/mediapipe/tasks/web/vision/interactive_segmenter/interactive_segmenter.ts
@@ -86,7 +86,7 @@ export class InteractiveSegmenter extends VisionTaskRunner {
private result: InteractiveSegmenterResult = {};
private outputCategoryMask = DEFAULT_OUTPUT_CATEGORY_MASK;
private outputConfidenceMasks = DEFAULT_OUTPUT_CONFIDENCE_MASKS;
- private userCallback: InteractiveSegmenterCallback = () => {};
+ private userCallback?: InteractiveSegmenterCallback;
private readonly options: ImageSegmenterGraphOptionsProto;
private readonly segmenterOptions: SegmenterOptionsProto;
@@ -186,14 +186,9 @@ export class InteractiveSegmenter extends VisionTaskRunner {
/**
* Performs interactive segmentation on the provided single image and invokes
- * the callback with the response. The `roi` parameter is used to represent a
- * user's region of interest for segmentation.
- *
- * If the output_type is `CATEGORY_MASK`, the callback is invoked with vector
- * of images that represent per-category segmented image mask. If the
- * output_type is `CONFIDENCE_MASK`, the callback is invoked with a vector of
- * images that contains only one confidence image mask. The method returns
- * synchronously once the callback returns.
+ * the callback with the response. The method returns synchronously once the
+ * callback returns. The `roi` parameter is used to represent a user's region
+ * of interest for segmentation.
*
* @param image An image to process.
* @param roi The region of interest for segmentation.
@@ -206,8 +201,9 @@ export class InteractiveSegmenter extends VisionTaskRunner {
callback: InteractiveSegmenterCallback): void;
/**
* Performs interactive segmentation on the provided single image and invokes
- * the callback with the response. The `roi` parameter is used to represent a
- * user's region of interest for segmentation.
+ * the callback with the response. The method returns synchronously once the
+ * callback returns. The `roi` parameter is used to represent a user's region
+ * of interest for segmentation.
*
* The 'image_processing_options' parameter can be used to specify the
* rotation to apply to the image before performing segmentation, by setting
@@ -215,12 +211,6 @@ export class InteractiveSegmenter extends VisionTaskRunner {
* using the 'regionOfInterest' field is NOT supported and will result in an
* error.
*
- * If the output_type is `CATEGORY_MASK`, the callback is invoked with vector
- * of images that represent per-category segmented image mask. If the
- * output_type is `CONFIDENCE_MASK`, the callback is invoked with a vector of
- * images that contains only one confidence image mask. The method returns
- * synchronously once the callback returns.
- *
* @param image An image to process.
* @param roi The region of interest for segmentation.
* @param imageProcessingOptions the `ImageProcessingOptions` specifying how
@@ -233,23 +223,63 @@ export class InteractiveSegmenter extends VisionTaskRunner {
image: ImageSource, roi: RegionOfInterest,
imageProcessingOptions: ImageProcessingOptions,
callback: InteractiveSegmenterCallback): void;
+ /**
+ * Performs interactive segmentation on the provided video frame and returns
+ * the segmentation result. This method creates a copy of the resulting masks
+ * and should not be used in high-throughput applictions. The `roi` parameter
+ * is used to represent a user's region of interest for segmentation.
+ *
+ * @param image An image to process.
+ * @param roi The region of interest for segmentation.
+ * @return The segmentation result. The data is copied to avoid lifetime
+ * limits.
+ */
+ segment(image: ImageSource, roi: RegionOfInterest):
+ InteractiveSegmenterResult;
+ /**
+ * Performs interactive segmentation on the provided video frame and returns
+ * the segmentation result. This method creates a copy of the resulting masks
+ * and should not be used in high-throughput applictions. The `roi` parameter
+ * is used to represent a user's region of interest for segmentation.
+ *
+ * The 'image_processing_options' parameter can be used to specify the
+ * rotation to apply to the image before performing segmentation, by setting
+ * its 'rotationDegrees' field. Note that specifying a region-of-interest
+ * using the 'regionOfInterest' field is NOT supported and will result in an
+ * error.
+ *
+ * @param image An image to process.
+ * @param roi The region of interest for segmentation.
+ * @param imageProcessingOptions the `ImageProcessingOptions` specifying how
+ * to process the input image before running inference.
+ * @return The segmentation result. The data is copied to avoid lifetime
+ * limits.
+ */
segment(
image: ImageSource, roi: RegionOfInterest,
- imageProcessingOptionsOrCallback: ImageProcessingOptions|
+ imageProcessingOptions: ImageProcessingOptions):
+ InteractiveSegmenterResult;
+ segment(
+ image: ImageSource, roi: RegionOfInterest,
+ imageProcessingOptionsOrCallback?: ImageProcessingOptions|
InteractiveSegmenterCallback,
- callback?: InteractiveSegmenterCallback): void {
+ callback?: InteractiveSegmenterCallback): InteractiveSegmenterResult|
+ void {
const imageProcessingOptions =
typeof imageProcessingOptionsOrCallback !== 'function' ?
imageProcessingOptionsOrCallback :
{};
this.userCallback = typeof imageProcessingOptionsOrCallback === 'function' ?
imageProcessingOptionsOrCallback :
- callback!;
+ callback;
this.reset();
this.processRenderData(roi, this.getSynctheticTimestamp());
this.processImageData(image, imageProcessingOptions);
- this.userCallback = () => {};
+
+ if (!this.userCallback) {
+ return this.result;
+ }
}
private reset(): void {
@@ -265,7 +295,9 @@ export class InteractiveSegmenter extends VisionTaskRunner {
return;
}
- this.userCallback(this.result);
+ if (this.userCallback) {
+ this.userCallback(this.result);
+ }
}
/** Updates the MediaPipe graph configuration. */
@@ -295,8 +327,9 @@ export class InteractiveSegmenter extends VisionTaskRunner {
this.graphRunner.attachImageVectorListener(
CONFIDENCE_MASKS_STREAM, (masks, timestamp) => {
- this.result.confidenceMasks =
- masks.map(wasmImage => this.convertToMPImage(wasmImage));
+ this.result.confidenceMasks = masks.map(
+ wasmImage => this.convertToMPMask(
+ wasmImage, /* shouldCopyData= */ !this.userCallback));
this.setLatestOutputTimestamp(timestamp);
this.maybeInvokeCallback();
});
@@ -314,7 +347,8 @@ export class InteractiveSegmenter extends VisionTaskRunner {
this.graphRunner.attachImageListener(
CATEGORY_MASK_STREAM, (mask, timestamp) => {
- this.result.categoryMask = this.convertToMPImage(mask);
+ this.result.categoryMask = this.convertToMPMask(
+ mask, /* shouldCopyData= */ !this.userCallback);
this.setLatestOutputTimestamp(timestamp);
this.maybeInvokeCallback();
});
@@ -338,16 +372,31 @@ export class InteractiveSegmenter extends VisionTaskRunner {
const renderData = new RenderDataProto();
const renderAnnotation = new RenderAnnotationProto();
-
const color = new ColorProto();
color.setR(255);
renderAnnotation.setColor(color);
- const point = new RenderAnnotationProto.Point();
- point.setNormalized(true);
- point.setX(roi.keypoint.x);
- point.setY(roi.keypoint.y);
- renderAnnotation.setPoint(point);
+ if (roi.keypoint && roi.scribble) {
+ throw new Error('Cannot provide both keypoint and scribble.');
+ } else if (roi.keypoint) {
+ const point = new RenderAnnotationProto.Point();
+ point.setNormalized(true);
+ point.setX(roi.keypoint.x);
+ point.setY(roi.keypoint.y);
+ renderAnnotation.setPoint(point);
+ } else if (roi.scribble) {
+ const scribble = new RenderAnnotationProto.Scribble();
+ for (const coord of roi.scribble) {
+ const point = new RenderAnnotationProto.Point();
+ point.setNormalized(true);
+ point.setX(coord.x);
+ point.setY(coord.y);
+ scribble.addPoint(point);
+ }
+ renderAnnotation.setScribble(scribble);
+ } else {
+ throw new Error('Must provide either a keypoint or a scribble.');
+ }
renderData.addRenderAnnotations(renderAnnotation);
diff --git a/mediapipe/tasks/web/vision/interactive_segmenter/interactive_segmenter_result.d.ts b/mediapipe/tasks/web/vision/interactive_segmenter/interactive_segmenter_result.d.ts
index bc2962936..e773b5e64 100644
--- a/mediapipe/tasks/web/vision/interactive_segmenter/interactive_segmenter_result.d.ts
+++ b/mediapipe/tasks/web/vision/interactive_segmenter/interactive_segmenter_result.d.ts
@@ -14,7 +14,7 @@
* limitations under the License.
*/
-import {MPImage} from '../../../../tasks/web/vision/core/image';
+import {MPMask} from '../../../../tasks/web/vision/core/mask';
/** The output result of InteractiveSegmenter. */
export declare interface InteractiveSegmenterResult {
@@ -23,12 +23,12 @@ export declare interface InteractiveSegmenterResult {
* `MPImage`s where, for each mask, each pixel represents the prediction
* confidence, usually in the [0, 1] range.
*/
- confidenceMasks?: MPImage[];
+ confidenceMasks?: MPMask[];
/**
* A category mask represented as a `Uint8ClampedArray` or
* `WebGLTexture`-backed `MPImage` where each pixel represents the class which
* the pixel in the original image was predicted to belong to.
*/
- categoryMask?: MPImage;
+ categoryMask?: MPMask;
}
diff --git a/mediapipe/tasks/web/vision/interactive_segmenter/interactive_segmenter_test.ts b/mediapipe/tasks/web/vision/interactive_segmenter/interactive_segmenter_test.ts
index 84ecde00b..c5603c5c6 100644
--- a/mediapipe/tasks/web/vision/interactive_segmenter/interactive_segmenter_test.ts
+++ b/mediapipe/tasks/web/vision/interactive_segmenter/interactive_segmenter_test.ts
@@ -19,17 +19,21 @@ import 'jasmine';
// Placeholder for internal dependency on encodeByteArray
import {CalculatorGraphConfig} from '../../../../framework/calculator_pb';
import {addJasmineCustomFloatEqualityTester, createSpyWasmModule, MediapipeTasksFake, SpyWasmModule, verifyGraph} from '../../../../tasks/web/core/task_runner_test_utils';
-import {MPImage} from '../../../../tasks/web/vision/core/image';
+import {MPMask} from '../../../../tasks/web/vision/core/mask';
import {RenderData as RenderDataProto} from '../../../../util/render_data_pb';
import {WasmImage} from '../../../../web/graph_runner/graph_runner_image_lib';
import {InteractiveSegmenter, RegionOfInterest} from './interactive_segmenter';
-const ROI: RegionOfInterest = {
+const KEYPOINT: RegionOfInterest = {
keypoint: {x: 0.1, y: 0.2}
};
+const SCRIBBLE: RegionOfInterest = {
+ scribble: [{x: 0.1, y: 0.2}, {x: 0.3, y: 0.4}]
+};
+
class InteractiveSegmenterFake extends InteractiveSegmenter implements
MediapipeTasksFake {
calculatorName =
@@ -134,26 +138,46 @@ describe('InteractiveSegmenter', () => {
it('doesn\'t support region of interest', () => {
expect(() => {
interactiveSegmenter.segment(
- {} as HTMLImageElement, ROI,
+ {} as HTMLImageElement, KEYPOINT,
{regionOfInterest: {left: 0, right: 0, top: 0, bottom: 0}}, () => {});
}).toThrowError('This task doesn\'t support region-of-interest.');
});
- it('sends region-of-interest', (done) => {
+ it('sends region-of-interest with keypoint', (done) => {
interactiveSegmenter.fakeWasmModule._waitUntilIdle.and.callFake(() => {
expect(interactiveSegmenter.lastRoi).toBeDefined();
expect(interactiveSegmenter.lastRoi!.toObject().renderAnnotationsList![0])
.toEqual(jasmine.objectContaining({
color: {r: 255, b: undefined, g: undefined},
+ point: {x: 0.1, y: 0.2, normalized: true},
}));
done();
});
- interactiveSegmenter.segment({} as HTMLImageElement, ROI, () => {});
+ interactiveSegmenter.segment({} as HTMLImageElement, KEYPOINT, () => {});
+ });
+
+ it('sends region-of-interest with scribble', (done) => {
+ interactiveSegmenter.fakeWasmModule._waitUntilIdle.and.callFake(() => {
+ expect(interactiveSegmenter.lastRoi).toBeDefined();
+ expect(interactiveSegmenter.lastRoi!.toObject().renderAnnotationsList![0])
+ .toEqual(jasmine.objectContaining({
+ color: {r: 255, b: undefined, g: undefined},
+ scribble: {
+ pointList: [
+ {x: 0.1, y: 0.2, normalized: true},
+ {x: 0.3, y: 0.4, normalized: true}
+ ]
+ },
+ }));
+ done();
+ });
+
+ interactiveSegmenter.segment({} as HTMLImageElement, SCRIBBLE, () => {});
});
it('supports category mask', async () => {
- const mask = new Uint8ClampedArray([1, 2, 3, 4]);
+ const mask = new Uint8Array([1, 2, 3, 4]);
await interactiveSegmenter.setOptions(
{outputCategoryMask: true, outputConfidenceMasks: false});
@@ -168,10 +192,10 @@ describe('InteractiveSegmenter', () => {
// Invoke the image segmenter
return new Promise(resolve => {
- interactiveSegmenter.segment({} as HTMLImageElement, ROI, result => {
+ interactiveSegmenter.segment({} as HTMLImageElement, KEYPOINT, result => {
expect(interactiveSegmenter.fakeWasmModule._waitUntilIdle)
.toHaveBeenCalled();
- expect(result.categoryMask).toBeInstanceOf(MPImage);
+ expect(result.categoryMask).toBeInstanceOf(MPMask);
expect(result.categoryMask!.width).toEqual(2);
expect(result.categoryMask!.height).toEqual(2);
expect(result.confidenceMasks).not.toBeDefined();
@@ -199,23 +223,23 @@ describe('InteractiveSegmenter', () => {
});
return new Promise(resolve => {
// Invoke the image segmenter
- interactiveSegmenter.segment({} as HTMLImageElement, ROI, result => {
+ interactiveSegmenter.segment({} as HTMLImageElement, KEYPOINT, result => {
expect(interactiveSegmenter.fakeWasmModule._waitUntilIdle)
.toHaveBeenCalled();
expect(result.categoryMask).not.toBeDefined();
- expect(result.confidenceMasks![0]).toBeInstanceOf(MPImage);
+ expect(result.confidenceMasks![0]).toBeInstanceOf(MPMask);
expect(result.confidenceMasks![0].width).toEqual(2);
expect(result.confidenceMasks![0].height).toEqual(2);
- expect(result.confidenceMasks![1]).toBeInstanceOf(MPImage);
+ expect(result.confidenceMasks![1]).toBeInstanceOf(MPMask);
resolve();
});
});
});
it('supports combined category and confidence masks', async () => {
- const categoryMask = new Uint8ClampedArray([1]);
+ const categoryMask = new Uint8Array([1]);
const confidenceMask1 = new Float32Array([0.0]);
const confidenceMask2 = new Float32Array([1.0]);
@@ -239,22 +263,22 @@ describe('InteractiveSegmenter', () => {
return new Promise(resolve => {
// Invoke the image segmenter
interactiveSegmenter.segment(
- {} as HTMLImageElement, ROI, result => {
+ {} as HTMLImageElement, KEYPOINT, result => {
expect(interactiveSegmenter.fakeWasmModule._waitUntilIdle)
.toHaveBeenCalled();
- expect(result.categoryMask).toBeInstanceOf(MPImage);
+ expect(result.categoryMask).toBeInstanceOf(MPMask);
expect(result.categoryMask!.width).toEqual(1);
expect(result.categoryMask!.height).toEqual(1);
- expect(result.confidenceMasks![0]).toBeInstanceOf(MPImage);
- expect(result.confidenceMasks![1]).toBeInstanceOf(MPImage);
+ expect(result.confidenceMasks![0]).toBeInstanceOf(MPMask);
+ expect(result.confidenceMasks![1]).toBeInstanceOf(MPMask);
resolve();
});
});
});
it('invokes listener once masks are avaiblae', async () => {
- const categoryMask = new Uint8ClampedArray([1]);
+ const categoryMask = new Uint8Array([1]);
const confidenceMask = new Float32Array([0.0]);
let listenerCalled = false;
@@ -276,10 +300,28 @@ describe('InteractiveSegmenter', () => {
});
return new Promise(resolve => {
- interactiveSegmenter.segment({} as HTMLImageElement, ROI, () => {
+ interactiveSegmenter.segment({} as HTMLImageElement, KEYPOINT, () => {
listenerCalled = true;
resolve();
});
});
});
+
+ it('returns result', () => {
+ const confidenceMask = new Float32Array([0.0]);
+
+ // Pass the test data to our listener
+ interactiveSegmenter.fakeWasmModule._waitUntilIdle.and.callFake(() => {
+ interactiveSegmenter.confidenceMasksListener!(
+ [
+ {data: confidenceMask, width: 1, height: 1},
+ ],
+ 1337);
+ });
+
+ const result =
+ interactiveSegmenter.segment({} as HTMLImageElement, KEYPOINT);
+ expect(result.confidenceMasks![0]).toBeInstanceOf(MPMask);
+ result.confidenceMasks![0].close();
+ });
});
diff --git a/mediapipe/tasks/web/vision/pose_landmarker/BUILD b/mediapipe/tasks/web/vision/pose_landmarker/BUILD
index 8d128ac1a..566513b40 100644
--- a/mediapipe/tasks/web/vision/pose_landmarker/BUILD
+++ b/mediapipe/tasks/web/vision/pose_landmarker/BUILD
@@ -45,7 +45,7 @@ mediapipe_ts_declaration(
"//mediapipe/tasks/web/components/containers:category",
"//mediapipe/tasks/web/components/containers:landmark",
"//mediapipe/tasks/web/core",
- "//mediapipe/tasks/web/vision/core:image",
+ "//mediapipe/tasks/web/vision/core:mask",
"//mediapipe/tasks/web/vision/core:vision_task_options",
],
)
@@ -63,7 +63,7 @@ mediapipe_ts_library(
"//mediapipe/tasks/web/components/processors:landmark_result",
"//mediapipe/tasks/web/core",
"//mediapipe/tasks/web/core:task_runner_test_utils",
- "//mediapipe/tasks/web/vision/core:image",
+ "//mediapipe/tasks/web/vision/core:mask",
"//mediapipe/tasks/web/vision/core:vision_task_runner",
],
)
diff --git a/mediapipe/tasks/web/vision/pose_landmarker/pose_landmarker.ts b/mediapipe/tasks/web/vision/pose_landmarker/pose_landmarker.ts
index 2d72bf1dc..87fdacbc2 100644
--- a/mediapipe/tasks/web/vision/pose_landmarker/pose_landmarker.ts
+++ b/mediapipe/tasks/web/vision/pose_landmarker/pose_landmarker.ts
@@ -43,7 +43,6 @@ const IMAGE_STREAM = 'image_in';
const NORM_RECT_STREAM = 'norm_rect';
const NORM_LANDMARKS_STREAM = 'normalized_landmarks';
const WORLD_LANDMARKS_STREAM = 'world_landmarks';
-const AUXILIARY_LANDMARKS_STREAM = 'auxiliary_landmarks';
const SEGMENTATION_MASK_STREAM = 'segmentation_masks';
const POSE_LANDMARKER_GRAPH =
'mediapipe.tasks.vision.pose_landmarker.PoseLandmarkerGraph';
@@ -64,7 +63,7 @@ export type PoseLandmarkerCallback = (result: PoseLandmarkerResult) => void;
export class PoseLandmarker extends VisionTaskRunner {
private result: Partial = {};
private outputSegmentationMasks = false;
- private userCallback: PoseLandmarkerCallback = () => {};
+ private userCallback?: PoseLandmarkerCallback;
private readonly options: PoseLandmarkerGraphOptions;
private readonly poseLandmarksDetectorGraphOptions:
PoseLandmarksDetectorGraphOptions;
@@ -200,21 +199,22 @@ export class PoseLandmarker extends VisionTaskRunner {
}
/**
- * Performs pose detection on the provided single image and waits
- * synchronously for the response. Only use this method when the
- * PoseLandmarker is created with running mode `image`.
+ * Performs pose detection on the provided single image and invokes the
+ * callback with the response. The method returns synchronously once the
+ * callback returns. Only use this method when the PoseLandmarker is created
+ * with running mode `image`.
*
* @param image An image to process.
* @param callback The callback that is invoked with the result. The
* lifetime of the returned masks is only guaranteed for the duration of
* the callback.
- * @return The detected pose landmarks.
*/
detect(image: ImageSource, callback: PoseLandmarkerCallback): void;
/**
- * Performs pose detection on the provided single image and waits
- * synchronously for the response. Only use this method when the
- * PoseLandmarker is created with running mode `image`.
+ * Performs pose detection on the provided single image and invokes the
+ * callback with the response. The method returns synchronously once the
+ * callback returns. Only use this method when the PoseLandmarker is created
+ * with running mode `image`.
*
* @param image An image to process.
* @param imageProcessingOptions the `ImageProcessingOptions` specifying how
@@ -222,16 +222,42 @@ export class PoseLandmarker extends VisionTaskRunner {
* @param callback The callback that is invoked with the result. The
* lifetime of the returned masks is only guaranteed for the duration of
* the callback.
- * @return The detected pose landmarks.
*/
detect(
image: ImageSource, imageProcessingOptions: ImageProcessingOptions,
callback: PoseLandmarkerCallback): void;
+ /**
+ * Performs pose detection on the provided single image and waits
+ * synchronously for the response. This method creates a copy of the resulting
+ * masks and should not be used in high-throughput applictions. Only
+ * use this method when the PoseLandmarker is created with running mode
+ * `image`.
+ *
+ * @param image An image to process.
+ * @return The landmarker result. Any masks are copied to avoid lifetime
+ * limits.
+ * @return The detected pose landmarks.
+ */
+ detect(image: ImageSource): PoseLandmarkerResult;
+ /**
+ * Performs pose detection on the provided single image and waits
+ * synchronously for the response. This method creates a copy of the resulting
+ * masks and should not be used in high-throughput applictions. Only
+ * use this method when the PoseLandmarker is created with running mode
+ * `image`.
+ *
+ * @param image An image to process.
+ * @return The landmarker result. Any masks are copied to avoid lifetime
+ * limits.
+ * @return The detected pose landmarks.
+ */
+ detect(image: ImageSource, imageProcessingOptions: ImageProcessingOptions):
+ PoseLandmarkerResult;
detect(
image: ImageSource,
- imageProcessingOptionsOrCallback: ImageProcessingOptions|
+ imageProcessingOptionsOrCallback?: ImageProcessingOptions|
PoseLandmarkerCallback,
- callback?: PoseLandmarkerCallback): void {
+ callback?: PoseLandmarkerCallback): PoseLandmarkerResult|void {
const imageProcessingOptions =
typeof imageProcessingOptionsOrCallback !== 'function' ?
imageProcessingOptionsOrCallback :
@@ -242,59 +268,94 @@ export class PoseLandmarker extends VisionTaskRunner {
this.resetResults();
this.processImageData(image, imageProcessingOptions);
- this.userCallback = () => {};
+
+ if (!this.userCallback) {
+ return this.result as PoseLandmarkerResult;
+ }
}
/**
- * Performs pose detection on the provided video frame and waits
- * synchronously for the response. Only use this method when the
- * PoseLandmarker is created with running mode `video`.
+ * Performs pose detection on the provided video frame and invokes the
+ * callback with the response. The method returns synchronously once the
+ * callback returns. Only use this method when the PoseLandmarker is created
+ * with running mode `video`.
*
* @param videoFrame A video frame to process.
* @param timestamp The timestamp of the current frame, in ms.
* @param callback The callback that is invoked with the result. The
* lifetime of the returned masks is only guaranteed for the duration of
* the callback.
- * @return The detected pose landmarks.
*/
detectForVideo(
videoFrame: ImageSource, timestamp: number,
callback: PoseLandmarkerCallback): void;
/**
- * Performs pose detection on the provided video frame and waits
- * synchronously for the response. Only use this method when the
- * PoseLandmarker is created with running mode `video`.
+ * Performs pose detection on the provided video frame and invokes the
+ * callback with the response. The method returns synchronously once the
+ * callback returns. Only use this method when the PoseLandmarker is created
+ * with running mode `video`.
*
* @param videoFrame A video frame to process.
+ * @param timestamp The timestamp of the current frame, in ms.
* @param imageProcessingOptions the `ImageProcessingOptions` specifying how
* to process the input image before running inference.
- * @param timestamp The timestamp of the current frame, in ms.
* @param callback The callback that is invoked with the result. The
* lifetime of the returned masks is only guaranteed for the duration of
* the callback.
- * @return The detected pose landmarks.
*/
detectForVideo(
- videoFrame: ImageSource, imageProcessingOptions: ImageProcessingOptions,
- timestamp: number, callback: PoseLandmarkerCallback): void;
+ videoFrame: ImageSource, timestamp: number,
+ imageProcessingOptions: ImageProcessingOptions,
+ callback: PoseLandmarkerCallback): void;
+ /**
+ * Performs pose detection on the provided video frame and returns the result.
+ * This method creates a copy of the resulting masks and should not be used
+ * in high-throughput applictions. Only use this method when the
+ * PoseLandmarker is created with running mode `video`.
+ *
+ * @param videoFrame A video frame to process.
+ * @param timestamp The timestamp of the current frame, in ms.
+ * @return The landmarker result. Any masks are copied to extend the
+ * lifetime of the returned data.
+ */
+ detectForVideo(videoFrame: ImageSource, timestamp: number):
+ PoseLandmarkerResult;
+ /**
+ * Performs pose detection on the provided video frame and returns the result.
+ * This method creates a copy of the resulting masks and should not be used
+ * in high-throughput applictions. The method returns synchronously once the
+ * callback returns. Only use this method when the PoseLandmarker is created
+ * with running mode `video`.
+ *
+ * @param videoFrame A video frame to process.
+ * @param timestamp The timestamp of the current frame, in ms.
+ * @param imageProcessingOptions the `ImageProcessingOptions` specifying how
+ * to process the input image before running inference.
+ * @return The landmarker result. Any masks are copied to extend the lifetime
+ * of the returned data.
+ */
detectForVideo(
- videoFrame: ImageSource,
- timestampOrImageProcessingOptions: number|ImageProcessingOptions,
- timestampOrCallback: number|PoseLandmarkerCallback,
- callback?: PoseLandmarkerCallback): void {
+ videoFrame: ImageSource, timestamp: number,
+ imageProcessingOptions: ImageProcessingOptions): PoseLandmarkerResult;
+ detectForVideo(
+ videoFrame: ImageSource, timestamp: number,
+ imageProcessingOptionsOrCallback?: ImageProcessingOptions|
+ PoseLandmarkerCallback,
+ callback?: PoseLandmarkerCallback): PoseLandmarkerResult|void {
const imageProcessingOptions =
- typeof timestampOrImageProcessingOptions !== 'number' ?
- timestampOrImageProcessingOptions :
+ typeof imageProcessingOptionsOrCallback !== 'function' ?
+ imageProcessingOptionsOrCallback :
{};
- const timestamp = typeof timestampOrImageProcessingOptions === 'number' ?
- timestampOrImageProcessingOptions :
- timestampOrCallback as number;
- this.userCallback = typeof timestampOrCallback === 'function' ?
- timestampOrCallback :
- callback!;
+ this.userCallback = typeof imageProcessingOptionsOrCallback === 'function' ?
+ imageProcessingOptionsOrCallback :
+ callback;
+
this.resetResults();
this.processVideoData(videoFrame, imageProcessingOptions, timestamp);
- this.userCallback = () => {};
+
+ if (!this.userCallback) {
+ return this.result as PoseLandmarkerResult;
+ }
}
private resetResults(): void {
@@ -309,13 +370,13 @@ export class PoseLandmarker extends VisionTaskRunner {
if (!('worldLandmarks' in this.result)) {
return;
}
- if (!('landmarks' in this.result)) {
- return;
- }
if (this.outputSegmentationMasks && !('segmentationMasks' in this.result)) {
return;
}
- this.userCallback(this.result as Required);
+
+ if (this.userCallback) {
+ this.userCallback(this.result as Required);
+ }
}
/** Sets the default values for the graph. */
@@ -332,10 +393,11 @@ export class PoseLandmarker extends VisionTaskRunner {
* Converts raw data into a landmark, and adds it to our landmarks list.
*/
private addJsLandmarks(data: Uint8Array[]): void {
+ this.result.landmarks = [];
for (const binaryProto of data) {
const poseLandmarksProto =
NormalizedLandmarkList.deserializeBinary(binaryProto);
- this.result.landmarks = convertToLandmarks(poseLandmarksProto);
+ this.result.landmarks.push(convertToLandmarks(poseLandmarksProto));
}
}
@@ -344,24 +406,12 @@ export class PoseLandmarker extends VisionTaskRunner {
* worldLandmarks list.
*/
private adddJsWorldLandmarks(data: Uint8Array[]): void {
+ this.result.worldLandmarks = [];
for (const binaryProto of data) {
const poseWorldLandmarksProto =
LandmarkList.deserializeBinary(binaryProto);
- this.result.worldLandmarks =
- convertToWorldLandmarks(poseWorldLandmarksProto);
- }
- }
-
- /**
- * Converts raw data into a landmark, and adds it to our auxilary
- * landmarks list.
- */
- private addJsAuxiliaryLandmarks(data: Uint8Array[]): void {
- for (const binaryProto of data) {
- const auxiliaryLandmarksProto =
- NormalizedLandmarkList.deserializeBinary(binaryProto);
- this.result.auxilaryLandmarks =
- convertToLandmarks(auxiliaryLandmarksProto);
+ this.result.worldLandmarks.push(
+ convertToWorldLandmarks(poseWorldLandmarksProto));
}
}
@@ -372,7 +422,6 @@ export class PoseLandmarker extends VisionTaskRunner {
graphConfig.addInputStream(NORM_RECT_STREAM);
graphConfig.addOutputStream(NORM_LANDMARKS_STREAM);
graphConfig.addOutputStream(WORLD_LANDMARKS_STREAM);
- graphConfig.addOutputStream(AUXILIARY_LANDMARKS_STREAM);
graphConfig.addOutputStream(SEGMENTATION_MASK_STREAM);
const calculatorOptions = new CalculatorOptions();
@@ -385,8 +434,6 @@ export class PoseLandmarker extends VisionTaskRunner {
landmarkerNode.addInputStream('NORM_RECT:' + NORM_RECT_STREAM);
landmarkerNode.addOutputStream('NORM_LANDMARKS:' + NORM_LANDMARKS_STREAM);
landmarkerNode.addOutputStream('WORLD_LANDMARKS:' + WORLD_LANDMARKS_STREAM);
- landmarkerNode.addOutputStream(
- 'AUXILIARY_LANDMARKS:' + AUXILIARY_LANDMARKS_STREAM);
landmarkerNode.setOptions(calculatorOptions);
graphConfig.addNode(landmarkerNode);
@@ -417,26 +464,14 @@ export class PoseLandmarker extends VisionTaskRunner {
this.maybeInvokeCallback();
});
- this.graphRunner.attachProtoVectorListener(
- AUXILIARY_LANDMARKS_STREAM, (binaryProto, timestamp) => {
- this.addJsAuxiliaryLandmarks(binaryProto);
- this.setLatestOutputTimestamp(timestamp);
- this.maybeInvokeCallback();
- });
- this.graphRunner.attachEmptyPacketListener(
- AUXILIARY_LANDMARKS_STREAM, timestamp => {
- this.result.auxilaryLandmarks = [];
- this.setLatestOutputTimestamp(timestamp);
- this.maybeInvokeCallback();
- });
-
if (this.outputSegmentationMasks) {
landmarkerNode.addOutputStream(
'SEGMENTATION_MASK:' + SEGMENTATION_MASK_STREAM);
this.graphRunner.attachImageVectorListener(
SEGMENTATION_MASK_STREAM, (masks, timestamp) => {
- this.result.segmentationMasks =
- masks.map(wasmImage => this.convertToMPImage(wasmImage));
+ this.result.segmentationMasks = masks.map(
+ wasmImage => this.convertToMPMask(
+ wasmImage, /* shouldCopyData= */ !this.userCallback));
this.setLatestOutputTimestamp(timestamp);
this.maybeInvokeCallback();
});
diff --git a/mediapipe/tasks/web/vision/pose_landmarker/pose_landmarker_result.d.ts b/mediapipe/tasks/web/vision/pose_landmarker/pose_landmarker_result.d.ts
index 66d0498a6..96e698a85 100644
--- a/mediapipe/tasks/web/vision/pose_landmarker/pose_landmarker_result.d.ts
+++ b/mediapipe/tasks/web/vision/pose_landmarker/pose_landmarker_result.d.ts
@@ -16,7 +16,7 @@
import {Category} from '../../../../tasks/web/components/containers/category';
import {Landmark, NormalizedLandmark} from '../../../../tasks/web/components/containers/landmark';
-import {MPImage} from '../../../../tasks/web/vision/core/image';
+import {MPMask} from '../../../../tasks/web/vision/core/mask';
export {Category, Landmark, NormalizedLandmark};
@@ -26,14 +26,11 @@ export {Category, Landmark, NormalizedLandmark};
*/
export declare interface PoseLandmarkerResult {
/** Pose landmarks of detected poses. */
- landmarks: NormalizedLandmark[];
+ landmarks: NormalizedLandmark[][];
/** Pose landmarks in world coordinates of detected poses. */
- worldLandmarks: Landmark[];
-
- /** Detected auxiliary landmarks, used for deriving ROI for next frame. */
- auxilaryLandmarks: NormalizedLandmark[];
+ worldLandmarks: Landmark[][];
/** Segmentation mask for the detected pose. */
- segmentationMasks?: MPImage[];
+ segmentationMasks?: MPMask[];
}
diff --git a/mediapipe/tasks/web/vision/pose_landmarker/pose_landmarker_test.ts b/mediapipe/tasks/web/vision/pose_landmarker/pose_landmarker_test.ts
index 794df68b8..d4a49db97 100644
--- a/mediapipe/tasks/web/vision/pose_landmarker/pose_landmarker_test.ts
+++ b/mediapipe/tasks/web/vision/pose_landmarker/pose_landmarker_test.ts
@@ -18,7 +18,7 @@ import 'jasmine';
import {CalculatorGraphConfig} from '../../../../framework/calculator_pb';
import {createLandmarks, createWorldLandmarks} from '../../../../tasks/web/components/processors/landmark_result_test_lib';
import {addJasmineCustomFloatEqualityTester, createSpyWasmModule, MediapipeTasksFake, SpyWasmModule, verifyGraph} from '../../../../tasks/web/core/task_runner_test_utils';
-import {MPImage} from '../../../../tasks/web/vision/core/image';
+import {MPMask} from '../../../../tasks/web/vision/core/mask';
import {VisionGraphRunner} from '../../../../tasks/web/vision/core/vision_task_runner';
import {PoseLandmarker} from './pose_landmarker';
@@ -45,8 +45,7 @@ class PoseLandmarkerFake extends PoseLandmarker implements MediapipeTasksFake {
this.attachListenerSpies[0] =
spyOn(this.graphRunner, 'attachProtoVectorListener')
.and.callFake((stream, listener) => {
- expect(stream).toMatch(
- /(normalized_landmarks|world_landmarks|auxiliary_landmarks)/);
+ expect(stream).toMatch(/(normalized_landmarks|world_landmarks)/);
this.listeners.set(stream, listener as PacketListener);
});
this.attachListenerSpies[1] =
@@ -80,23 +79,23 @@ describe('PoseLandmarker', () => {
it('initializes graph', async () => {
verifyGraph(poseLandmarker);
- expect(poseLandmarker.listeners).toHaveSize(3);
+ expect(poseLandmarker.listeners).toHaveSize(2);
});
it('reloads graph when settings are changed', async () => {
await poseLandmarker.setOptions({numPoses: 1});
verifyGraph(poseLandmarker, [['poseDetectorGraphOptions', 'numPoses'], 1]);
- expect(poseLandmarker.listeners).toHaveSize(3);
+ expect(poseLandmarker.listeners).toHaveSize(2);
await poseLandmarker.setOptions({numPoses: 5});
verifyGraph(poseLandmarker, [['poseDetectorGraphOptions', 'numPoses'], 5]);
- expect(poseLandmarker.listeners).toHaveSize(3);
+ expect(poseLandmarker.listeners).toHaveSize(2);
});
it('registers listener for segmentation masks', async () => {
- expect(poseLandmarker.listeners).toHaveSize(3);
+ expect(poseLandmarker.listeners).toHaveSize(2);
await poseLandmarker.setOptions({outputSegmentationMasks: true});
- expect(poseLandmarker.listeners).toHaveSize(4);
+ expect(poseLandmarker.listeners).toHaveSize(3);
});
it('merges options', async () => {
@@ -209,8 +208,6 @@ describe('PoseLandmarker', () => {
(landmarksProto, 1337);
poseLandmarker.listeners.get('world_landmarks')!
(worldLandmarksProto, 1337);
- poseLandmarker.listeners.get('auxiliary_landmarks')!
- (landmarksProto, 1337);
poseLandmarker.listeners.get('segmentation_masks')!(masks, 1337);
});
@@ -222,10 +219,9 @@ describe('PoseLandmarker', () => {
.toHaveBeenCalledTimes(1);
expect(poseLandmarker.fakeWasmModule._waitUntilIdle).toHaveBeenCalled();
- expect(result.landmarks).toEqual([{'x': 0, 'y': 0, 'z': 0}]);
- expect(result.worldLandmarks).toEqual([{'x': 0, 'y': 0, 'z': 0}]);
- expect(result.auxilaryLandmarks).toEqual([{'x': 0, 'y': 0, 'z': 0}]);
- expect(result.segmentationMasks![0]).toBeInstanceOf(MPImage);
+ expect(result.landmarks).toEqual([[{'x': 0, 'y': 0, 'z': 0}]]);
+ expect(result.worldLandmarks).toEqual([[{'x': 0, 'y': 0, 'z': 0}]]);
+ expect(result.segmentationMasks![0]).toBeInstanceOf(MPMask);
done();
});
});
@@ -240,8 +236,6 @@ describe('PoseLandmarker', () => {
(landmarksProto, 1337);
poseLandmarker.listeners.get('world_landmarks')!
(worldLandmarksProto, 1337);
- poseLandmarker.listeners.get('auxiliary_landmarks')!
- (landmarksProto, 1337);
});
// Invoke the pose landmarker twice
@@ -261,7 +255,39 @@ describe('PoseLandmarker', () => {
expect(landmarks1).toEqual(landmarks2);
});
- it('invokes listener once masks are avaiblae', (done) => {
+ it('supports multiple poses', (done) => {
+ const landmarksProto = [
+ createLandmarks(0.1, 0.2, 0.3).serializeBinary(),
+ createLandmarks(0.4, 0.5, 0.6).serializeBinary()
+ ];
+ const worldLandmarksProto = [
+ createWorldLandmarks(1, 2, 3).serializeBinary(),
+ createWorldLandmarks(4, 5, 6).serializeBinary()
+ ];
+
+ poseLandmarker.setOptions({numPoses: 1});
+
+ // Pass the test data to our listener
+ poseLandmarker.fakeWasmModule._waitUntilIdle.and.callFake(() => {
+ poseLandmarker.listeners.get('normalized_landmarks')!
+ (landmarksProto, 1337);
+ poseLandmarker.listeners.get('world_landmarks')!
+ (worldLandmarksProto, 1337);
+ });
+
+ // Invoke the pose landmarker
+ poseLandmarker.detect({} as HTMLImageElement, result => {
+ expect(result.landmarks).toEqual([
+ [{'x': 0.1, 'y': 0.2, 'z': 0.3}], [{'x': 0.4, 'y': 0.5, 'z': 0.6}]
+ ]);
+ expect(result.worldLandmarks).toEqual([
+ [{'x': 1, 'y': 2, 'z': 3}], [{'x': 4, 'y': 5, 'z': 6}]
+ ]);
+ done();
+ });
+ });
+
+ it('invokes listener once masks are available', (done) => {
const landmarksProto = [createLandmarks().serializeBinary()];
const worldLandmarksProto = [createWorldLandmarks().serializeBinary()];
const masks = [
@@ -281,8 +307,6 @@ describe('PoseLandmarker', () => {
poseLandmarker.listeners.get('world_landmarks')!
(worldLandmarksProto, 1337);
expect(listenerCalled).toBeFalse();
- poseLandmarker.listeners.get('auxiliary_landmarks')!
- (landmarksProto, 1337);
expect(listenerCalled).toBeFalse();
poseLandmarker.listeners.get('segmentation_masks')!(masks, 1337);
expect(listenerCalled).toBeTrue();
@@ -294,4 +318,23 @@ describe('PoseLandmarker', () => {
listenerCalled = true;
});
});
+
+ it('returns result', () => {
+ const landmarksProto = [createLandmarks().serializeBinary()];
+ const worldLandmarksProto = [createWorldLandmarks().serializeBinary()];
+
+ // Pass the test data to our listener
+ poseLandmarker.fakeWasmModule._waitUntilIdle.and.callFake(() => {
+ poseLandmarker.listeners.get('normalized_landmarks')!
+ (landmarksProto, 1337);
+ poseLandmarker.listeners.get('world_landmarks')!
+ (worldLandmarksProto, 1337);
+ });
+
+ // Invoke the pose landmarker
+ const result = poseLandmarker.detect({} as HTMLImageElement);
+ expect(poseLandmarker.fakeWasmModule._waitUntilIdle).toHaveBeenCalled();
+ expect(result.landmarks).toEqual([[{'x': 0, 'y': 0, 'z': 0}]]);
+ expect(result.worldLandmarks).toEqual([[{'x': 0, 'y': 0, 'z': 0}]]);
+ });
});
diff --git a/mediapipe/tasks/web/vision/types.ts b/mediapipe/tasks/web/vision/types.ts
index 164276bab..760b97b77 100644
--- a/mediapipe/tasks/web/vision/types.ts
+++ b/mediapipe/tasks/web/vision/types.ts
@@ -16,7 +16,8 @@
export * from '../../../tasks/web/core/fileset_resolver';
export * from '../../../tasks/web/vision/core/drawing_utils';
-export {MPImage, MPImageChannelConverter, MPImageType} from '../../../tasks/web/vision/core/image';
+export {MPImage} from '../../../tasks/web/vision/core/image';
+export {MPMask} from '../../../tasks/web/vision/core/mask';
export * from '../../../tasks/web/vision/face_detector/face_detector';
export * from '../../../tasks/web/vision/face_landmarker/face_landmarker';
export * from '../../../tasks/web/vision/face_stylizer/face_stylizer';
diff --git a/mediapipe/web/graph_runner/graph_runner_image_lib.ts b/mediapipe/web/graph_runner/graph_runner_image_lib.ts
index 8b491d891..d2d6e52a8 100644
--- a/mediapipe/web/graph_runner/graph_runner_image_lib.ts
+++ b/mediapipe/web/graph_runner/graph_runner_image_lib.ts
@@ -10,7 +10,7 @@ type LibConstructor = new (...args: any[]) => GraphRunner;
/** An image returned from a MediaPipe graph. */
export interface WasmImage {
- data: Uint8ClampedArray|Float32Array|WebGLTexture;
+ data: Uint8Array|Float32Array|WebGLTexture;
width: number;
height: number;
}
diff --git a/third_party/BUILD b/third_party/BUILD
index 7522bab1b..60fa73799 100644
--- a/third_party/BUILD
+++ b/third_party/BUILD
@@ -13,6 +13,9 @@
# limitations under the License.
#
+load("@rules_foreign_cc//tools/build_defs:cmake.bzl", "cmake_external")
+load("@bazel_skylib//:bzl_library.bzl", "bzl_library")
+
licenses(["notice"]) # Apache License 2.0
exports_files(["LICENSE"])
@@ -61,16 +64,73 @@ config_setting(
visibility = ["//visibility:public"],
)
+config_setting(
+ name = "opencv_ios_arm64_source_build",
+ define_values = {
+ "OPENCV": "source",
+ },
+ values = {
+ "apple_platform_type": "ios",
+ "cpu": "ios_arm64",
+ },
+)
+
+config_setting(
+ name = "opencv_ios_sim_arm64_source_build",
+ define_values = {
+ "OPENCV": "source",
+ },
+ values = {
+ "apple_platform_type": "ios",
+ "cpu": "ios_sim_arm64",
+ },
+)
+
+config_setting(
+ name = "opencv_ios_x86_64_source_build",
+ define_values = {
+ "OPENCV": "source",
+ },
+ values = {
+ "apple_platform_type": "ios",
+ "cpu": "ios_x86_64",
+ },
+)
+
+config_setting(
+ name = "opencv_ios_sim_fat_source_build",
+ define_values = {
+ "OPENCV": "source",
+ },
+ values = {
+ "apple_platform_type": "ios",
+ "ios_multi_cpus": "sim_arm64, x86_64",
+ },
+)
+
alias(
name = "opencv",
actual = select({
":opencv_source_build": ":opencv_cmake",
+ ":opencv_ios_sim_arm64_source_build": "@ios_opencv_source//:opencv",
+ ":opencv_ios_sim_fat_source_build": "@ios_opencv_source//:opencv",
+ ":opencv_ios_arm64_source_build": "@ios_opencv_source//:opencv",
"//conditions:default": ":opencv_binary",
}),
visibility = ["//visibility:public"],
)
-load("@rules_foreign_cc//tools/build_defs:cmake.bzl", "cmake_external")
+bzl_library(
+ name = "opencv_ios_xcframework_files_bzl",
+ srcs = ["opencv_ios_xcframework_files.bzl"],
+ visibility = ["//visibility:private"],
+)
+
+bzl_library(
+ name = "opencv_ios_source_bzl",
+ srcs = ["opencv_ios_source.bzl"],
+ visibility = ["//visibility:private"],
+)
# Note: this determines the order in which the libraries are passed to the
# linker, so if library A depends on library B, library B must come _after_.
diff --git a/third_party/external_files.bzl b/third_party/external_files.bzl
index af9361bb3..652a2947f 100644
--- a/third_party/external_files.bzl
+++ b/third_party/external_files.bzl
@@ -204,8 +204,8 @@ def external_files():
http_file(
name = "com_google_mediapipe_conv2d_input_channel_1_tflite",
- sha256 = "126edac445967799f3b8b124d15483b1506f6d6cb57a501c1636eb8f2fb3734f",
- urls = ["https://storage.googleapis.com/mediapipe-assets/conv2d_input_channel_1.tflite?generation=1678218348519744"],
+ sha256 = "ccb667092f3aed3a35a57fb3478fecc0c8f6360dbf477a9db9c24e5b3ec4273e",
+ urls = ["https://storage.googleapis.com/mediapipe-assets/conv2d_input_channel_1.tflite?generation=1683252905577703"],
)
http_file(
@@ -246,8 +246,8 @@ def external_files():
http_file(
name = "com_google_mediapipe_dense_tflite",
- sha256 = "be9323068461b1cbf412692ee916be30dcb1a5fb59a9ee875d470bc340d9e869",
- urls = ["https://storage.googleapis.com/mediapipe-assets/dense.tflite?generation=1678218351373709"],
+ sha256 = "6795e7c3a263f44e97be048a5e1166e0921b453bfbaf037f4f69ac5c059ee945",
+ urls = ["https://storage.googleapis.com/mediapipe-assets/dense.tflite?generation=1683252907920466"],
)
http_file(
@@ -960,8 +960,8 @@ def external_files():
http_file(
name = "com_google_mediapipe_portrait_selfie_segmentation_expected_category_mask_jpg",
- sha256 = "d8f20fa746e14067f668dd293f21bbc50ec81196d186386a6ded1278c3ec8f46",
- urls = ["https://storage.googleapis.com/mediapipe-assets/portrait_selfie_segmentation_expected_category_mask.jpg?generation=1678606935088873"],
+ sha256 = "1400c6fccf3805bfd1644d7ed9be98dfa4f900e1720838c566963f8d9f10f5d0",
+ urls = ["https://storage.googleapis.com/mediapipe-assets/portrait_selfie_segmentation_expected_category_mask.jpg?generation=1683332555306471"],
)
http_file(
@@ -972,8 +972,8 @@ def external_files():
http_file(
name = "com_google_mediapipe_portrait_selfie_segmentation_landscape_expected_category_mask_jpg",
- sha256 = "f5c3fa3d93f8e7289b69b8a89c2519276dfa5014dcc50ed6e86e8cd4d4ae7f27",
- urls = ["https://storage.googleapis.com/mediapipe-assets/portrait_selfie_segmentation_landscape_expected_category_mask.jpg?generation=1678606939469429"],
+ sha256 = "a208aeeeb615fd40046d883e2c7982458e1b12edd6526e88c305c4053b0a9399",
+ urls = ["https://storage.googleapis.com/mediapipe-assets/portrait_selfie_segmentation_landscape_expected_category_mask.jpg?generation=1683332557473435"],
)
http_file(
@@ -1158,14 +1158,14 @@ def external_files():
http_file(
name = "com_google_mediapipe_selfie_segmentation_landscape_tflite",
- sha256 = "28fb4c287d6295a2dba6c1f43b43315a37f927ddcd6693d635d625d176eef162",
- urls = ["https://storage.googleapis.com/mediapipe-assets/selfie_segmentation_landscape.tflite?generation=1678775102234495"],
+ sha256 = "a77d03f4659b9f6b6c1f5106947bf40e99d7655094b6527f214ea7d451106edd",
+ urls = ["https://storage.googleapis.com/mediapipe-assets/selfie_segmentation_landscape.tflite?generation=1683332561312022"],
)
http_file(
name = "com_google_mediapipe_selfie_segmentation_tflite",
- sha256 = "b0e2ec6f95107795b952b27f3d92806b45f0bc069dac76dcd264cd1b90d61c6c",
- urls = ["https://storage.googleapis.com/mediapipe-assets/selfie_segmentation.tflite?generation=1678775104900954"],
+ sha256 = "9ee168ec7c8f2a16c56fe8e1cfbc514974cbbb7e434051b455635f1bd1462f5c",
+ urls = ["https://storage.googleapis.com/mediapipe-assets/selfie_segmentation.tflite?generation=1683332563830600"],
)
http_file(
diff --git a/third_party/opencv_ios_source.BUILD b/third_party/opencv_ios_source.BUILD
new file mode 100644
index 000000000..c0cb65908
--- /dev/null
+++ b/third_party/opencv_ios_source.BUILD
@@ -0,0 +1,125 @@
+# Description:
+# OpenCV xcframework for video/image processing on iOS.
+
+licenses(["notice"]) # BSD license
+
+exports_files(["LICENSE"])
+
+load(
+ "@build_bazel_rules_apple//apple:apple.bzl",
+ "apple_static_xcframework_import",
+)
+load(
+ "@//third_party:opencv_ios_source.bzl",
+ "select_headers",
+ "unzip_opencv_xcframework",
+)
+
+# Build opencv2.xcframework from source using a convenience script provided in
+# OPENCV sources and zip the xcframework. We only build the modules required by MediaPipe by specifying
+# the modules to be ignored as command line arguments.
+# We also specify the simulator and device architectures we are building for.
+# Currently we only support iOS arm64 (M1 Macs) and x86_64(Intel Macs) simulators
+# and arm64 iOS devices.
+# Bitcode and Swift support. Swift support will be added in when the final binary
+# for MediaPipe iOS Task libraries are built. Shipping with OPENCV built with
+# Swift support throws linker errors when the MediaPipe framework is used from
+# an iOS project.
+genrule(
+ name = "build_opencv_xcframework",
+ srcs = glob(["opencv-4.5.1/**"]),
+ outs = ["opencv2.xcframework.zip"],
+ cmd = "&&".join([
+ "$(location opencv-4.5.1/platforms/apple/build_xcframework.py) \
+ --iphonesimulator_archs arm64,x86_64 \
+ --iphoneos_archs arm64 \
+ --without dnn \
+ --without ml \
+ --without stitching \
+ --without photo \
+ --without objdetect \
+ --without gapi \
+ --without flann \
+ --disable PROTOBUF \
+ --disable-bitcode \
+ --disable-swift \
+ --build_only_specified_archs \
+ --out $(@D)",
+ "cd $(@D)",
+ "zip --symlinks -r opencv2.xcframework.zip opencv2.xcframework",
+ ]),
+)
+
+# Unzips `opencv2.xcframework.zip` built from source by `build_opencv_xcframework`
+# genrule and returns an exhaustive list of all its files including symlinks.
+unzip_opencv_xcframework(
+ name = "opencv2_unzipped_xcframework_files",
+ zip_file = "opencv2.xcframework.zip",
+)
+
+# Imports the files of the unzipped `opencv2.xcframework` as an apple static
+# framework which can be linked to iOS targets.
+apple_static_xcframework_import(
+ name = "opencv_xcframework",
+ visibility = ["//visibility:public"],
+ xcframework_imports = [":opencv2_unzipped_xcframework_files"],
+)
+
+# Filters the headers for each platform in `opencv2.xcframework` which will be
+# used as headers in a `cc_library` that can be linked to C++ targets.
+select_headers(
+ name = "opencv_xcframework_device_headers",
+ srcs = [":opencv_xcframework"],
+ platform = "ios-arm64",
+)
+
+select_headers(
+ name = "opencv_xcframework_simulator_headers",
+ srcs = [":opencv_xcframework"],
+ platform = "ios-arm64_x86_64-simulator",
+)
+
+# `cc_library` that can be linked to C++ targets to import opencv headers.
+cc_library(
+ name = "opencv",
+ hdrs = select({
+ "@//mediapipe:ios_x86_64": [
+ ":opencv_xcframework_simulator_headers",
+ ],
+ "@//mediapipe:ios_sim_arm64": [
+ ":opencv_xcframework_simulator_headers",
+ ],
+ "@//mediapipe:ios_arm64": [
+ ":opencv_xcframework_simulator_headers",
+ ],
+ # A value from above is chosen arbitarily.
+ "//conditions:default": [
+ ":opencv_xcframework_simulator_headers",
+ ],
+ }),
+ copts = [
+ "-std=c++11",
+ "-x objective-c++",
+ ],
+ include_prefix = "opencv2",
+ linkopts = [
+ "-framework AssetsLibrary",
+ "-framework CoreFoundation",
+ "-framework CoreGraphics",
+ "-framework CoreMedia",
+ "-framework Accelerate",
+ "-framework CoreImage",
+ "-framework AVFoundation",
+ "-framework CoreVideo",
+ "-framework QuartzCore",
+ ],
+ strip_include_prefix = select({
+ "@//mediapipe:ios_x86_64": "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers",
+ "@//mediapipe:ios_sim_arm64": "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers",
+ "@//mediapipe:ios_arm64": "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers",
+ # Random value is selected for default cases.
+ "//conditions:default": "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers",
+ }),
+ visibility = ["//visibility:public"],
+ deps = [":opencv_xcframework"],
+)
diff --git a/third_party/opencv_ios_source.bzl b/third_party/opencv_ios_source.bzl
new file mode 100644
index 000000000..e46fb4cac
--- /dev/null
+++ b/third_party/opencv_ios_source.bzl
@@ -0,0 +1,158 @@
+# Copyright 2023 The MediaPipe Authors. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Custom rules for building iOS OpenCV xcframework from sources."""
+
+load(
+ "@//third_party:opencv_ios_xcframework_files.bzl",
+ "OPENCV_XCFRAMEWORK_INFO_PLIST_PATH",
+ "OPENCV_XCFRAMEWORK_IOS_DEVICE_FILE_PATHS",
+ "OPENCV_XCFRAMEWORK_IOS_SIMULATOR_FILE_PATHS",
+)
+
+_OPENCV_XCFRAMEWORK_DIR_NAME = "opencv2.xcframework"
+_OPENCV_FRAMEWORK_DIR_NAME = "opencv2.framework"
+_OPENCV_SIMULATOR_PLATFORM_DIR_NAME = "ios-arm64_x86_64-simulator"
+_OPENCV_DEVICE_PLATFORM_DIR_NAME = "ios-arm64"
+
+def _select_headers_impl(ctx):
+ _files = [
+ f
+ for f in ctx.files.srcs
+ if (f.basename.endswith(".h") or f.basename.endswith(".hpp")) and
+ f.dirname.find(ctx.attr.platform) != -1
+ ]
+ return [DefaultInfo(files = depset(_files))]
+
+# This rule selects only the headers from an apple static xcframework filtered by
+# an input platform string.
+select_headers = rule(
+ implementation = _select_headers_impl,
+ attrs = {
+ "srcs": attr.label_list(mandatory = True, allow_files = True),
+ "platform": attr.string(mandatory = True),
+ },
+)
+
+# This function declares and returns symlinks to the directories within each platform
+# in `opencv2.xcframework` expected to be present.
+# The symlinks are created according to the structure stipulated by apple xcframeworks
+# do that they can be correctly consumed by `apple_static_xcframework_import` rule.
+def _opencv2_directory_symlinks(ctx, platforms):
+ basenames = ["Resources", "Headers", "Modules", "Versions/Current"]
+ symlinks = []
+
+ for platform in platforms:
+ symlinks = symlinks + [
+ ctx.actions.declare_symlink(
+ _OPENCV_XCFRAMEWORK_DIR_NAME + "/{}/{}/{}".format(platform, _OPENCV_FRAMEWORK_DIR_NAME, name),
+ )
+ for name in basenames
+ ]
+
+ return symlinks
+
+# This function declares and returns all the files for each platform expected
+# to be present in `opencv2.xcframework` after the unzipping action is run.
+def _opencv2_file_list(ctx, platform_filepath_lists):
+ binary_name = "opencv2"
+ output_files = []
+ binaries_to_symlink = []
+
+ for (platform, filepaths) in platform_filepath_lists:
+ for path in filepaths:
+ file = ctx.actions.declare_file(path)
+ output_files.append(file)
+ if path.endswith(binary_name):
+ symlink_output = ctx.actions.declare_file(
+ _OPENCV_XCFRAMEWORK_DIR_NAME + "/{}/{}/{}".format(
+ platform,
+ _OPENCV_FRAMEWORK_DIR_NAME,
+ binary_name,
+ ),
+ )
+ binaries_to_symlink.append((symlink_output, file))
+
+ return output_files, binaries_to_symlink
+
+def _unzip_opencv_xcframework_impl(ctx):
+ # Array to iterate over the various platforms to declare output files and
+ # symlinks.
+ platform_filepath_lists = [
+ (_OPENCV_SIMULATOR_PLATFORM_DIR_NAME, OPENCV_XCFRAMEWORK_IOS_SIMULATOR_FILE_PATHS),
+ (_OPENCV_DEVICE_PLATFORM_DIR_NAME, OPENCV_XCFRAMEWORK_IOS_DEVICE_FILE_PATHS),
+ ]
+
+ # Gets an exhaustive list of output files which are present in the xcframework.
+ # Also gets array of `(binary simlink, binary)` pairs which are to be symlinked
+ # using `ctx.actions.symlink()`.
+ output_files, binaries_to_symlink = _opencv2_file_list(ctx, platform_filepath_lists)
+ output_files.append(ctx.actions.declare_file(OPENCV_XCFRAMEWORK_INFO_PLIST_PATH))
+
+ # xcframeworks have a directory structure in which the `opencv2.framework` folders for each
+ # platform contain directories which are symlinked to the respective folders of the version
+ # in use. Simply unzipping the zip of the framework will not make Bazel treat these
+ # as symlinks. They have to be explicity declared as symlinks using `ctx.actions.declare_symlink()`.
+ directory_symlinks = _opencv2_directory_symlinks(
+ ctx,
+ [_OPENCV_SIMULATOR_PLATFORM_DIR_NAME, _OPENCV_DEVICE_PLATFORM_DIR_NAME],
+ )
+
+ output_files = output_files + directory_symlinks
+
+ args = ctx.actions.args()
+
+ # Add the path of the zip file to be unzipped as an argument to be passed to
+ # `run_shell` action.
+ args.add(ctx.file.zip_file.path)
+
+ # Add the path to the directory in which the framework is to be unzipped to.
+ args.add(ctx.file.zip_file.dirname)
+
+ ctx.actions.run_shell(
+ inputs = [ctx.file.zip_file],
+ outputs = output_files,
+ arguments = [args],
+ progress_message = "Unzipping %s" % ctx.file.zip_file.short_path,
+ command = "unzip -qq $1 -d $2",
+ )
+
+ # The symlinks of the opencv2 binaries for each platform in the xcframework
+ # have to be symlinked using the `ctx.actions.symlink` unlike the directory
+ # symlinks which can be expected to be valid when unzipping is completed.
+ # Otherwise, when tests are run, the linker complaints that the binary is
+ # not found.
+ binary_symlink_files = []
+ for (symlink_output, binary_file) in binaries_to_symlink:
+ ctx.actions.symlink(output = symlink_output, target_file = binary_file)
+ binary_symlink_files.append(symlink_output)
+
+ # Return all the declared output files and symlinks as the output of this
+ # rule.
+ return [DefaultInfo(files = depset(output_files + binary_symlink_files))]
+
+# This rule unzips an `opencv2.xcframework.zip` created by a genrule that
+# invokes a python script in the opencv 4.5.1 github archive.
+# It returns all the contents of opencv2.xcframework as a list of files in the
+# output. This rule works by explicitly declaring files at hardcoded
+# paths in the opencv2 xcframework bundle which are expected to be present when
+# the zip file is unzipped. This is a prerequisite since the outputs of this rule
+# will be consumed by apple_static_xcframework_import which can only take a list
+# of files as inputs.
+unzip_opencv_xcframework = rule(
+ implementation = _unzip_opencv_xcframework_impl,
+ attrs = {
+ "zip_file": attr.label(mandatory = True, allow_single_file = True),
+ },
+)
diff --git a/third_party/opencv_ios_xcframework_files.bzl b/third_party/opencv_ios_xcframework_files.bzl
new file mode 100644
index 000000000..f3ea23883
--- /dev/null
+++ b/third_party/opencv_ios_xcframework_files.bzl
@@ -0,0 +1,468 @@
+# Copyright 2023 The MediaPipe Authors. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""List of file paths in the `opencv2.xcframework` bundle."""
+
+OPENCV_XCFRAMEWORK_INFO_PLIST_PATH = "opencv2.xcframework/Info.plist"
+
+OPENCV_XCFRAMEWORK_IOS_SIMULATOR_FILE_PATHS = [
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Resources/Info.plist",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/Moments.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/imgproc.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/MatOfRect2d.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/MatOfFloat4.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/MatOfPoint2i.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/video/tracking.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/video/legacy/constants_c.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/video/background_segm.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/video/video.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/Double3.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/MatOfByte.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/Range.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/Core.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/Size2f.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/world.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/opencv2-Swift.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/fast_math.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/cuda_types.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/check.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/cv_cpu_dispatch.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/utility.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/softfloat.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/cv_cpu_helper.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/cvstd.inl.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/hal/msa_macros.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/hal/intrin.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/hal/intrin_rvv.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/hal/simd_utils.impl.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/hal/intrin_wasm.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/hal/intrin_neon.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/hal/intrin_avx.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/hal/intrin_avx512.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/hal/intrin_vsx.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/hal/interface.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/hal/intrin_msa.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/hal/intrin_cpp.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/hal/intrin_forward.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/hal/intrin_sse.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/hal/intrin_sse_em.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/hal/hal.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/async.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/bufferpool.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/ovx.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/optim.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/va_intel.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/cvdef.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/cuda/warp.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/cuda/filters.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/cuda/dynamic_smem.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/cuda/reduce.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/cuda/utility.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/cuda/warp_shuffle.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/cuda/border_interpolate.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/cuda/transform.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/cuda/saturate_cast.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/cuda/vec_math.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/cuda/functional.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/cuda/limits.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/cuda/type_traits.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/cuda/vec_distance.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/cuda/block.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/cuda/detail/reduce.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/cuda/detail/reduce_key_val.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/cuda/detail/color_detail.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/cuda/detail/type_traits_detail.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/cuda/detail/vec_distance_detail.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/cuda/detail/transform_detail.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/cuda/emulation.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/cuda/color.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/cuda/datamov_utils.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/cuda/funcattrib.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/cuda/common.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/cuda/vec_traits.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/cuda/simd_functions.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/cuda/warp_reduce.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/cuda/scan.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/traits.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/opengl.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/cvstd_wrapper.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/cuda.inl.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/eigen.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/cuda_stream_accessor.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/ocl.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/cuda.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/affine.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/mat.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/utils/logger.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/utils/allocator_stats.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/utils/allocator_stats.impl.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/utils/logtag.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/utils/filesystem.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/utils/tls.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/utils/trace.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/utils/instrumentation.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/utils/logger.defines.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/quaternion.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/neon_utils.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/sse_utils.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/version.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/opencl/opencl_info.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/opencl/runtime/opencl_gl.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/opencl/runtime/opencl_svm_definitions.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/opencl/runtime/opencl_svm_hsa_extension.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/opencl/runtime/opencl_clamdblas.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/opencl/runtime/opencl_core.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/opencl/runtime/opencl_svm_20.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/opencl/runtime/opencl_core_wrappers.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/opencl/runtime/opencl_gl_wrappers.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/opencl/runtime/opencl_clamdfft.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/opencl/runtime/autogenerated/opencl_gl.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/opencl/runtime/autogenerated/opencl_clamdblas.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/opencl/runtime/autogenerated/opencl_core.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/opencl/runtime/autogenerated/opencl_core_wrappers.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/opencl/runtime/autogenerated/opencl_gl_wrappers.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/opencl/runtime/autogenerated/opencl_clamdfft.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/opencl/ocl_defs.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/opencl/opencl_svm.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/ocl_genbase.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/detail/async_promise.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/detail/exception_ptr.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/simd_intrinsics.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/matx.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/directx.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/base.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/operations.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/vsx_utils.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/persistence.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/mat.inl.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/types_c.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/cvstd.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/types.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/bindings_utils.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/quaternion.inl.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/saturate.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/core_c.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core/core.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/Converters.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/Mat.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/Algorithm.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/opencv.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/Mat+Converters.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/ByteVector.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/imgproc/imgproc.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/imgproc/imgproc_c.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/imgproc/hal/interface.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/imgproc/hal/hal.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/imgproc/detail/gcgraph.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/imgproc/types_c.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/highgui.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/features2d.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/Point2f.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/KeyPoint.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/Rect2f.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/Float6.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/MatOfKeyPoint.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/MatOfRect2i.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/FloatVector.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/TermCriteria.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/opencv2.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/Int4.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/MatOfDMatch.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/Scalar.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/Point3f.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/MatOfDouble.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/IntVector.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/RotatedRect.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/MatOfFloat6.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/cvconfig.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/DoubleVector.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/Size2d.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/MinMaxLocResult.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/MatOfInt4.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/Rect2i.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/Point2i.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/MatOfPoint3.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/MatOfRotatedRect.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/DMatch.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/TickMeter.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/Point3i.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/video.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/imgcodecs/ios.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/imgcodecs/legacy/constants_c.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/imgcodecs/macosx.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/imgcodecs/imgcodecs.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/imgcodecs/imgcodecs_c.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/CvType.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/CVObjcUtil.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/Size2i.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/imgcodecs.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/Float4.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/videoio/registry.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/videoio/cap_ios.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/videoio/legacy/constants_c.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/videoio/videoio.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/videoio/videoio_c.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/MatOfFloat.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/Rect2d.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/MatOfPoint2f.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/Point2d.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/highgui/highgui.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/highgui/highgui_c.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/Double2.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/CvCamera2.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/features2d/hal/interface.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/features2d/features2d.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/videoio.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/opencv_modules.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/core.hpp",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/MatOfInt.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/ArrayUtil.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/MatOfPoint3f.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Headers/Point3d.h",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Modules/opencv2.swiftmodule/x86_64-apple-ios-simulator.swiftinterface",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Modules/opencv2.swiftmodule/arm64-apple-ios-simulator.abi.json",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Modules/opencv2.swiftmodule/arm64-apple-ios-simulator.private.swiftinterface",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Modules/opencv2.swiftmodule/x86_64-apple-ios-simulator.swiftdoc",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Modules/opencv2.swiftmodule/Project/arm64-apple-ios-simulator.swiftsourceinfo",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Modules/opencv2.swiftmodule/Project/x86_64-apple-ios-simulator.swiftsourceinfo",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Modules/opencv2.swiftmodule/arm64-apple-ios-simulator.swiftinterface",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Modules/opencv2.swiftmodule/x86_64-apple-ios-simulator.private.swiftinterface",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Modules/opencv2.swiftmodule/arm64-apple-ios-simulator.swiftdoc",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Modules/opencv2.swiftmodule/x86_64-apple-ios-simulator.abi.json",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/Modules/module.modulemap",
+ "opencv2.xcframework/ios-arm64_x86_64-simulator/opencv2.framework/Versions/A/opencv2",
+]
+
+OPENCV_XCFRAMEWORK_IOS_DEVICE_FILE_PATHS = [
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Resources/Info.plist",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/Moments.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/imgproc.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/MatOfRect2d.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/MatOfFloat4.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/MatOfPoint2i.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/video/tracking.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/video/legacy/constants_c.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/video/background_segm.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/video/video.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/Double3.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/MatOfByte.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/Range.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/Core.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/Size2f.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/world.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/opencv2-Swift.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/fast_math.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/cuda_types.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/check.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/cv_cpu_dispatch.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/utility.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/softfloat.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/cv_cpu_helper.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/cvstd.inl.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/hal/msa_macros.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/hal/intrin.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/hal/intrin_rvv.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/hal/simd_utils.impl.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/hal/intrin_wasm.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/hal/intrin_neon.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/hal/intrin_avx.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/hal/intrin_avx512.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/hal/intrin_vsx.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/hal/interface.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/hal/intrin_msa.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/hal/intrin_cpp.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/hal/intrin_forward.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/hal/intrin_sse.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/hal/intrin_sse_em.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/hal/hal.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/async.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/bufferpool.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/ovx.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/optim.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/va_intel.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/cvdef.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/cuda/warp.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/cuda/filters.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/cuda/dynamic_smem.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/cuda/reduce.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/cuda/utility.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/cuda/warp_shuffle.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/cuda/border_interpolate.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/cuda/transform.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/cuda/saturate_cast.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/cuda/vec_math.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/cuda/functional.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/cuda/limits.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/cuda/type_traits.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/cuda/vec_distance.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/cuda/block.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/cuda/detail/reduce.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/cuda/detail/reduce_key_val.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/cuda/detail/color_detail.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/cuda/detail/type_traits_detail.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/cuda/detail/vec_distance_detail.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/cuda/detail/transform_detail.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/cuda/emulation.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/cuda/color.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/cuda/datamov_utils.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/cuda/funcattrib.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/cuda/common.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/cuda/vec_traits.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/cuda/simd_functions.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/cuda/warp_reduce.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/cuda/scan.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/traits.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/opengl.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/cvstd_wrapper.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/cuda.inl.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/eigen.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/cuda_stream_accessor.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/ocl.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/cuda.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/affine.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/mat.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/utils/logger.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/utils/allocator_stats.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/utils/allocator_stats.impl.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/utils/logtag.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/utils/filesystem.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/utils/tls.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/utils/trace.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/utils/instrumentation.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/utils/logger.defines.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/quaternion.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/neon_utils.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/sse_utils.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/version.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/opencl/opencl_info.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/opencl/runtime/opencl_gl.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/opencl/runtime/opencl_svm_definitions.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/opencl/runtime/opencl_svm_hsa_extension.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/opencl/runtime/opencl_clamdblas.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/opencl/runtime/opencl_core.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/opencl/runtime/opencl_svm_20.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/opencl/runtime/opencl_core_wrappers.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/opencl/runtime/opencl_gl_wrappers.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/opencl/runtime/opencl_clamdfft.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/opencl/runtime/autogenerated/opencl_gl.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/opencl/runtime/autogenerated/opencl_clamdblas.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/opencl/runtime/autogenerated/opencl_core.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/opencl/runtime/autogenerated/opencl_core_wrappers.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/opencl/runtime/autogenerated/opencl_gl_wrappers.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/opencl/runtime/autogenerated/opencl_clamdfft.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/opencl/ocl_defs.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/opencl/opencl_svm.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/ocl_genbase.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/detail/async_promise.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/detail/exception_ptr.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/simd_intrinsics.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/matx.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/directx.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/base.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/operations.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/vsx_utils.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/persistence.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/mat.inl.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/types_c.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/cvstd.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/types.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/bindings_utils.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/quaternion.inl.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/saturate.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/core_c.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core/core.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/Converters.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/Mat.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/Algorithm.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/opencv.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/Mat+Converters.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/ByteVector.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/imgproc/imgproc.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/imgproc/imgproc_c.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/imgproc/hal/interface.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/imgproc/hal/hal.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/imgproc/detail/gcgraph.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/imgproc/types_c.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/highgui.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/features2d.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/Point2f.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/KeyPoint.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/Rect2f.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/Float6.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/MatOfKeyPoint.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/MatOfRect2i.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/FloatVector.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/TermCriteria.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/opencv2.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/Int4.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/MatOfDMatch.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/Scalar.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/Point3f.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/MatOfDouble.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/IntVector.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/RotatedRect.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/MatOfFloat6.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/cvconfig.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/DoubleVector.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/Size2d.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/MinMaxLocResult.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/MatOfInt4.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/Rect2i.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/Point2i.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/MatOfPoint3.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/MatOfRotatedRect.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/DMatch.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/TickMeter.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/Point3i.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/video.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/imgcodecs/ios.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/imgcodecs/legacy/constants_c.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/imgcodecs/macosx.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/imgcodecs/imgcodecs.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/imgcodecs/imgcodecs_c.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/CvType.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/CVObjcUtil.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/Size2i.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/imgcodecs.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/Float4.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/videoio/registry.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/videoio/cap_ios.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/videoio/legacy/constants_c.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/videoio/videoio.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/videoio/videoio_c.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/MatOfFloat.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/Rect2d.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/MatOfPoint2f.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/Point2d.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/highgui/highgui.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/highgui/highgui_c.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/Double2.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/CvCamera2.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/features2d/hal/interface.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/features2d/features2d.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/videoio.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/opencv_modules.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/core.hpp",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/MatOfInt.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/ArrayUtil.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/MatOfPoint3f.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Headers/Point3d.h",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Modules/opencv2.swiftmodule/arm64-apple-ios.swiftinterface",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Modules/opencv2.swiftmodule/arm64-apple-ios.swiftdoc",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Modules/opencv2.swiftmodule/Project/arm64-apple-ios.swiftsourceinfo",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Modules/opencv2.swiftmodule/arm64-apple-ios.abi.json",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Modules/opencv2.swiftmodule/arm64-apple-ios.private.swiftinterface",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/Modules/module.modulemap",
+ "opencv2.xcframework/ios-arm64/opencv2.framework/Versions/A/opencv2",
+]