mediapipe/mediapipe/calculators/image/scale_image_calculator.proto

// Options for ScaleImageCalculator.
syntax = "proto2";

package mediapipe;

import "mediapipe/framework/calculator.proto";
import "mediapipe/framework/formats/image_format.proto";

// Order of operations.
// 1) Crop the image to fit within min_aspect_ratio and max_aspect_ratio.
// 2) Scale and convert the image to fit inside target_width x target_height
//    using the specified scaling algorithm.  (maintaining the aspect
//    ratio if preserve_aspect_ratio is true).
// The output width and height will be divisible by 2, by default. It is
// possible to output width and height that are odd numbers when the output
// format is SRGB and the aspect ratio is left unpreserved. See
// scale_to_multiple_of for details.
message ScaleImageCalculatorOptions {
  extend CalculatorOptions {
    optional ScaleImageCalculatorOptions ext = 66237115;
  }

  // Target output width and height.  The final output's size may vary
  // depending on the other options below.  If unset, use the same width
  // or height as the input.  If only one is set then determine the other
  // from the aspect ratio (after cropping).  The output width and height
  // will be divisible by 2, by default.
  optional int32 target_width = 1;
  optional int32 target_height = 2;

  // If true, the image is scaled up or down proportionally so that it
  // fits inside the box represented by target_width and target_height.
  // Otherwise it is scaled to fit target_width and target_height
  // completely.  In any case, the aspect ratio that is preserved is
  // that after cropping to the minimum/maximum aspect ratio. Additionally, if
  // true, the output width and height will be divisible by 2.
  optional bool preserve_aspect_ratio = 3 [default = true];

  // If ratio is positive, crop the image to this minimum and maximum
  // aspect ratio (preserving the center of the frame).  This is done
  // before scaling.  The string must contain "/", so to disable cropping,
  // set both to "0/1".
  //   For example, for a min_aspect_ratio of "9/16" and max of "16/9" the
  //   following cropping will occur:
  //       1920x1080 (which is 16:9) is not cropped
  //       640x1024  (which is 10:16) is not cropped
  //       640x320   (which is 2:1) cropped to 568x320 (just under 16/9)
  //       96x480    (which is 1:5), cropped to 96x170 (just over 9/16)
  //   The resultant frame will always be between (or at) the
  //   min_aspect_ratio and max_aspect_ratio.
  optional string min_aspect_ratio = 4 [default = "9/16"];
  optional string max_aspect_ratio = 5 [default = "16/9"];

  // If unset, use the same format as the input.
  // NOTE: in the current implementation, the output format (either specified
  // in the output_format option or inherited from the input format) must be
  // SRGB. It can be YCBCR420P if the input_format is also the same.
  optional ImageFormat.Format output_format = 6;

  enum ScaleAlgorithm {
    DEFAULT = 0;
    LINEAR = 1;
    CUBIC = 2;
    AREA = 3;
    LANCZOS = 4;
    DEFAULT_WITHOUT_UPSCALE = 5;  // Option to disallow upscaling.
  }

  // The upscaling algorithm to use.  The default is to use CUBIC.  Note that
  // downscaling unconditionally uses DDA; see image_processing::
  // AffineGammaResizer for documentation.
  optional ScaleAlgorithm algorithm = 7 [default = DEFAULT];

  // The output image will have this alignment.  If set to zero, then
  // any alignment could be used.  If set to one, the output image will
  // be stored contiguously.
  optional int32 alignment_boundary = 8 [default = 16];

  // Set the alignment padding area to deterministic values (as opposed
  // to possibly leaving it as uninitialized memory).  The padding is
  // the space between the pixel values in a row and the end of the row
  // (which may be different due to alignment requirements on the length
  // of a row).
  optional bool set_alignment_padding = 9 [default = true];

  optional bool OBSOLETE_skip_linear_rgb_conversion = 10 [default = false];

  // Applies sharpening for downscaled images as post-processing.  See
  // image_processing::AffineGammaResizer for documentation.
  optional float post_sharpening_coefficient = 11 [default = 0.0];

  // If input_format is YCBCR420P, input packets contain a YUVImage. If
  // input_format is a format other than YCBCR420P or is unset, input packets
  // contain an ImageFrame.
  // NOTE: in the current implementation, the input format (either specified
  // in the input_format option or inferred from the input packets) must be
  // SRGB or YCBCR420P.
  optional ImageFormat.Format input_format = 12;

  // If set to 2, the target width and height will be rounded-down
  // to the nearest even number. If set to any positive value other than 2,
  // preserve_aspect_ratio must be false and the target width and height will be
  // rounded-down to multiples of the given value. If set to any value less than
  // 1, it will be treated like 1.
  // NOTE: If set to an odd number, the output format must be SRGB.
  optional int32 scale_to_multiple_of = 13 [default = 2];

  // If true, assume the input YUV is BT.709 (this is the HDTV standard, so most
  // content is likely using it). If false use the previous assumption of BT.601
  // (mid-80s standard). Ideally this information should be contained in the
  // input YUV Frame, but as of 02/06/2019, it's not. Once this info is baked
  // in, this flag becomes useless.
  optional bool use_bt709 = 14 [default = false];
}