Image to Video

Most of the parameters of this API are compatible with Kling’s format. Please refer to Kling’s official documentation for more details.

Authorizations

Authorization

string

header

required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json

model_name

enum<string>

default:kling-v2-1-master

Model Name

Available options:

kling-v2-1-master

image

string

Reference Image. Support inputting image Base64 encoding

Important: When using Base64 encoding, do not add any prefixes such as data:image/png;base64,. Provide only the Base64-encoded string itself.

Supported image formats: .jpg, .jpeg, .png
Image file size cannot exceed 10MB
Width and height dimensions must not be less than 300px
Aspect ratio should be between 1:2.5 ~ 2.5:1
At least one parameter should be filled in between image and image_tail
image+image_tail, dynamic_masks/static_mask, and camera_control cannot be used at the same time

image_tail

string | null

Reference Image - End frame control. Support inputting image Base64 encoding Notice: This parameter is not supported by Kling v2.1 models. Important: When using Base64 encoding, do not add any prefixes such as data:image/png;base64,. Provide only the Base64-encoded string itself.

Supported image formats: .jpg, .jpeg, .png
Image file size cannot exceed 10MB
Width and height dimensions must not be less than 300px
At least one parameter should be filled in between image and image_tail
image+image_tail, dynamic_masks/static_mask, and camera_control cannot be used at the same time

prompt

string

Positive text prompt. Cannot exceed 2500 characters.

Maximum string length: 2500

negative_prompt

string

Negative text prompt. Cannot exceed 2500 characters.

Maximum string length: 2500

cfg_scale

number<float>

default:0.5

Flexibility in video generation. Higher value means lower flexibility and stronger relevance to the prompt. Range [0, 1].

Required range: 0 <= x <= 1

mode

enum<string>

default:std

Video generation mode

std: Standard Mode, which is cost-effective. pro: Professional Mode, generates videos use longer duration but higher quality video output.

Available options:

std,

pro

static_mask

string | null

Static Brush Application Area (Mask image created by users using the motion brush).

The "Motion Brush" feature includes two types of brushes: Dynamic Brush (dynamic_masks) and Static Brush (static_mask).

Support inputting image Base64 encoding
Supported image formats: .jpg, .jpeg, .png
The aspect ratio of the mask image must match the input image; otherwise, the task will fail
The resolutions of the static_mask image and the dynamic_masks.mask image must be identical

dynamic_masks

object[] | null

Dynamic Brush Configuration List. Multiple configurations can be set up (up to 6 groups). Each group includes a "mask area" (mask) and a sequence of "motion trajectories" (trajectories).

Maximum array length: 6

Show child attributes

dynamic_masks.mask

string | null

Dynamic Brush Application Area (Mask image created by users using the motion brush).

Support inputting image Base64 encoding
Supported image formats: .jpg, .jpeg, .png
The aspect ratio of the mask image must match the input image; otherwise, the task will fail
The resolutions of the static_mask image and the dynamic_masks.mask image must be identical

dynamic_masks.trajectories

object[]

Motion Trajectory Coordinate Sequence. For a 5-second video, trajectory length must not exceed 77 coordinates, ranging from [2, 77].

The coordinate system is based on the bottom-left corner of the image as the origin point.

Note-1: The more coordinates provided, the more precise the trajectory will be. Note-2: The trajectory direction follows the input order. The first coordinate serves as the starting point.

Required array length: 2 - 77 elements

Show child attributes

dynamic_masks.trajectories.x

integer

required

The horizontal coordinate (X-coordinate) of the trajectory point, where the bottom-left corner of the input image serves as the origin point (0, 0).

dynamic_masks.trajectories.y

integer

required

The vertical coordinate (Y-coordinate) of the trajectory point, where the bottom-left corner of the input image serves as the origin point (0, 0).

camera_control

object

Terms of controlling camera movement. If not specified, the model will intelligently match based on the input text/images.

Show child attributes

camera_control.type

enum<string>

Predefined camera movements type simple: Camera movement, Under this Type, you can choose one out of six options for camera movement in the “config”. down_back: Camera descends and moves backward, Pan down and zoom out, Under this Type, the config parameter must be set to “None”. forward_up: Camera moves forward and tilts up, Zoom in and pan up, the config parameter must be set to “None”. right_turn_forward: Rotate right and move forward, Rotate right and advance, the config parameter must be set to “None”. left_turn_forward: Rotate left and move forward, Rotate left and advance, the config parameter must be set to “None”.

Available options:

simple,

down_back,

forward_up,

right_turn_forward,

left_turn_forward

camera_control.config

object

Contains 6 fields for camera movement; only one should be non-zero when type is simple. Must be omitted for other types.

Show child attributes

camera_control.config.horizontal

number

Horizontal translation along x-axis. Negative=left, positive=right. Range [-10, 10].