mmflow.apis¶

mmflow.core¶

evaluation¶

class mmflow.core.evaluation.DistEvalHook(dataloader: torch.utils.data.dataloader.DataLoader, interval: int = 1, tmpdir: Optional[str] = None, gpu_collect: bool = False, by_epoch: bool = False, dataset_name: Optional[Union[str, Sequence[str]]] = None, **eval_kwargs: Any)[source]¶

Distributed evaluation hook.

Parameters

dataloader (DataLoader) – A PyTorch dataloader.
interval (int) – Evaluation interval (by epochs). Default: 1.
tmpdir (str | None) – Temporary directory to save the results of all processes. Default: None.
gpu_collect (bool) – Whether to use gpu or cpu to collect results. Default: False.
by_epoch (bool) – Determine perform evaluation by epoch or by iteration. If set to True, it will perform by epoch. Otherwise, by iteration. Default: False.
dataset_name (str, list, optional) – The name of the dataset this evaluation hook will doing in.
eval_kwargs (any) – Evaluation arguments fed into the evaluate function of the dataset.

evaluate(runner: mmcv.runner.iter_based_runner.IterBasedRunner)[source]¶: Evaluation function to call online evaluate function.

class mmflow.core.evaluation.EvalHook(dataloader: torch.utils.data.dataloader.DataLoader, interval: int = 1, by_epoch: bool = False, dataset_name: Optional[Union[str, Sequence[str]]] = None, **eval_kwargs: Any)[source]¶

Evaluation hook.

Parameters

dataloader (DataLoader) – A PyTorch dataloader.
interval (int) – Evaluation interval (by epochs). Default: 1.
by_epoch (bool) – Determine perform evaluation by epoch or by iteration. If set to True, it will perform by epoch. Otherwise, by iteration. Default: False.
dataset_name (str, list, optional) – The name of the dataset this evaluation hook will doing in.
eval_kwargs (any) – Evaluation arguments fed into the evaluate function of the dataset.

after_train_epoch(runner: mmcv.runner.iter_based_runner.IterBasedRunner) → None[source]¶: After train epoch.

after_train_iter(runner: mmcv.runner.iter_based_runner.IterBasedRunner) → None[source]¶: After train iteration.

evaluate(runner: mmcv.runner.iter_based_runner.IterBasedRunner) → None[source]¶: Evaluation function to call online evaluate function.

mmflow.core.evaluation.end_point_error(flow_pred: Sequence[numpy.ndarray], flow_gt: Sequence[numpy.ndarray], valid_gt: Sequence[numpy.ndarray]) → float[source]¶

Calculate end point errors between prediction and ground truth.

Parameters

flow_pred (list) – output list of flow map from flow_estimator shape(H, W, 2).
flow_gt (list) – ground truth list of flow map shape(H, W, 2).
valid_gt (list) – the list of valid mask for ground truth with the shape (H, W).

Returns

end point error for output.

Return type

float

mmflow.core.evaluation.end_point_error_map(flow_pred: numpy.ndarray, flow_gt: numpy.ndarray) → numpy.ndarray[source]¶

Calculate end point error map.

Parameters

flow_pred (ndarray) – The predicted optical flow with the shape (H, W, 2).
flow_gt (ndarray) – The ground truth of optical flow with the shape (H, W, 2).

Returns

End point error map with the shape (H , W).

Return type

ndarray

mmflow.core.evaluation.eval_metrics(results: Sequence[numpy.ndarray], flow_gt: Sequence[numpy.ndarray], valid_gt: Sequence[numpy.ndarray], metrics: Union[Sequence[str], str] = ['EPE']) → Dict[str, numpy.ndarray][source]¶

Calculate evaluation metrics.

Parameters

results (list) – list of predictedflow maps.
flow_gt (list) – list of ground truth flow maps
metrics (list, str) – metrics to be evaluated. Defaults to [‘EPE’], end-point error.

Returns

metrics and their values.

Return type

dict

mmflow.core.evaluation.multi_gpu_online_evaluation(model: torch.nn.modules.module.Module, data_loader: torch.utils.data.dataloader.DataLoader, metric: Union[str, Sequence[str]] = 'EPE', tmpdir: Optional[str] = None, gpu_collect: bool = False) → Dict[str, numpy.ndarray][source]¶

Evaluate model with multiple gpus online.

This function will not save the flow. Namely, there do not exist any IO operations in this function. Thus, in general, online mode will achieve a faster evaluation. However, using this function, the img_metas must include the ground truth e.g. flow_gt or flow_fw_gt and flow_bw_gt.

Parameters

model (nn.Module) – The optical flow estimator model.
data_loader (DataLoader) – The test dataloader.
metric (str, list) – Metrics to be evaluated. Default: ‘EPE’.
tmpdir (str) – Path of directory to save the temporary results from different gpus under cpu mode.
gpu_collect (bool) – Option to use either gpu or cpu to collect results.

Returns

The evaluation result.

Return type

dict

mmflow.core.evaluation.online_evaluation(model: torch.nn.modules.module.Module, data_loader: torch.utils.data.dataloader.DataLoader, metric: Union[str, Sequence[str]] = 'EPE', **kwargs: Any) → Dict[str, numpy.ndarray][source]¶

Evaluate model online.

Parameters

model (nn.Module) – The optical flow estimator model.
data_loader (DataLoader) – The test dataloader.
metric (str, list) – Metrics to be evaluated. Default: ‘EPE’.
kwargs (any) – Evaluation arguments fed into the evaluate function of the dataset.

Returns

The evaluation result.

Return type

dict

mmflow.core.evaluation.optical_flow_outliers(flow_pred: Sequence[numpy.ndarray], flow_gt: Sequence[numpy.ndarray], valid_gt: Sequence[numpy.ndarray]) → float[source]¶

Calculate percentage of optical flow outliers for KITTI dataset.

Parameters

flow_pred (list) – output list of flow map from flow_estimator shape(H, W, 2).
flow_gt (list) – ground truth list of flow map shape(H, W, 2).
valid_gt (list) – the list of valid mask for ground truth with the shape (H, W).

Returns

optical flow outliers for output.

Return type

float

mmflow.core.evaluation.single_gpu_online_evaluation(model: torch.nn.modules.module.Module, data_loader: torch.utils.data.dataloader.DataLoader, metric: Union[str, Sequence[str]] = 'EPE') → Dict[str, numpy.ndarray][source]¶

Evaluate model with single gpu online.

This function will not save the flow. Namely, there do not exist any IO operations in this function. Thus, in general, online mode will achieve a faster evaluation. However, using this function, the img_metas must include the ground truth e.g. flow_gt or flow_fw_gt and flow_bw_gt.

Parameters

model (nn.Module) – The optical flow estimator model.
data_loader (DataLoader) – The test dataloader.
metric (str, list) – Metrics to be evaluated. Default: ‘EPE’.

Returns

The evaluation result.

Return type

dict

hooks¶

class mmflow.core.hooks.LiteFlowNetStageLoadHook(src_level: str, dst_level: str)[source]¶

Stage loading hook for LiteFlowNet.

This hook works for loading weights at the previous stage to the additional stage in this training.

Parameters

src_level (str) – The source level to be loaded.
dst_level (str) – The level that will load the weights.

before_run(runner: mmcv.runner.iter_based_runner.IterBasedRunner) → None[source]¶

Before running function of Hook.

Parameters: runner (IterBasedRunner) – The runner for this training. This hook only has be tested in IterBasedRunner.

class mmflow.core.hooks.MultiStageLrUpdaterHook(milestone_lrs: Sequence[float], milestone_iters: Sequence[int], steps: Sequence[Sequence[int]], gammas: Sequence[float], **kwargs: Any)[source]¶

Multi-Stage Learning Rate Hook.

Parameters

milestone_lrs (Sequence[float]) – The base LR for multi-stages.
milestone_iters (Sequence[int]) – The first iterations in different stages.
steps (Sequence[Sequence[int]]) – The steps to decay the LR in stages.
gammas (Sequence[float]) – The list of decay LR ratios.
kwargs (any) – The arguments of LrUpdaterHook.

get_lr(runner: mmcv.runner.iter_based_runner.IterBasedRunner, base_lr: float) → float[source]¶

Get current LR.

Parameters

runner (IterBasedRunner) – The runner to control the training workflow.
base_lr (float) – The base LR in training workflow.

Returns

The current LR.

Return type

float

mmflow.datasets¶

datasets¶

class mmflow.datasets.ChairsSDHom(*args: Any, **kwargs: Any)[source]¶

ChaorsSDHom dataset.

load_data_info() → None[source]¶: load data information.

class mmflow.datasets.Collect(keys: collections.abc.Sequence, meta_keys: collections.abc.Sequence = ('filename1', 'filename2', 'ori_filename1', 'ori_filename2', 'filename_flow', 'ori_filename_flow', 'ori_shape', 'img_shape', 'pad_shape', 'scale_factor', 'flip', 'flip_direction', 'img_norm_cfg'))[source]¶

Collect data from the loader relevant to the specific task.

This is usually the last stage of the data loader pipeline. Typically keys is set to some subset of “img”, “flow_gt”.

The “img_meta” item is always populated. The contents of the “img_meta” dictionary depends on “meta_keys”. By default this includes:

“img_shape”: shape of the image input to the network as a tuple
(h, w, c). Note that images may be zero padded on the bottom/right if the batch tensor is larger than this shape.

“scale_factor”: a float indicating the preprocessing scale

“flip”: a boolean indicating if image flip transform was used

“filename1”: path to the image1 file

“filename2”: path to the image2 file

“ori_filename1”: image1 file name

“ori_filename2”: image2 file name

“ori_shape”: original shape of the image as a tuple (h, w, c)

“pad_shape”: image shape after padding

“img_norm_cfg”: a dict of normalization information:

mean - per channel mean subtraction

std - per channel std divisor

to_rgb - bool indicating if bgr was converted to rgb

Parameters

keys (Sequence[str]) – Keys of results to be collected in data.
meta_keys (Sequence[str], optional) – Meta keys to be converted to mmcv.DataContainer and collected in data[img_metas]. Default: ('filename1', 'filename2', 'ori_filename1', 'ori_filename2', 'ori_shape', 'img_shape', 'pad_shape', 'scale_factor', 'flip', 'flip_direction', 'img_norm_cfg')

class mmflow.datasets.ColorJitter(asymmetric_prob=0.0, brightness=0.0, contrast=0.0, saturation=0.0, hue=0.0)[source]¶

Randomly change the brightness, contrast, saturation and hue of an image. :param asymmetric_prob: the probability to do color jitter for two

images asymmetrically.

Parameters

brightness (float, tuple) – How much to jitter brightness. brightness_factor is chosen uniformly from [max(0, 1 - brightness), 1 + brightness] or the given [min, max]. Should be non negative numbers.
contrast (float, tuple) – How much to jitter contrast. contrast_factor is chosen uniformly from [max(0, 1 - contrast), 1 + contrast] or the given [min, max]. Should be non negative numbers.
saturation (float, tuple) – How much to jitter saturation. saturation_factor is chosen uniformly from [max(0, 1 - saturation), 1 + saturation] or the given [min, max]. Should be non negative numbers.
hue (float, tuple) – How much to jitter hue. hue_factor is chosen uniformly from [-hue, hue] or the given [min, max]. Should have 0<= hue <= 0.5 or -0.5 <= min <= max <= 0.5.

class mmflow.datasets.Compose(transforms: Sequence)[source]¶

Compose multiple transforms sequentially.

Parameters: transforms (Sequence[dict | callable]) – Sequence of transform object or config dict to be composed.

class mmflow.datasets.ConcatDataset(datasets: Sequence[torch.utils.data.dataset.Dataset], separate_eval: bool = True)[source]¶

A wrapper of concatenated dataset.

Same as torch.utils.data.dataset.ConcatDataset, but concat the group flag for image aspect ratio.

Parameters

datasets (list[Dataset]) – A list of datasets.
separate_eval (bool) – Whether to evaluate the results separately if it is used as validation dataset. Defaults to True.

evaluate(results: dict, logger: Optional[Union[str, logging.Logger]] = None, **kwargs: Any)[source]¶

Evaluate the results.

Parameters

results (list[list | tuple]) – Testing results of the dataset.
logger (logging.Logger | str | None) – Logger used for printing related information during evaluation. Default: None.

Returns

float]: AP results of the total dataset or each separate dataset if self.separate_eval=True.

Return type

dict[str

class mmflow.datasets.DefaultFormatBundle[source]¶

Default formatting bundle.

It simplifies the pipeline of formatting common fields, including “img” and “flow_gt”. These fields are formatted as follows.

img1: (1)transpose, (2)to tensor, (3)to DataContainer (stack=True)
img2: (1)transpose, (2)to tensor, (3)to DataContainer (stack=True)
flow_gt: (1)transpose, (2)to tensor, (3)to DataContainer (stack=True)

class mmflow.datasets.DistributedSampler(dataset: torch.utils.data.dataset.Dataset, num_replicas: Optional[int] = None, rank: Optional[int] = None, shuffle: bool = True, seed=0)[source]¶

DistributedSampler inheriting from torch.utils.data.DistributedSampler.

This distributed sampler is compatible Pytorch==1.5, as there is no seed argument in Pytorch==1.5.

Parameters

datasets (Dataset) – the dataset will be loaded.
num_replicas (int, optional) – Number of processes participating in distributed training. By default, world_size is retrieved from the current distributed group.
rank (int, optional) – Rank of the current process within num_replicas. By default, rank is retrieved from the current distributed group.
shuffle (bool) – If True (default), sampler will shuffle the indices.
seed (int) – random seed used to shuffle the sampler if shuffle=True. This number should be identical across all processes in the distributed group. Default: 0.

class mmflow.datasets.Erase(prob: float, bounds: Sequence = [50, 100], max_num: int = 3)[source]¶

Erase transform from RAFT is randomly erasing rectangular regions in img2 to simulate occlusions.

Parameters

prob (float) – the probability for erase transform.
bounds (list, tuple) – the bounds for erase regions (bound_x, bound_y).
max_num (int) – the max number of erase regions.

Returns

revised results, ‘img2’ and ‘erase_num’ are added into results.

Return type

dict

class mmflow.datasets.FlyingChairs(*args, split_file: str, **kwargs)[source]¶

FlyingChairs dataset.

Parameters: split_file (str) – File name of train-validation split file for FlyingChairs.

load_ann_info(filename: Sequence[str], filename_key: str) → None[source]¶

Load information of optical flow.

This function splits the dataset into two subsets, training subset and testing subset.

Parameters

filename (list) – ordered list of abstract file path of annotation.
filename_key (str) – the annotation e.g. ‘flow’.

load_data_info() → None[source]¶: Load data information, including file path of image1, image2 and optical flow.

load_img_info(img1_filename: Sequence[str], img2_filename: Sequence[str]) → None[source]¶

Load information of image1 and image2.

Parameters

img1_filename (list) – ordered list of abstract file path of img1.
img2_filename (list) – ordered list of abstract file path of img2.

class mmflow.datasets.FlyingChairsOcc(*args, **kwargs)[source]¶

FlyingChairsOcc dataset.

load_ann_info(filename, filename_key)[source]¶

Load information of optical flow.

This function splits the dataset into two subsets, training subset and testing subset.

Parameters

filename (list) – ordered list of abstract file path of annotation.
filename_key (str) – the annotation key for FlyingChairsOcc dataset ‘flow_fw’, ‘flow_bw’, ‘occ_fw’, and ‘occ_bw’.

load_data_info()[source]¶: Load data information, including file path of image1, image2 and optical flow.

load_img_info(img1_filename, img2_filename)[source]¶

Load information of image1 and image2.

Parameters

img1_filename (list) – ordered list of abstract file path of img1.
img2_filename (list) – ordered list of abstract file path of img2.

class mmflow.datasets.FlyingThings3D(*args, direction: Union[str, Sequence[str]] = ['forward', 'backward'], scene: Union[str, Sequence[str]] = 'left', pass_style: str = 'clean', **kwargs)[source]¶

FlyingThings3D subset dataset.

Parameters

direction (str) – Direction of flow, has 4 options ‘forward’, ‘backward’, ‘bidirection’ and [‘forward’, ‘backward’]. Default: [‘forward’, ‘backward’].
scene (list, str) – Scene in Flyingthings3D dataset, default: ‘left’. This default value is for RAFT, as FlyingThings3D is so large and not often used, and only RAFT use the ‘left’ data in it.
pass_style (str) – Pass style for FlyingThing3D dataset, and it has 2 options [‘clean’, ‘final’]. Default: ‘clean’.

load_data_info() → None[source]¶: Load data information, including file path of image1, image2 and optical flow.

class mmflow.datasets.FlyingThings3DSubset(*args, direction: Union[str, Sequence[str]] = ['forward', 'backward'], scene: Optional[Union[str, Sequence[str]]] = None, **kwargs)[source]¶

FlyingThings3D subset dataset.

Parameters

direction (str) – Direction of flow, has 4 options ‘forward’, ‘backward’, ‘bidirection’, and [‘forward’, ‘backward’]. Default: [‘forward’, ‘backward’].
scene (list, str, optional) – Scene in Flyingthings3D dataset, if scene is None, it means collecting data in all of scene of Flyingthing3D dataset. Default: None.

load_data_info() → None[source]¶: Load data information, including file path of image1, image2 and optical flow.

class mmflow.datasets.GaussianNoise(sigma_range=(0, 0.04), clamp_range=(- inf, inf))[source]¶

Add Gaussian Noise to images.

Add Gaussian Noise, with mean 0 and std sigma uniformly sampled from sigma_range, to images. And then clamp the images to clamp_range.

Parameters

sigma_range (list(float) | tuple(float)) – Uniformly sample sigma of gaussian noise in sigma_range. Default: (0, 0.04)
clamp_range (list(float) | tuple(float)) – The min and max value to clamp the images after adding gaussian noise. Default: (float(‘-inf’), float(‘inf’)).

class mmflow.datasets.HD1K(*args, **kwargs)[source]¶

HD1K dataset.

load_data_info() → None[source]¶: Load data information, including file path of image1, image2 and optical flow.

class mmflow.datasets.ImageToTensor(keys: collections.abc.Sequence)[source]¶

Convert image to torch.Tensor by given keys.

The dimension order of input image is (H, W, C). The pipeline will convert it to (C, H, W). If only 2 dimension (H, W) is given, the output would be (1, H, W).

Parameters: keys (Sequence[str]) – Key of images to be converted to Tensor.

class mmflow.datasets.InputPad(exponent, mode='edge', position='center', **kwargs)[source]¶

Pad images such that dimensions are divisible by 2^n used in test.

Parameters

exponent (int) – the exponent n of 2^n
mode (str) – mode for numpy.pad(). Defaults to ‘edge’.
position (str) – ‘center’, ‘left’, ‘right’, ‘top’ and ‘down’. Defaults to ‘center’

class mmflow.datasets.InputResize(exponent)[source]¶

Resize images such that dimensions are divisible by 2^n :param exponent: the exponent n of 2^n :type exponent: int

Returns

Resized results, ‘img_shape’, ‘scale_factor’ keys are added: into result dict.

Return type

dict

class mmflow.datasets.KITTI2012(*args, **kwargs)[source]¶

KITTI flow 2012 dataset.

load_data_info() → None[source]¶: Load data information, including file path of image1, image2 and optical flow.

class mmflow.datasets.KITTI2015(*args, **kwargs)[source]¶

KITTI flow 2015 dataset.

load_data_info() → None[source]¶: Load data information, including file path of image1, image2 and optical flow.

class mmflow.datasets.LoadImageFromFile(to_float32: bool = False, color_type: str = 'color', file_client_args: dict = {'backend': 'disk'}, imdecode_backend: str = 'cv2')[source]¶

Load image1 and image2 from file.

Required keys are “img1_info” (dict that must contain the key “filename” and “filename2”). Added or updated keys are “img1”, “img2”, “img_shape”, “ori_shape” (same as img_shape), “pad_shape” (same as img_shape), “scale_factor” (1.0, 1.0) and “img_norm_cfg” (means=0 and stds=1).

Parameters

to_float32 (bool) – Whether to convert the loaded image to a float32 numpy array. If set to False, the loaded image is an uint8 array. Defaults to False.
color_type (str) – The flag argument for mmcv.imfrombytes(). Defaults to ‘color’.
file_client_args (dict) – Arguments to instantiate a FileClient. See mmcv.fileio.FileClient for details. Defaults to dict(backend='disk').
imdecode_backend (str) – Backend for mmcv.imdecode(). Default: ‘cv2’

class mmflow.datasets.MixedBatchDistributedSampler(datasets: Sequence[torch.utils.data.dataset.Dataset], sample_ratio: Sequence[float], num_replicas: Optional[int] = None, rank: Optional[int] = None, shuffle: bool = True, seed: int = 0)[source]¶

Distributed Sampler for mixed data batch.

Parameters

datasets (list) – List of datasets will be loaded.
sample_ratio (list) – List of the ratio of each dataset in a batch, e.g. datasets=[DatasetA, DatasetB], sample_ratio=[0.25, 0.75], sample_per_gpu=1, gpus=8, it means 2 gpus load DatasetA, and 6 gpus load DatasetB. The length of datasets must be equal to length of sample_ratio.
num_replicas (int, optional) – Number of processes participating in distributed training. By default, world_size is retrieved from the current distributed group.
rank (int, optional) – Rank of the current process within num_replicas. By default, rank is retrieved from the current distributed group.
shuffle (bool) – If True (default), sampler will shuffle the indices.
seed (int) – random seed used to shuffle the sampler if shuffle=True. This number should be identical across all processes in the distributed group. Default: 0.

set_epoch(epoch: int) → None[source]¶

Sets the epoch for this sampler. When shuffle=True, this ensures all replicas use a different random ordering for each epoch. Otherwise, the next iteration of this sampler will yield the same ordering.

Parameters: epoch (int) – Epoch number.

class mmflow.datasets.Normalize(mean, std, to_rgb=True)[source]¶

Normalize the image.

Added key is “img_norm_cfg”. :param mean: Mean values of 3 channels. :type mean: sequence :param std: Std values of 3 channels. :type std: sequence :param to_rgb: Whether to convert the image from BGR to RGB,

default is true.

class mmflow.datasets.PhotoMetricDistortion(brightness_delta=32, contrast_range=(0.5, 1.5), saturation_range=(0.5, 1.5), hue_delta=18)[source]¶

Apply photometric distortion to image sequentially, every transformation is applied with a probability of 0.5.

The position of random contrast is in second or second to last. 1. random brightness 2. random contrast (mode 0) 3. convert color from BGR to HSV 4. random saturation 5. random hue 6. convert color from HSV to BGR 7. random contrast (mode 1) 8. randomly swap channels :param brightness_delta: delta of brightness. :type brightness_delta: int :param contrast_range: range of contrast. :type contrast_range: tuple :param saturation_range: range of saturation. :type saturation_range: tuple :param hue_delta: delta of hue. :type hue_delta: int

brightness(img)[source]¶: Brightness distortion.

contrast(img)[source]¶: Contrast distortion.

convert(img, alpha=1, beta=0)[source]¶: Multiple with alpha and add beat with clip.

hue(img)[source]¶: Hue distortion.

saturation(img)[source]¶: Saturation distortion.

class mmflow.datasets.RandomAffine(global_transform: Optional[dict] = None, relative_transform: Optional[dict] = None, preserve_valid: bool = True, check_bound: bool = False)[source]¶

Random affine transformation of images, flow map and occlusion map (if available).

Keys of global_transform and relative_transform should be the subset of (‘translates’, ‘zoom’, ‘shear’, ‘rotate’). And also, each key and its corresponding values has to satisfy the following rules:

translates: the translation ratios along x axis and y axis. Defaults
to(0., 0.).

zoom: the min and max zoom ratios. Defaults to (1.0, 1.0).

shear: the min and max shear ratios. Defaults to (1.0, 1.0).

rotate: the min and max rotate degree. Defaults to (0., 0.).

Parameters

global_transform (dict) – A dict which contains keys: transform, zoom, shear, rotate. global_transform will transform both img1 and img2.
relative_transform (dict) – A dict which contains keys: transform, zoom, shear, rotate. relative_transform will only transform img2 after global_transform to both images.
preserve_valid (bool) – Whether continue transforming until both images are valid. A valid affine transform is an affine transform which guarantees the transformed image covers the whole original picture frame. Defaults to True.
check_bound (bool) – Whether to check out of bound for transformed occlusion maps. If True, all pixels in borders of img1 but not in borders of img2 will be marked occluded. Defaults to False.

class mmflow.datasets.RandomCrop(crop_size)[source]¶

Random crop the image & flow.

Parameters: crop_size (tuple) – Expected size after cropping, (h, w).

crop(img, crop_bbox)[source]¶: Crop from img

get_crop_bbox(img_shape)[source]¶: Randomly get a crop bounding box.

class mmflow.datasets.RandomFlip(prob, direction='horizontal')[source]¶

Flip the image and flow map.

Parameters

prob (float) – The flipping probability.
direction (str) – The flipping direction. Options are ‘horizontal’ and ‘vertical’. Default: ‘horizontal’.

class mmflow.datasets.RandomRotation(prob, angle, auto_bound=False)[source]¶

Random rotation of the image from -angle to angle (in degrees).

optical flow data.

Parameters

prob (float) – The rotation probability.
angle (float) – max angle of the rotation in the range from -180 to 180.
auto_bound (bool) – Whether to adjust the image size to cover the whole rotated image. Default: False

class mmflow.datasets.RandomTranslate(prob=0.0, x_offset=0.0, y_offset=0.0)[source]¶

Random translation of the images and flow map.

optical flow data.

Parameters

prob (float) – the probability to do translation.
x_offset (float | tuple) – translate ratio on x axis, randomly choice [-x_offset, x_offset] or the given [min, max]. Default: 0.
y_offset (float | tuple) – translate ratio on y axis, randomly choice [-x_offset, x_offset] or the given [min, max]. Default: 0.

class mmflow.datasets.RepeatDataset(dataset, times)[source]¶

A wrapper of repeated dataset.

The length of repeated dataset will be times larger than the original dataset. This is useful when the data loading time is long but the dataset is small. Using RepeatDataset can reduce the data loading time between epochs.

Parameters

dataset (Dataset) – The dataset to be repeated.
times (int) – Repeat times.

class mmflow.datasets.Rerange(min_value=0, max_value=255)[source]¶

Rerange the image pixel value.

Parameters

min_value (float or int) – Minimum value of the reranged image. Default: 0.
max_value (float or int) – Maximum value of the reranged image. Default: 255.

class mmflow.datasets.Sintel(*args, pass_style: str = 'clean', scene: Optional[Union[str, Sequence[str]]] = None, **kwargs)[source]¶

Sintel optical flow dataset.

Parameters

pass_style (str) – Pass style for Sintel dataset, and it has 2 options [‘clean’, ‘final’]. Default: ‘clean’.
scene (str, list, optional) – Scene in Sintel dataset, if scene is None, it means collecting data in all of scene of Sintel dataset. Default: None.

load_data_info() → None[source]¶: Load data information, including file path of image1, image2 and optical flow.

pre_pipeline(results: Sequence[dict]) → None[source]¶

Prepare results dict for pipeline.

For Sintel, there is an additional annotation, invalid.

class mmflow.datasets.SpacialTransform(spacial_prob: float, stretch_prob: float, crop_size: Sequence, min_scale: float = - 0.2, max_scale: float = 0.5, max_stretch: float = 0.2)[source]¶

Spacial Transform API for RAFT :param spacial_prob: probability to do spacial transform. :type spacial_prob: float :param stretch_prob: probability to do stretch. :type stretch_prob: float :param crop_size: the base size for resize. :type crop_size: tuple, list :param min_scale: the exponent for min scale. Defaults to -0.2. :type min_scale: float :param max_scale: the exponent for max scale. Defaults to 0.5. :type max_scale: float

Returns: Resized results, ‘img_shape’,
Return type: dict

resize_sparse_flow_map(flow: numpy.ndarray, valid: numpy.ndarray, fx: float = 1.0, fy: float = 1.0, x0: int = 0, y0: int = 0) → Sequence[numpy.ndarray][source]¶

Resize sparse optical flow function.

Parameters

flow (ndarray) – optical flow data will be resized.
valid (ndarray) – valid mask for sparse optical flow.
fx (float, optional) – horizontal scale factor. Defaults to 1.0.
fy (float, optional) – vertical scale factor. Defaults to 1.0.
x0 (int, optional) – abscissa of left-top point where the flow map will be crop from. Defaults to 0.
y0 (int, optional) – ordinate of left-top point where the flow map will be crop from. Defaults to 0.

Returns

the transformed flow map and valid mask.

Return type

Sequence[ndarray]

spacial_transform(imgs: numpy.ndarray) → Tuple[numpy.ndarray, float, float, int, int][source]¶

Spacial transform function.

Parameters

imgs (ndarray) – the images that will be transformed.

Returns

the transformed images,: horizontal scale factor, vertical scale factor, coordinate of left-top point where the image maps will be crop from.

Return type

Tuple[ndarray, float, float, int, int]

class mmflow.datasets.ToDataContainer(fields: collections.abc.Sequence = ({'key': 'img1', 'stack': True}, {'key': 'img2', 'stack': True}, {'key': 'flow_gt'}))[source]¶

Convert results to mmcv.DataContainer by given fields.

Parameters: fields (Sequence[dict]) – Each field is a dict like dict(key='xxx', **kwargs). The key in result will be converted to mmcv.DataContainer with **kwargs. Default: (dict(key='img1', stack=True), dict(key='img2', stack=True), dict(key='flow_gt')).

class mmflow.datasets.ToTensor(keys: collections.abc.Sequence)[source]¶

Convert some results to torch.Tensor by given keys.

Parameters: keys (Sequence[str]) – Keys that need to be converted to Tensor.

class mmflow.datasets.Transpose(keys: collections.abc.Sequence, order: collections.abc.Sequence)[source]¶

Transpose some results by given keys.

Parameters

keys (Sequence[str]) – Keys of results to be transposed.
order (Sequence[int]) – Order of transpose.

class mmflow.datasets.Validation(max_flow: Union[float, int])[source]¶

This Validation transform from RAFT is for return a mask for the flow is less than max_flow.

Parameters

max_flow (float, int) – the max flow for validated flow.

Returns

Resized results, ‘valid’ and ‘max_flow’ keys are added into: result dict.

Return type

dict

mmflow.datasets.build_dataloader(dataset: torch.utils.data.dataset.Dataset, samples_per_gpu: int, workers_per_gpu: int, sample_ratio: Optional[Sequence] = None, num_gpus: int = 1, dist: bool = True, shuffle: bool = True, seed: Optional[int] = None, persistent_workers: bool = False, **kwargs)[source]¶

Build PyTorch DataLoader.

In distributed training, each GPU/process has a dataloader. In non-distributed training, there is only one dataloader for all GPUs.

Parameters

dataset (Dataset) – A PyTorch dataset.
samples_per_gpu (int) – Number of training samples on each GPU, i.e., batch size of each GPU.
workers_per_gpu (int) – How many subprocesses to use for data loading for each GPU.
sample_ratio (list, optional) – The ratio for samples in mixed branch, sum of sample_ratio must be equal to 1. and the length must be equal to the length of datasets, e.g branch=8, sample_ratio=(0.5,0.25,0.25) means in one branch 4 samples from dataset1, 2 samples from dataset2 and 2 samples from dataset3.
num_gpus (int) – Number of GPUs. Only used in non-distributed training.
dist (bool) – Distributed training/test or not. Default: True.
shuffle (bool) – Whether to shuffle the data at every epoch. Default: True.
seed (int, optional) – the seed for generating random numbers for data workers. Default to None.
persistent_workers (bool) – If True, the data loader will not shutdown the worker processes after a dataset has been consumed once. This allows to maintain the workers Dataset instances alive. The argument also has effect in PyTorch>=1.7.0. Default: False.
kwargs – any keyword argument to be used to initialize DataLoader

Returns

A PyTorch dataloader.

Return type

DataLoader

mmflow.datasets.build_dataset(cfg: Union[mmcv.utils.config.Config, Sequence[mmcv.utils.config.Config]], default_args: Optional[dict] = None) → torch.utils.data.dataset.Dataset[source]¶

Build Pytorch dataset.

Parameters

cfg (mmcv.Config) – Config dict of dataset or list of config dict. It should at least contain the key “type”.
default_args (dict, optional) – Default initialization arguments.

Note

If the input config is a list, this function will concatenate them automatically.

Returns: The built dataset based on the input config.
Return type: dataset

mmflow.datasets.read_flow(name: str) → numpy.ndarray[source]¶

Read flow file with the suffix ‘.flo’.

Parameters: name (str) – Optical flow file path.
Returns: Optical flow
Return type: ndarray

mmflow.datasets.read_flow_kitti(name: str) → Tuple[numpy.ndarray, numpy.ndarray][source]¶

Read sparse flow file from KITTI dataset.

Parameters: name (str) – The flow file
Returns: flow and valid map
Return type: Tuple[ndarray, ndarray]

mmflow.datasets.render_color_wheel(save_file: str = 'color_wheel.png') → numpy.ndarray[source]¶

Render color wheel.

Parameters: save_file (str) – The saved file name . Defaults to ‘color_wheel.png’.
Returns: color wheel image.
Return type: ndarray

mmflow.datasets.visualize_flow(flow: numpy.ndarray, save_file: Optional[str] = None) → numpy.ndarray[source]¶

Flow visualization function.

Parameters

flow (ndarray) – The flow will be render
save_dir ([type], optional) – save dir. Defaults to None.

Returns

flow map image with RGB order.

Return type

ndarray

mmflow.datasets.write_flow(flow: numpy.ndarray, flow_file: str) → None[source]¶

Write the flow in disk.

Parameters

flow (ndarray) – The optical flow that will be saved.
flow_file (str) – The file for saving optical flow.

mmflow.datasets.write_flow_kitti(uv: numpy.ndarray, filename: str)[source]¶

Write the flow in disk.

Parameters

uv (ndarray) – The optical flow that will be saved.
filename ([type]) – The file for saving optical flow.

pipelines¶

class mmflow.datasets.pipelines.Collect(keys: collections.abc.Sequence, meta_keys: collections.abc.Sequence = ('filename1', 'filename2', 'ori_filename1', 'ori_filename2', 'filename_flow', 'ori_filename_flow', 'ori_shape', 'img_shape', 'pad_shape', 'scale_factor', 'flip', 'flip_direction', 'img_norm_cfg'))[source]¶

Collect data from the loader relevant to the specific task.

This is usually the last stage of the data loader pipeline. Typically keys is set to some subset of “img”, “flow_gt”.

The “img_meta” item is always populated. The contents of the “img_meta” dictionary depends on “meta_keys”. By default this includes:

“img_shape”: shape of the image input to the network as a tuple
(h, w, c). Note that images may be zero padded on the bottom/right if the batch tensor is larger than this shape.

“scale_factor”: a float indicating the preprocessing scale

“flip”: a boolean indicating if image flip transform was used

“filename1”: path to the image1 file

“filename2”: path to the image2 file

“ori_filename1”: image1 file name

“ori_filename2”: image2 file name

“ori_shape”: original shape of the image as a tuple (h, w, c)

“pad_shape”: image shape after padding

“img_norm_cfg”: a dict of normalization information:

mean - per channel mean subtraction

std - per channel std divisor

to_rgb - bool indicating if bgr was converted to rgb

Parameters

keys (Sequence[str]) – Keys of results to be collected in data.
meta_keys (Sequence[str], optional) – Meta keys to be converted to mmcv.DataContainer and collected in data[img_metas]. Default: ('filename1', 'filename2', 'ori_filename1', 'ori_filename2', 'ori_shape', 'img_shape', 'pad_shape', 'scale_factor', 'flip', 'flip_direction', 'img_norm_cfg')

class mmflow.datasets.pipelines.ColorJitter(asymmetric_prob=0.0, brightness=0.0, contrast=0.0, saturation=0.0, hue=0.0)[source]¶

Randomly change the brightness, contrast, saturation and hue of an image. :param asymmetric_prob: the probability to do color jitter for two

images asymmetrically.

Parameters

brightness (float, tuple) – How much to jitter brightness. brightness_factor is chosen uniformly from [max(0, 1 - brightness), 1 + brightness] or the given [min, max]. Should be non negative numbers.
contrast (float, tuple) – How much to jitter contrast. contrast_factor is chosen uniformly from [max(0, 1 - contrast), 1 + contrast] or the given [min, max]. Should be non negative numbers.
saturation (float, tuple) – How much to jitter saturation. saturation_factor is chosen uniformly from [max(0, 1 - saturation), 1 + saturation] or the given [min, max]. Should be non negative numbers.
hue (float, tuple) – How much to jitter hue. hue_factor is chosen uniformly from [-hue, hue] or the given [min, max]. Should have 0<= hue <= 0.5 or -0.5 <= min <= max <= 0.5.

class mmflow.datasets.pipelines.Compose(transforms: Sequence)[source]¶

Compose multiple transforms sequentially.

Parameters: transforms (Sequence[dict | callable]) – Sequence of transform object or config dict to be composed.

class mmflow.datasets.pipelines.DefaultFormatBundle[source]¶

Default formatting bundle.

It simplifies the pipeline of formatting common fields, including “img” and “flow_gt”. These fields are formatted as follows.

img1: (1)transpose, (2)to tensor, (3)to DataContainer (stack=True)
img2: (1)transpose, (2)to tensor, (3)to DataContainer (stack=True)
flow_gt: (1)transpose, (2)to tensor, (3)to DataContainer (stack=True)

class mmflow.datasets.pipelines.Erase(prob: float, bounds: Sequence = [50, 100], max_num: int = 3)[source]¶

Erase transform from RAFT is randomly erasing rectangular regions in img2 to simulate occlusions.

Parameters

prob (float) – the probability for erase transform.
bounds (list, tuple) – the bounds for erase regions (bound_x, bound_y).
max_num (int) – the max number of erase regions.

Returns

revised results, ‘img2’ and ‘erase_num’ are added into results.

Return type

dict

class mmflow.datasets.pipelines.GaussianNoise(sigma_range=(0, 0.04), clamp_range=(- inf, inf))[source]¶

Add Gaussian Noise to images.

Add Gaussian Noise, with mean 0 and std sigma uniformly sampled from sigma_range, to images. And then clamp the images to clamp_range.

Parameters

sigma_range (list(float) | tuple(float)) – Uniformly sample sigma of gaussian noise in sigma_range. Default: (0, 0.04)
clamp_range (list(float) | tuple(float)) – The min and max value to clamp the images after adding gaussian noise. Default: (float(‘-inf’), float(‘inf’)).

class mmflow.datasets.pipelines.ImageToTensor(keys: collections.abc.Sequence)[source]¶

Convert image to torch.Tensor by given keys.

The dimension order of input image is (H, W, C). The pipeline will convert it to (C, H, W). If only 2 dimension (H, W) is given, the output would be (1, H, W).

Parameters: keys (Sequence[str]) – Key of images to be converted to Tensor.

class mmflow.datasets.pipelines.InputPad(exponent, mode='edge', position='center', **kwargs)[source]¶

Pad images such that dimensions are divisible by 2^n used in test.

Parameters

exponent (int) – the exponent n of 2^n
mode (str) – mode for numpy.pad(). Defaults to ‘edge’.
position (str) – ‘center’, ‘left’, ‘right’, ‘top’ and ‘down’. Defaults to ‘center’

class mmflow.datasets.pipelines.InputResize(exponent)[source]¶

Resize images such that dimensions are divisible by 2^n :param exponent: the exponent n of 2^n :type exponent: int

Returns

Resized results, ‘img_shape’, ‘scale_factor’ keys are added: into result dict.

Return type

dict

class mmflow.datasets.pipelines.LoadAnnotations(with_occ: bool = False, sparse: bool = False, file_client_args: dict = {'backend': 'disk'})[source]¶

Load optical flow from file.

Parameters

with_occ (bool) – whether to parse and load occlusion mask. Default to False.
sparse (bool) – whether the flow is sparse. Default to False.
file_client_args (dict) – Arguments to instantiate a FileClient. See mmcv.fileio.FileClient for details. Defaults to dict(backend='disk').

class mmflow.datasets.pipelines.LoadImageFromFile(to_float32: bool = False, color_type: str = 'color', file_client_args: dict = {'backend': 'disk'}, imdecode_backend: str = 'cv2')[source]¶

Load image1 and image2 from file.

Required keys are “img1_info” (dict that must contain the key “filename” and “filename2”). Added or updated keys are “img1”, “img2”, “img_shape”, “ori_shape” (same as img_shape), “pad_shape” (same as img_shape), “scale_factor” (1.0, 1.0) and “img_norm_cfg” (means=0 and stds=1).

Parameters

to_float32 (bool) – Whether to convert the loaded image to a float32 numpy array. If set to False, the loaded image is an uint8 array. Defaults to False.
color_type (str) – The flag argument for mmcv.imfrombytes(). Defaults to ‘color’.
file_client_args (dict) – Arguments to instantiate a FileClient. See mmcv.fileio.FileClient for details. Defaults to dict(backend='disk').
imdecode_backend (str) – Backend for mmcv.imdecode(). Default: ‘cv2’

class mmflow.datasets.pipelines.Normalize(mean, std, to_rgb=True)[source]¶

Normalize the image.

Added key is “img_norm_cfg”. :param mean: Mean values of 3 channels. :type mean: sequence :param std: Std values of 3 channels. :type std: sequence :param to_rgb: Whether to convert the image from BGR to RGB,

default is true.

class mmflow.datasets.pipelines.PhotoMetricDistortion(brightness_delta=32, contrast_range=(0.5, 1.5), saturation_range=(0.5, 1.5), hue_delta=18)[source]¶

Apply photometric distortion to image sequentially, every transformation is applied with a probability of 0.5.

The position of random contrast is in second or second to last. 1. random brightness 2. random contrast (mode 0) 3. convert color from BGR to HSV 4. random saturation 5. random hue 6. convert color from HSV to BGR 7. random contrast (mode 1) 8. randomly swap channels :param brightness_delta: delta of brightness. :type brightness_delta: int :param contrast_range: range of contrast. :type contrast_range: tuple :param saturation_range: range of saturation. :type saturation_range: tuple :param hue_delta: delta of hue. :type hue_delta: int

brightness(img)[source]¶: Brightness distortion.

contrast(img)[source]¶: Contrast distortion.

convert(img, alpha=1, beta=0)[source]¶: Multiple with alpha and add beat with clip.

hue(img)[source]¶: Hue distortion.

saturation(img)[source]¶: Saturation distortion.

class mmflow.datasets.pipelines.RandomAffine(global_transform: Optional[dict] = None, relative_transform: Optional[dict] = None, preserve_valid: bool = True, check_bound: bool = False)[source]¶

Random affine transformation of images, flow map and occlusion map (if available).

Keys of global_transform and relative_transform should be the subset of (‘translates’, ‘zoom’, ‘shear’, ‘rotate’). And also, each key and its corresponding values has to satisfy the following rules:

translates: the translation ratios along x axis and y axis. Defaults
to(0., 0.).

zoom: the min and max zoom ratios. Defaults to (1.0, 1.0).

shear: the min and max shear ratios. Defaults to (1.0, 1.0).

rotate: the min and max rotate degree. Defaults to (0., 0.).

Parameters

global_transform (dict) – A dict which contains keys: transform, zoom, shear, rotate. global_transform will transform both img1 and img2.
relative_transform (dict) – A dict which contains keys: transform, zoom, shear, rotate. relative_transform will only transform img2 after global_transform to both images.
preserve_valid (bool) – Whether continue transforming until both images are valid. A valid affine transform is an affine transform which guarantees the transformed image covers the whole original picture frame. Defaults to True.
check_bound (bool) – Whether to check out of bound for transformed occlusion maps. If True, all pixels in borders of img1 but not in borders of img2 will be marked occluded. Defaults to False.

class mmflow.datasets.pipelines.RandomCrop(crop_size)[source]¶

Random crop the image & flow.

Parameters: crop_size (tuple) – Expected size after cropping, (h, w).

crop(img, crop_bbox)[source]¶: Crop from img

get_crop_bbox(img_shape)[source]¶: Randomly get a crop bounding box.

class mmflow.datasets.pipelines.RandomFlip(prob, direction='horizontal')[source]¶

Flip the image and flow map.

Parameters

prob (float) – The flipping probability.
direction (str) – The flipping direction. Options are ‘horizontal’ and ‘vertical’. Default: ‘horizontal’.

class mmflow.datasets.pipelines.RandomRotation(prob, angle, auto_bound=False)[source]¶

Random rotation of the image from -angle to angle (in degrees).

optical flow data.

Parameters

prob (float) – The rotation probability.
angle (float) – max angle of the rotation in the range from -180 to 180.
auto_bound (bool) – Whether to adjust the image size to cover the whole rotated image. Default: False

class mmflow.datasets.pipelines.RandomTranslate(prob=0.0, x_offset=0.0, y_offset=0.0)[source]¶

Random translation of the images and flow map.

optical flow data.

Parameters

prob (float) – the probability to do translation.
x_offset (float | tuple) – translate ratio on x axis, randomly choice [-x_offset, x_offset] or the given [min, max]. Default: 0.
y_offset (float | tuple) – translate ratio on y axis, randomly choice [-x_offset, x_offset] or the given [min, max]. Default: 0.

class mmflow.datasets.pipelines.Rerange(min_value=0, max_value=255)[source]¶

Rerange the image pixel value.

Parameters

min_value (float or int) – Minimum value of the reranged image. Default: 0.
max_value (float or int) – Maximum value of the reranged image. Default: 255.

class mmflow.datasets.pipelines.SpacialTransform(spacial_prob: float, stretch_prob: float, crop_size: Sequence, min_scale: float = - 0.2, max_scale: float = 0.5, max_stretch: float = 0.2)[source]¶

Spacial Transform API for RAFT :param spacial_prob: probability to do spacial transform. :type spacial_prob: float :param stretch_prob: probability to do stretch. :type stretch_prob: float :param crop_size: the base size for resize. :type crop_size: tuple, list :param min_scale: the exponent for min scale. Defaults to -0.2. :type min_scale: float :param max_scale: the exponent for max scale. Defaults to 0.5. :type max_scale: float

Returns: Resized results, ‘img_shape’,
Return type: dict

resize_sparse_flow_map(flow: numpy.ndarray, valid: numpy.ndarray, fx: float = 1.0, fy: float = 1.0, x0: int = 0, y0: int = 0) → Sequence[numpy.ndarray][source]¶

Resize sparse optical flow function.

Parameters

flow (ndarray) – optical flow data will be resized.
valid (ndarray) – valid mask for sparse optical flow.
fx (float, optional) – horizontal scale factor. Defaults to 1.0.
fy (float, optional) – vertical scale factor. Defaults to 1.0.
x0 (int, optional) – abscissa of left-top point where the flow map will be crop from. Defaults to 0.
y0 (int, optional) – ordinate of left-top point where the flow map will be crop from. Defaults to 0.

Returns

the transformed flow map and valid mask.

Return type

Sequence[ndarray]

spacial_transform(imgs: numpy.ndarray) → Tuple[numpy.ndarray, float, float, int, int][source]¶

Spacial transform function.

Parameters

imgs (ndarray) – the images that will be transformed.

Returns

the transformed images,: horizontal scale factor, vertical scale factor, coordinate of left-top point where the image maps will be crop from.

Return type

Tuple[ndarray, float, float, int, int]

class mmflow.datasets.pipelines.TestFormatBundle[source]¶

Default formatting bundle.

It simplifies the pipeline of formatting common fields, including “img1” and “img2”. These fields are formatted as follows.

img1: (1)transpose, (2)to tensor, (3)to DataContainer (stack=True)
img2: (1)transpose, (2)to tensor, (3)to DataContainer (stack=True)

class mmflow.datasets.pipelines.ToDataContainer(fields: collections.abc.Sequence = ({'key': 'img1', 'stack': True}, {'key': 'img2', 'stack': True}, {'key': 'flow_gt'}))[source]¶

Convert results to mmcv.DataContainer by given fields.

Parameters: fields (Sequence[dict]) – Each field is a dict like dict(key='xxx', **kwargs). The key in result will be converted to mmcv.DataContainer with **kwargs. Default: (dict(key='img1', stack=True), dict(key='img2', stack=True), dict(key='flow_gt')).

class mmflow.datasets.pipelines.ToTensor(keys: collections.abc.Sequence)[source]¶

Convert some results to torch.Tensor by given keys.

Parameters: keys (Sequence[str]) – Keys that need to be converted to Tensor.

class mmflow.datasets.pipelines.Transpose(keys: collections.abc.Sequence, order: collections.abc.Sequence)[source]¶

Transpose some results by given keys.

Parameters

keys (Sequence[str]) – Keys of results to be transposed.
order (Sequence[int]) – Order of transpose.

class mmflow.datasets.pipelines.Validation(max_flow: Union[float, int])[source]¶

This Validation transform from RAFT is for return a mask for the flow is less than max_flow.

Parameters

max_flow (float, int) – the max flow for validated flow.

Returns

Resized results, ‘valid’ and ‘max_flow’ keys are added into: result dict.

mmflow.apis¶

mmflow.core¶

evaluation¶

hooks¶

mmflow.datasets¶

datasets¶

pipelines¶

mmflow.models¶

encoders¶

decoders¶

flow_estimators¶

losses¶