torch_frame.datasets.FakeDataset

class FakeDataset(num_rows: int, with_nan: bool = False, stypes: Optional[list[torch_frame._stype.stype]] = None, create_split: bool = False, task_type: TaskType = TaskType.REGRESSION, col_to_text_embedder_cfg: Optional[Union[dict[str, torch_frame.config.text_embedder.TextEmbedderConfig], TextEmbedderConfig]] = None, col_to_text_tokenizer_cfg: Optional[Union[dict[str, torch_frame.config.text_tokenizer.TextTokenizerConfig], TextTokenizerConfig]] = None, col_to_image_embedder_cfg: Optional[Union[dict[str, torch_frame.config.image_embedder.ImageEmbedderConfig], ImageEmbedderConfig]] = None, tmp_path: Optional[str] = None)[source]

Bases: Dataset

A fake dataset for testing purpose.

Parameters:
  • num_rows (int) – Number of rows.

  • with_nan (bool) – Whether include nan in the dataset.

  • stypes (List[stype]) – List of stype columns to include in the dataset. Particularly useful, when you want to create a dataset with only numerical or categorical feature columns. (default: [stype.categorical, stype.numerical])

  • create_split (bool) – Whether to create a train, val and test split for the fake dataset. (default: False)

  • task_type (TaskType) – Task type (default: TaskType.REGRESSION)

  • tmp_path (str, optional) – Temporary path to save created images.