torch_frame.data.TensorFrame
- class TensorFrame(feat_dict: dict[torch_frame._stype.stype, torch.Tensor | torch_frame.data.multi_nested_tensor.MultiNestedTensor | torch_frame.data.multi_embedding_tensor.MultiEmbeddingTensor | dict[str, torch_frame.data.multi_nested_tensor.MultiNestedTensor]], col_names_dict: dict[torch_frame._stype.stype, list[str]], y: Optional[Tensor] = None, num_rows: Optional[int] = None)[source]
Bases:
objectA tensor frame holds a PyTorch tensor for each table column. Table columns are organized into their semantic types
stype(e.g., categorical, numerical) and mapped to a compact tensor representation (e.g., strings in a categorical column are mapped to indices from{0, ..., num_categories - 1}), and can be accessed throughfeat_dict. For instance,feat_dict[stype.numerical]stores a concatenated PyTorch tensor for all numerical features, where the first and second dimension represents the row and column in the original data frame, respectively.TensorFramehandles missing values viafloat('NaN')for floating-point tensors, and-1otherwise.col_names_dictmaps each column infeat_dictto their original column name. For example,col_names_dict[stype.numerical][i]stores the column name offeat_dict[stype.numerical][:, i].Additionally,
TensorFramecan store any target values iny.import torch_frame tf = torch_frame.TensorFrame( feat_dict = { # Two numerical columns: torch_frame.numerical: torch.randn(10, 2), # Three categorical columns: torch_frame.categorical: torch.randint(0, 5, (10, 3)), }, col_names_dict = { torch_frame.numerical: ['num_1', 'num_2'], torch_frame.categorical: ['cat_1', 'cat_2', 'cat_3'], }, ) print(len(tf)) >>> 10 # Row-wise filtering: tf = tf[torch.tensor([0, 2, 4, 6, 8])] print(len(tf)) >>> 5 # Transfer tensor frame to the GPU: tf = tf.to('cuda')
- validate() None[source]
Validates the
TensorFrameobject.
- get_col_feat(col_name: str, *, return_stype: bool = False) torch.Tensor | torch_frame.data.multi_nested_tensor.MultiNestedTensor | torch_frame.data.multi_embedding_tensor.MultiEmbeddingTensor | dict[str, torch_frame.data.multi_nested_tensor.MultiNestedTensor] | tuple[torch.Tensor | torch_frame.data.multi_nested_tensor.MultiNestedTensor | torch_frame.data.multi_embedding_tensor.MultiEmbeddingTensor | dict[str, torch_frame.data.multi_nested_tensor.MultiNestedTensor], torch_frame._stype.stype][source]
Get feature of a given column.
- property stypes: list[torch_frame._stype.stype]
Returns a canonical ordering of stypes in
feat_dict.
- property num_cols: int
The number of columns in the
TensorFrame.
- property num_rows: int
The number of rows in the
TensorFrame.
- property device: torch.device | None
The device of the
TensorFrame.
- property is_empty: bool
Returns
Trueif theTensorFrameis empty.