torch_frame.nn.models.ExcelFormer
- class ExcelFormer(in_channels: int, out_channels: int, num_cols: int, num_layers: int, num_heads: int, col_stats: dict[str, dict[torch_frame.data.stats.StatType, Any]], col_names_dict: dict[torch_frame._stype.stype, list[str]], stype_encoder_dict: Optional[dict[torch_frame._stype.stype, torch_frame.nn.encoder.stype_encoder.StypeEncoder]] = None, diam_dropout: float = 0.0, aium_dropout: float = 0.0, residual_dropout: float = 0.0, mixup: Optional[str] = None, beta: float = 0.5)[source]
Bases:
ModuleThe ExcelFormer model introduced in the “ExcelFormer: A Neural Network Surpassing GBDTs on Tabular Data” paper.
ExcelFormer first converts the categorical features with a target statistics encoder (i.e.,
CatBoostEncoderin the paper) into numerical features. Then it sorts the numerical features with mutual information sort. So the model itself limits to numerical features.Note
For an example of using ExcelFormer, see examples/excelformer.py.
- Parameters:
in_channels (int) – Input channel dimensionality
out_channels (int) – Output channels dimensionality
num_cols (int) – Number of columns
num_layers (int) – Number of
torch_frame.nn.conv.ExcelFormerConvlayers.num_heads (int) – Number of attention heads used in
DiaMcol_stats (dict[str,dict[
torch_frame.data.stats.StatType,Any]]) – A dictionary that maps column name into stats. Available asdataset.col_stats.col_names_dict (dict[
torch_frame.stype, list[str]]) – A dictionary that maps stype to a list of column names. The column names are sorted based on the ordering that appear intensor_frame.feat_dict. Available astensor_frame.col_names_dict.stype_encoder_dict – (dict[
torch_frame.stype,torch_frame.nn.encoder.StypeEncoder], optional): A dictionary mapping stypes into their stype encoders. (default:None, will callExcelFormerEncoder()for numerical feature)diam_dropout (float, optional) – diam_dropout. (default:
0.0)aium_dropout (float, optional) – aium_dropout. (default:
0.0)residual_dropout (float, optional) – residual dropout. (default:
0.0)mixup (str, optional) – mixup type.
None,feature, orhidden. (default:None)beta (float, optional) – Shape parameter for beta distribution to calculate shuffle rate in mixup. Only useful when mixup is not
None. (default:0.5)
- forward(tf: TensorFrame, mixup_encoded: bool = False) torch.Tensor | tuple[torch.Tensor, torch.Tensor][source]
Transform
TensorFrameobject into output embeddings. Ifmixup_encodedisTrue, it produces the output embeddings together with the mixed-up targets inself.mixupmanner.- Parameters:
- Returns:
- The output embeddings of size
[batch_size, out_channels]. If
mixup_encodedisTrue, return the mixed-up targets of size [batch_size, num_classes] as well.
- Return type:
torch.Tensor | tuple[Tensor, Tensor]