torch_frame.nn.models.ExcelFormer
- class ExcelFormer(in_channels: int, out_channels: int, num_cols: int, num_layers: int, num_heads: int, col_stats: dict[str, dict[StatType, Any]], col_names_dict: dict[torch_frame.stype, list[str]], stype_encoder_dict: dict[torch_frame.stype, StypeEncoder] | None = None, diam_dropout: float = 0.0, aium_dropout: float = 0.0, residual_dropout: float = 0.0, mixup: str | None = None, beta: float = 0.5)[source]
Bases:
Module
The ExcelFormer model introduced in the “ExcelFormer: A Neural Network Surpassing GBDTs on Tabular Data” paper.
ExcelFormer first converts the categorical features with a target statistics encoder (i.e.,
CatBoostEncoder
in the paper) into numerical features. Then it sorts the numerical features with mutual information sort. So the model itself limits to numerical features.Note
For an example of using ExcelFormer, see examples/excelformer.py.
- Parameters:
in_channels (int) – Input channel dimensionality
out_channels (int) – Output channels dimensionality
num_cols (int) – Number of columns
num_layers (int) – Number of
torch_frame.nn.conv.ExcelFormerConv
layers.num_heads (int) – Number of attention heads used in
DiaM
col_stats (dict[str,dict[
torch_frame.data.stats.StatType
,Any]]) – A dictionary that maps column name into stats. Available asdataset.col_stats
.col_names_dict (dict[
torch_frame.stype
, list[str]]) – A dictionary that maps stype to a list of column names. The column names are sorted based on the ordering that appear intensor_frame.feat_dict
. Available astensor_frame.col_names_dict
.stype_encoder_dict – (dict[
torch_frame.stype
,torch_frame.nn.encoder.StypeEncoder
], optional): A dictionary mapping stypes into their stype encoders. (default:None
, will callExcelFormerEncoder()
for numerical feature)diam_dropout (float, optional) – diam_dropout. (default:
0.0
)aium_dropout (float, optional) – aium_dropout. (default:
0.0
)residual_dropout (float, optional) – residual dropout. (default:
0.0
)mixup (str, optional) – mixup type.
None
,feature
, orhidden
. (default:None
)beta (float, optional) – Shape parameter for beta distribution to calculate shuffle rate in mixup. Only useful when mixup is not
None
. (default:0.5
)
- forward(tf: TensorFrame, mixup_encoded: bool = False) Tensor | tuple[Tensor, Tensor] [source]
Transform
TensorFrame
object into output embeddings. If mixup_encoded is True, it produces the output embeddings together with the mixed-up targets in self.mixup manner.- Parameters:
tf (
torch_frame.TensorFrame
) – InputTensorFrame
object.mixup_encoded (bool) – Whether to mixup on encoded numerical features, i.e., FEAT-MIX and HIDDEN-MIX.
- Returns:
The output embeddings of size [batch_size, out_channels]. If mixup_encoded is True, return the mixed-up targets of size [batch_size, num_classes] as well.
- Return type:
torch.Tensor | tuple[Tensor, Tensor]