torch_frame.transforms.CatToNumTransform
- class CatToNumTransform[source]
Bases:
FittableBaseTransformTransforms categorical features in
TensorFrameusing target statistics. The original transform is explained in A preprocessing scheme for high-cardinality categorical attributes in classification and prediction problems paper.Specifically, each categorical feature is transformed into numerical feature using m-probability estimate, defined by
\[\frac{n_c + p \cdot m}{n + m}\]where \(n_c\) is the count of the category, \(n\) is the total count, \(p\) is the prior probability and \(m\) is a smoothing factor.