TabModel

This is an implementation created by Ignacio Oguiza (oguiza@timeseriesAI.co) based on fastai’s TabularModel.

We built it so that it’s easy to change the head of the model, something that is particularly interesting when building hybrid models.


source

TabHead

 TabHead (emb_szs, n_cont, c_out, layers=None, fc_dropout=None,
          y_range=None, use_bn=True, bn_final=False, lin_first=False,
          act=ReLU(inplace=True), skip=False)

Basic head for tabular data.


source

TabBackbone

 TabBackbone (emb_szs, n_cont, embed_p=0.0, bn_cont=True)

Same as nn.Module, but no need for subclasses to call super().__init__


source

TabModel

 TabModel (emb_szs, n_cont, c_out, layers=None, fc_dropout=None,
           embed_p=0.0, y_range=None, use_bn=True, bn_final=False,
           bn_cont=True, lin_first=False, act=ReLU(inplace=True),
           skip=False)

Basic model for tabular data.

from fastai.tabular.core import *
from tsai.data.tabular import *
path = untar_data(URLs.ADULT_SAMPLE)
df = pd.read_csv(path/'adult.csv')
# df['salary'] = np.random.rand(len(df)) # uncomment to simulate a cont dependent variable
procs = [Categorify, FillMissing, Normalize]
cat_names = ['workclass', 'education', 'marital-status', 'occupation', 'relationship', 'race']
cont_names = ['age', 'fnlwgt', 'education-num']
y_names = ['salary']
y_block = RegressionBlock() if isinstance(df['salary'].values[0], float) else CategoryBlock()
splits = RandomSplitter()(range_of(df))
pd.options.mode.chained_assignment=None
to = TabularPandas(df, procs=procs, cat_names=cat_names, cont_names=cont_names, y_names=y_names, y_block=y_block, splits=splits, inplace=True, 
                   reduce_memory=False)
to.show(5)
tab_dls = to.dataloaders(bs=16, val_bs=32)
b = first(tab_dls.train)
test_eq((b[0].shape, b[1].shape, b[2].shape), (torch.Size([16, 7]), torch.Size([16, 3]), torch.Size([16, 1])))
workclass education marital-status occupation relationship race education-num_na age fnlwgt education-num salary
20505 Private HS-grad Married-civ-spouse Sales Husband White False 47.0 197836.0 9.0 <50k
28679 Private HS-grad Married-civ-spouse Craft-repair Husband White False 28.0 65078.0 9.0 >=50k
11669 Private HS-grad Never-married Adm-clerical Not-in-family White False 38.0 202683.0 9.0 <50k
29079 Self-emp-not-inc Bachelors Married-civ-spouse Prof-specialty Husband White False 41.0 168098.0 13.0 <50k
7061 Private HS-grad Married-civ-spouse Adm-clerical Husband White False 31.0 243442.0 9.0 <50k
tab_model = build_tabular_model(TabModel, dls=tab_dls)
b = first(tab_dls.train)
test_eq(tab_model.to(b[0].device)(*b[:-1]).shape, (tab_dls.bs, tab_dls.c))
learn = Learner(tab_dls, tab_model, splitter=ts_splitter)
p1 = count_parameters(learn.model)
learn.freeze()
p2 = count_parameters(learn.model)
learn.unfreeze()
p3 = count_parameters(learn.model)
assert p1 == p3
assert p1 > p2 > 0