FastAI Course - Lesson 1 - Models

Quick evaluation of different models introduced in first lesson
Published

February 4, 2021

I have finished watching the first lesson now. It would be great to actually run each of the types of models they mention in the lesson and then perform a prediction with them. So lets just smash through them.


CNN Classifier

Code
from fastai.vision.all import *
path = untar_data(URLs.PETS)/'images'

def is_cat(x): return x[0].isupper()
cnn_dls = ImageDataLoaders.from_name_func(
    path, get_image_files(path), valid_pct=0.2, seed=42,
    label_func=is_cat, item_tfms=Resize(224))

cnn_learn = cnn_learner(cnn_dls, resnet34, metrics=error_rate)
cnn_learn.fine_tune(1)
epoch train_loss valid_loss error_rate time
0 0.177180 0.027871 0.009472 00:11
epoch train_loss valid_loss error_rate time
0 0.033203 0.025562 0.005413 00:14
Code
cnn_learn.show_results()

Code
Image.open(cnn_dls.valid.items[0])

Code
cnn_learn.predict(cnn_dls.valid.items[0])
('False', tensor(0), tensor([1.0000e+00, 9.3083e-08]))

This looks like it will be extremely easy!


UNet Segmentation

Code
from fastai.vision.all import *
path = untar_data(URLs.CAMVID_TINY)
segmentation_dls = SegmentationDataLoaders.from_label_func(
    path, bs=8, fnames = get_image_files(path/"images"),
    label_func = lambda o: path/'labels'/f'{o.stem}_P{o.suffix}',
    codes = np.loadtxt(path/'codes.txt', dtype=str)
)

segmentation_learn = unet_learner(segmentation_dls, resnet34)
segmentation_learn.fine_tune(8)
epoch train_loss valid_loss time
0 3.659997 2.537637 00:02
epoch train_loss valid_loss time
0 2.103605 1.806246 00:01
1 1.754702 1.421853 00:01
2 1.542019 1.316897 00:01
3 1.405152 1.076015 00:01
4 1.268974 0.876026 00:01
5 1.140154 0.842787 00:01
6 1.037381 0.811368 00:01
7 0.956515 0.810816 00:01
Code
segmentation_learn.show_results(max_n=6, figsize=(7,8))

Code
Image.open(segmentation_dls.valid.items[0])

Code
outputs = segmentation_learn.predict(segmentation_dls.valid.items[0])
len(outputs)
3
Code
outputs[0].show() ; None

Code
outputs[1].show() ; None

I guess the houses and the tree are ok, the road is much worse, and the markings are missing. It looks bad however this is actually reasonably accurate. Segmentation is hard!


LSTM

This is quite an old technique by now. Hopefully soon they will be using transformers for this.

Code
from fastai.text.all import *

text_dls = TextDataLoaders.from_folder(untar_data(URLs.IMDB), valid='test', bs=32)
text_learn = text_classifier_learner(text_dls, AWD_LSTM, drop_mult=0.5, metrics=accuracy)
text_learn.fine_tune(4, 1e-2)
epoch train_loss valid_loss accuracy time
0 0.477276 0.404300 0.822800 01:46
epoch train_loss valid_loss accuracy time
0 0.306954 0.352532 0.845080 03:16
1 0.235937 0.206796 0.918120 03:19
2 0.187152 0.183651 0.930120 03:21
3 0.154293 0.195161 0.929480 03:20
Code
print(text_dls.valid.items[0].read_text())
"Shivering Shakespeare" could be considered the first classic of the "Our Gang" talkie era. By now, Hal Roach Studios began to hit their stride in making talking pictures, and "Shakespeare" is the happy result.<br /><br />The Gang is appearing in a version of Quo Vadis produced by Kennedy the Cop's wife. The kids don't find the play very fun to be in and are distracted by people in the theatre and cannot remember their lines. Among the funniest bits are Kennedy the Cop as the giant, who pulls off his makeup to fight an overzealous man in a bull costume; and the terrible dancing girl (played by director Bob McGowan's daughter.)<br /><br />Several filmographies mention that "Shakespeare" has the first pie fight in a talkie. This may be true, seeing as they tried different speeds with the film during the fight. Buster Keaton's brother Harry is at the receiving end of one of the pies. Very funny and an early Gang talkie classic. 9 out of 10.
Code
text_learn.predict(text_dls.valid.items[0])
('pos', tensor(1), tensor([2.8925e-04, 9.9971e-01]))

They gave the movie 9 out of 10 so that certainly is a positive review.


Tabular Data

Code
from fastai.tabular.all import *
path = untar_data(URLs.ADULT_SAMPLE)

tabular_dls = TabularDataLoaders.from_csv(path/'adult.csv', path=path, y_names="salary",
    cat_names = ['workclass', 'education', 'marital-status', 'occupation',
                 'relationship', 'race'],
    cont_names = ['age', 'fnlwgt', 'education-num'],
    procs = [Categorify, FillMissing, Normalize])

tabular_learn = tabular_learner(tabular_dls, metrics=accuracy)
tabular_learn.fit_one_cycle(3) # using a different training method here - not working from a pretrained model
epoch train_loss valid_loss accuracy time
0 0.364098 0.368269 0.827856 00:03
1 0.354876 0.359489 0.828317 00:03
2 0.341413 0.355986 0.832156 00:03
Code
tabular_dls.valid.items
age workclass fnlwgt education education-num marital-status occupation relationship race sex capital-gain capital-loss hours-per-week native-country salary education-num_na
9944 0.181847 5 -0.203521 16 -0.035090 3 2 6 5 Female 0 0 40 United-States 1 1
24761 -0.038134 5 -1.337343 16 -0.035090 3 15 1 5 Male 0 0 40 United-States 1 1
26474 0.915119 3 0.493768 14 -3.568862 3 9 1 5 Male 0 0 40 United-States 0 1
32353 -1.064716 5 0.161659 16 -0.035090 5 13 4 5 Female 5060 0 30 United-States 0 1
12069 -1.211370 5 -1.461741 12 -0.427732 5 9 2 5 Female 0 0 40 United-States 0 1
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
9161 0.915119 2 0.248167 16 -0.035090 3 7 1 4 Male 0 0 40 ? 0 1
16570 1.061774 5 -1.114666 16 -0.035090 3 5 1 5 Male 0 0 55 United-States 0 1
6378 -0.038134 6 1.161370 16 -0.035090 3 13 1 5 Male 0 0 50 United-States 1 1
19639 1.721719 5 -0.327465 12 -0.427732 3 5 1 5 Male 0 0 45 United-States 1 1
30992 -0.404771 5 0.116361 13 1.535475 3 11 1 5 Male 0 0 50 United-States 0 1

6512 rows × 16 columns

Code
tabular_learn.cpu() ; tabular_dls.cpu() ;

tabular_learn.predict(tabular_dls.valid.items.iloc[0])
0.00% [0/1 00:00<00:00]
IndexError: Target -1 is out of bounds.

So this hasn’t worked out. A pity.


Collaborative Filtering

Code
from fastai.collab import *
path = untar_data(URLs.ML_SAMPLE)
collab_dls = CollabDataLoaders.from_csv(path/'ratings.csv')
collab_learn = collab_learner(collab_dls, y_range=(0.5,5.5))
collab_learn.fine_tune(10)
epoch train_loss valid_loss time
0 1.497408 1.427355 00:00
epoch train_loss valid_loss time
0 1.372053 1.366258 00:00
1 1.270144 1.181694 00:00
2 1.023034 0.875672 00:00
3 0.794844 0.730298 00:00
4 0.693325 0.696969 00:00
5 0.645092 0.686418 00:00
6 0.632014 0.683180 00:00
7 0.615656 0.681403 00:00
8 0.602203 0.680848 00:00
9 0.595254 0.680619 00:00
Code
collab_dls.valid.items
userId movieId rating timestamp
1607 7 60 2.0 1467005230
36 68 9 3.0 939691973
734 18 76 2.5 1138998912
2792 25 18 3.0 977725027
1554 9 52 3.0 1163253049
... ... ... ... ...
4020 2 45 4.5 1127469230
2657 89 82 1.0 1138705003
29 28 23 1.0 1462637943
5242 14 14 3.0 1101130498
3686 30 70 3.0 955092956

1206 rows × 4 columns

Code
collab_dls.cpu() ; collab_learn.cpu() ;

collab_learn.predict(collab_dls.valid.items.iloc[0])
TypeError: list indices must be integers or slices, not list

Another failure. I haven’t really used tabular or collaborative models before so there is probably something I am missing here. It’s a shame that the “take something from the valid items” approach does not work on every learner.

I am impressed with how easy this is.