Matthew’s Blog - FastAI Course - Lesson 1

I have finished watching the first lesson now. It would be great to actually run each of the types of models they mention in the lesson and then perform a prediction with them. So lets just smash through them.

CNN Classifier

Code

from fastai.vision.all import *
path = untar_data(URLs.PETS)/'images'

def is_cat(x): return x[0].isupper()
cnn_dls = ImageDataLoaders.from_name_func(
    path, get_image_files(path), valid_pct=0.2, seed=42,
    label_func=is_cat, item_tfms=Resize(224))

cnn_learn = cnn_learner(cnn_dls, resnet34, metrics=error_rate)
cnn_learn.fine_tune(1)

epoch	train_loss	valid_loss	error_rate	time
0	0.177180	0.027871	0.009472	00:11

epoch	train_loss	valid_loss	error_rate	time
0	0.033203	0.025562	0.005413	00:14

Code

cnn_learn.show_results()

Code

Image.open(cnn_dls.valid.items[0])

Code

cnn_learn.predict(cnn_dls.valid.items[0])

('False', tensor(0), tensor([1.0000e+00, 9.3083e-08]))

This looks like it will be extremely easy!

UNet Segmentation

Code

from fastai.vision.all import *
path = untar_data(URLs.CAMVID_TINY)
segmentation_dls = SegmentationDataLoaders.from_label_func(
    path, bs=8, fnames = get_image_files(path/"images"),
    label_func = lambda o: path/'labels'/f'{o.stem}_P{o.suffix}',
    codes = np.loadtxt(path/'codes.txt', dtype=str)
)

segmentation_learn = unet_learner(segmentation_dls, resnet34)
segmentation_learn.fine_tune(8)

epoch	train_loss	valid_loss	time
0	3.659997	2.537637	00:02

epoch	train_loss	valid_loss	time
0	2.103605	1.806246	00:01
1	1.754702	1.421853	00:01
2	1.542019	1.316897	00:01
3	1.405152	1.076015	00:01
4	1.268974	0.876026	00:01
5	1.140154	0.842787	00:01
6	1.037381	0.811368	00:01
7	0.956515	0.810816	00:01

Code

segmentation_learn.show_results(max_n=6, figsize=(7,8))

Code

Image.open(segmentation_dls.valid.items[0])

Code

outputs = segmentation_learn.predict(segmentation_dls.valid.items[0])
len(outputs)

Code

outputs[0].show() ; None

Code

outputs[1].show() ; None

I guess the houses and the tree are ok, the road is much worse, and the markings are missing. It looks bad however this is actually reasonably accurate. Segmentation is hard!

LSTM

This is quite an old technique by now. Hopefully soon they will be using transformers for this.

Code

from fastai.text.all import *

text_dls = TextDataLoaders.from_folder(untar_data(URLs.IMDB), valid='test', bs=32)
text_learn = text_classifier_learner(text_dls, AWD_LSTM, drop_mult=0.5, metrics=accuracy)
text_learn.fine_tune(4, 1e-2)

epoch	train_loss	valid_loss	accuracy	time
0	0.477276	0.404300	0.822800	01:46

epoch	train_loss	valid_loss	accuracy	time
0	0.306954	0.352532	0.845080	03:16
1	0.235937	0.206796	0.918120	03:19
2	0.187152	0.183651	0.930120	03:21
3	0.154293	0.195161	0.929480	03:20

Code

print(text_dls.valid.items[0].read_text())

"Shivering Shakespeare" could be considered the first classic of the "Our Gang" talkie era. By now, Hal Roach Studios began to hit their stride in making talking pictures, and "Shakespeare" is the happy result.<br /><br />The Gang is appearing in a version of Quo Vadis produced by Kennedy the Cop's wife. The kids don't find the play very fun to be in and are distracted by people in the theatre and cannot remember their lines. Among the funniest bits are Kennedy the Cop as the giant, who pulls off his makeup to fight an overzealous man in a bull costume; and the terrible dancing girl (played by director Bob McGowan's daughter.)<br /><br />Several filmographies mention that "Shakespeare" has the first pie fight in a talkie. This may be true, seeing as they tried different speeds with the film during the fight. Buster Keaton's brother Harry is at the receiving end of one of the pies. Very funny and an early Gang talkie classic. 9 out of 10.

Code

text_learn.predict(text_dls.valid.items[0])

('pos', tensor(1), tensor([2.8925e-04, 9.9971e-01]))

They gave the movie 9 out of 10 so that certainly is a positive review.

Tabular Data

Code

from fastai.tabular.all import *
path = untar_data(URLs.ADULT_SAMPLE)

tabular_dls = TabularDataLoaders.from_csv(path/'adult.csv', path=path, y_names="salary",
    cat_names = ['workclass', 'education', 'marital-status', 'occupation',
                 'relationship', 'race'],
    cont_names = ['age', 'fnlwgt', 'education-num'],
    procs = [Categorify, FillMissing, Normalize])

tabular_learn = tabular_learner(tabular_dls, metrics=accuracy)
tabular_learn.fit_one_cycle(3) # using a different training method here - not working from a pretrained model

epoch	train_loss	valid_loss	accuracy	time
0	0.364098	0.368269	0.827856	00:03
1	0.354876	0.359489	0.828317	00:03
2	0.341413	0.355986	0.832156	00:03

Code

tabular_dls.valid.items

	age	workclass	fnlwgt	education	education-num	marital-status	occupation	relationship	race	sex	capital-gain	capital-loss	hours-per-week	native-country	salary	education-num_na
9944	0.181847	5	-0.203521	16	-0.035090	3	2	6	5	Female	0	0	40	United-States	1	1
24761	-0.038134	5	-1.337343	16	-0.035090	3	15	1	5	Male	0	0	40	United-States	1	1
26474	0.915119	3	0.493768	14	-3.568862	3	9	1	5	Male	0	0	40	United-States	0	1
32353	-1.064716	5	0.161659	16	-0.035090	5	13	4	5	Female	5060	0	30	United-States	0	1
12069	-1.211370	5	-1.461741	12	-0.427732	5	9	2	5	Female	0	0	40	United-States	0	1
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
9161	0.915119	2	0.248167	16	-0.035090	3	7	1	4	Male	0	0	40	?	0	1
16570	1.061774	5	-1.114666	16	-0.035090	3	5	1	5	Male	0	0	55	United-States	0	1
6378	-0.038134	6	1.161370	16	-0.035090	3	13	1	5	Male	0	0	50	United-States	1	1
19639	1.721719	5	-0.327465	12	-0.427732	3	5	1	5	Male	0	0	45	United-States	1	1
30992	-0.404771	5	0.116361	13	1.535475	3	11	1	5	Male	0	0	50	United-States	0	1

6512 rows × 16 columns

Code

tabular_learn.cpu() ; tabular_dls.cpu() ;

tabular_learn.predict(tabular_dls.valid.items.iloc[0])

0.00% [0/1 00:00<00:00]

IndexError: Target -1 is out of bounds.

So this hasn’t worked out. A pity.

Collaborative Filtering

Code

from fastai.collab import *
path = untar_data(URLs.ML_SAMPLE)
collab_dls = CollabDataLoaders.from_csv(path/'ratings.csv')
collab_learn = collab_learner(collab_dls, y_range=(0.5,5.5))
collab_learn.fine_tune(10)

epoch	train_loss	valid_loss	time
0	1.497408	1.427355	00:00

epoch	train_loss	valid_loss	time
0	1.372053	1.366258	00:00
1	1.270144	1.181694	00:00
2	1.023034	0.875672	00:00
3	0.794844	0.730298	00:00
4	0.693325	0.696969	00:00
5	0.645092	0.686418	00:00
6	0.632014	0.683180	00:00
7	0.615656	0.681403	00:00
8	0.602203	0.680848	00:00
9	0.595254	0.680619	00:00

Code

collab_dls.valid.items

	userId	movieId	rating	timestamp
1607	7	60	2.0	1467005230
36	68	9	3.0	939691973
734	18	76	2.5	1138998912
2792	25	18	3.0	977725027
1554	9	52	3.0	1163253049
...	...	...	...	...
4020	2	45	4.5	1127469230
2657	89	82	1.0	1138705003
29	28	23	1.0	1462637943
5242	14	14	3.0	1101130498
3686	30	70	3.0	955092956

1206 rows × 4 columns

Code

collab_dls.cpu() ; collab_learn.cpu() ;

collab_learn.predict(collab_dls.valid.items.iloc[0])

TypeError: list indices must be integers or slices, not list

Another failure. I haven’t really used tabular or collaborative models before so there is probably something I am missing here. It’s a shame that the “take something from the valid items” approach does not work on every learner.

I am impressed with how easy this is.