Hide table of contents

Prerequisites and Introduction

This post assumes you know Python decently well (though I do explain a good chunk of the code, so it shouldn't be a big issue if you don't) and that you know how neural networks work.

If you would like to learn how neural networks work, I highly recommend this roughly 20-minute video by 3Blue1Brown.

Most of my knowledge for this post came from the first two lessons of Practical Deep Learning for Coders by fast.ai and this post intends to be a gentle introduction to programming A.I. and is intentionally surface level. I hope it demystifies the process of creating simple models for projects for someone interested in A.I. safety.

Jupyter Notebook

Jupyter Notebook is an integrated development environment (IDE) that allows us to write Python code in smaller chunks (called cells) and see the output any time we run one of these cells.

Not only does this lead to regularly "sanity checking" our code (which is a good habit in programming as we catch silly errors quickly before they become a major issue), but we also regularly get the output, which is useful in conjunction with other packages, such as matplotlib.

Our notebooks can also turn into a simple app within the notebook using widgets (and you can even deploy them, but that isn't discussed in this post).

We'll be using Google Colab, an online Jupyter Notebook environment that's simple to use. Open up a fresh notebook via File > New Notebook and let's get started.

fast.ai

fast.ai is a deep learning library that provides high-level tools that make making models much simpler (as we cut through a lot of the boilerplate).

We'll be using the fastbook package (since it's bundled with fastai and a few extra tools, such as the previously mentioned widgets), copy and paste the code snippets (one cell for each snippet) and hit Ctrl/Cmd + Enter to run the cell.

!pip install fastbook==0.0.17
!pip uninstall tornado -y 
!yes | pip install tornado==5.1.0
from fastbook import *
from fastai.vision.widgets import *

Getting our data

I recommend using the Bing Image Search API (it's free for 1,000 calls per month and you can use it later to make your own classifier for other types of images)[1] as fastai allows us to use it to easily get about 150 images of different types.

For this example, we're going to train our neural network to classify three types of bears: grizzly, black and teddy.

We provide our API key:

key = os.environ.get('AZURE_SEARCH_KEY', 'ENTER_THE_API_KEY_HERE')

And set the types and the path name (where the images and their folders will be):

bear_types = 'grizzly', 'black', 'teddy'
path = Path('bears')

Here's the main loop to get the images from Bing, which checks if the directory at the path we supplied exists (if not, it creates it) and, for each type, it creates a folder with about 150 results from Bing Images:

if not path.exists():
    path.mkdir()
    for o in bear_types:
        dest = (path/o)
        dest.mkdir(exist_ok=True)
        results = search_images_bing(key, f'{o} bear')
        download_images(dest, urls=results.attrgot('contentUrl'))

We now have our dataset organized into folders, however, there is a reasonable chance that some of the downloads failed, so we do a bit of data cleaning by unlinking any images that failed the verify_images test provided by fastai.

failed = verify_images(get_image_files(path))
failed.map(Path.unlink)

Setting up our neural network

To get our data into the neural network, we use fast.ai's DataBlock and DataLoaders classes that enable us to split the images into two branches, the training set (which will be used to actually train the neural network) and the validation set (where valid_pct of the images will not be shown for training and will instead be used to prevent overfitting by showing it images it's never seen during training)

bears = DataBlock(
    blocks=(ImageBlock, CategoryBlock),
    get_items=get_image_files,
    splitter=RandomSplitter(valid_pct=0.2, seed=42),
    get_y=parent_label,
    item_tfms=Resize(128)
)
bears = bears.new(
    item_tfms=RandomResizedCrop(224, min_scale=0.5),
    batch_tfms=aug_transforms()
)
dls = bears.dataloaders(path)

ResNet is a pre-trained model from PyTorch that takes advantage of a powerful learning method called transfer learning where a model with predefined weights is used, and the weights update slightly to adapt to the new data.

The reason this works is because the initial hidden layers search for really general features (simple lines, polygons and circles) that can be applied almost everywhere. This saves us a lot of compute as we only have to change a few weights in the higher hidden layers to work on our new data. (Note: This is an oversimplification and this would be a great stopping point to do some further reading. Here's a paper to get you started with this fascinating topic.)

We call cnn_learner (CNN stands for convolutional neural network. As an oversimplified explanation, the CNN continuously apply filters to search for specific pixel gradients, and the CNN finally spits out the classification result like a regular neural network) and fine_tune it with the number of epochs as a parameter to train our model:

learn = cnn_learner(dls, resnet18, metrics=error_rate)
learn.fine_tune(1)

Exporting and getting a prediction

We export our model to a pickle file:

learn.export()

path = Path()
path.ls(file_exts='.pkl')

With this, we can more easily create an "inference" variable that is referenced any time we want to call the model to make a prediction for a new image that it hasn't seen before (called an inference):

learn_inf = load_learner(path/'export.pkl')

With widgets, we can supply our image by creating a simple upload button with its own state:

btn_upload = widgets.FileUpload()
btn_upload

We save the image uploaded to a variable:

img = PILImage.create(btn_upload.data[-1])
out_pl = widgets.Output()

And we get our prediction and probability from the inference variable we created earlier:

pred, pred_idx, probs = learn_inf.predict(img)

Finally, we print out the last cell and get our prediction!:

lbl_pred = widgets.Label()
lbl_pred.value = f'Prediction: {pred}; Probability: {probs[pred_idx]:.04f}'
lbl_pred

A slight warning on making a classifier for a human attribute

I encourage you to make your own classifier for any object or animal you fancy. I don't recommend making a classifier on some human attribute.

For example: Say we wanted to classify whether a skin tumour is malignant or benign. If we go over to Bing and search for "malignant skin tumour" (which I won't show here) we find that the first page mainly consists of fair skin tones, making it much more difficult for a person of colour to use the classifier.

Obviously, your model isn't going to be used by oncologists, but this serves as a reminder that we need to be careful with data for models, powerful or not.

  1. ^

    As an alternative, you can download the dataset we'll be using via this command: !npx degit y-arjun-y/bears -f and you should run the bear_types cell and skip to the "Setting up our neural network" heading.

Comments


No comments on this post yet.
Be the first to respond.
Curated and popular this week
TL;DR * Screwworm Free Future is a new group seeking support to advance work on eradicating the New World Screwworm in South America. * The New World Screwworm (C. hominivorax - literally "man-eater") causes extreme suffering to hundreds of millions of wild and domestic animals every year. * To date we’ve held private meetings with government officials, experts from the private sector, academics, and animal advocates. We believe that work on the NWS is valuable and we want to continue our research and begin lobbying. * Our analysis suggests we could prevent about 100 animals from experiencing an excruciating death per dollar donated, though this estimate has extreme uncertainty. * The screwworm “wall” in Panama has recently been breached, creating both an urgent need and an opportunity to address this problem. * We are seeking $15,000 to fund a part-time lead and could absorb up to $100,000 to build a full-time team, which would include a team lead and another full-time equivalent (FTE) role * We're also excited to speak to people who have a background in veterinary science/medicine, entomology, gene drives, as well as policy experts in Latin America. - please reach out if you know someone who fits this description!   Cochliomyia hominivorax delenda est Screwworm Free Future is a new group of volunteers who connected through Hive investigating the political and scientific barriers stopping South American governments from eradicating the New World Screwworm. In our shallow investigation, we have identified key bottlenecks, but we now need funding and people to take this investigation further, and begin lobbying. In this post, we will cover the following: * The current status of screwworms * Things that we have learnt in our research * What we want to do next * How you can help by funding or supporting or project   What’s the deal with the New World Screwworm? The New World Screwworm[1] is the leading cause of myiasis in Latin America. Myiasis “
 ·  · 11m read
 · 
Does a food carbon tax increase animal deaths and/or the total time of suffering of cows, pigs, chickens, and fish? Theoretically, this is possible, as a carbon tax could lead consumers to substitute, for example, beef with chicken. However, this is not per se the case, as animal products are not perfect substitutes.  I'm presenting the results of my master's thesis in Environmental Economics, which I re-worked and published on SSRN as a pre-print. My thesis develops a model of animal product substitution after a carbon tax, slaughter tax, and a meat tax. When I calibrate[1] this model for the U.S., there is a decrease in animal deaths and duration of suffering following a carbon tax. This suggests that a carbon tax can reduce animal suffering. Key points * Some animal products are carbon-intensive, like beef, but causes relatively few animal deaths or total time of suffering because the animals are large. Other animal products, like chicken, causes relatively many animal deaths or total time of suffering because the animals are small, but cause relatively low greenhouse gas emissions. * A carbon tax will make some animal products, like beef, much more expensive. As a result, people may buy more chicken. This would increase animal suffering, assuming that farm animals suffer. However, this is not per se the case. It is also possible that the direct negative effect of a carbon tax on chicken consumption is stronger than the indirect (positive) substitution effect from carbon-intensive products to chicken. * I developed a non-linear market model to predict the consumption of different animal products after a tax, based on own-price and cross-price elasticities. * When calibrated for the United States, this model predicts a decrease in the consumption of all animal products considered (beef, chicken, pork, and farmed fish). Therefore, the modelled carbon tax is actually good for animal welfare, assuming that animals live net-negative lives. * A slaughter tax (a
 ·  · 4m read
 · 
As 2024 draws to a close, I’m reflecting on the work and stories that inspired me this year: those from the effective altruism community, those I found out about through EA-related channels, and those otherwise related to EA. I’ve appreciated the celebration of wins and successes over the past few years from @Shakeel Hashim's posts in 2022 and 2023. As @Lizka and @MaxDalton put very well in a post in 2022: > We often have high standards in effective altruism. This seems absolutely right: our work matters, so we must constantly strive to do better. > > But we think that it's really important that the effective altruism community celebrate successes: > > * If we focus too much on failures, we incentivize others/ourselves to minimize the risk of failure, and we will probably be too risk averse. > * We're humans: we're more motivated if we celebrate things that have gone well. Rather than attempting to write a comprehensive review of this year's successes and wins related to EA, I want to share what has personally moved me this year—progress that gave me hope, individual stories and acts of altruism, and work that I found thought-provoking or valuable. I’ve structured the sections below as prompts to invite your own reflection on the year, as I’d love to hear your responses in the comments. We all have different relationships with EA ideas and the community surrounding them, and I find it valuable that we can bring different perspectives and responses to questions like these. What progress in the world did you find exciting? * The launch of the Lead Exposure Elimination Fund this year was exciting to see, and the launch of the Partnership for a Lead-Free Future. The fund jointly committed over $100 million to combat lead exposure, compared to the $15 million in private funding that went toward lead exposure reduction in 2023. It’s encouraging to see lead poisoning receiving attention and funding after being relatively neglected. * The Open Wing Alliance repor