Matthew’s Blog - New Huggingface Agents course

Huggingface have released a course on Agents, it’s available here. I’ve been working on agents for a task at work and I evaluated smolagents shortly after it was released. Since this course directly uses it in the first section I thought it would be fun to follow along the tutorial and recreate it locally.

I’m not going to go over the structure of agents or how they work directly. Instead this will focus on the tool definitions and the creation and use of an agent.

Tools

The first thing they start with is the format for the tools. Tools are invocable python functions, and they are presented to the agent by using meta programming to extract the name, arguments, return type and description of the tool. These function details are then composed into a description of the tool. When the agent wants to use the tool it responds with a specially formatted message which will invoke the tool. Once the tool has been invoked the return value is presented to the agent and it can then proceed.

The following code defines a dummy tool and a timezone conversion tool:

Code

from smolagents import CodeAgent, DuckDuckGoSearchTool, HfApiModel, load_tool, tool
import datetime
import requests
import pytz
import yaml
from tools.final_answer import FinalAnswerTool

@tool
def my_custom_tool(arg1:str, arg2:int)-> str: # it's important to specify the return type
    # Keep this format for the tool description / args description but feel free to modify the tool
    """A tool that does nothing yet 
    Args:
        arg1: the first argument
        arg2: the second argument
    """
    return "What magic will you build ?"

@tool
def get_current_time_in_timezone(timezone: str) -> str:
    """A tool that fetches the current local time in a specified timezone.
    Args:
        timezone: A string representing a valid timezone (e.g., 'America/New_York').
    """
    try:
        # Create timezone object
        tz = pytz.timezone(timezone)
        # Get current time in that timezone
        local_time = datetime.datetime.now(tz).strftime("%Y-%m-%d %H:%M:%S")
        return f"The current local time in {timezone} is: {local_time}"
    except Exception as e:
        return f"Error fetching time for timezone '{timezone}': {str(e)}"

We can define our own tools easily enough. Since there is a timezone tool it would be nice if the agent could create a picture of a location that has the correct weather and time of day. This would then involve three tools:

timezone conversion
weather api
image generation

Let’s see if this can be done. The timezone conversion from the definition above can be used already.

Weather Tool

There is the Open-Meteo weather api which is free and open source. This takes the latitude and longitude, which means we now need a way to resolve a location into coordinates. Nominatim is a free and open source geocoding api which seems quite simple to use.

Let’s start with the geocoding then move on to weather. To make it easier for the agent I am going to combine these into a single tool. (This might be the wrong thing to do, as there often are multiple places that have the same name, and the agent could disambiguate between them).

Code

import httpx
from http import HTTPStatus
from dataclasses import dataclass

@dataclass
class GeocodedLocation:
    name: str
    latitude: float
    longitude: float

def geocode_location(location: str) -> GeocodedLocation:
    response = httpx.get(
        "https://nominatim.openstreetmap.org/search",
        params={
            "q": location,
            "format": "json",
        },
    )
    assert response.status_code == HTTPStatus.OK

    first_result = response.json()[0]
    name = first_result["display_name"]
    latitude = float(first_result["lat"])
    longitude = float(first_result["lon"])
    return GeocodedLocation(
        name=name,
        latitude=latitude,
        longitude=longitude,
    )

london_location = geocode_location("London")
london_location

GeocodedLocation(name='London, Greater London, England, United Kingdom', latitude=51.5074456, longitude=-0.1277653)

Code

import httpx
from http import HTTPStatus
from dataclasses import dataclass

@dataclass
class LocationWeather:
    is_day: bool
    humidity: str
    apparent_temperature: str
    precipitation: str
    snowfall: str
    cloud_cover: str
    wind_speed: str
    wind_gusts: str

def weather_at_coordinates(latitude: float, longitude: float) -> GeocodedLocation:
    response = httpx.get(
        "https://api.open-meteo.com/v1/forecast",
        params={
            "latitude": round(latitude, 2),
            "longitude": round(longitude, 2),
            "current": (
                "is_day,"
                "relative_humidity_2m,"
                "apparent_temperature,"
                "precipitation,"
                "snowfall,"
                "cloud_cover,"
                "wind_speed_10m,"
                "wind_gusts_10m"
            ),
            "format": "json",
        },
    )
    assert response.status_code == HTTPStatus.OK

    result = response.json()
    units = result["current_units"]
    values = result["current"]

    def _get_value_and_unit(name: str) -> str:
        value = values[name]
        unit = units[name]
        return f"{value} {unit}"
    
    is_day = bool(values["is_day"])
    humidity = _get_value_and_unit("relative_humidity_2m")
    apparent_temperature = _get_value_and_unit("apparent_temperature")
    precipitation = _get_value_and_unit("precipitation")
    snowfall = _get_value_and_unit("snowfall")
    cloud_cover = _get_value_and_unit("cloud_cover")
    wind_speed = _get_value_and_unit("wind_speed_10m")
    wind_gusts = _get_value_and_unit("wind_gusts_10m")

    return LocationWeather(
        is_day=is_day,
        humidity=humidity,
        apparent_temperature=apparent_temperature,
        precipitation=precipitation,
        snowfall=snowfall,
        cloud_cover=cloud_cover,
        wind_speed=wind_speed,
        wind_gusts=wind_gusts,
    )

weather_at_coordinates(latitude=london_location.latitude, longitude=london_location.longitude)

LocationWeather(is_day=False, humidity='75 %', apparent_temperature='-2.6 °C', precipitation='0.0 mm', snowfall='0.0 cm', cloud_cover='27 %', wind_speed='9.6 km/h', wind_gusts='26.3 km/h')

Code

from smolagents import tool
import httpx

@tool
def get_current_weather_in_location(location: str) -> str:
    """A tool that fetches the current weather for a specified location.
    You should use this to make your picture of that location more accurate by including the current weather.
    Args:
        location: A string containing a valid place name (e.g., 'London, UK').
    """

    try:
        coordinates = geocode_location(location)
    except Exception as e:
        return f"Error geocoding location '{location}': {str(e)}"

    try:
        weather = weather_at_coordinates(
            latitude=coordinates.latitude,
            longitude=coordinates.longitude,
        )
    except Exception as e:
        return f"Error fetching weather for location '{location}': {str(e)}"

    facts = [
        "it is daytime" if weather.is_day else "it is night",
        f"humidity of {weather.humidity}",
        f"temperature of {weather.apparent_temperature}",
        f"precipitation of {weather.precipitation}",
        f"snowfall of {weather.snowfall}",
        f"cloud cover of {weather.cloud_cover}",
        f"sustained wind speed of {weather.wind_speed}",
        f"wind gusts up to {weather.wind_gusts}",
    ]
    facts_str = ", ".join(facts)
    return f"Weather for {coordinates.name}: {facts_str}"

get_current_weather_in_location("Toronto")

'Weather for Toronto, Golden Horseshoe, Ontario, Canada: it is daytime, humidity of 63 %, temperature of -15.3 °C, precipitation of 0.0 mm, snowfall of 0.0 cm, cloud cover of 100 %, sustained wind speed of 24.6 km/h, wind gusts up to 50.8 km/h'

It’s looking pretty rough in Toronto right now. This is great though! Since it includes day/night I don’t even need the timezone converter.

Image Generation Tool

The next thing will be to generate an image of the location based on the current weather. Luckily the image generation tool was provided on the course page, so I am just going to use that for now. At some point it would be good to hook it up to a local model.

from smolagents import load_tool

image_generation_tool = load_tool("agents-course/text-to-image", trust_remote_code=True)

Final Answer Tool

This is what generates the response to the user. I can just copy this from the space for now.

from typing import Any, Optional
from smolagents.tools import Tool

class FinalAnswerTool(Tool):
    name = "final_answer"
    description = "Provides a final answer to the given problem."
    inputs = {'answer': {'type': 'any', 'description': 'The final answer to the problem'}}
    output_type = "any"

    def forward(self, answer: Any) -> Any:
        return answer

    def __init__(self, *args, **kwargs):
        self.is_initialized = False

Agent

The agent is then a model with a prompt. The Qwen/Qwen2.5-Coder-32B-Instruct model is chunky and available via the huggingface api without a token. That should be good enough for our needs. Then we can just add in the tools that we wrote to create the agent.

Code

from smolagents import CodeAgent, HfApiModel, load_tool, tool
import yaml

image_generation_tool = load_tool("agents-course/text-to-image", trust_remote_code=True)
final_answer = FinalAnswerTool()

model = HfApiModel(
    max_tokens=2096,
    temperature=0.5,
    model_id='Qwen/Qwen2.5-Coder-32B-Instruct',
    custom_role_conversions=None,
)

with open("prompt.yaml", 'r') as stream:
    prompt_templates = yaml.safe_load(stream)
    
agent = CodeAgent(
    model=model,
    tools=[
        get_current_weather_in_location,
        image_generation_tool,
        final_answer,
    ],
    max_steps=6,
    verbosity_level=1,
    grammar=None,
    planning_interval=None,
    name=None,
    description=None,
    prompt_templates=prompt_templates
)

Now we can draw a lovely picture of Toronto:

agent.run("Can you draw me a picture of Toronto please?")

╭──────────────────────────────────────────────────── New run ────────────────────────────────────────────────────╮
│                                                                                                                 │
│ Can you draw me a picture of Toronto please?                                                                    │
│                                                                                                                 │
╰─ HfApiModel - Qwen/Qwen2.5-Coder-32B-Instruct ──────────────────────────────────────────────────────────────────╯

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Step 1 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

 ─ Executing parsed code: ──────────────────────────────────────────────────────────────────────────────────────── 
  weather = get_current_weather_in_location(location="Toronto, Canada")                                            
  print(f"The current weather in Toronto is: {weather}")                                                           
 ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────

Execution logs:
The current weather in Toronto is: Weather for Toronto, Golden Horseshoe, Ontario, Canada: it is daytime, humidity 
of 62 %, temperature of -15.7 °C, precipitation of 0.0 mm, snowfall of 0.0 cm, cloud cover of 100 %, sustained wind
speed of 24.6 km/h, wind gusts up to 51.1 km/h

Out: None

[Step 0: Duration 5.10 seconds| Input tokens: 2,180 | Output tokens: 81]

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Step 2 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

 ─ Executing parsed code: ──────────────────────────────────────────────────────────────────────────────────────── 
  prompt = f"A high-res, photorealistic image of Toronto, Canada, with the current weather conditions: it is       
  daytime, humidity of 62 %, temperature of -15. 7 °C, precipitation of 0. 0 mm, snowfall of 0. 0 cm, cloud cover  
  of 100 %, sustained wind speed of 24. 6 km/h, wind gusts up to 51. 1 km/h."                                      
  image = image_generator(prompt=prompt)                                                                           
  final_answer(image)                                                                                              
 ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────

Out - Final answer: <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=1024x1024 at 0x7180DEFC5E80>

[Step 1: Duration 11.51 seconds| Input tokens: 4,626 | Output tokens: 239]

The agent certainly needs to work on summarizing the weather conditions better, however this image is much better with the inclusion of the weather. There is actually a live camera feed from the cn tower so we can even compare this to the real thing (different angle obviously):

It is snowy and clear so the generated image matches the current weather. Wonderful!

Looking at this more deeply, it would be good if the prompt for the image was improved. The current prompt is:

A high-res, photorealistic image of Toronto, Canada, with the current weather conditions: it is daytime, humidity of 62 %, temperature of -15. 7 °C, precipitation of 0. 0 mm, snowfall of 0. 0 cm, cloud cover of 100 %, sustained wind speed of 24. 6 km/h, wind gusts up to 51. 1 km/h.

If this was rephrased to something like: an extremely cold and windy overcast day then it’s likely that the grey box with an approximation of the weather in it would not get added to the picture. This could be achieved with another tool which just invokes the LLM to summarize the weather. I added this when I made this agent into a huggingface space.