Everything You Need to Know About Hugging Face Jobs

Run ML tasks using Hugging Face Jobs with zero setup. This beginner tutorial helps you launch cloud ML workflows in minutes — no DevOps needed.

Curious how to run machine learning tasks without setting up any servers or installing heavy frameworks? Hugging Face Jobs makes it incredibly simple to execute code and train models in the cloud — all with minimal setup. In this beginner’s guide, you’ll learn how to launch your own cloud ML workflows using Hugging Face’s on-demand compute service in just a few steps.

What Are Hugging Face Jobs?

Hugging Face Jobs is a cloud-based service provided by Hugging Face that lets you run machine learning tasks on demand, without managing your own servers. In simple terms, it offers compute resources (like CPUs, GPUs, or TPUs) on the Hugging Face Hub to execute your AI and data workflows. A Job on Hugging Face is defined by three main things: a command to run (for example, a Python script), a Docker container image (from Hugging Face or Docker Hub), and a hardware flavor (the type of machine – CPU or GPU or TPU – and size). When you launch a Job, it runs this command inside the specified container on Hugging Face’s cloud infrastructure, using the hardware you selected, and streams back the logs and results to you.

This service is pay-as-you-go, meaning you only pay for the compute time you use*. You don’t need to rent servers by the hour or maintain long-running instances – you simply submit a Job, it runs to completion, and you’re billed for the seconds of computation consumed. (Note: As of now, Hugging Face Jobs is available to Pro subscribers or members of a Team/Enterprise plan. If you’re using a free account, you’ll need to upgrade to access this feature.)

In a nutshell, Hugging Face Jobs provides a convenient way to perform machine learning jobs in the cloud – whether it’s training a model, running data processing, or doing large-scale inference – using a simple interface tightly integrated with the Hugging Face ecosystem. You get to leverage powerful hardware (including GPUs like NVIDIA A10G or even TPUs) without having to set up any cloud environment yourself. The Hugging Face Hub handles scheduling the job, running it in an isolated environment (container), and giving you access to the output and logs. This is great for AI workflows where you need some heavy compute on demand, such as training a transformer model on a GPU or preprocessing a large dataset.

Figure: A conceptual flow of how Hugging Face Jobs execute a user-defined task on the cloud. The user submits code, selects an environment (Docker + hardware), and the Hugging Face Jobs service runs it on cloud infrastructure. The outputs and logs are then returned to the user.

Who Is It For and Why Use It?

Who can benefit from Hugging Face Jobs? This service is designed for researchers, data scientists, machine learning engineers, and developers who want to run intensive ML tasks without the hassle of managing infrastructure. If you’ve ever been limited by your local machine’s resources or struggled to configure cloud instances, Hugging Face Jobs offers an accessible alternative. It’s especially useful for:

  • Model Training and Fine-tuning: Need to fine-tune a Transformer on a GPU? With Jobs you can train models on powerful GPUs (T4, A10G, A100, etc.) or TPUs without setting up any servers. Just specify the hardware in your job and focus on your training code.
  • Data Processing and ETL Pipelines: For tasks like converting a large dataset, feature engineering, or batch processing thousands of files, you can spin up a high-CPU job. This way, you offload heavy processing to the cloud and free your local environment.
  • Batch Inference or Evaluation: If you have a large number of inputs to run through a model (e.g. generating embeddings for a dataset, or evaluating a model on a big test set), Jobs let you do this in parallel on strong hardware.
  • Experiments & Prototyping: Jobs are great for trying out ideas in a controlled, reproducible environment. Each job runs in an isolated container, so your dependencies and environment are consistent. It’s like having a fresh machine for every experiment.
  • Developers Lacking Local GPUs: If you’re a developer without access to a GPU or specialized hardware, Hugging Face Jobs gives you on-demand access to GPUs/TPUs for cloud AI training and testing. No need to purchase expensive hardware – you can rent it by the second.

In short, Hugging Face Jobs is for anyone who wants to accelerate AI workflows by leveraging cloud compute in a convenient way. It eliminates a lot of the DevOps overhead (no need to provision AWS instances or set up Docker on a remote machine yourself) and integrates with Hugging Face Hub where your models and datasets might already reside. This tight integration makes it easy to, for example, fine-tune a model from the Hub and then save the results back to the Hub, all within the Jobs framework.

Tip: Because Jobs are executed in the Hugging Face Hub environment, you automatically have access to repositories you have permission for. This means a Job can easily download a model from the Hub or read a dataset from the Hub. It’s a very seamless way to use Hugging Face’s compute alongside the Hub’s data and model storage.

Getting Started: Running Your First Hugging Face Job

Let’s walk through a step-by-step tutorial on how to create and run a job on Hugging Face. We’ll start from the basics, assuming you’re a beginner. By the end of this tutorial, you’ll have launched a cloud job, monitored its progress, and retrieved the results – all using Hugging Face Jobs. 🏁

Step 1: Setting Up Your Environment

Sign Up and Upgrade (if needed): First, ensure you have a Hugging Face account. If you don’t have one, create a free account on the Hugging Face Hub. As mentioned, Hugging Face Jobs require a Pro or higher plan. If you’re not a subscriber yet, you may consider upgrading to Pro (check the Pricing page on Hugging Face for details). Pro users get access to Jobs on a pay-as-you-go basis. Once your account has the required access, you’re ready to go.

Install the Hugging Face CLI: The easiest way to interact with Hugging Face Jobs is via the command-line interface (CLI) tool called hf. This comes as part of the huggingface_hub Python package. You can install it via pip:

pip install huggingface_hub

This will provide the hf command. Alternatively, you can use pip install huggingface-cli or use the Hugging Face Hub API in Python, but in this guide we’ll mostly use the CLI for simplicity.

Log in to Hugging Face: After installing, authenticate your CLI with your Hugging Face account. Run:

hf auth login

You’ll be prompted to enter your Hugging Face API token (you can get this from your Hugging Face account settings). Once authenticated, the CLI now knows who you are and what resources you have access to. (This is necessary because Jobs will be launched under your account.)

Step 2: Running a Job (Hello World)

Now that we’re set up, let’s run our first job! We’ll start with a simple “Hello World” example to demonstrate the basics. We’ll ask Hugging Face Jobs to run a small Python command on the cloud.

Choose a Docker image: Since Jobs run inside containers, we need to pick an environment. For many tasks, a standard Python image will do. Hugging Face provides some optimized images, or you can use any image from Docker Hub. We’ll use the latest Python image for now.

Command to run: We’ll just print a message from Python to confirm everything works.

Open your terminal and execute:

hf jobs run python:3.12 -c "print('Hello from Hugging Face Jobs!')"

Let’s break down this command:

  • hf jobs run – this tells the CLI we want to run a new job.
  • python:3.12 – this is the Docker image name. Here we chose the official Python 3.12 image as our execution environment.
  • -c " ... " – everything after -c is the command we want to run inside the container. In this case, we use Python’s -c flag to execute a tiny snippet that prints a message.

When you run this, the job will be submitted to Hugging Face. The CLI will stream the logs by default, so you should see output from the job appear in your terminal. For our example, after a few moments you should see:

Hello from Hugging Face Jobs!

This confirms that our job ran successfully on the cloud and produced the expected output. 🎉

Behind the scenes, Hugging Face took our request, scheduled it on an available machine, pulled the python:3.12 container, ran our code, and captured the output. All of that happened within a few seconds. We didn’t have to manually start any server or VM – the service handled it.

Job ID: Every job is assigned a unique ID. The CLI likely printed something like Job ID: xxx or a URL to the job. For example, it might show a URL like https://huggingface.co/jobs/<your-username>/<job-id>. This is a link to the job’s page on the Hub. You could visit that in a browser (while logged in) to see details like status and logs. The Python API also returns a JobInfo object that contains this ID and URL. Keep this ID handy for monitoring the job in the next steps.

Screenshot: After running the above command, you should see the job’s logs in your terminal. (If this were a screenshot, it would show the CLI output with the printed “Hello from Hugging Face Jobs!” message and possibly the job ID.)

Step 3: Monitoring and Managing Jobs

Once you have jobs running, you’ll want to check on them or retrieve their outputs. Hugging Face provides commands for monitoring:

  • List running jobs: hf jobs ps will list your active jobs (similar to docker ps if you come from a Docker background). This will show jobs and their statuses (e.g., running, completed, failed).
  • Inspect a job’s status: hf jobs inspect <job_id> gives detailed info on a specific job. This includes whether it’s in progress, completed, or errored, plus metadata like start time and the hardware used.
  • View logs: To fetch the logs of a job (past or running), use hf jobs logs <job_id>. This prints all output that the job has produced so far. It’s useful if you want to see training progress or any printouts from your script after the fact.
  • Cancel a job: If you started a job and then realize you need to stop it (maybe you found a bug in your script or it’s taking too long), you can cancel it with hf jobs cancel <job_id>. This will terminate the running job on the server.

For example, suppose you launched a long training job and you want to watch its progress. You could run:

hf jobs logs <your_job_id>

This will stream the logs (you can run this in a separate terminal). You might see your model printing validation metrics or other progress indicators in real time.

If you’re using the Python API instead, analogous functions exist. For instance, list_jobs() will give you a list of jobs (with statuses), inspect_job(job_id) returns a JobInfo object for a job, and fetch_job_logs(job_id) yields log lines so you can iterate over them or print them in your Python script.

Pro Tip: The job page on the Hugging Face Hub (accessible via the URL in the job info) also shows the status and logs in a web interface. This can be convenient to share with colleagues (they must have access to your account or org to view it) or just to monitor without using the CLI. It’s essentially a simple dashboard for each job.

Step 4: Using Custom Hardware and Environments

The real power of Hugging Face Jobs is the ease of accessing different kinds of hardware. By default, if you don’t specify anything, your job runs on a basic CPU machine. But often for AI tasks, you’ll want a GPU or even multiple GPUs. Hugging Face provides a range of hardware flavors that you can choose via the --flavor option in the CLI (or the flavor= argument in the Python API).

Available hardware flavors: As of mid-2025, there are several options. For example:

  • CPU options: cpu-basic (default), cpu-upgrade (more powerful CPU).
  • GPU options: t4-small, t4-medium (NVIDIA T4 GPUs in varying quantities), l4 instances, a10g-small and up (NVIDIA A10G GPUs), and even an a100-large (NVIDIA A100) for heavy-duty tasks.
  • TPU options: Google Cloud TPUs like v5e-1x1, v5e-2x2 etc., for specialized workloads.

Let’s say you want to run the same “Hello world” but on a GPU machine, perhaps to test that GPU is available. You could run:

hf jobs run --flavor a10g-small \
    pytorch/pytorch:2.6.0-cuda12.4-cudnn9-devel \
    python -c "import torch; print(torch.cuda.get_device_name())"

In this command:

  • We added --flavor a10g-small to request a job with one A10G GPU (small refers to one GPU; there are flavors like a10g-large for multi-GPU).
  • We switched the Docker image to one that has PyTorch with CUDA support (pytorch/pytorch:2.6.0-cuda12.4-cudnn9-devel is a PyTorch image with CUDA 12 and cuDNN, which is suitable for using the GPU).
  • The Python command checks what GPU is available via torch.cuda.get_device_name().

When this job runs, the output should confirm the GPU type, e.g., it might print NVIDIA A10G as the device name. If you see that, congrats – you just ran a job on a GPU in the cloud, with zero setup! That’s cloud AI compute at your fingertips.

You can similarly choose other flavors. For instance, use --flavor a100-large for an A100 GPU (if your plan supports it), or --flavor t4-small for a smaller Nvidia T4. Keep in mind costs scale with the size of hardware – an A100 will cost more per second than a T4 or CPU. The Hub’s documentation provides up-to-date lists of flavors and pricing.

Custom Docker images: You’re not limited to generic Python or PyTorch images. You can run any container that suits your task. Hugging Face Spaces provide many ready-to-use images (e.g., for R, Julia, or specific ML frameworks). You can even use your own image from Docker Hub if you have a very custom environment. Just replace the image name in the command. For example, if you have a Docker image myuser/my-custom-env:latest, you can do: hf jobs run --flavor cpu-basic myuser/my-custom-env:latest python my_script.py. This will pull your image and run python my_script.py inside it.

Environment variables and secrets: If your job needs environment variables (like API keys or configuration flags), you can pass them with -e VAR=value for plain env vars, or -s VAR=value for secrets (secret values are encrypted and not shown in logs). There are also options to load these from .env files. For example: hf jobs run -e USER=alice -e MODE=production ... will make USER and MODE available to your script.

Using UV Scripts (advanced): Hugging Face Jobs also supports something called UV scripts, which are self-contained Python scripts with special comments to declare dependencies. This is an experimental feature for running one-off scripts easily. For a beginner, you might not need this, but just know it exists. You could run a local script with dependencies via hf jobs uv run my_script.py without even building a Docker image – the system will handle installing the needed packages as declared in the script. (For more, see the UV documentation linked in the Hugging Face guide.)

With these capabilities, you can tailor the job environment exactly to your needs. Whether it’s a quick data tweak on CPU or a multi-GPU training on a big dataset, Hugging Face Jobs likely has a configuration to suit it.

Step 5: Retrieving Results

Finally, what about getting the outputs of your job? Logs are streamed, but what if your job produces files or model artifacts? There are a few ways to handle that:

  • Print to stdout: For small results (like metrics or a single prediction), simply printing them (as we did in our examples) might be enough – you’ll see it in the logs.
  • Save to Hugging Face Hub: A common pattern is to have your job push outputs to a repository on the Hub. For example, if you fine-tuned a model, you can use the Hugging Face Hub libraries to save the model and upload it to your Hugging Face model repository at the end of the job. Since the job has your auth (it inherits your credentials from when you did hf auth login), it can seamlessly huggingface_hub.upload_file() or save_pretrained to a repo. This way, when the job finishes, your results (model weights, etc.) are already on the Hub for you.
  • Use persistent storage (coming soon?): At the time of writing, each job’s filesystem is ephemeral (temporary). Hugging Face might introduce options for persistent storage or outputs, but the current recommended approach is saving to a Hub repo or some external storage if needed.

For now, if you follow the above steps, you have the basics to run and manage jobs. You’ve essentially got a handle on using Hugging Face compute on the cloud!

Comparing Hugging Face Jobs to AWS SageMaker and Google Vertex AI

You might be wondering: How does Hugging Face Jobs stack up against other cloud AI platforms? Two well-known alternatives are Amazon AWS SageMaker and Google Cloud Vertex AI. All three services let you run machine learning workloads in the cloud, but they have different strengths and target audiences. Below is a quick comparison:

FeatureHugging Face Jobs (Hugging Face Hub)Amazon SageMaker (AWS)Google Vertex AI (GCP)
Type of ServiceJobs on Hugging Face Hub (focused on running scripts with models/data from HF Hub).Full-spectrum ML platform on AWS (training, hosting, AutoML, etc.).Unified ML platform on Google Cloud (training, deployment, data tools).
Ease of UseEasy – simple CLI (hf jobs run), minimal setup; ideal for quick jobs and integration with HF content.Steeper learning curve; many features means more configuration (not as plug-and-play).Moderate complexity; integrated with GCP console/UI, more guided than AWS but still enterprise-oriented.
IntegrationNative integration with Hugging Face Hub (models, datasets, Spaces). Great for users already using HF ecosystem.Deep integration with AWS ecosystem (S3, IAM, CloudWatch, etc.), good for AWS-centric workflows.Integrated with Google Cloud services (BigQuery, DataFlow, etc.), good for GCP-centric workflows.
Compute OptionsCPU, GPU (various NVIDIA tiers like T4, A10G, A100), TPU v5 – single-node jobs (multi-GPU on one node possible via flavors).Wide range of AWS instances (including multi-node distributed training, custom EC2 types, etc.), plus AWS-specific chips (Inferentia, Trainium).Wide range including GPU and TPU (v2/v3) on Google Cloud; managed services for distributed training and AutoML.
Pricing ModelPay-as-you-go by compute seconds. Requires HF Pro/Team subscription. No upfront, no idle cost – jobs spin up and tear down.Pay for provisioned resources (per hour billing for instances). Some services (like endpoints) incur idle costs. Complex pricing structure (multiple components).Pay-as-you-go for usage (e.g., training hours, prediction requests). Pricing is moderate and transparent per Google’s offerings.
MLOps FeaturesLightweight approach: you manage jobs individually (no built-in pipelines or AutoML, but you can script workflows).Full MLOps suite: SageMaker has training jobs, endpoints, pipeline workflows, model registry, monitoring, etc. (enterprise-grade features).Full MLOps suite: Vertex AI offers pipelines, model registry, hyperparameter tuning, deployment scaling, etc., all in one platform.
Ideal ForIndividuals & small teams who want a simple way to run ML tasks, especially if leveraging Hugging Face models/datasets. Great for experimentation, fine-tuning, and community-driven projects.Enterprises or advanced users with heavy AWS usage; those who need end-to-end control and are okay with AWS-specific tools. Good for production at scale on AWS.Enterprises or analysts in the Google ecosystem; those who want an end-to-end managed service with emphasis on AutoML and data integration in GCP.

As you can see, Hugging Face Jobs shines in its simplicity and seamless integration with the Hugging Face Hub, making it extremely user-friendly for those already in the HF community. You write a command, and it runs – that’s it. In contrast, platforms like SageMaker and Vertex AI are part of larger cloud provider offerings, designed to handle everything from data preparation to deployment. They can train and serve models at massive scale and offer many advanced features, but with that comes complexity and often higher cost/effort to get started.

In practical terms, if you just want to quickly train a model on a GPU or process a dataset with minimal fuss, Hugging Face Jobs is likely the fastest route. It doesn’t require you to configure cloud storage or networking – the defaults work out-of-the-box. On the other hand, if you require a fully managed production pipeline, or integration with a company’s AWS/GCP infrastructure, you might compare how far you can go with HF Jobs versus when you’d step up to SageMaker/Vertex. Many users actually mix and match – for example, prototyping with Hugging Face Jobs, then moving to a SageMaker pipeline for a large-scale production training job.

According to industry analysis, one trade-off is that AWS SageMaker, while powerful, has a complex setup and pricing structure that can be daunting for beginners. Hugging Face’s approach, conversely, tends to have lower initial overhead and no cloud lock-in – you’re not tied to AWS or GCP accounts, just to the HF Hub. Google’s Vertex AI is praised for its clear pricing and unified interface, but it targets enterprise users and managed workflows, whereas Hugging Face Jobs offers more flexibility for developers and researchers to run custom code with community resources.

Ultimately, the best choice depends on your needs. If you value a beginner-friendly, community-driven platform, Hugging Face Jobs is a strong contender. If you need a full-blown enterprise solution with all the bells and whistles (and you have a team to manage it), SageMaker or Vertex might be suitable. The good news is that Hugging Face is also partnering with these cloud providers (for example, there are integrations where you can use Hugging Face datasets and models within SageMaker or Azure ML), so it’s not an either-or situation. You can start on the Hub and later scale out on a cloud platform as needed.

Conclusion and Next Steps

Hugging Face Jobs opens up an exciting new way to access cloud compute for AI, all within the friendly Hugging Face ecosystem. In this tutorial, we covered what Hugging Face Jobs are, who should use them, and walked through a simple example of launching a job. We also compared it with other platforms to give you context on where it stands.

Key takeaways:

  • Hugging Face Jobs provides on-demand cloud compute for machine learning tasks, defined by a command, container, and hardware of your choice.
  • It’s extremely easy to use (just a single CLI command to launch a job) and is integrated with Hugging Face Hub, so you can utilize models and data from the community effortlessly.
  • It uses a pay-as-you-go model – great for not overspending on idle resources – and requires a Pro/Team plan for access.
  • You can run anything from quick experiments to lengthy training jobs on CPUs, GPUs, or TPUs, without worrying about infrastructure maintenance.
  • Compared to big cloud services, HF Jobs is more beginner-friendly and flexible for individual developers or researchers, whereas services like SageMaker or Vertex AI target full-scale production pipelines (with correspondingly higher complexity).

Now, it’s your turn! Give Hugging Face Jobs a try for your cloud ML workflows – spin up a job with your own code or model. Perhaps fine-tune a model on a dataset you love, or offload that long-running data preprocessing task. The barrier to entry is low, and you might be surprised how much time it saves you. Check out the official Hugging Face Jobs documentation for more examples and advanced usage.

If you’re interested in other aspects of the Hugging Face platform, you might explore our other tutorials – for instance, learn how to deploy models with Hugging Face Inference Endpoints for building production APIs, or read about how to share and use datasets on the Hub to fuel your jobs. There’s a whole ecosystem to take advantage of.

Call to Action: Ready to supercharge your AI workflow? Head over to Hugging Face and launch your first Job today! Whether you’re a solo learner or part of a team, Hugging Face Jobs can streamline your ML experiments. If you found this guide helpful, please share it with your colleagues or on social media so more people can discover easier ways to run their machine learning tasks. Happy model training, and happy Hugging! 🤗🚀


🔗 Useful Resources & Further Reading

📚 Internal Resources from Ossels AI:

🌐 External Resources from Hugging Face:


Posted by Ananya Rajeev

Ananya Rajeev is a Kerala-born data scientist and AI enthusiast who simplifies generative and agentic AI for curious minds. B.Tech grad, code lover, and storyteller at heart.