Did the Puck Cross? Using Object Detection to Settle Goal-Line Reviews

April 27, 2026 · Danny · Computer Vision NHL Python

The Ducks beat the Oilers in overtime in Game 4, and the goal that ended it sat under review for almost five minutes. Tristan Jarry was sprawled across the crease, the puck slid past him, and his skate dropped on top of it right at the goal line. From the broadcast feed it was genuinely impossible to tell whether the rubber had fully crossed. The Situation Room ruled it a goal. Players from both sides said they had no idea how anyone could be sure.

That kind of moment, where a millimeter of black rubber decides a playoff series, is exactly what computer vision is good for. The human eye gets one camera angle and one frame. A model can look at every frame, every angle, and reason about position over time. I wanted to walk through what a goal-line CV pipeline would actually look like.

Overhead broadcast frame with the goal line highlighted in yellow and the puck circled in green

The yellow line marks the goal-side edge of the painted band, which is the edge the rule actually applies to. The green circle is the puck. Even with the line placed and the puck identified, you can already see the problem. The goalie's skate is parked on top of the disc, so the part of the puck that determines the call (its leading edge relative to the line) sits in pixels we can't observe. Zooming in makes it sharper:

Zoomed-in view of the puck on the goal line, with the line and circle still annotated

Why these calls are hard

The broadcast goal-line cam is a single overhead camera. The puck is 1 inch tall and 3 inches across. A goalie's skate blade is roughly the same width. When a save ends with the goalie sprawled over the puck, the blade and the rubber occupy almost the same pixels, and the puck disappears for the only frames that matter.

A few things make it worse:

Frame rate. Broadcast is 30 to 60 fps. At NHL pad speed the puck can travel 6 to 10 inches between frames, so the "did it cross" moment may simply not exist in any frame the camera captured.
Occlusion. Whatever frame would have shown the puck across is exactly the frame where the goalie's body is on top of it.
One angle. The official review uses multiple feeds, but they're all human eyes on the same low-resolution frames.

A model can't manufacture pixels that aren't there, but it can be more rigorous than a human about combining what is there.

A pipeline

The problem decomposes pretty cleanly into four pieces:

Puck detection on every frame of every available camera angle.
Goal-line segmentation so we know exactly where the line is in the image, not approximately.
Trajectory fitting across frames to interpolate positions when the puck is briefly hidden.
Multi-camera fusion so a frame from the rink-side cam can constrain a frame from the overhead cam.

1. Puck detection

A standard object-detection model trained on a few thousand annotated NHL frames does this surprisingly well. The puck is a small, high-contrast, mostly uniform object. The hard cases are exactly what you'd expect: motion blur on hard shots, occlusion behind sticks and skates, and the puck flat on its side against a black goal-mouth shadow.

The output isn't just puck_present: true/false. It's a bounding box with a confidence score. For our problem the box's edge matters more than the box itself, specifically the goal-line-side edge, because the rule isn't "puck visible past the line." It's "the entire puck has crossed the line."

2. Where exactly is the goal line?

You can't measure "fully across" without knowing where the line is in the same coordinate space as the puck. The line is painted on the ice but goalies churn it up over the course of a period, and broadcast cameras zoom and pan. A simple approach:

Detect the goal frame (posts and crossbar). Same kind of detector, easy to train.
Pull the line homography from the known geometry. Posts are 6 feet apart, the line is 2 inches wide.
Re-fit each play, because the camera angle changes.

Now the goal line isn't "wherever the red ice is in this frame." It's a precise pixel-space line you can compare a bounding box against.

There's a subtler problem the zoom above hints at: in the broadcast frame the painted line doesn't look like a clean edge. It seems to bleed slightly into the surrounding ice. Whether that's the paint actually feathering, scuffed ice mixing with paint over the course of a period, motion blur, or just broadcast compression turning a 2-inch stripe into a fuzzy band, I honestly can't tell from a single still. Either way, "fully across the line" stops being a comparison of a sharp puck edge to a sharp painted edge. It's one fuzzy region against another. That's part of why the call is hard even before occlusion enters the picture, and it's a place where calibrated geometry from the posts (which are sharp edges) probably beats trying to read the paint directly.

3. Trajectory fitting

The puck doesn't actually disappear, it's just hidden for a few frames. If the model sees it at frame N-2 and frame N+3, you can fit a smooth trajectory through those points and ask where the puck was at the missing frames, accounting for momentum and (more dimly) friction with the ice.

In hockey, the deflection physics on these moments are messy. The puck has hit a pad, a skate, or both, so trajectory fitting is more useful for velocity-bounding than for placing the puck precisely. If at frame N the puck was 2 inches outside the line moving at 14 ft/s and at frame N+5 it was 4 inches inside moving at 1 ft/s, you can put a confidence interval on when it crossed.

4. Multi-camera fusion

This is where it gets interesting. The Situation Room has access to angles the broadcast doesn't show. Each camera gives a 2D projection. With camera intrinsics calibrated, you can lift each detection into a single 3D rink coordinate system and ask the question once, in 3D, instead of separately on each feed.

The Premier League has been doing this for years on offside calls. There's no technical reason hockey couldn't.

What this would have produced for that call

Probably not a confident "yes" or "no," and that's the point worth being upfront about. The frames where the puck mattered were the frames where the goalie's skate was sitting on top of it. Even a perfectly trained detector can't identify a puck that isn't visible in any pixel.

What the pipeline can do is bound the answer. It can say:

The puck was at coordinate X two frames before contact.
The puck was at coordinate Y three frames after.
Given the velocity and the time interval, the puck crossed the line at probability p.

If p is 92%, "conclusive evidence" gets a much more rigorous definition than "the senior referee thought so."

What it would take to actually deploy this

Three things, none of them small:

Training data. A few thousand frames of NHL pucks, hand-labeled, including the messy occlusion cases. The league has the footage but not the labels. Getting both takes either NHL cooperation or a lot of patience.
Camera calibration. Every arena, every game. Solvable but tedious.
Latency budget. A review can take five minutes today, but if a CV system can answer in fifteen seconds with a calibrated confidence number, that's the actual pitch. Not "the model is always right." Just "the model is faster and shows its work."

The bigger argument

The argument for using CV here isn't that it'll always settle the call, because it won't. It's that the league's current process, which is humans staring at the same handful of frames the broadcast shows, has hit a ceiling that better tooling could push through. The Premier League's semi-automated offside system has had its rough nights, but the trajectory of those tools is going up and to the right. Hockey will get there too. The real question is whether the league builds it themselves or waits for someone else to.

That review went the Ducks' way. Both goalies on the broadcast said they couldn't tell. A model probably couldn't have told either, but it could have shown its uncertainty instead of asking everyone to take the Situation Room at its word.

This post describes the methodology of an object-detection pipeline for goal-line review. The screenshot is from the Sportsnet broadcast of Ducks at Oilers Game 4. No model was run on this single broadcast frame, since single-frame analysis of an occluded puck is the case where no detector can give a confident answer.