Inferring hidden constraints for scheduling solvers

I’m happy to share that my latest paper, “On Inferring Cumulative Constraints,” has been accepted to CP 2026! You can read the full paper here. In this post, I’d like to explain the core ideas in a more accessible way, building on the story started in last year’s post.

Where we left off

Last year, I wrote about a limitation shared by almost all constraint programming solvers for scheduling: they reason about resources one at a time. A solver might check that the “workers” resource isn’t overloaded, then separately check the “machines” resource — but it never steps back and asks what the two constraints mean together.

Our CP 2025 paper (Unite and Lead) tackled this by building a conflict graph: whenever two tasks couldn’t overlap on any resource, the solver drew an edge between them. Once a large enough group of mutually conflicting tasks was found—a clique—the solver could immediately rule out many schedules.

That approach worked well, but it had two fundamental weaknesses.

First, it only looked for tasks that can’t overlap at all. But some interactions are subtler: not “tasks A and B can never coexist,” but “tasks A, B, and C can’t all run at once, though any two of them can.” The earlier technique had no way to express or exploit this kind of group constraint.

Second, the clique search happened during the search for a solution. For the problem instances where no useful clique existed, this added overhead with nothing to show for it — and in the worst cases, the solver slowed down by more than an order of magnitude compared to doing nothing.

With that context in place, here’s how the new paper addresses both problems.

A new way to look at scheduling constraints

To fix these weaknesses, we need to introduce a new framing. Usually, we think of a scheduling constraint as a rule to check: “Is this time point overloaded?” In this work, I treat it instead as a linear inequality.

Here’s how it works. Each task has a presence pattern on the timeline: at any given time unit, it is either active (1) or inactive (0). You can think of this as a row of zeros and ones extending across time:

A Cumulative constraint — “the tasks together can’t use more than b units of resource” — is just the statement that these rows, when stacked and summed column by column with reweighting, never exceed b in any column. That’s a linear inequality, and a system of Cumulative constraints is a system of linear inequalities:

This reframing might seem like an abstraction for its own sake, but it has a powerful payoff: any valid inequality for this system of linear constraints can be converted back into a new Cumulative constraint. The paper calls this the inference transfer lemma. It means the entire machinery of integer programming, including decades of theoretical work on discovering and tightening inequalities, can be plugged directly into CP solvers. The question then becomes: how do we efficiently find useful new inequalities?

Step 1: covers

The first building block is the notion of a cover: a set of tasks whose combined resource demand exceeds the capacity. Tasks in a cover can never all be active simultaneously — there simply isn’t enough room. That gives us a first, simple constraint: “at most k − 1 tasks from this cover can overlap at any time.”

For instance, look at the overloaded column in the figure above: three tasks together demand 4 units of a resource whose capacity is 3. You can’t run all three at once. The cover inequality says: whenever you look at any snapshot in time, at most two of these three tasks are active. (For example, the figure above implies the constraint that can be compactly written as \(x_a + x_b + x_d \le 2\).)

That’s a real restriction, and many such restrictions can be discovered simply by looking at the table of resource consumption by tasks. However, this constraint is still local to a single resource, and it only says something about the tasks in the cover. What about the other tasks?

Step 2: lifting

This is where lifting comes in. Lifting is a classical technique from integer programming for strengthening a valid inequality by incorporating variables that weren’t in it originally. The idea is: pick a task that isn’t mentioned in the cover inequality yet, and ask — if this task is also running, does the remaining slack change?

Almost always, it does. If an extra task consumes some of the resource, there’s less room left for the covered tasks. By calculating exactly how much room remains when each outside task is added, lifting produces a tighter inequality — one that mentions more tasks and gives the solver more information.

At the end of the lifting procedure, we have a new Cumulative constraint: not just “at most 2 of these 3 tasks can overlap,” but “at most 2 of these 4 tasks can overlap” — the same bound, now covering an extra task for free. That’s a stronger statement, and it’s expressed in a form the solver’s propagation algorithms already know how to use.

Try it yourself below: pick a cover, then lift in the remaining tasks one at a time to see exactly how each coefficient is computed — and what happens if you take away one of the two resources.

Putting it all together

The complete method runs as a preprocessing step, before the solver begins its search. Here is a summary of what it does:

Enumerate a set of promising covers (both small “short covers” and larger “long covers” built from tasks with similar resource demands).
Lift each cover inequality using the procedure above.
Score the resulting constraints by how tight they are — specifically, by the ratio of total task demand to capacity — and keep the best ones.
Inject those constraints into the CP model, which the solver then uses normally throughout its search.

The solver doesn’t need to know anything about how these constraints were derived. It receives a few extra Cumulative constraints, and its existing propagation algorithms immediately start using them. For problem instances where the lifted constraints encode real structure, this can dramatically prune the search space before the solver makes a single decision.

To see what this means concretely, the demo below reuses the constraint you derived above (or a default one, if you skipped straight here). Drag each task’s duration and compare the lower bound on the makespan implied by each resource on its own against the one implied by the lifted constraint.

Does it work?

In a word: yes — and more robustly than the previous approach.

On benchmarks with rich multi-resource structure, the improvement can be enormous. The new approach helped discover 25 new lower bounds and 5 new best solutions on standard RCPSP benchmark instances that have been studied by researchers for years. Notably, some of the lower bound improvements required genuinely non-binary constraints — that is, “at most 3 tasks can overlap,” not just “these two tasks can’t overlap” — which the previous approach couldn’t produce at all.

The comparison with Unite and Lead tells an equally interesting story. UNL’s clique-search overhead meant that on instances without useful disjunctive structure, it sometimes slowed the solver down by more than ten times. The lifting approach runs only at the root node and has a much more predictable cost: degradations of two times or more happen on fewer than 5% of runs. That’s a significant practical improvement — you can apply it to your scheduling instances with fewer concerns over disastrous runs than before.

There is one honest caveat: the approach helps more with lower bounds (proving that a schedule can’t be shorter than some value) than with finding good solutions. Inferred constraints prune the search space and help the solver rule out bad regions faster, but they usually don’t directly guide it toward better solutions. That asymmetry is visible in the data, and it’s one of the open directions for future work.

The bigger picture

The main takeaway is that the single-resource viewpoint has real limits, and there’s a principled way to overcome them: establish a formal bridge between CP constraints and the polyhedral geometry studied in integer programming, then use that bridge to import powerful global inferences into the solver.

There’s much more to explore. Aside from establishing how to consistently discover better schedules, another natural step is to try using this machinery with other objectives—and there is no shortage of those in the RCPSP literature. However, this will most likely require a more advanced optimization solving procedure than the cover-and-lift flow discussed here, as well as handling the time windows of tasks. With that said, it’s an encouraging sign that even the preprocessing version, with no special solver support, already finds results that had previously eluded years of research.

Thanks for reading!