[Solved] Machine Learning Method to Classify using Demand

Question

For nicely distributed data, I expect that the most effective way will be to start with a k-means clustering. If each resulting cluster fits within the schools’ capacities, you have a solution.

However, your “worry” case is where at least one school is over capacity. For instance, you have 20 children on the north side of a wide river, 90 on the south side, and the schools have a capacity of 40: you need to assign at least 10 children from the south to the north.

The algorithmic way to deal with this is to implement a different error function: add a clause that heavily penalizes (i.e. +infinity cost) adding a 41st student to that cluster.

Another way is to allow the clusters to aggregate normally, but adjust afterward. Say that the SE school has 46 students, and the SW has 44: send the 6 and 4 students nearest to the the north school, to that school.

Is this enough guidance to work for you? Do you have cases where you would have multiple schools both over and under capacity? I don’t want to over-engineer a solution.

Accepted Answer

For nicely distributed data, I expect that the most effective way will be to start with a k-means clustering. If each resulting cluster fits within the schools’ capacities, you have a solution.

However, your “worry” case is where at least one school is over capacity. For instance, you have 20 children on the north side of a wide river, 90 on the south side, and the schools have a capacity of 40: you need to assign at least 10 children from the south to the north.

The algorithmic way to deal with this is to implement a different error function: add a clause that heavily penalizes (i.e. +infinity cost) adding a 41st student to that cluster.

Another way is to allow the clusters to aggregate normally, but adjust afterward. Say that the SE school has 46 students, and the SW has 44: send the 6 and 4 students nearest to the the north school, to that school.

Is this enough guidance to work for you? Do you have cases where you would have multiple schools both over and under capacity? I don’t want to over-engineer a solution.