Towards replication-robust analytics markets
2024ArticlePreprint
T Falconer, P Pinson, J Kazempour
preprint, under review
Publication year: 2024
Many industries rely on data-driven analytics, yet useful datasets are often distributed amongst market competitors that are reluctant to collaborate and share information. Recent literature proposes analytics markets to provide monetary incentives for data sharing, however many of these market designs are vulnerable to malicious forms of replication — whereby agents replicate their data and act under multiple identities to increase revenue. We develop a replication-robust analytics market, centering on supervised learning for regression. To allocate revenue, we use a Shapley value-based attribution policy, framing the features of agents as players and their interactions as a characteristic function game. We show that there are different ways to describe such a game, each with causal nuances that affect robustness to replication. Our proposal is validated using a real world wind power forecasting case study.
Passenger ferry operations in the digital era: Forecasting and revenue management at Molslinjen
2024ArticleJournal paperPreprint
P. Pinson, M. Bjørn, S. Kristiansen, C.B. Nielsen, L. Janerka, J. Skovgaard, K. Durhuus
INFORMS Journal of Applied Analytics, under review (invited paper - winner of the INFORMS Edelman Award 2024)
Publication year: 2024
Molslinjen, one of the world’s largest operators of fast-moving catamaran ferries, based in Denmark, adopted a strategical focus on digitalization to profoundly change their operations and business practice. They partnered with Halfspace, a data, analytics and AI company based in Copenhagen, Denmark, to support that transition. Halfspace and Molslinjen have jointly developed and deployed a successful forecasting and revenue management toolbox for the data-driven operation of ferries in Denmark, rolled out operationally since 2020. This has resulted in \$2.6-3.2 million yearly savings, significant reduction in number of delayed departures and average delays, and a 3% reduction in fuel costs and emissions. This toolbox relies on some of the latest advances in machine learning for forecasting and in analytics approaches to revenue management. The potential for generalizing to the global ferry industry is significant, with an impact on both revenues and ESG criteria.
Fairness by design in shared-energy allocation problems
2024ArticlePreprint
Z Fornier, V Leclėre, P Pinson
preprint, under review
Publication year: 2024
This paper studies how to aggregate prosumers (or large consumers) and their collective decisions in electricity markets, with a focus on fairness. Fairness is essential for prosumers to participate in aggregation schemes. Some prosumers may not be able to access the energy market directly, even though it would be beneficial for them. Therefore, new companies offer to aggregate them and promise to treat them fairly. This leads to a fair resource allocation problem.
We propose to use acceptability constraints to guarantee that each prosumer gains from the aggregation.
Moreover, we aim to distribute the costs and benefits fairly, taking into account the multi-period and uncertain nature of the problem. Rather than using financial mechanisms to adjust for fairness issues, we focus on various objectives and constraints, within decision problems, that achieve fairness by design. We start from a simple single-period and deterministic model, and then generalize it to a dynamic and stochastic setting using, e.g., stochastic dominance constraints.
Data is missing again -- Reconstruction of power generation data using k-Nearest Neighbors and spectral graph theory
2024ArticlePreprint
A. Pierrot, P. Pinson
preprint, under review
Publication year: 2024
The risk of missing data and subsequent incomplete data records at wind farms increases with the number of turbines and sensors. We propose here an imputation method that blends data-driven concepts with expert knowledge, by using the geometry of the wind farm in order to provide better estimates when performing nearest-neighbour imputation. Our method relies on learning Laplacian eigenmaps out of the graph of the wind farm through spectral graph theory. These learned representations can be based on the wind farm layout only, or additionally account for information provided by collected data. The related weighted graph is allowed to change with time and can be tracked in an online fashion. Application to the Westermost Rough offshore wind farm shows significant improvement over approaches that do not account for the wind farm layout information.
Privacy-preserving convex optimization: When differential privacy meets stochastic programming
2023ArticleJournal paperPreprint
V. Dvorkin, F. Fioretto, P. Van Hentenryck, P. Pinson, J. Kazempour
preprint, under review
Publication year: 2023
Convex optimization finds many real-life applications, where – optimized on real data – optimization results may expose private data attributes (e.g., individual health records, commercial information, etc.), thus leading to privacy breaches. To avoid these breaches and formally guarantee privacy to optimization data owners, we develop a new privacy-preserving perturbation strategy for convex optimization programs by combining stochastic (chance-constrained) programming and differential privacy. Unlike standard noise-additive strategies, which perturb either optimization data or optimization results, we express the optimization variables as functions of the random perturbation using linear decision rules; we then optimize these rules to accommodate the perturbation within the problem’s feasible region by enforcing chance constraints. This way, the perturbation is feasible and makes different, yet adjacent in the sense of a given distance function, optimization datasets statistically similar in randomized optimization results, thereby enabling probabilistic differential privacy guarantees. The chance-constrained optimization additionally internalizes the conditional value-at-risk measure to model the tolerance towards the worst-case realizations of the optimality loss with respect to the non-private solution. We demonstrate the privacy properties of our perturbation strategy analytically and through optimization and machine learning applications.
Privacy-aware data acquisition under data similarity in regression markets
2023ArticlePreprint
S Pandey, P. Pinson, P. Popovski
preprint, under review
Publication year: 2023
Data markets facilitate decentralized data exchange for applications such as prediction, learning, or inference. The design of these markets is challenged by varying privacy preferences as well as data similarity among data owners. Related works have often overlooked how data similarity impacts pricing and data value through statistical information leakage. We demonstrate that data similarity and privacy preferences are integral to market design and propose a query-response protocol using local differential privacy for a two-party data acquisition mechanism. In our regression data market model, we analyze strategic interactions between privacy-aware owners and the learner as a Stackelberg game over the asked price and privacy factor. Finally, we numerically evaluate how data similarity affects market participation and traded data value.