- document
-
Neumann, N.M.P. (author), de Heer, P.B.U.L. (author), Phillipson, F. (author)In this paper, we present implementations of an annealing-based and a gate-based quantum computing approach for finding the optimal policy to traverse a grid and compare them to a classical deep reinforcement learning approach.We extended these three approaches by allowing for stochastic actions instead of deterministic actions and by...article 2023
- document
-
Arango, A.R. (author), Aguilar, J. (author), R-Moreno, M.D. (author)Hydro-thermal economic dispatch is a widely analyzed energy optimization problem, which seeks to make the best use of available energy resources to meet demand at minimum cost. This problem has great complexity in its solution due to the uncertainty of multiple parameters. In this paper, we view hydro-thermal economic dispatch as a multistage...article 2023
- document
-
Coutino, M. (author), Uysal, F. (author)Recently, it has been shown that reinforcement learning (RL) is able to solve decision-based problems through a series of action-observation-reward cycles. In this paper, we pose the problem of constrained waveform optimization as a sequential decision problem and show how it can be solved by an RL agent. The proposed RL-based method is an...conference paper 2023
- document
-
Albers, N. (author), Neerincx, M.A. (author), Brinkman, W.P. (author)Behavior change applications often assign their users activities such as tracking the number of smoked cigarettes or planning a running route. To help a user complete these activities, an application can persuade them in many ways. For example, it may help the user create a plan or mention the experience of peers. Intuitively, the application...article 2022
- document
-
Pingen, G.L.J. (author), van Ommeren, C.R. (author), van Leeuwen, C.J. (author), Fransen, R.W. (author), Elfrink, T. (author), de Vries, Y.C. (author), Karunakaran, J. (author), Demirovic, E. (author), Yorke-Smith, N. (author)Logistics planning is a complex optimization problem involving multiple decision makers. Automated scheduling systems offer support to human planners; however state-of-the-art approaches often employ a centralized control paradigm. While these approaches have shown great value, their application is hindered in dynamic settings with no central...conference paper 2022
- document
-
Dimitrovski, T. (author), Chiscop, I. (author), Pileggi, P. (author), Panneman, J. (author)Anticipating a highly customized and flexible virtual service portfolio of 6G networks, unprecedented complexity will challenge network manageability if left to human operators. Automation and autonomy are key enablers of next generation cloud networks. With this in mind, we present a demonstrator of ML-assisted user plane resource management...conference paper 2022
- document
- Arora, A. (author), Dimitrovski, T. (author), Litjens, R. (author), Zhang, H. (author) conference paper 2021
- document
-
Albers, N. (author), Neerincx, M.A. (author), Brinkman, W.P. (author)There is certainly some behavior that you want to change. Maybe you want to become more physically active, call your mother more often or snack less when watching TV at night. Let’s assume that you want to quit smoking. You are not doing this alone, but are supported by your coach Hannah. Hannah constantly persuades you to stick to your...conference paper 2021
- document
-
van Zoelen, E.M. (author), Cremers, A.H.M. (author), Dignum, F.P.M. (author), van Diggelen, J. (author), Peeters, M.M. (author)Artificially intelligent agents increasingly collaborate with humans in human-agent teams. Timely proactive sharing of relevant information within the team contributes to the overall team performance. This paper presents a machine learning approach to proactive communication in AI-agents using contextual factors. Proactive communication was...conference paper 2020
- document
-
Neumann, N.M.P. (author), de Heer, P.B.U.L. (author), Chiscop, I. (author), Phillipson, F. (author)With quantum computers still under heavy development, already numerous quantum machine learning algorithms have been proposed for both gate-based quantum computers and quantum annealers. Recently, a quantum annealing version of a reinforcement learning algorithm for grid-traversal using one agent was published. We extend this work based on...conference paper 2020
- document
- de Heer, P.B.U.L. (author), de Reus, N.M. (author), Tealdi, L. (author), Kerbusch, P.J.M. (author) conference paper 2019
- document
-
de Heer, P.B.U.L. (author), de Reus, N.M. (author), Tealdi, L. (author), Kerbusch, P.J.M. (author)The density, diversity, connectedness and scale of urban environments make military operations challenging. This paper shows that different artificial intelligence techniques can be combined to provide the commander with various form of intelligence augmentation and to support the decision making process. A warfare model has been developed where...conference paper 2019
- document
-
Grappiolo, C. (author), van Gerwen, M.J.A.M. (author), Verhoosel, J.P.C. (author), Somers, L. (author)The booming popularity of data science is also affecting high-tech industries. However, since these usually have different core competencies - building cyber-physical systems rather than e.g. machine learning or data mining algorithms - delving into data science by domain experts such as system engineers or architects might be more cumbersome...conference paper 2019
- document
-
Grappiolo, C. (author), Verhoosel, J. (author), van Gerwen, E. (author), Somers, L. (author)The booming popularity of data science is also affecting high-tech industries. However, since these usually have different core competencies — building cyber-physical systems rather than e.g. machine learning or data mining algorithms — delving into data science by domain experts such as system engineers or architects might be more cumbersome...conference paper 2018
- document
-
van Seijen, H.H. (author), Whiteson, S. (author), Kester, L.J.H.M. (author)This article addresses reinforcement learning problems based on factored Markov decision processes (MDPs) in which the agent must choose among a set of candidate abstractions, each build up from a different combination of state components. We present and evaluate a new approach that can perform effective abstraction selection that is more...article 2014
- document
-
van Seijen, H.H. (author), Whiteson, S. (author), van Hasselt, H. (author), Wiering, M. (author)This article presents and evaluates best-match learning, a new approach to reinforcement learning that trades off the sample efficiency of model-based methods with the space efficiency of model-free methods. Best-match learning works by approximating the solution to a set of best-match equations, which combine a sparse model with a model-free Q...article 2011
- document
- van Seijen, H.H. (author) doctoral thesis 2011
- document
-
van Seijen, H.H. (author), van Hasselt, H. (author), Whiteson, S. (author), Wiering, M. (author), TNO Defensie en Veiligheid (author)This paper presents a theoretical and empirical analysis of Expected Sarsa, a variation on Sarsa, the classic onpolicy temporal-difference method for model-free reinforcement learning. Expected Sarsa exploits knowledge about stochasticity in the behavior policy to perform updates with lower variance. Doing so allows for higher learning rates and...conference paper 2009
- document
-
van Seijen, H.H. (author), Bakker, B. (author), Kester, L.J.H.M. (author), TNO Defensie en Veiligheid (author)This paper proposes a reinforcement learning architecture containing multiple "experts", each of which is a specialist in a different region in the overall state space. The central idea is that the different experts use qualitatively different (but sufficiently Markov) state representations, each of which captures different information regarding...conference paper 2008