An extensible approach for real-time bidding with model-free reinforcement learning

Yin Cheng; Luobao Zou; Zhiwei Zhuang; Jingwei Liu; Bin Xu; Weidong Zhang

doi:10.1016/j.neucom.2019.06.009

An extensible approach for real-time bidding with model-free reinforcement learning

Yin Cheng, Luobao Zou, Zhiwei Zhuang, Jingwei Liu, Bin Xu, Weidong Zhang

School of Automation

Research output: Contribution to journal › Article › peer-review

6 Scopus citations

Abstract

In this paper, we propose an extensible framework for model-free reinforcement learning (RL) for real-time bidding (RTB) in display advertising. This framework can be applied into both simple environments and extend to the comprehensive environment that the DSP bids for multiple advertisers at the same time. To process new information that is collected via real-time interaction with the environment, an extensible model is first introduced, which is based on the distribution of the recharging probability. Substantial effort is expended to alleviate the problem of the sparsity of the click signal with the reward function. The proposed scheme has high feasibility and can address dynamic environments in contrast to prior works, which assumed that the distribution of the feature vectors and the dealing price were already known. Furthermore, a fund-recharging mechanism is introduced for transforming the RTB model into an endless task, which allows the policy to be optimized in a farsighted rather than a myopic manner. Illustrative experiments on both the small- and large-scale real datasets demonstrate the state-of-the-art performance of the proposed framework for the issue of interest.

Original language	English
Pages (from-to)	97-106
Number of pages	10
Journal	Neurocomputing
Volume	360
DOIs	https://doi.org/10.1016/j.neucom.2019.06.009
State	Published - 30 Sep 2019

Keywords

Deep reinforcement learning
Extensible approach
Model-free
Real-time bidding

Access to Document

10.1016/j.neucom.2019.06.009

Cite this

@article{8fd7320bcf2d4bebb66f8038a7031170,

title = "An extensible approach for real-time bidding with model-free reinforcement learning",

abstract = "In this paper, we propose an extensible framework for model-free reinforcement learning (RL) for real-time bidding (RTB) in display advertising. This framework can be applied into both simple environments and extend to the comprehensive environment that the DSP bids for multiple advertisers at the same time. To process new information that is collected via real-time interaction with the environment, an extensible model is first introduced, which is based on the distribution of the recharging probability. Substantial effort is expended to alleviate the problem of the sparsity of the click signal with the reward function. The proposed scheme has high feasibility and can address dynamic environments in contrast to prior works, which assumed that the distribution of the feature vectors and the dealing price were already known. Furthermore, a fund-recharging mechanism is introduced for transforming the RTB model into an endless task, which allows the policy to be optimized in a farsighted rather than a myopic manner. Illustrative experiments on both the small- and large-scale real datasets demonstrate the state-of-the-art performance of the proposed framework for the issue of interest.",

keywords = "Deep reinforcement learning, Extensible approach, Model-free, Real-time bidding",

author = "Yin Cheng and Luobao Zou and Zhiwei Zhuang and Jingwei Liu and Bin Xu and Weidong Zhang",

note = "Publisher Copyright: {\textcopyright} 2019",

year = "2019",

month = sep,

day = "30",

doi = "10.1016/j.neucom.2019.06.009",

language = "英语",

volume = "360",

pages = "97--106",

journal = "Neurocomputing",

issn = "0925-2312",

publisher = "Elsevier B.V.",

}

TY - JOUR

T1 - An extensible approach for real-time bidding with model-free reinforcement learning

AU - Cheng, Yin

AU - Zou, Luobao

AU - Zhuang, Zhiwei

AU - Liu, Jingwei

AU - Xu, Bin

AU - Zhang, Weidong

PY - 2019/9/30

Y1 - 2019/9/30

N2 - In this paper, we propose an extensible framework for model-free reinforcement learning (RL) for real-time bidding (RTB) in display advertising. This framework can be applied into both simple environments and extend to the comprehensive environment that the DSP bids for multiple advertisers at the same time. To process new information that is collected via real-time interaction with the environment, an extensible model is first introduced, which is based on the distribution of the recharging probability. Substantial effort is expended to alleviate the problem of the sparsity of the click signal with the reward function. The proposed scheme has high feasibility and can address dynamic environments in contrast to prior works, which assumed that the distribution of the feature vectors and the dealing price were already known. Furthermore, a fund-recharging mechanism is introduced for transforming the RTB model into an endless task, which allows the policy to be optimized in a farsighted rather than a myopic manner. Illustrative experiments on both the small- and large-scale real datasets demonstrate the state-of-the-art performance of the proposed framework for the issue of interest.

AB - In this paper, we propose an extensible framework for model-free reinforcement learning (RL) for real-time bidding (RTB) in display advertising. This framework can be applied into both simple environments and extend to the comprehensive environment that the DSP bids for multiple advertisers at the same time. To process new information that is collected via real-time interaction with the environment, an extensible model is first introduced, which is based on the distribution of the recharging probability. Substantial effort is expended to alleviate the problem of the sparsity of the click signal with the reward function. The proposed scheme has high feasibility and can address dynamic environments in contrast to prior works, which assumed that the distribution of the feature vectors and the dealing price were already known. Furthermore, a fund-recharging mechanism is introduced for transforming the RTB model into an endless task, which allows the policy to be optimized in a farsighted rather than a myopic manner. Illustrative experiments on both the small- and large-scale real datasets demonstrate the state-of-the-art performance of the proposed framework for the issue of interest.

KW - Deep reinforcement learning

KW - Extensible approach

KW - Model-free

KW - Real-time bidding

UR - http://www.scopus.com/inward/record.url?scp=85067344390&partnerID=8YFLogxK

U2 - 10.1016/j.neucom.2019.06.009

DO - 10.1016/j.neucom.2019.06.009

M3 - 文章

AN - SCOPUS:85067344390

SN - 0925-2312

VL - 360

SP - 97

EP - 106

JO - Neurocomputing

JF - Neurocomputing

ER -

An extensible approach for real-time bidding with model-free reinforcement learning

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this