Back in the euphoric early days of the Web, people liked to claim that much of the enormous potential in a company like Yahoo! was in the “eyeballs”-the simple fact that millions of people look at its pages every day. Further, by convincing people to register personal data with the site, a site like Yahoo! can show each user an extremely targeted advertisement whenever he or she visits the site, in a way that TV networks or magazines couldn’t hope to match. So if a user has told Yahoo! that he or she is a 20-year-old computer science major from Cornell University, the site can present a banner ad for apartments in Ithaca, New York:on the other hand, if he or she is a 50-year-old investment banker from Greenwich, Connecticut, the site can display a banner ad pitching Lincoln Town Cars instead. But deciding on which ads to show to which people involves some serious computation behind the scenes. Suppose that the managers of a popular Web site have identified k distinct demographic groups G1, G2, …, Gk. (These groups can overlap:for example, G1 can be equal to all residents of New York State, and G2 can be equal to all people with a degree in computer science.) The site has contracts with m different advertisers, to show a certain number of copies of their ads to users of the site. Here’s what the contract with the ith advertiser looks like. For a subset X_i {G_1, …, G_k} of the demographic groups, advertiser i wants its ads shown only to users who belong to at least one of the demographic groups in the set Xi. For a number ri, advertiser i wants its ads shown to at least ri users each minute. Now consider the problem of designing a good advertising policy-a way to show a single ad to each user of the site. Suppose at a given minute, there are n users visiting the site. Because we have registration information on each of these users, we know that user j (for j = 1, 2, n) belongs to a subset U_j (G_j, …, G_k} of the demographic groups. The problem is:Is there a way to show a single ad to each user so that the site’s contracts with each of the m advertisers is satisfied for this minute? (That is, for each i = 1, 2,, m, can at least ri of the n users, each belonging to at least one demographic group in Xi, be shown an ad provided by advertiser i?) Give an efficient algorithm to decide if this is possible, and if so, to actually choose an ad to show each user.
Expert Answer
We add the following vertices to our graph:
- For each advertiser-i, we add a vertex ai.
- For each User-i, we add a vertex ui.
- For each group-i, we add a vertex gi.
- Add a source vertex ‘s’ and sink vertex ‘d’.
Now, we add directed edges to the graph:
- For every advertiser, we add an edge from s to ai, with weight (capacity), ri. Meaning that we dont want more than ri ads to be shown for advertiser-i. ( The problem does not specify ri as sthe upper limit but since, we are trying to find out if there’s an assignment where at least ri ads can be served for the advertiser, servinging ri ads is sufficient and hence we are limiting this to ri)
- For every advertiser-i and group-j , we add an edge from ai to gj, with weight ri, if Gj
Xi. This means that the flow of ads can only go into groups where advertiser wants to show ads. Again since the upper limit of ads for advertiser-i is ri, the upper limit of ads for advertiser-i shown to any group will have an upper limit of ri.
- For every user-i and group-j , we add an edge from gj to ui, with weight 1, if Gj
Ui. This means that the flow of ads can move from groups to user only if the user belongs to that group. And since every user can be shown a single ad, we have set the capacity to 1.
- For every user-i, we add an edge from ui to d, with weight 1. To finalize our flow.
This is what the graph will look like:
Now we solve the max-flow problem for this graph with source as s and sink as d. If the max flow is = say, R, for every advertiser-i, then at least ri users can be assigned to each advertiser-i.
Here’s the algorithm to assign users to the advertisers:
Create a graph G as per instructions above
Run ford-fulkerson’s algorithm on graph G. If the max flow is computed as R, a satisfactory assignment is possible
for each advertiser-i, flow between s and ai will be ri (as the max flow is computed as R)
for every and group-i and user-j:
for each advertiser-i:
for every and group-j:
f = flow(ai,gj) // flow from ai to gj as computed by the ford fulkerson’s algorithm
there will at least be f unassigned users such that flow(gj,uk) = 1, assign these f users to advertiser-i, and mark them assigned
——————————————————————–
Running time of the algorithm = running time of the ford-fulkerson algorithm = O(E*f).
Where f is the maximum outgoing flow at any edge which in this case is R, the sum of all ri’s, hence f =
E is the number of edges (= number of advertisers(say a) + number of users(say u) + sum of number of groups that each user is a part of + sum of number of groups that each advertiser is targeting) which is:
Hence the time complexity is: where a is the number of advertisers and u, the number of users.