了解解决方案,以寻找最佳的策略游戏,其中黄金采摘盆解决方案、策略游戏、黄金

由网友(倾尽年华终是梦)分享简介:我无法理解的解决方案背后的推理来上CareerCup 这个问题。 黄金游戏花盆:两个玩家A和B.有金罐安排在一条线上,每个都包含了一些金币(玩家可以看到许多硬币是有每个第一桶金 - 完善的信息)。他们得到交替匝,其中玩家可以从所述一个挑一个锅结束行。获胜者是具有一个较大的数字播放器的硬币在末端。目标是最大化的数硬币收集...

我无法理解的解决方案背后的推理来上CareerCup 这个问题。

  黄金游戏

花盆:两个玩家A和B.有金罐安排   在一条线上,每个都包含了一些金币(玩家可以看到   许多硬币是有每个第一桶金 - 完善的信息)。他们得到   交替匝,其中玩家可以从所述一个挑一个锅   结束行。获胜者是具有一个较大的数字播放器   的硬币在末端。目标是最大化的数   硬币收集的A,假设B也可以发挥最佳状态。 A开始   游戏。

     

我们的想法是找到一个最佳的策略,使一个双赢知道   B被打最佳为好。你会怎么做呢?

     

在最后,我被要求code这个策略!

这是从谷歌面试的问题。

提出的解决方案是:

 函数max_coin(INT *硬币,诠释开始,诠释完):
    如果开始>结束:
        返回0

    //我不明白这接下来的两行
    诠释一个=硬币[开始] +分(max_coin(硬币,启动+ 2月底),max_coin(硬币,启动+ 1,最终1))
    INT B =硬币[结束] +分(max_coin(硬币,启动+ 1,最终1),max_coin(硬币,开始,结束-2))

    返回最大值(A,B)
 

有两个具体的部分我不明白:

在为什么我们使用范围的第一行[开始+ 2,结束]和[开始+ 1,结束 - 1]?它总是留出一个硬币罐。难道不应该将[启动+ 1,结束],因为我们采取了首发硬币罐了? 在第一行,为什么我们采取了两个结果的最低,而不是最大的? 因为我很困惑,为什么两行取最小值,以及为什么我们选择那些特定的范围,我真的不知道是什么 A B 实际上是重新present? 解决方案

A B 这里重新present最高 A 可以采摘开始锅或结束锅,分别得到。

与人交谈时,少说这3件事情,对你有好处

我们实际上是想最大限度地 AB ,但因为 B = TotalGold - A ,我们正在努力最大限度地 2A - TotalGold ,并且因为 TotalGold 是不变的,我们正在努力,以最大限度地 2A ,这是一样的 A ,所以我们完全忽略 B 的值的选秀权,只是与 A 工作

在递归调用更新的参数包括 B 采摘以及 - 这样硬币[开始] 再presents A 采摘开始,接着 B 挑选从一开始下一个,所以它的启动+ 2 。在接下来的电话, B 主从端,所以它的启动+ 1 结束-1 。同样,对于其他地区。

我们正在做的,因为 B 将试图最大化自己的利润,所以会挑最小化 A 的利润。

的选择

但实际上我想说这个解决方案是缺乏的,因为它只是返回一个值,而不是最佳策略,这在我的脑海里,将是移动序列意义上的位。而且它也没有考虑到的可能性, A 赢不了,在这种情况下,人们可能要输出的消息说,这是不可能的,但这会真正的东西,以澄清与面试官。

I am having trouble understanding the reasoning behind the solution to this question on CareerCup.

Pots of gold game: Two players A & B. There are pots of gold arranged in a line, each containing some gold coins (the players can see how many coins are there in each gold pot - perfect information). They get alternating turns in which the player can pick a pot from one of the ends of the line. The winner is the player which has a higher number of coins at the end. The objective is to "maximize" the number of coins collected by A, assuming B also plays optimally. A starts the game.

The idea is to find an optimal strategy that makes A win knowing that B is playing optimally as well. How would you do that?

At the end I was asked to code this strategy!

This was a question from a Google interview.

The proposed solution is:

function max_coin( int *coin, int start, int end ):
    if start > end:
        return 0

    // I DON'T UNDERSTAND THESE NEXT TWO LINES
    int a = coin[start] + min(max_coin(coin, start+2, end), max_coin(coin, start+1, end-1))
    int b = coin[end] + min(max_coin(coin, start+1,end-1), max_coin(coin, start, end-2))

    return max(a,b)

There are two specific sections I don't understand:

In the first line why do we use the ranges [start + 2, end] and [start + 1, end - 1]? It's always leaving out one coin jar. Shouldn't it be [start + 1, end] because we took the starting coin jar out? In the first line, why do we take the minimum of the two results and not the maximum? Because I'm confused about why the two lines take the minimum and why we choose those specific ranges, I'm not really sure what a and b actually represent?

解决方案

a and b here represent the maximum A can get by picking the starting pot or the ending pot, respectively.

We're actually trying to maximize A-B, but since B = TotalGold - A, we're trying to maximize 2A - TotalGold, and since TotalGold is constant, we're trying to maximize 2A, which is the same as A, so we completely ignore the values of B's picks and just work with A.

The updated parameters in the recursive calls include B picking as well - so coin[start] represents A picking the start, then B picks the next one from the start, so it's start+2. For the next call, B picks from the end, so it's start+1 and end-1. Similarly for the rest.

We're taking the min, because B will try to maximize it's own profit, so it will pick the choice that minimizes A's profit.

But actually I'd say this solution is lacking a bit in the sense that it just returns a single value, not 'an optimal strategy', which, in my mind, would be a sequence of moves. And it also doesn't take into account the possibility that A can't win, in which case one might want to output a message saying that it's not possible, but this would really be something to clarify with the interviewer.

阅读全文

相关推荐

最新文章