# State-Value and Action-Value Function

<strong><em>State-Value Function</em></strong> ![v_{\pi}(s)](https://latex.codecogs.com/svg.latex?v_{\pi}(s)) : It is the expected return of being in state ![s](https://latex.codecogs.com/svg.latex?s), and following policy ![\pi](https://latex.codecogs.com/svg.latex?\pi) thereafter.

```{math}
v_{\pi} (s) = \mathbb{E}_{\pi} \left[  G_t | S_t = s \right]
```
```{math}
\implies v_{\pi} (s) = \mathbb{E}_{\pi} \left[  R_{t+1} + \gamma R_{t+2} + \gamma^2 R_{t+3} + ... | S_t = s \right]
```
```{math}
\implies v_{\pi} (s) = \mathbb{E}_{\pi} \left[  R_{t+1} + \gamma G_{t+1} | S_t = s, A_t \sim \pi (s) \right]
```
```{math}
\therefore v_{\pi} (s) = \mathbb{E}_{\pi} \left[  R_{t+1} + \gamma v_{\pi} (S_{t+1}) | S_t = s, A_t \sim \pi (s) \right]
```
This is known as <b>Bellman Expectation equation</b>.

Its another form is,
```{math}
\therefore v_{\pi} (s) = \sum_{a \in A(s)} \pi (a | s) \sum_{s' \in S, r \in R} p (s', r | s, a) (r + \gamma v_{\pi} (s'))
```

<strong><em>Action-Value Function</em></strong> ![q_{\pi}(s,a)](https://latex.codecogs.com/svg.latex?q_{\pi}(s,a)) : It is the expected return being in state ![s](https://latex.codecogs.com/svg.latex?s), having taken action ![a](https://latex.codecogs.com/svg.latex?a), and following policy ![\pi](https://latex.codecogs.com/svg.latex?\pi) thereafter.

```{math}
q_{\pi} (s) = \mathbb{E}_{\pi} \left[  G_t | S_t = s, A_t = a \right]
```
```{math}
\implies q_{\pi} (s,a) = \mathbb{E}_{\pi} \left[  R_{t+1} + \gamma R_{t+2} + \gamma^2 R_{t+3} + ... | S_t = s, A_t = a \right]
```
```{math}
\implies q_{\pi} (s,a) = \mathbb{E}_{\pi} \left[  R_{t+1} + \gamma G_{t+1} | S_t = s, A_t = a \right]
```
```{math}
\therefore q_{\pi} (s,q) = \mathbb{E}_{\pi} \left[  R_{t+1} + \gamma v_{\pi} (S_{t+1}) | S_t = s, A_t \sim \pi (s) \right]
```
```{math}
\text{or,} \, q_{\pi} (s,q) = \mathbb{E}_{\pi} \left[  R_{t+1} + \gamma \substack{argmax \\ a'' \in A} q_{\pi} (S_{t+1}, a'') | S_t = s, A_t \sim \pi (s) \right]
```

This is known as <b>Bellman Expectation equation</b>.

Its another form is,
```{math}
\therefore q_{\pi} (s, a) = \sum_{s' \in S, r \in R} p ( s', r | s, a )( r + \gamma \sum_{a' \in A(s')} \pi (a'|s') \, p_{\pi} (s', a') )
```