RL | ML | ALGO TRADING | TRANSPORTATION | GAME THEORY

1. Introduction

2. Concept of Elasticity

3. Measuring Elasticity: Linear Regression

4. Example: Data & Code

5. Dynamic Pricing in Competition: Game Theory

6. Nash Equilibrium

One of the biggest challenges in e-commerce is to utilize data mining methods for the improvement of their dynamic pricing policies. Usually these products in the e-commerce face *severe competition* either from same products hosted in different e-commerce website or from products from different brand.

In this blog, I will try to model this severe competition as a formal 2 player strategic game and apply one of the most famous game theoretic solution concept…

We introduced Nash Equilibrium solution concept in the previous blog. In this blog we will start with a continuous action example and we will discuss the applicability of Nash equilibrium in mixed strategies.

Mixed strategies are class of games where player chooses actions stochastically( i.e. Instead of chooses a single strategy, player chooses a distribution of strategies).

Let’s try to understand how self interested players might behave in scenario of scarce resources. Imaging there are **n** fertilizer manufacturing companies each choosing how much to produce around a fresh water lake. Each manufacturing companies degrades some amount of fresh water in…

Let’s introduce the idea of solution concept in this section. So far we stressed on representation of payoff for different combinations of unique player’s decisions in the strategic environment. These representations are useless until we apply some model to predict the decision of a given player considering the anticipated decisions taken by other rational players. We describe this model as solution concepts.

For example pareto optimality is a solution concept. **Pareto optimality** is a situation that cannot be modified so as to make any one individual better off without making at least one individual worse off. …

This is an advanced theoretical blog which focusses on one of the most intriguing and complex aspect of policy gradient algorithms. The reader is assumed to have some basic understanding of policy gradient algorithms: A popular class of reinforcement learning algorithms which estimates the gradient for a function approximation. You can refer to chapter 13 of Reinforcement Learning: An Introduction for understanding policy gradient algorithms.

In policy gradient setup, the idea is to directly parameterise the policy. The optimal policy is the policy with highest value function. This is easier and certainly different from value-based method, where we first find…

**NLP Zero to One: Basics (Part 1/30)****NLP Zero to One : Sparse Document Representations (Part 2/30)****NLP Zero to One: Deep Learning Theory Basics (Part 3/30)****NLP Zero to One: Deep Learning Training Procedure (Part 4/30)****NLP Zero to One: Dense Representations, Word2Vec (Part 5/30)****NLP Zero to One: Count based Embeddings, GloVe (Part 6/30)****NLP Zero to One: Training Embeddings using Gensim and Visualisation (Part 7/30)****NLP Zero to One: Recurrent Neural Networks Basics Part(8/30)****NLP Zero to One: LSTM Part(9/30)****NLP Zero to One: Bi-Directional LSTM Part(10/30)****NLP Theory and Code: Encoder-Decoder Models (Part 11/30)****NLP Zero…**

Thanks for responding. Its h(planks)/(2*pi). 6.6/(2*3.14) ~ 1.05 ! I will leave a comment in the code.

Hey, thanks for suggestion. I tried to type it in latex, its too much effort so I resorted to writing with stylus. Anyway I will try to redo the blog with latex eqs.