@@ -5,6 +5,9 @@ Quick start guide
55In the following we provide some pointers about which functions and classes
66to use for different problems related to optimal transport (OT).
77
8+ This document is not a tutorial on numerical optimal transport. For this we strongly
9+ recommend to read the very nice book [15 ]_ .
10+
811
912Optimal transport and Wasserstein distance
1013------------------------------------------
@@ -20,10 +23,11 @@ Solving optimal transport
2023
2124The optimal transport problem between discrete distributions is often expressed
2225as
23- .. math ::
24- \gamma ^* = arg\min _\gamma \quad \sum _{i,j}\gamma _{i,j}M_{i,j}
2526
26- s.t. \gamma 1 = a; \gamma ^T 1 = b; \gamma\geq 0
27+ .. math ::
28+ \gamma ^* = arg\min _\gamma \quad \sum _{i,j}\gamma _{i,j}M_{i,j}
29+
30+ s.t. \gamma 1 = a; \gamma ^T 1 = b; \gamma\geq 0
2731
2832 where :
2933
@@ -120,8 +124,6 @@ distributions. In this case when the finite sample dataset is supposed gaussian,
120124mapping.
121125
122126
123-
124-
125127Regularized Optimal Transport
126128-----------------------------
127129
@@ -146,6 +148,7 @@ We discuss in the following specific algorithms that can be used depending on
146148the regularization term.
147149
148150
151+
149152Entropic regularized OT
150153^^^^^^^^^^^^^^^^^^^^^^^
151154
@@ -168,23 +171,107 @@ solution of the resulting optimization problem can be expressed as:
168171 \gamma _\lambda ^*=\text {diag}(u)K\text {diag}(v)
169172
170173 where :math: `u` and :math: `v` are vectors and :math: `K=\exp (-M/\lambda )` where
171- the :math: `\exp ` is taken component-wise.
174+ the :math: `\exp ` is taken component-wise. In order to solve the optimization
175+ problem, on can use an alternative projection algorithm that can be very
176+ efficient for large values if regularization.
177+
178+ The main function is POT are :any: `ot.sinkhorn ` and
179+ :any: `ot.sinkhorn2 ` that return respectively the OT matrix and the value of the
180+ linear term. Note that the regularization parameter :math: `\lambda ` in the
181+ equation above is given to those function with the parameter :code: `reg `.
172182
183+ >>> import ot
184+ >>> a= [.5 ,.5 ]
185+ >>> b= [.5 ,.5 ]
186+ >>> M= [[0 .,1 .],[1 .,0 .]]
187+ >>> ot.sinkhorn(a,b,M,1 )
188+ array([[ 0.36552929, 0.13447071],
189+ [ 0.13447071, 0.36552929]])
173190
174191
175192
193+ More details about the algorithm used is given in the following note.
194+
195+
196+ .. note ::
197+ The main function to solve entropic regularized OT is :any: `ot.sinkhorn `.
198+ This function is a wrapper and the parameter :code: `method ` help you select
199+ the actual algorithm used to solve the problem:
200+
201+ + :code: `method='sinkhorn' ` calls :any: `ot.bregman.sinkhorn_knopp ` the
202+ classic algorithm [2 ]_.
203+ + :code: `method='sinkhorn_stabilized' ` calls :any: `ot.bregman.sinkhorn_stabilized ` the
204+ log stabilized version of the algorithm [9 ]_.
205+ + :code: `method='sinkhorn_epsilon_scaling' ` calls
206+ :any: `ot.bregman.sinkhorn_epsilon_scaling ` the epsilon scaling version
207+ of the algorithm [9 ]_.
208+ + :code: `method='greenkhorn' ` calls :any: `ot.bregman.greenkhorn ` the
209+ greedy sinkhorn verison of the algorithm [22 ]_.
210+
211+ In addition to all those variants of sinkhorn, we have another
212+ implementation solving the problem in the smooth dual or semi-dual in
213+ :any: `ot.smooth `. This solver use the :any: `scipy.optimize.minimize `
214+ function to solve the smooth problem with :code: `L-BFGS ` algorithm. Tu use
215+ this solver, use functions :any: `ot.smooth.smooth_ot_dual ` or
216+ :any: `ot.smooth.smooth_ot_semi_dual ` with parameter :code: `reg_type='kl' ` to
217+ choose entropic/Kullbach Leibler regularization.
218+
219+ .. hint ::
220+ Examples of use for :any: `ot.sinkhorn ` are available in the following examples:
221+
222+ - :any: `auto_examples/plot_OT_2D_samples `
223+ - :any: `auto_examples/plot_OT_1D `
224+ - :any: `auto_examples/plot_OT_1D_smooth `
225+ - :any: `auto_examples/plot_stochastic `
226+
227+ Finally note that we also provide in :any: `ot.stochastic ` several implementation
228+ of stochastic solvers for entropic regularized OT [18 ]_ [19 ]_.
176229
177230Other regularization
178231^^^^^^^^^^^^^^^^^^^^
179232
180- Stochastic gradient descent
181- ^^^^^^^^^^^^^^^^^^^^^^^^^^^
233+ While entropic OT is the most common and favored in practice, there exist other
234+ kind of regularization. We provide in POT two specific solvers for other
235+ regularization terms: namely quadratic regularization and group lasso
236+ regularization. But we also provide in :any: `ot.optim ` two generic solvers that allows solving any
237+ smooth regularization in practice.
238+
239+ The first general regularization term we can solve is the quadratic
240+ regularization of the form
241+
242+ .. math ::
243+ \Omega (\gamma )=\sum _{i,j} \gamma _{i,j}^2
244+
245+ this regularization term has a similar effect to entropic regularization in
246+ densifying the OT matrix but it keeps some sort of sparsity that is lost with
247+ entropic regularization as soon as :math: `\lambda >0 ` [17 ]_. This problem cen be
248+ solved with POT using solvers from :any: `ot.smooth `, more specifically
249+ functions :any: `ot.smooth.smooth_ot_dual ` or
250+ :any: `ot.smooth.smooth_ot_semi_dual ` with parameter :code: `reg_type='l2' ` to
251+ choose the quadratic regularization.
252+
253+ Another regularization that has been used in recent years is the group lasso
254+ regularization
255+
256+ .. math ::
257+ \Omega (\gamma )=\sum _{j,G\in \mathcal {G}} \|\gamma _{G,j}\| _p^q
258+
259+ where :math: `\mathcal {G}` contains non overlapping groups of lines in the OT
260+ matrix. This regularization proposed in [5 ]_ will promote sparsity at the group level and for
261+ instance will force target samples to get mass from a small number of groups.
262+ Note that the exact OT solution is already sparse so this regularization does
263+ not make sens if it is not combined with others such as entropic.
264+
265+
266+
267+
268+
182269
183270Wasserstein Barycenters
184271-----------------------
185272
186273Monge mapping and Domain adaptation with Optimal transport
187- ----------------------------------------
274+ ----------------------------------------------------------
188275
189276
190277Other applications
207294 the OT transport matrix. If you want to solve a regularized OT you can
208295 use :py:mod: `ot.sinkhorn `.
209296
210-
211297
212298 Here is a simple use case:
213299
222308 :doc: `auto_examples/plot_OT_2D_samples `
223309
224310
225- 2. **Compute a Wasserstein distance **
311+ 2. **pip install POT fails with error : ImportError: No module named Cython.Build **
312+
313+ As discussed shortly in the README file. POT requires to have :code: `numpy `
314+ and :code: `cython ` installed to build. This corner case is not yet handled
315+ by :code: `pip ` and for now you need to install both library prior to
316+ installing POT.
317+
318+ Note that this problem do not occur when using conda-forge since the packages
319+ there are pre-compiled.
320+
321+ See `Issue #59 <https://github.com/rflamary/POT/issues/59 >`__ for more
322+ details.
323+
324+ 3. **Why is Sinkhorn slower than EMD ? **
325+
326+ This might come from the choice of the regularization term. The speed of
327+ convergence of sinkhorn depends directly on this term [22 ]_ and when the
328+ regularization gets very small the problem try and approximate the exact OT
329+ which leads to slow convergence in addition to numerical problems. In other
330+ words, for large regularization sinkhorn will be very fast to converge, for
331+ small regularization (when you need an OT matrix close to the true OT), it
332+ might be quicker to use the EMD solver.
333+
334+ Also note that the numpy implementation of the sinkhorn can use parallel
335+ computation depending on the configuration of your system but very important
336+ speedup can be obtained by using a GPU implementation since all operations
337+ are matrix/vector products.
338+
339+ 4. **Using GPU fails with error: module 'ot' has no attribute 'gpu' **
340+
341+ In order to limit import time and hard dependencies in POT. we do not import
342+ some sub-modules automatically with :code: `import ot `. In order to use the
343+ acceleration in :any: `ot.gpu ` you need first to import is with
344+ :code: `import ot.gpu `.
345+
346+ See `Issue #85 <https://github.com/rflamary/POT/issues/85 >`__ and :any: `ot.gpu `
347+ for more details.
226348
227349
228350References
0 commit comments