@@ -278,7 +278,7 @@ choose the quadratic regularization.
278278Group Lasso regularization
279279""""""""""""""""""""""""""
280280
281- Another regularization that has been used in recent years is the group lasso
281+ Another regularization that has been used in recent years [ 5 ]_ is the group lasso
282282regularization
283283
284284.. math ::
@@ -333,7 +333,7 @@ Another solver is proposed to solve the problem
333333 s.t. \gamma 1 = a; \gamma ^T 1 = b; \gamma\geq 0
334334
335335 where :math: `\Omega _e` is the entropic regularization. In this case we use a
336- generalized conditional gradient [7 ]_ implemented in :any: `ot.opim .gcg ` that does not linearize the entropic term and
336+ generalized conditional gradient [7 ]_ implemented in :any: `ot.optim .gcg ` that does not linearize the entropic term and
337337relies on :any: `ot.sinkhorn ` for its iterations.
338338
339339.. hint ::
@@ -421,11 +421,11 @@ Estimating the Wassresein barycenter with free support but fixed weights
421421corresponds to solving the following optimization problem:
422422
423423.. math ::
424- \min _\{ x_i\} \quad \sum _{k} w_kW(\mu ,\mu _k)
424+ \min _{ \{ x_i\} } \quad \sum _{k} w_kW(\mu ,\mu _k)
425425
426426 s.t. \quad \mu =\sum _{i=1 }^n a_i\delta _{x_i}
427427
428- WE provide an alternating solver based on [20 ]_ in
428+ We provide an alternating solver based on [20 ]_ in
429429:any: `ot.lp.free_support_barycenter `. This function minimize the problem and
430430return an optimal support :math: `\{ x_i\}` for uniform or given weights
431431:math: `a`.
@@ -443,13 +443,149 @@ return an optimal support :math:`\{x_i\}` for uniform or given weights
443443Monge mapping and Domain adaptation
444444-----------------------------------
445445
446+ The original transport problem investigated by Gaspard Monge was seeking for a
447+ mapping function that maps (or transports) between a source and target
448+ distribution but that minimizes the transport loss. The existence and uniqueness of this
449+ optimal mapping is still an open problem in the general case but has been proven
450+ for smooth distributions by Brenier in his eponym `theorem
451+ <https://who.rocq.inria.fr/Jean-David.Benamou/demiheure.pdf> `__. We provide in
452+ :any: `ot.da ` several solvers for Monge mapping estimation and domain adaptation.
453+
454+ Monge Mapping estimation
455+ ^^^^^^^^^^^^^^^^^^^^^^^^
456+
457+ We now discuss several approaches that are implemented in POT to estimate or
458+ approximate a Monge mapping from finite distributions.
459+
460+ First note that when the source and target distributions are supposed to be Gaussian
461+ distributions, there exists a close form solution for the mapping and its an
462+ affine function [14 ]_ of the form :math: `T(x)=Ax+b` . In this case we provide the function
463+ :any: `ot.da.OT_mapping_linear ` that return the operator :math: `A` and vector
464+ :math: `b`. Note that if the number of samples is too small there is a parameter
465+ :code: `reg ` that provide a regularization for the covariance matrix estimation.
466+
467+ For a more general mapping estimation we also provide the barycentric mapping
468+ proposed in [6 ]_ . It is implemented in the class :any: `ot.da.EMDTransport ` and
469+ other transport based classes in :any: `ot.da ` . Those classes are discussed more
470+ in the following but follow an interface similar to sklearn classes. Finally a
471+ method proposed in [8 ]_ that estimate a continuous mapping approximating the
472+ barycentric mapping is provided in :any: `ot.da.joint_OT_mapping_linear ` for
473+ linear mapping and :any: `ot.da.joint_OT_mapping_kernel ` for non linear mapping.
474+
475+ .. hint ::
476+
477+ Example of the linear Monge mapping estimation is available
478+ in the following example:
479+
480+ - :any: `auto_examples/plot_otda_linear_mapping `
481+
482+ Domain adaptation classes
483+ ^^^^^^^^^^^^^^^^^^^^^^^^^
484+
485+ The use of OT for domain adaptation (OTDA) has been first proposed in [5 ]_ that also
486+ introduced the group Lasso regularization. The main idea of OTDA is to estimate
487+ a mapping of the samples between source and target distributions which allows to
488+ transport labeled source samples onto the target distribution with no labels.
489+
490+ We provide several classes based on :any: `ot.da.BaseTransport ` that provide
491+ several OT and mapping estimations. The interface of those classes is similar to
492+ classifiers in sklearn toolbox. At initialization several parameters (for
493+ instance regularization parameter) can be set. Then one needs to estimate the
494+ mapping with function :any: `ot.da.BaseTransport.fit `. Finally one can map the
495+ samples from source to target with :any: `ot.da.BaseTransport.transform ` and
496+ from target to source with :any: `ot.da.BaseTransport.inverse_transform `. Here is
497+ an example for class :any: `ot.da.EMDTransport `
498+
499+ .. code ::
500+
501+ ot_emd = ot.da.EMDTransport()
502+ ot_emd.fit(Xs=Xs, Xt=Xt)
503+
504+ Mapped_Xs= ot_emd.transform(Xs=Xs)
505+
506+ A list
507+ of the provided implementation is given in the following note.
508+
509+ .. note ::
510+
511+ Here is a list of the mapping classes inheriting from
512+ :any: `ot.da.BaseTransport `
513+
514+ * :any: `ot.da.EMDTransport ` : Barycentric mapping with EMD transport
515+ * :any: `ot.da.SinkhornTransport ` : Barycentric mapping with Sinkhorn transport
516+ * :any: `ot.da.SinkhornL1l2Transport ` : Barycentric mapping with Sinkhorn +
517+ group Lasso regularization [5 ]_
518+ * :any: `ot.da.SinkhornLpl1Transport ` : Barycentric mapping with Sinkhorn +
519+ non convex group Lasso regularization [5 ]_
520+ * :any: `ot.da.LinearTransport ` : Linear mapping estimation between Gaussians
521+ [14 ]_
522+ * :any: `ot.da.MappingTransport ` : Nonlinear mapping estimation [8 ]_
523+
524+ .. hint ::
525+
526+ Example of the use of OTDA classes are available in the following exmaples:
527+
528+ - :any: `auto_examples/plot_otda_color_images `
529+ - :any: `auto_examples/plot_otda_mapping `
530+ - :any: `auto_examples/plot_otda_mapping_colors_images `
531+ - :any: `auto_examples/plot_otda_semi_supervised `
446532
447533Other applications
448534------------------
449535
536+ We discuss in the following several implementations that has been used and
537+ proposed in the OT and machine learning community.
538+
450539Wasserstein Discriminant Analysis
451540^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
452541
542+ Wasserstein Discriminant Analysis [11 ]_ is a generalization of `Fisher Linear Discriminant
543+ Analysis <https://en.wikipedia.org/wiki/Linear_discriminant_analysis> `__ that
544+ allows discrimination between classes that are not linearly separable. It
545+ consist in finding a linear projector optimizing the following criterion
546+
547+ .. math ::
548+ P = \text {arg}\min _P \frac {\sum _i OT_e(\mu _i\# P,\mu _i\# P)}{\sum _{i,j\neq i}
549+ OT_e(\mu _i\# P,\mu _j\# P)}
550+
551+ where :math: `\#` is the push-forward operator, :math: `OT_e` is the entropic OT
552+ loss and :math: `\mu _i` is the
553+ distribution of samples from class :math: `i`. :math: `P` is also constrained to
554+ be in the Stiefel manifold. WDA can be solved in pot using function
555+ :any: `ot.dr.wda `. It requires to have installed :code: `pymanopt ` and
556+ :code: `autograd ` for manifold optimization and automatic differentiation
557+ respectively. Note that we also provide the Fisher discriminant estimator in
558+ :any: `ot.dr.wda ` for easy comparison.
559+
560+ .. warning ::
561+ Note that due to the hard dependency on :code: `pymanopt ` and
562+ :code: `autograd `, :any: `ot.dr ` is not imported by default. If you want to
563+ use it you have to specifically import it with :code: `import ot.dr ` .
564+
565+ .. hint ::
566+
567+ An example of the use of WDA is available in the following example:
568+
569+ - :any: `auto_examples/plot_WDA `
570+
571+
572+ Unbalanced optimal transport
573+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
574+
575+ Unbalanced OT is a relaxation of the original OT problem where the violation of
576+ the constraint on the marginals is added to the objective of the optimization
577+ problem:
578+
579+ .. math ::
580+ \min _\gamma \quad \sum _{i,j}\gamma _{i,j}M_{i,j} + reg\cdot\Omega (\gamma ) + \alpha KL(\gamma 1 , a) + \alpha KL(\gamma ^T 1 , b)
581+
582+ s.t. \quad \gamma\geq 0
583+
584+
585+ where KL is the Kullback-Leibler divergence. This formulation allwos for
586+ computing approximate mapping between distributions that do not have the same
587+ amount of mass. Interestingly the problem can be solved with a generalization of
588+ the Bregman projections algorithm [10 ]_.
453589
454590Gromov-Wasserstein
455591^^^^^^^^^^^^^^^^^^
@@ -461,6 +597,10 @@ GPU acceleration
461597We provide several implementation of our OT solvers in :any: `ot.gpu `. Those
462598implementation use the :code: `cupy ` toolbox.
463599
600+ .. warning ::
601+ Note that due to the hard dependency on :code: `cupy `, :any: `ot.gpu ` is not
602+ imported by default. If you want to
603+ use it you have to specifically import it with :code: `import ot.gpu ` .
464604
465605
466606FAQ
0 commit comments