Given a system in $\mathbb{F}_2$ in RREF, how do I find a solution of minimal norm?

I have a $ 12 \times 12$ (so not really large) system of linear equations in $ \mathbb{F}_2$ which I got to RREF through the usual row reduction. Suppose the system has multiple solutions, and call the unknowns $ x_i$ . What is the least expensive way to find a solution that minimizes the amount of $ x_i$ ‘s such that $ x_i = 1$ , or equivalently, a solution of minimal norm? Is this solution unique?

Moments of the Schatten norm of matrix

I am wondering what is the connection between order of the moment of the p-th shatten norm of the matrix and order if the shatten norm itself.

More precisely, why one would ever sick for the bound of the p-th moment of the p-shatyen norm of the matrix, why we would not consider the q-th moment of the p-th Shatten norm?

(see for example https://scholar.google.nl/scholar?hl=en&as_sdt=0%2C5&as_vis=1&q=moment+of+schatten+norm&btnG=#d=gs_qabs&u=%23p%3Dap9f83X_ylcJ)

Reference request: norm topology on M(X) vs. weak topology

Let $ (X,d)$ be a metric space and $ \mathcal{M}(X)$ be the space of regular (e.g. Radon) measures on $ X$ . There are two standard topologies on $ \mathcal{M}(X)$ : The (probabilist’s) weak topology and the strong norm topology, where the norm is the total variation norm.

Surprisingly, I have found very little discussion in the literature comparing these two topologies rigourously, besides the oft-cited claim that the norm topology is much stronger than the weak topology. I am looking for a reference that discusses and compares these topologies, esp. things like convergence, boundedness, open sets, projections, etc.

I am mostly concerned with probability measures $ \mathcal{P}(X)\subset\mathcal{M}(X)$ , but I am not sure how much of a difference this makes wrt topological concerns.

Luxemburg norm as argument of Young’s function: $\Phi\left(\lVert f \rVert_{L^{\Phi}}\right)$

Let $ \Phi$ be a Youngs’s function, i.e. $ $ \Phi(t) = \int_0^t \varphi(s) \,\mathrm d s$ $ for some $ \varphi$ satifying

  1. $ \varphi:[0,\infty)\to[0,\infty]$ is increasing
  2. $ \varphi$ is lower semi continuous
  3. $ \varphi(0) = 0$
  4. $ \varphi$ is neither identically zero nor identically infinite

and define the Luxemburg norm of $ f:\Omega\to\mathbb{R}$ as $ $ \lVert f \rVert_{L^{\Phi}} := \inf \left\{\gamma>0\,\middle|\, \int_{\Omega} \Phi\left(\frac {\lvert f(x)\rvert}{\gamma} \right)\,\mathrm{d}x\right\}.$ $


Question: What can we say about $ \Phi\left(\lVert f \rVert_{L^{\Phi}}\right)$ ? In particular, I’d like to know, if $ $ \Phi\left(\lVert f \rVert_{L^{\Phi}}\right) \leq C \int_{\Omega}\Phi(\lvert f(x)\rvert) \,\mathrm d x$ $ holds for some $ C$ independent of $ f$ .

Any idea or hint for a reference is welome!


Notes:

  • The above inequality trivially holds for $ \Phi(t) = t^p$ , where $ p>1$
  • Maybe it’s appropriate to consider this question in the more general framework of Musielak-Orlicz spaces. However, e.g. in Lebesgue and Sobolev Spaces with Variable Exponents I was unable to find an appropriate result.
  • I have asked this question on Math.Stackexchange without luck, so I’m trying here.

Schur norm of weighted Cauchy matrix

The Schur norm of a matrix $ A$ is defined to be $ \|A\|_S=\max\{\|A\circ X\|: \|X\|\leq 1\}$ , where $ \|\cdot \|$ is the operator norm of a matrix, i.e., the largest singular value.

Let $ a_1,\ldots, a_m, b_1,\ldots, b_n$ be positive reals.Let $ A$ be an $ m\times n$ matrix defined to be $ A_{i,j}=(a_i-b_j)/(a_i+b_j)$ .

My question is how to compute $ \|A\|_S$ . Is it upper bounded by an absolute constant independent of $ m, n$ ?

The norm squared of a moment map

I am studying the paper by E. Lerman: https://arxiv.org/pdf/math/0410568.pdf

Let $ (M,\sigma)$ be a connected symplectic manifold with an hamiltonien action of a compact Lie group $ G$ , so that there exist a moemnt map $ $ \mu : M\to\mathcal{G}^\ast$ $ $ \mathcal{G}^\ast$ being the dual of the Lie algebra of $ G$ . We assume that $ \mu$ is $ G$ -equivariant: $ $ \mu(g\cdot x)=\mathrm{Ad}_g^\ast\circ\mu(x)$ $ and that $ \mu$ is proper (the preimage of any compact is compact). Let $ f=\|\mu\|^2$ (for an Ad-invariant norm on $ \mathcal{G}^\ast$ ).

I know that the moment map is important by:

1- a convexity theorem of Atiyah and Guillemin-Sternberg.

2- symplectic reduction, where the quotient of the zero level of the moment map by the group makes it possible to construct new symplectic manifolds.

Hence my question:

What is the motivation to study the norm squared of a moment map? in particular, why is it important to know that the zero level set of the moment map is a retract by deformation of a piece of the manifold?

As I understand it, $ f$ behaves like a Morse-Bott function (Kirwan works) and that the stable manifold of a critical component of $ f$ is a submanifold. That the gradient flow of $ f$ is defined for all $ t\geq0$ . Here Lerman asserts that this is true because $ f$ is proper, but $ x^3$ is proper but its gradient $ -3x^2\partial_x$ is not defined for all $ t\geq0$ .

I think we have to show that $ \nabla_f$ is $ G$ -invariant and therefore complete.

That $ f$ is real analytic to show that the limit of a trajectory of any point $ \phi_t(x)$ is reduced to a point $ \phi_\infty(x)$ . That the applications $ t\to\phi_t(x)$ and $ x\to\phi_\infty(x)$ are continuous.

Linear regression: not noramalising by y’s norm

I was recently reading an article on Pearson correlation, and OLS coefficients. I came across the following section.

enter image description here

I understand that using calculus we can arrive at an expression for finding a, the coefficient. The expression’s denominator turns out to not contain y’s norm. In the last paragraph of the excerpt, I could not understand the following line

Not normalizing for y is what you want for the linear regression

Why don’t we want to normalize for y? What is the physical/geometrical significance of this?

Hoeffding to bound Orlicz norm

I have been reading from Weak Convergence and Empirical Processes, and came across the following: Let $ a_1,\ldots,a_n$ be constants and $ \epsilon_1,\ldots,\epsilon_n\sim$ Rademacher. Then

$ \mathbb{P}\left(\left|\sum_i\epsilon_i a_i\right|>x\right)\leq 2\exp\left(-\frac{x^2}{2||a||^2_2}\right)$

Consequently, $ ||\sum_i\epsilon_ia_i||_{\Psi_2}\leq\sqrt{6}||a||_2$ .

How does this follow (relation between Orlicz norm of Rademacher average and L2 norm of constants)? Thank you in advance for your time.

Strict Convexity and Uniqueness of Dual norm

So, I have trouble proving the following, I’d be grateful if somebody helps me with this.

Let $ z$ be a given point in $ \mathbb{R}^m$ . Then, $ x\in \mathbb{R}^m$ is a dual vector of $ z$ with respect to $ \|.\|$ if it satisfies $ \|x\|=1$ and $ z^Tx=\|z\|’$ .
A norm $ \|.\|$ is said to be strictly convex if the unit sphere $ \{x:\|x\|=1\}$ contains no line segment.

Now, how does one prove that

The norm $ \|.\|$ is strictly convex if and only if each $ z\in \mathbb{R}^m$ has a unique dual vector.

Spectral radius is the greatest lower bound for some matrix norm

I’m studying matrix analysis with Horn and Johnson’s book.

I have something trouble while reading the book.

There is lemma 5.6.10 lemma and the following is the proof of that Proof of lemma.

I have trouble in two lines below from the matrix such that 1-norm of (D_t \triangle D_t^{-1}) is less and equal to (\rho(A)+\epsilon).

1-norm is defined as the sum of all element in the matrix.

I understood that off-diagonal elements can be bounded by epsilon for large t. However, I cannot understand how does the sum of absolute values of eigenvalues will be bounded by spectral radius of A.