Friends,

I have been working on implementing denoising Lanczos differentiators, following this paper. The following code works well (though I could certainly improve its performance), but *the design is a train wreck*, and I was hoping to get some advice to improve it.

The code is provided below, but let me provide some background so the design tradeoffs are more obvious. If you use standard finite difference stencils on noisy data, you’ll amplify noise to the extent that the results are totally useless. So instead of using the traditional finite difference filter, you want to create a filter which has $ O(\Delta t^{p})$ accuracy against the true derivative, but take the filter length $ 2n+1 \gg p$ so that the filter can average away noise. (For $ p = 2n$ , you recover the standard finite difference filter and for $ p > 2n$ the filter is undefined.) Then you need $ n$ boundary filters, each of length $ 2n+1$ , because the interior filter requires $ n$ points on each side. (The right side boundary filters are reflect+negate the left boundary filters, so don’t need to be stored.) So if I was going to precompute all the $ (n,p)$ -filters up to $ n_{\max}$ , it would require ~$ 4n_{\max}^{3}$ elements in storage, for each precision. That’s not counting the fact that people will want these filters precomputed in both float and double precision, perhaps long double as well. Hence did a back-of-the-envelope which concluded that to store $ n_{\max} = 25$ , I’d need a 40MB header file which would need to be parsed to find the desired filter. I think that’d make people pretty unhappy. In addition, the error of the computed derivative decreases as $ n^3$ , so there’s really no point of diminishing returns where I can say “this is a sane $ n_{\max}$ , there’s no cause to ever use a longer filter!”

So I feel like I have to compute the filters at runtime. But the filter computation is at least cubic complexity in $ n$ , so it’s pretty expensive. In addition, the filters *feel* like compile-time data, since they doesn’t depend on any runtime data and they’re immutable. So I feel like there’s some design that could be done which would make it so that the filters only have to be computed once and can be used globally for any data.

Here are some other design problems I feel should be addressed:

I’m passing the data vector into the class at the same time I’m computing the filters, and storing a reference member. This should be decoupled, as the filters don’t depend on the data. But then I’d lose the ability to call the class via an index, or would need to do something like `lanczos.set_data(v); Real dvdt = lanczos[i]`

. Another option might be `Real dvdt = lanczos(v, i)`

, or even `Real dvdt = lanczos(v, dt, i)`

, as the filters don’t depend on the timestep either. As I said, the API is a train wreck and it’s not clear to me how to make it more elegant.

`#ifndef BOOST_MATH_DIFFERENTIATION_LANCZOS_SMOOTHING_HPP #define BOOST_MATH_DIFFERENTIATION_LANCZOS_SMOOTHING_HPP namespace boost { namespace math { namespace differentiation { namespace detail { template <typename Real> class discrete_legendre { public: discrete_legendre(int n) : m_n{n} { // The integer n indexes a family of discrete Legendre polynomials indexed by k <= 2*n } Real norm_sq(int r) { Real prod = Real(2) / Real(2 * r + 1); for (int k = -r; k <= r; ++k) { prod *= Real(2 * m_n + 1 + k) / Real(2 * m_n); } return prod; } Real operator()(Real x, int k) { BOOST_ASSERT_MSG(k <= 2 * m_n, "r <= 2n is required."); if (k == 0) { return 1; } if (k == 1) { return x; } Real qrm2 = 1; Real qrm1 = x; Real N = 2 * m_n + 1; Real qr; for (int r = 2; r <= k; ++r) { Real num = (r - 1) * (N * N - (r - 1) * (r - 1)) * qrm2; Real tmp = (2 * r - 1) * x * qrm1 - num / Real(4 * m_n * m_n); qrm2 = qrm1; qrm1 = tmp / r; } return qrm1; } Real prime(Real x, int k) { BOOST_ASSERT_MSG(k <= 2 * m_n, "r <= 2n is required."); if (k == 0) { return 0; } if (k == 1) { return 1; } Real qrm2 = 1; Real qrm1 = x; Real qrm2p = 0; Real qrm1p = 1; Real N = 2 * m_n + 1; Real qr; for (int r = 2; r <= k; ++r) { Real s = (r - 1) * (N * N - (r - 1) * (r - 1)) / Real(4 * m_n * m_n); Real tmp1 = ((2 * r - 1) * x * qrm1 - s * qrm2) / r; Real tmp2 = ((2 * r - 1) * (qrm1 + x * qrm1p) - s * qrm2p) / r; qrm2 = qrm1; qrm1 = tmp1; qrm2p = qrm1p; qrm1p = tmp2; } return qrm1p; } private: int m_n; }; template <class Real> std::vector<Real> interior_filter(int n, int p) { // We could make the filter length n, as f[0] = 0, // but that'd make the indexing awkward when applying the filter. std::vector<Real> f(n + 1, 0); auto dlp = discrete_legendre<Real>(n); for (int j = 1; j < f.size(); ++j) { Real arg = Real(j) / Real(n); for (int l = 1; l <= p; l += 2) { f[j] += dlp.prime(0, l) * dlp(arg, l) / dlp.norm_sq(l); } f[j] /= (n * n); } return f; } template <class Real> std::vector<Real> boundary_filter(int n, int p, int s) { std::vector<Real> f(2 * n + 1, 0); auto dlp = discrete_legendre<Real>(n); for (int k = 0; k < f.size(); ++k) { int j = k - n; f[k] = 0; Real arg = Real(j) / Real(n); Real sn = Real(s) / Real(n); for (int l = 1; l <= p; ++l) { f[k] += dlp.prime(sn, l) * dlp(arg, l) / dlp.norm_sq(l); } f[k] /= (n * n); } return f; } } // namespace detail template <class RandomAccessContainer> class lanczos_smoothing_derivative { public: using Real = typename RandomAccessContainer::value_type; lanczos_smoothing_derivative(RandomAccessContainer const &v, Real spacing = 1, int filter_length = 18, int approximation_order = 3) : m_v{v}, dt{spacing} { BOOST_ASSERT_MSG(approximation_order <= 2 * filter_length, "The approximation order must be <= 2n"); BOOST_ASSERT_MSG(spacing > 0, "Spacing between samples must be > 0."); using std::size; BOOST_ASSERT_MSG( size(v) >= filter_length, "Vector must be at least as long as the filter length"); m_f = detail::interior_filter<Real>(filter_length, approximation_order); boundary_filters.resize(filter_length); for (size_t i = 0; i < filter_length; ++i) { // s = i - n; boundary_filters[i] = detail::boundary_filter<Real>( filter_length, approximation_order, i - filter_length); } } Real operator[](size_t i) const { using std::size; if (i >= m_f.size() - 1 && i <= size(m_v) - m_f.size()) { Real dv = 0; for (size_t j = 1; j < m_f.size(); ++j) { dv += m_f[j] * (m_v[i + j] - m_v[i - j]); } return dv / dt; } // m_f.size() = N+1 if (i < m_f.size() - 1) { auto &f = boundary_filters[i]; Real dv = 0; for (size_t j = 0; j < f.size(); ++j) { dv += f[j] * m_v[j]; } return dv / dt; } if (i > size(m_v) - m_f.size()) { int k = size(m_v) - 1 - i; auto &f = boundary_filters[k]; Real dv = 0; for (size_t j = 0; j < f.size(); ++j) { dv += f[j] * m_v[m_v.size() - 1 - j]; } return -dv / dt; } // OOB access: return std::numeric_limits<Real>::quiet_NaN(); } private: const RandomAccessContainer &m_v; std::vector<Real> m_f; std::vector<std::vector<Real>> boundary_filters; Real dt; }; } // namespace differentiation } // namespace math } // namespace boost #endif `

Any advice is appreciated. Note that I haven’t fine-toothed this code for elegance because I think it needs to be torn up and refactored; I’ll post another question once the API is stabilized.