Regex character class range generator with certain characters excluded

While I was working on a personal project, I found I needed to exclude certain characters in a regex range. So I thought, why not implement a custom range exclusion check for greater efficiency, it’ll only take 5 lines or so. 60+ lines later, I produced this:

regex_supplement.py:

"""Supplementary regex modifying methods."""  from itertools import tee  def pairwise(iterable):     """s -> (s0,s1), (s1,s2), (s2, s3), ...      From the itertools recipes.     """     a, b = tee(iterable)     next(b, None)     return zip(a, b)  def _offset_byte_char(char, offset):     """Offset a single byte char"""     return (ord(char) + offset).to_bytes(1, byteorder='big')  def escape_byte_in_character_class(char):     """Escapes characters as necessary within the character class."""     return b'\' + char if char in b'-]' else char  def byte_range(startchar, endchar, excluded):     """Returns a pattern matching characters in the range startchar-endchar.     Characters in excluded are excluded from the range.     """     excluded = sorted(char for char in excluded if startchar <= char <= endchar)      char_ranges = []     if len(excluded) >= 1:         first_exclude = excluded[0]         if startchar != first_exclude:             # Another possibility = + 1             char_ranges.append(                 (startchar, _offset_byte_char(first_exclude, -1))             )         for start, end in pairwise(excluded):             # Adjacent or equal             if ord(end) - 1 <= ord(start):                 continue             char_ranges.append(                 (_offset_byte_char(start, 1), _offset_byte_char(end, -1))             )         last_exclude = excluded[-1]         if endchar != last_exclude:             char_ranges.append(                 (_offset_byte_char(last_exclude, 1), endchar)             )     else:         char_ranges = [(startchar, endchar)]      char_output = b''     escape = escape_byte_in_character_class     for char_range in char_ranges:         # Doesn't minimize all '-', but that quickly gets complicated.         # (Whether '-' needs to be escaped within a regex range is context dependent.)         start, end = char_range         if start == end:             char_output += escape(start)         elif ord(start) == ord(end) - 1:             char_output += escape(start) + escape(end)         else:             char_output += escape(start) + b'-' + escape(end)     return b'[' + char_output + b']' 

test_regex_supplement.py:

"""Regex supplement tests"""  import regex_supplement as re_supp  def test_byte_regex_range_empty():     """Test that empty exclusions do not affect the range"""     assert re_supp.byte_range(b'a', b'c', []) == b'[a-c]'  def test_byte_regex_range_exclusion_outside():     """An exclusion outside of the regex range should have no effect."""     assert re_supp.byte_range(b'a', b'c', [b'e']) == b'[a-c]'  def test_byte_regex_range_escaped_1():     """Test that ']' is escaped"""     assert re_supp.byte_range(b']', b'`', [b'`']) == rb'[\]-_]'  def test_byte_regex_range_escaped_2():     """Test that '-' is escaped"""     assert re_supp.byte_range(b'-', b'0', [b'0']) == rb'[\--/]'  def test_byte_regex_range_standard_1():     """Test that a standard range behaves as expected"""     assert re_supp.byte_range(b'a', b'g', [b'd']) == b'[a-ce-g]'  def test_byte_regex_range_standard_2():     """Test that a standard range with multiple exclusions behaves as expected"""     assert re_supp.byte_range(b'a', b'k', [b'd', b'h']) == b'[a-ce-gi-k]'  def test_byte_regex_range_optimized_1():     """Test that ranges of 1 char are optimized to single characters."""     assert re_supp.byte_range(b'a', b'c', [b'b']) == b'[ac]'  def test_byte_regex_range_optimized_2():     """Test that multiple ranges of 1 chars are optimized to single characters."""     assert re_supp.byte_range(b'a', b'e', [b'b', b'd']) == b'[ace]' 

This is only implemented for bytestrings, because that’s what I needed for my project. I actually learned that Python regular expressions can handle bytestrings during this project. The tests are intended to be run by pytest. I was originally considering adding in an optimization for single character ranges, but I decided not to because it would lead to more complicated escape-handling code (and the possibility for subtle bugs like double-escaping ]), and it wasn’t needed for my purposes.

I’m mostly concerned with efficiency (mostly of the resultant regex, but also of the program) and accuracy-checking, but stylistic and readability improvements are also appreciated.

Also, in hindsight, I might have considered implementing a lookahead with an exclusion character check preceding the range, but my current approach does have the advantage of discarding excluded characters that are outside of the range, and requiring less escaping.

The Law of Excluded Miracle in the language of guarded commands

The definition of weakest precondition is familiar (let me use Isabelle’s syntax here):

definition "wp c Q s ≡ ∃t. (c,s) ⇒ t ∧ Q t" 

the weakest precondition ensuring Q when executing a command in an initial state s is given by the formula in the RHS.

Now, Dijkstra, for instance in, “Nondeterminacy and Formal Derivation of Programs” talks about the law of the excluded middle:

definition "F ≡ λ t. False"  lemma "wp c F = F" unfolding wp_def F_def by simp 

according to this article the meaning of this proposition would be:

started in a given state, a program execution must either terminate or loop forever

However, I don’t see how this proposition states this fact. Unrolling the definitions I get:

∃t. (c,s) ⇒ t ∧ (λ t. False) t = ∃t. (c,s) ⇒ t ∧ False = False 

seen as a function in s, obviously this LHS matches the RHS. However, I don’t see how this tells that the program terminates or loops. Could you explain what is happening here?

‘Excluded from display’ fields are still visible in the view table

I am using the footable module with the latest dev version,

In the footable view, I have some fields that are “excluded from display”… but yet I still can see them inside the table (without the field label or the field value)

Excluded from display fields

I have created a test website for anyone interested in help:

URL: https://www.testdrupal.ml/test-footable

Any help please ? or any CSS workaround solution ?!

Thank you,