Let’s define a “semirewritable” memory device as having the following properties:
 The initial blank media is initialised with all zeroes.
 When writing to the media, individual zeroes can be turned into ones.
 Ones can not be turned back into zeroes.
Making a physical interpretation of this is easy. Consider for instance a punch card where new holes can easily be made, but old holes can not be filled.
What makes this different from a “write once, read many” device is that a used device can be rewritten (multiple times), at the cost of reduced capacity for each rewrite.
Implicit assumptions I would like to make explicit:
 The memory reader has no information about what was previously written on the device. It can therefore not be relied upon to use a mechanism such as “which symbols have been changed?” to encode data on a device rewrite. That is, the reader is stateless.
 On the other hand, different “generations” of the device may use different encoding schemes as the available capacity shrinks.
 The data stored can assumed to be random bits.
Sample storage scheme, to demonstrate rewrite capability:
Information in this scheme is stored on the device as pairs of binary symbols, each pair encoding one of the three states of a ternary symbol, or [DISCARDED] in the case where both symbols have been written.
The first generation thus stores data at a density of $ \frac{log_2(3)}{2} \approx 0.79$ times that of simple binary encoding.
When the device is rewritten, the encoder considers each pair of binary symbols in sequence. If the existing state matches the one it desires to write, the encoder considers the data written. If on the other hand the pair doesn’t match, it writes the necessary modification to that pair, or in the case where that isn’t possible, writes the symbol [DISCARDED] and considers the next pair instead until it has successfully written the ternary symbol.
As such, every rewrite would discard $ \frac{4}{9}$ of existing capacity.
For a large number of cycles, the device would in sum have stored $ \frac{9log_2(3)}{8} \approx 1.78$ times the data of a simple onetime binary encoding.
(For a variation of the above, one could also encode the first generation in binary and then apply this scheme on every subsequent generation. The loss from the first generation to the second would be larger, and the total life time capacity reduced, but the initial capacity would be larger).
Question:

Is it possible to have a better lifetime capacity than $ \frac{9log_2(3)}{8}$ ? I suspect the the real asymptotic capacity is 2.

Can a scheme do better than having $ \frac{4}{9}$ capacity loss between rewrites?