Linear algorithm to measure how sorted an array is

I’ve just attended an algorithm course, in which I’ve seen many sorting algorithms performing better or worse depending on how much the elements of an array are sorted already. The typical example are quicksort, performing in $ O(n^2)$ time, and mergesort which operates in linear time on sorted arrays. Vice versa, quicksort performs better in case we are dealing with an array sorted from the highest to the lowest value.

My question is if there is a way to measure in linear time how sorted the array is, and then decide which algorithm is better to use.

Why do we need security measure likes control flow integrity and buffer overflow guard if we have good access control protocol in place?

Reading into information security, I noticed two branches. Access control when communication with external device by using some type of cryptographic authentication and encryption mechanism and things like control flow integrity. My question is why do we need the latter if former is good enough. Are there example of control flow exploits on access control protocol implementation themselves? My focus is mainly on embedded devices.

How to identify measure words in Chinese text?

Measure words (aka classifiers) are used in Chinese to “measure” things, e.g.

牛奶
Three glasses of milk


That person

乌鸦
One crow

We don’t have an equivalent in English [they’re not collective nouns (e.g. a murder of crows)]. It’d help reading Chinese text to be able to highlight measure words in a distinct color.

Thus, I’m interested in the following problem:

Input: Chinese plaintext.
Output: Identification of which characters are measure words in that plaintext.

It’s a highly restricted version of Chinese text segmentation, and I expect it would be substantially simpler, e.g. there’s only a short list of characters which can be measure words. It may even be considered “solved”.

However, there’s some nuances which make this challenging, e.g., repeated measure words e.g. 个个 and 一根根, and characters in measure words can also belong to Chinese words 个人. Nevertheless, there may only be a small number of exceptions here.

Question: How to identify measure words in Chinese text?

Searching for Chinese measure words in ACM didn’t give anything relevant, but this problem may have arisen as a sideline in other work on Chinese text segmentation.

Dynamic length of union of segments (1d Klee’s measure problem)

Finding the length of union of segments (1-dimensional Klee’s measure problem) is a well-known algorithmic problem. Given a set of $ n$ intervals on the real line, the task is to find the length of their union. There is a simple $ O(n \log n)$ solution involving sorting endpoints of all intervals and then iterating over them while keeping track of some counters.

Now let’s look at a dynamic variant of this problem: we are receiving $ n$ segments one after another and the task is to find the length of the current union of segments after each new segment arrives. It seems like a natural generalization, however, I cannot find much information about this version of the problem. I wonder if it is still possible to solve this problem in $ O(n \log n)$ time in an offline setting, i.e. when we are given all segments beforehand and can preprocess them in some way before giving an answer for each segment.

I feel like a modification of a segment tree might be useful here. In every node we need to keep track of both the total sum of lengths of segments (or their parts) corresponding to this node and the length of their union. However, I cannot figure out how to implement this modification without performance of one update degrading to linear time in some cases. Maybe using a segment tree is not the right approach and a better way exists.

Should user input be validated/checked for it’s length in PHP (server side) as a security measure?

important to note that this user input is something that after validation & sanitation – will be inserted into a database, and later on be shown to other users on the same web site. (example: a forum) I’m referring to both a case when I know in advanced what’s the length I should expect from the user and a case in which I don’t but know vaguely that’s not more than 100 length. I’m trying to figure out if there is any security advantages for checking user input length in PHP. taking into account I’m already validation & sanitation user input based on the type of content I’m expecting using regex. I know this differs from language to language to I want to refer to PHP this time, but any referring to other language like Java, .NET, python etc. would be fine.

How can I measure the cost of this algorithm pseudo 3-coloring graph problem using probability algorithm?

Problem: I have a classic 3-coloring graph problem where I have to get at least 2/3 well-colored edges from the total edges. By well-colored I mean that the two vertex of one edge are of different color. I have to use a proabilistic algorithm with polynomial average cost.

¿Solution?: I assign each vertex one of the 3 possible colors randomly. So the probability that one edge is well-colored is 2/3. The cost of assign each vertex a random color is linear if I am not wrong. And since I have 2/3 probability that graph cost is well-colored, the probability of having found a solution is 2/3. So with k being a low natural number of bound attemps and n vertex, algorithm’s cost is $ n^{k}$

Doubts: ¿Is solution reasoning about the time cost okey? ¿Is this solution a sort of Las Vegas algorithm? Thanks in advance.

Practical use of O-card, or how to measure positive consent on the fly

I’m preparing to run the game of Bluebeard’s Bride with couple of players I don’t know very well. This game can be quite heavy on disturbing content, so I certainly plan to have X-card equivalent in game. At the same time, I have put clear indicatation of the theme and some of possible triggers in the pre-game blurb (it will be run on small convention dedicated purely to RPG games), so assumption is, that players will be willing to experience at least some of it and push their boundaries.

As it is single-session game, in predefined timeframe (5-6 hours total), there is a limit to how much pre-game ‘session 0’ research/questionnaires I can do. I also do not expect to have any contact with the players before the game itself.

I’m strongly considering having equivalent of O-card in addition to X-card. For people not familiar with the term, here is a definition from TTRPG Safety Toolkit

The O card can be used at any point if a participant wants to continue with the content. When the O card is used by tapping the card or typing an “O” in the chat, the group is ok to continue with the content. They can also regularly be prompted by a “O?” asked out loud or in the chat to check-in if everyone is still ok.

Let’s ignore online play part.

How does it work in practice with multiple players? X-card is simple – one players bails out, scene stops. But with O-card, is it enough that directly involved player taps a card to increase/follow the narration and rest can X-card it if they don’t agree? Can some other players use O-card, even if they are just listening atm? Or do we do quick vote, which can be quite awkard with 5 players and put a kind of peer pressure on last one not joining, which those techniques are meant to avoid?

With LARPs it is bit easier with red/yellow/green safety words, because

  • you often interact with just one person who can be affected by your actions
  • often you ask about physical interaction but you use verbal confirmation, which intrudes less into the flow

In TTRPGs, physical gesture on X-card provides same distinction between action (which is verbal) and safety mechanism (touch in this case) – verbal consent techniques would be more invasive.

Do you have any other, techniques for players to indicate consent for moving to ‘higher gear’ on-the-fly, which work with 5 players?

Can we think of information theory in terms of “a measure on set of information”?

In information theory, we deal with the quantities $ I(X;Y), H(X),H(Y), H(X|Y), H(Y|X)$ . These are just numbers, but I intuitively think of them as the “measure” of a set of information.

There is at least one special case where this interpretation is exact: suppose there are independent variables $ V_1,…V_n$ , and the variables $ X,Y,Z$ are tuples of $ V_i$ . Then we can literally think of $ I(X;Y)$ as the measure (entropy) of the intersection of $ X$ and $ Y$ .

But it is not obvious to me whether we can define a more general measure of information, such that mutual information can be interpreted as the measure of the intersection, and analogously for the other information quantities. Is this possible?