Streaming algorithm for counting triangles in a graph

As described in the reference, the algorithm (see at the bottom) supposes to output an estimator $ \hat T$ for the # of triangles in a given graph $ G = (V, E)$ , denoted $ T$ . It is written that "it can easily be shown" that $ $ E[\hat T] = T $ $ But unfortunately, I’m not seeing it. Trying to analyze $ E[\hat T]$ , I think as follows:

  • At line 1, denote the probability to randomly (and uniformly) choose an edge which is part of a triangle as $ p$ . Since triangles can share edges, $ $ \frac T m \le p \le \frac {3T} m $ $ For example, consider the following case:

    enter image description here

    The central triangle doesn’t add new edges to the # of possibilities to choose an edge which is part of a triangle. You can imagine a different configuration, in which there are only the 3 outer triangles and they don’t touch each other (in this configuration, we won’t see the central 4th triangle). In both cases ((case i) 4 triangles as seen in the image; (case ii) 3 disjoint triangles), the probability to choose an edge which is part of a triangle is 1 (although the # of triangles is different).

  • At line 2, the probability to choose uniformly at random a vertex which "closes a triangle" with the edge from the previous step is exactly $ \frac 1 {n-2}$ .

Therefore I only see that

$ $ T \le E[\hat T] \le 3T $ $

What am I missing?

Another question I have is regarding line 3. The stream is ordered, and we first pick a random edge $ (u, v)$ (line 1), then a random vertex $ w$ from $ V \backslash \{u, v\}$ (line 2). I feel that the analysis should take into account that at line 3 we check whether $ (u, w)$ and $ (v, w)$ appear after $ (u, v)$ in the stream. Maybe after I’ll understand the answer to my first question, it will be clearer.


  1. Pick an edge $ (u, v)$ uniformly at random from the stream.
  2. Pick a vertex $ w$ uniformly at random from $ V \backslash \{u, v\}$
  3. If $ (u, w)$ and $ (v, w)$ appear after $ (u, v)$ in the stream, then output $ m(n-2)$ . Else, output $ 0$ .

Also, although I didn’t see it written, I believe there’s an assumption that $ V$ is known ahead.

Reference: Data streams lecture notes by Prof. Amit Chakrabarti, section "15.3 Triangle Counting",

Best regards