Observation regarding finding predecessor(or successor) of a node in a Binary Search Tree(BST)

I have some propositions regarding BSTs , please can someone confirm whether they are true or false:

Situation :

1.Suppose we have a node $ n_1$ with a value $ val_1$ i.e $ n_1(val_1)$

2.We wish to find the predecessor node of $ n_1$ , with respect to inline traversal (That is the node $ n_2(val_2)$ with $ val_2$ being the greatest number $ val_2 < val_1$ )

3.Assume for simplicity that the BST dosen’t have any repetition

Proposition 1: Let $ n_1$ have two children .Then $ n_2$ has atmost one child i.e the number of children of $ n_2$ can’t be two

Proposition 2: Let $ n_1$ have only one child and $ n_1$ is not root ,then also $ n_2$ has atmost one child.

Please ascertain whether these propositions are true , but if false can someone provide a counter example ?

proof and intuition behind given observation

Consider following problem:

Given an undirected tree answer following type of queries. (No. of queries can be as high as $ 10^5$ )

$ \text{LCA}(r, u, v)$ : Find the Lowest Common Ancestor of vertices $ u$ and $ v$ assuming vertex $ r$ as the root.


Now, in solution it’s given that answer will always be one this: $ r, u, v, \text{LCA}(r, u), \text{LCA}(r, v), \text{LCA}(u, v).$

Where $ \text{LCA}(u,v)$ denotes Lowest Common Ancestor of vertices $ u$ and $ v$ if we assume vertex number $ 1$ as the root.


So I’m looking for a proof for claim made in a solution.

Storing earth observation derived attributes for polygons

I want to generate a number of vegetation related indices (NDVI, NDTI, etc., min/max/avg/std (pixel values)) from earth observation data for a large number of polygons during the entire growing season, roughly between March and November each year. The figures are about as follows: I have about one million polygons. Every day I will get new EO data (Sentinel-1/Sentinel-2) for about 20% of them. For each of these polygons I generate 10-20 indices based on the EO data. This gives me appr. 2-4 million records, every day. That makes appr. 500 – 1000 million during just one growing season (I´ll need to store at least 5 seasons).

The infrastructure within which I have to operate is predetermined and will have to be something based on either Oracle(Locator) or PostGIS. Personally I´d prefer PostGIS since OpenSource allows for much more flexibility.

My initial idea is to create a PostGIS database, which is partitioned based on year value. I thought about creating one attribute table where I create a new row for each date and each interpreted property (polygon geometries+id are stored in a separate table). It would look something like this:

table_structure

Since I have to do different interpretations depending on the geographical zone where the polygon lies, I also thought about creating a separate table for each zone. This will however make querying more difficult.

My questions are hence:

  1. Does Oracle (Locator) or PostGIS as base for all this make sense at all or do I need to start asking for an account at an ESA DIAS/Google Earth Engine/AWS in order to be able to use cloud solutions?
  2. If this indeed makes sense, wht is your opinion on my planned table structure?

All comments & help welcome.

Keep Data Pure During On-site User Observation

I am going to be running small on-site user observations for UX research for a mobile application that uses voice.

I have never conducted a user observation session, I’ve read a lot of articles on NNGroup and other UX and usability sites. I’ve read how to run a User Observation session like this article and a course on UX research and data on Lynda.com but nothing has given me a conclusive or comfortable answer to: How to / should I prompt and guide the user to use the product during a on-site visit?

If I want to see how they use the product after a sales call, should I prompt the user to use specific features of the app or should I just watch what they do with the app without saying a word?

If they ask me how to use a feature should I tell them? Of course this is based on what the observation is trying to achieve. I want to see how a sales person would use the app without prompting them. But let’s say they’ve never used the app and have no context – do I provide this?

I can imagine a user would get frustrated if they had no idea what I expect of them. If I tell them “I want you to add details on your client to the app” is that interfering with the study? I want to see if they add details to their client and what troubles they get when they try to do so.

I know some of these questions are a bit subjective, but keeping data honest is important and I am looking for the best way to do so.

Last Observation Carried Forward in python with datetime

I have this event dataset and while retrieving it only recorded the changes and I want these changes to be converted to a uniform time series. The data is recorded at 12 hour time interval. The retrieval_time is an object and start_time is datetime64.

   ID        Count  retrieval_time                start_time    100231380 70     2017-10-11T23:30:00.000+10:30 21/10/17 23:30    100231380 70     2017-10-12T11:30:00.000+10:30 21/10/17 23:30    100231380 72     2017-10-12T23:30:00.000+10:30 21/10/17 23:30    100231380 72     2017-10-13T11:30:00.000+10:30 21/10/17 23:30    100231380 73     2017-10-13T23:30:00.000+10:30 21/10/17 23:30    100231380 74     2017-10-14T11:30:00.000+10:30 21/10/17 23:30    100231380 74     2017-10-14T23:30:00.000+10:30 21/10/17 23:30    100231380 74     2017-10-15T11:30:00.000+10:30 21/10/17 23:30    100231380 77     2017-10-15T23:30:00.000+10:30 21/10/17 23:30    100231380 83     2017-10-16T11:30:00.000+10:30 21/10/17 23:30    100231380 85     2017-10-16T23:30:00.000+10:30 21/10/17 23:30    100231380 85     2017-10-17T11:30:00.000+10:30 21/10/17 23:30    100231380 90     2017-10-17T23:30:00.000+10:30 21/10/17 23:30    100231380 90     2017-10-18T11:30:00.000+10:30 21/10/17 23:30    100231380 93     2017-10-18T23:30:00.000+10:30 21/10/17 23:30    100231380 99     2017-10-19T23:30:00.000+10:30 21/10/17 23:30    100231380 104    2017-10-20T23:30:00.000+10:30 21/10/17 23:30    100231380 117    2017-10-21T23:30:00.000+10:30 21/10/17 23:30 

I want to be able to make it consistent for example in last 3 rows, from 19/10/2017 in retrieval time, there is no recorded data for 11:30am. I want to be able to add a row and replace it with last observation for entire row.

I want to output to be something like this..

   ID        Count  retrieval_time                start_time    100231380 70     2017-10-11T23:30:00.000+10:30 21/10/17 23:30    100231380 70     2017-10-12T11:30:00.000+10:30 21/10/17 23:30    100231380 72     2017-10-12T23:30:00.000+10:30 21/10/17 23:30    100231380 72     2017-10-13T11:30:00.000+10:30 21/10/17 23:30    100231380 73     2017-10-13T23:30:00.000+10:30 21/10/17 23:30    100231380 74     2017-10-14T11:30:00.000+10:30 21/10/17 23:30    100231380 74     2017-10-14T23:30:00.000+10:30 21/10/17 23:30    100231380 74     2017-10-15T11:30:00.000+10:30 21/10/17 23:30    100231380 77     2017-10-15T23:30:00.000+10:30 21/10/17 23:30    100231380 83     2017-10-16T11:30:00.000+10:30 21/10/17 23:30    100231380 85     2017-10-16T23:30:00.000+10:30 21/10/17 23:30    100231380 85     2017-10-17T11:30:00.000+10:30 21/10/17 23:30    100231380 90     2017-10-17T23:30:00.000+10:30 21/10/17 23:30    100231380 90     2017-10-18T11:30:00.000+10:30 21/10/17 23:30    100231380 93     2017-10-18T23:30:00.000+10:30 21/10/17 23:30    100231380 93     2017-10-19T11:30:00.000+10:30 21/10/17 23:30    100231380 99     2017-10-19T23:30:00.000+10:30 21/10/17 23:30    100231380 99     2017-10-20T11:30:00.000+10:30 21/10/17 23:30    100231380 104    2017-10-20T23:30:00.000+10:30 21/10/17 23:30    100231380 104    2017-10-21T11:30:00.000+10:30 21/10/17 23:30    100231380 117    2017-10-21T23:30:00.000+10:30 21/10/17 23:30 

I also want to know how to format the retrieval_time and start_time to make it similar to be able to compare it.

And, I want some generic solution as I have aggregated grouped data for multiple events and time interval is the same 12 hours, however, the retrieval_time and start_time is different for all the events.

Thanks.