Choosing an AI method to recreate a given binary 2D image


If the title wan not very clear, I want a method to take an input image like this,

[[0, 0, 0, 0],  [1, 1, 1, 0],  [1, 1, 1, 0],  [0, 1, 1, 0]] 

and output the 2D coordinates of the 1s of the image (So that I can recreate the image)

I want the output to be sequential because I need to reconstruct the input image pixel by pixel and there are some conditions on the image construction order (e.g. You cannot place a 1 somewhere when it is surrounded by 1s)

The image can change and the number of 1s in the image too.

  1. What is an appropriate AI method to apply in this case?
  2. How should I feed the image to the network? (Will flattening it to 2D affect my need of an output order?)
  3. Should I get the output coordinates one by one or as an ordered 2xN matrix?
  4. If one by one, should I feed the same image for each output or the image without the 1s already filled?

EDIT: The application is a robot creating the image using some kind of building blocks, placing one block after the other


I have tried to apply "NeuroEvolution of Augmenting Topologies" for this using neat-python but was unsuccessful. I am currently looking at RNNs but I am not sure if it is the best choice either.