How would a CPU designed purely for functional programming be different?

CPU’s are to an extent designed with in mind the software that people will write for it, implicitly or explicitly.

It seems to me that if you look at the design of instruction set architectures, they are very “imperative”, in the sense that each instruction encodes an imperative style command. It also seems to me that the current instruction set architectures have evolved partly based on the type of code programmers produce.

If one would design a CPU from scratch, knowing that it would only ever run programs written in a functional programming style, how would that CPU be designed differently from existing CPU’s?

Are the definitions of constructs in terms of lambda terms issues in implementation/design or uses of functional languages?

In Lambda Calculus, natural numbers, boolean values, list processing functions, recursion, if function are defined in terms of lambda terms. For example, natural numbers are defined as Church numerals, and recursion is defined in terms of a fixed point of a function.

Functional languages are said to be based on Lambda Calculus.

Who shall be concerned about the above concepts in terms of lambda terms: the implementer/designer of the languages, and/or programmers in the languages?

  • Do functional programming languages define/implement the above concepts in terms of lambda terms?

  • As programmers in regular functional programming languages (such as Haskell, Lisp, ML), is it correct that the above concepts are always given in the same way as in imperative languages, and we never have to understand or deal with their definitions in terms of lambda terms?

Thanks.

How does functional programming handle the situation where the same object is referenced from multiple places?

I am reading and hearing that people (also on this site) routinely praise the functional programming paradigm, emphasising how good it is to have everything immutable. Notably, people propose this approach even in traditionally imperative OO languages, like C#, Java or C++, not only in purely functional languages like Haskell that force this on the programmer.

I find it hard to understand, because I find mutability and side-effects… convenient. However, given how people currently condemn side effects and consider it a good practice to get rid of them wherever possible, I believe that if I want to be a competent programmer I have to start my jorney towards some better understanding of the paradigm… Hence my Q.

One place when I find problems with the functional paradigm is when an object is naturally referenced from multiple places. Let me describe it on two examples.

The first example will be my C# game I’m trying to make in my spare time. It’s turn-based web game where both players have teams of 4 monsters and can send a monster from their team to the battlefield, where it will face the monster sent by the opposing player. Players can also recall monsters from the battlefield and replace them with another monster from their team (similarily to Pokemon).

In this setting, a single monster can be naturally referenced from at least 2 places: a player’s team and the battlefield, which references two “active” monsters.

Now let’s consider the situation when one monster is hit and looses 20 health points. Within the brackets of the imperative paradigm I modify this monster’s health field to reflect this change – and this is what I’m doing now. However, this makes the Monster class mutable and the related functions (methods) impure, which I guess is considered a bad practice as of now.

Even though I gave myself the permission to have the code of this game in a lesser-than-ideal state in order to have any hopes of actually finishing it at some point of the future, I would like to know and understand how should it be written properly. Therefore: If this is a design flaw, how to fix it?

In the functional style, as I understand it, I’d instead make a copy of this Monster object, keeping it identical to the old one except for this one field; and the method suffer_hit would return this new object instead of modifying the old one in place. Then I’d likewise copy the Battlefield object, keeping all its fields the same except for this monster.

This comes with at least 2 difficulties:

  1. The hierarchy can easily be much deeper than this simplified example of just Battlefield -> Monster. I’d have to do such copying of all fields except one and returning a new object all the way up this hierarchy. This would be boilerplate code which I find annoying especially since functional programming is supposed to reduce boilerplate.
  2. A much more severe problem, however, is that this would lead to the data being out of sync. The field’s active monster would see its health reduced; however, this same monster, referenced from its controlling player Team, would not. If I instead embraced the imperative style, every modification of data would be instantly visible from all other places of code and in such cases like this one I find it really convenient – but the way I’m getting things this is precisely what people say is wrong with the imperative style!
    • Now it would be possible to take care of this issue by taking a journey to the Team after each attack. This is extra work. However, what if a monster can suddenly be later referenced from even more places? What if I come with an ability that, for example, lets a monster focus on another monster that is not necessarily on the field (I’m actually considering such an ability)? Will I surely remember to take also a journey to focused monsters immediatelly after each attack? This seems to be a time bomb that will explode as the code gets more complex, so I think it is a no solution.

An idea for a better solution comes from my second example, when I hit the same problem. In the academia we were told to write an interpreter of a language of our own design in Haskell. (This is also how I was forced to start understanding what FP is). The problem showed up when I was implementing closures. Once again the same scope could now be referenced from multiple places: Through the variable that holds this scope and as the parent scope of any nested scopes! Obviously, if a change is made to this scope via any of the references pointing to it, this change must also be visible through all other references.

The solution I came with was to assign each scope an ID and hold a central dictionary of all scopes in the State monad. Now variables would only hold the ID of the scope they were bound to, rather than the scope itself, and nested scopes would also hold the ID of their parent scope.

I guess the same approach could be attempted in my monster battling game… Fields and teams do not reference monsters; they instead hold IDs of monsters that are saved in a central monster dictionary.

However, I can once again see a problem with this approach that prevent me from accepting it without hesistation as the solution to the issue:

It once again is a source of boilerplate code. It makes one-liners necessarily 3-liners: what previously was a one-line in-place modification of a single field now requires (a) Retrieving the object from the central dictionary (b) Making the change (c) Saving the new object to the central dictionary. Also, holding ids of objects and central dictionaries instead of having references increases complexity. Since FP is advertised to reduce complexity and boilerplate code, this hints I’m doing it wrong.

I was also going to write about asecond problem that seems much more severe: This approach introduces memory leaks. Objects that are unreachable will normally be garbage collected. However, objects held in a central dictionary cannot be garbage collected, even if no reachable object references this particular ID. And while theoretically careful programming can avoid memory leaks (we could take care to manually remove each object from the central dictionary once it is no longer needed), this is error prone and FP is advertised to increase the correctness of programs so once again this may not be the correct way.

However, I found out in time that it rather seems to be a solved problem. Java provides WeakHashMap that could be used to solve this issue. C# provides a similar facility – ConditionalWeakTable – although according to the docs it is meant to be used by compilers. And in Haskell we have System.Mem.Weak.

Is storing such dictionaries the correct functional solution to this problem or is there a simpler one that I’m failing to see? I imagine that the number of such dictionaries can easily grow and badly; so if these dictionares are supposed to also be immutable this can mean a lot of parameter-passing or, in languages that support that, monadic computations, since the dictionaries would be held in monads (but once again I’m reading that in purely functional languages as little code as possible should be monadic, while this dictionary solution would place almost all code inside of the State monad; which once again makes me doubt if this is the correct solution.)

Is lack of functional requirements agile?

Nowadays everybody wants to be agile. In every team I worked with, the shape of agile was different. Some things are common – like daily stand-ups or planning, but other parts vary significantly.

In my current team there’s one detail which I find disturbing. It’s lack of functional requirements. Not only there’s no written form of expectations but also in the tasks it’s rather vaguely defined what needs to be done.

The project goal is to rewrite of the old system using new technologies. Old system doesn’t have any reasonable documentation as well. For sure up to date one doesn’t exist. Business owners’ description of requirements is – let’s do it in new implementation the same way as old. It seems reasonable but it’s not. Old system is kind of spaghetti code and extracting business requirements from it is costly. It seems that the situation affects planning in a negative way. For sure it’s prone to mistakes and bugs in new implementation (omitting some details).

Therefore I’m thinking – is it truly agile to have no business requirements in case of rewriting old system?

C++ Functional Programming: How can I convert nested loops into functional approach?

I first created my function using an imperative approach. This is just how I naturally write code most of the time. I then tried to think about the problem using a “What I want to accomplish?” approach, rather than “What do I want it to do?” In the end I think I still ended up with an imperative approach, just using STL algorithms as loops.

The code itself pre-calculates the winning combinations for an n-sized tic tac toe board.

Imperative approach:

std::vector<std::vector<int>> rules3(const int x){     using namespace std;      vector<vector<int>> seqs;     for(int n=0; n<x; ++n){         vector<int> seq;         for(int m=0; m<x; ++m){             seq.push_back(n*x+m);         }         seqs.push_back(seq);     }     for(int n=0; n<x; ++n){         vector<int> seq;         for(int m=0; m<x; ++m){             seq.push_back(n+x*m);         }         seqs.push_back(seq);     }     vector<int> seq;     for(int n=0; n<x; ++n){         seq.push_back(n*x+n);     }     seqs.push_back(seq);     seq.clear();     for(int n=0; n<x; ++n){         seq.push_back(x-1+n*(x-1));     }     seqs.push_back(seq);      return seqs; } 

Getting more functional:

std::vector<std::vector<int>> rules2(const int x){     using namespace std;      vector<vector<int>> seqs;     vector<int> iter(x);     iota(iter.begin(), iter.end(), 0);      for(auto n: iter){         vector<int> seq;         for(auto m: iter){             seq.push_back(n*x+m);         }         seqs.push_back(seq);     }     for(auto n: iter){         vector<int> seq;         for(auto m: iter){             seq.push_back(n+x*m);         }         seqs.push_back(seq);     }     vector<int> seq;     seq.clear();     for(auto n: iter){         seq.push_back(n*x+n);     }     seqs.push_back(seq);     seq.clear();     for(auto n: iter){         seq.push_back(x-1+n*(x-1));     }     seqs.push_back(seq);      return seqs; } 

Functional approach? (still feels imperative):

std::vector<std::vector<int>> rules1(const int x){     using namespace std;      vector<vector<int>> seqs;     vector<int> iter(x);     iota(iter.begin(), iter.end(), 0);      transform(iter.begin(), iter.end(), back_inserter(seqs), [&](const int& n){         vector<int> seq;         transform(iter.begin(), iter.end(), back_inserter(seq), [&](const int& m){             return n*x+m;         });         return seq;     });     transform(iter.begin(), iter.end(), back_inserter(seqs), [&](const int& n){         vector<int> seq;         transform(iter.begin(), iter.end(), back_inserter(seq), [&](const int& m){             return n+x*m;         });         return seq;     });     vector<int> seq;     transform(iter.begin(), iter.end(), back_inserter(seq), [&](const int& n){         return n*x+n;     });     seqs.push_back(seq);     seq.clear();     transform(iter.begin(), iter.end(), back_inserter(seq), [&](const int& n){         //return (x-n-1)*x+n; // 6,4,2         return x-1+n*(x-1); // 2,4,6     });     seqs.push_back(seq);      return seqs; } 

Output when printing:

rules3 std::vector(0, 1, 2) std::vector(3, 4, 5) std::vector(6, 7, 8) std::vector(0, 3, 6) std::vector(1, 4, 7) std::vector(2, 5, 8) std::vector(0, 4, 8) std::vector(2, 4, 6) rules2 std::vector(0, 1, 2) std::vector(3, 4, 5) std::vector(6, 7, 8) std::vector(0, 3, 6) std::vector(1, 4, 7) std::vector(2, 5, 8) std::vector(0, 4, 8) std::vector(2, 4, 6) rules1 std::vector(0, 1, 2) std::vector(3, 4, 5) std::vector(6, 7, 8) std::vector(0, 3, 6) std::vector(1, 4, 7) std::vector(2, 5, 8) std::vector(0, 4, 8) std::vector(2, 4, 6) 

In Tomasulo’s algorithm, how do reservation stations recognise which results are directed to them, especially where functional units are pipelined?

I am currently researching instruction level parallelism in CPUs and have come across Tomasulo’s algorithm for dynamic scheduling.

As I understand it so far, once a functional unit computes a result, it is sent on the CDB from which it can be received by reservation stations waiting for this result. How do these reservation stations recognise that the result is one they are waiting for? I assume results sent on the CDB are tagged with the ID of the reservation station involved in computing them, but then how is this implemented where the functional units are pipelined or there are multiple reservation stations per functional unit? Is the ID of the reservation station corresponding to the instruction being executed passed through the functional unit’s pipeline to be used as a tag when the result is sent?

Config file for Python code with functional dependencies between values

I need a config file for some code in Python with functional dependencies between values and possible overrides from command line arguments. What’s the best approach for it?

At the moment the config is like this:

PARAM_1 = 128 PARAM_2 = PARAM_1 * 2  PARAM_3 = 1024 PARAM_4 = (PARAM_3 / PARAM_1 - 1) * (PARAM_3 / PARAM_1) + 1 

I want to be able to override from command line the value of PARAM_1 and make other values change as well. I.e. if I pass --PARAM_1 64 two parameters above should change.

If I put it into config.py file and use import config.py, then updating PARAM_1 value from command line argument does not affect dependent parameters.

If I put it into an XML file, it looks ugly but should work using eval‘s (it’s an inhouse project so security vulnerability is not an issue).

Is there a good, established way for such a configuration file?

Config file for Python code with functional dependencies between values

I need a config file for some code in Python with functional dependencies between values and possible overrides from command line arguments. What’s the best approach for it?

At the moment the config is like this:

PARAM_1 = 128 PARAM_2 = PARAM_1 * 2  PARAM_3 = 1024 PARAM_4 = (PARAM_3 / PARAM_1 - 1) * (PARAM_3 / PARAM_1) + 1 

I want to be able to override from command line the value of PARAM_1 and make other values change as well. I.e. if I pass --PARAM_1 64 two parameters above should change.

If I put it into config.py file and use import config.py, then updating PARAM_1 value from command line argument does not affect dependent parameters.

If I put it into an XML file, it looks ugly but should work using eval‘s (it’s an inhouse project so security vulnerability is not an issue).

Is there a good, established way for such a configuration file?

Find the top N numbers in an endless stream of integers in a functional programming style

I came across this interesting Scala problem and not sure how to solve it:

class TopN {   def findTopN(n: Int)(stream: Stream[Int]): List[Int] = {    ???   } }  

This is a test of abstract data engineering skills.

The function findTopN(…) in TopN is supposed to find the top N highest unique integers in a presumed endless stream of integers. To process the Stream of Int, you can only hold a few values in memory at a given time. Therefore, a memory efficient way to process this list is required.