Can DDD entity rely upon infrastructure library?

Suppose I want this on my User entity:

user.createNewSecurityToken(); 

That means:

public void createNewSecurityToken() {   var buffer = new byte[32];   new RNGCryptoServiceProvider().GetBytes(buffer);  // <--- here's the issue   this.Token = Convert.ToBase64String(buffer); } 

This is not a DI dependency, just a class from the infrastructure. But does that “break” DDD?

The alternative is this:

user.setSecurityToken(token);   // pass it in (probably from a domain service) 

But that leads to anemic entities.

Which is the preferred DDD approach, and what considerations should I take into account when faced with this sort of design decision?

Best Approach for PII and Databases

PII, or Personally Identifiable Information, involves information that may include the following, among other data:

  • First Name
  • Last Name
  • Email Address
  • Phone (any of them)
  • Billing Address
  • Shipping Address

Security researchers and legislation are recommending we encrypt the data. In my case, I’m using PHP7 and MariaDB. So, I can store encrypted PII using the latest recommended and battle-tested encryption APIs. However, now when I need an admin system, I need to be able to:

  • Sort data by PII table columns
  • Do case-insensitive partial keyword searches

Unfortunately, I’m not clear on how to do that smoothly with encrypted data stored in MariaDB (and which might also use PHP7).

System Design question about consistent hashing

Regarding data sharding,
Sharding based on UserID: We can try storing all the data of a user on one server. While storing, we can pass the UserID to our hash function that will map the user to a database server where we will store all of the user’s tweets, favorites, follows, etc. While querying for tweets/follows/favorites of a user, we can ask our hash function where can we find the data of a user and then read it from there. This approach has a couple of issues:

What if a user becomes hot? There could be a lot of queries on the server holding the user. This high load will affect the performance of our service. Over time some users can end up storing a lot of tweets or having a lot of follows compared to others. Maintaining a uniform distribution of growing user data is quite difficult. To recover from these situations either we have to repartition/redistribute our data or use Sharding based on TweetID: Our hash function will map each TweetID to a random server where we will store that Tweet. To search for tweets, we have to query all servers, and each server will return a set of tweets. A centralized server will aggregate these results to return them to the user. Let’s look into timeline generation example; here are the number of steps our system has to perform to generate a user’s timeline:

Our application (app) server will find all the people the user follows. App server will send the query to all database servers to find tweets from these people. Each database server will find the tweets for each user, sort them by recency and return the top tweets. App server will merge all the results and sort them again to return the top results to the user. This approach solves the problem of hot users, but, in contrast to sharding by UserID, we have to query all database partitions to find tweets of a user, which can result in higher latencies.

Consistent Hashing can be used to overcome the issue of “Search Hot words”(people are searching the words frequently) and “Status Hot Words” (People are using the word in the status frequently). My question is how can Consistent Hashing technique help here? In consistent hashing if we hash the word to get a index on the ring since hash function does not get changed if multiple people search the word will end up same server until and unless we are adding any other attribute to hash function. So, how the “Search Hot words” problem gets solved? Similarly, in key-value pair where key is word and value are statusIDs. if any key gets popular in statuses will result more statusIDs in values. How does consistent hashing help?

Which is better approach – using subprocess vs communicating over socket?

I have been working on a python project for sometime which has been structured in following manner:

We have a function like this:

def execute_cmd(cmd):     (out, err) = subprocess.Popen(cmd).communicate()     return (out, err) 

and this function is used over and over to execute binaries of a third party library.
I did a little bit research and understood that we can connect to the third party library over sockets, and communicate by sending JSON payloads.

I am trying to understand which approach is better and why?
From what I understand, everytime we create a subprocess, underlying OS has to create a PCB for it, allocate resources and deallocate resources after completion. However, I see that it is really common practice in python programming to create subprocesses. This has left me completely confused.

It would be really helpful if anyone can point out which approach is better and why?

P.S.: Please let me know if this is not the correct forum to post questions like this.

Exposing services of an existing MVC app as REST API

We have an existing ASP.NET MVC project which has logging, exception handling and OCR services within it. Now we are developing a small SPA project where would like to consume the existing services from our MVC project. My question is what should be the best approach of doing that.

I was considering building an API project within my existing MVC project and expose the existing services from there. But I am not sure if this is the best approach to go with.

I am also keen to know how will the authentication work for my API. We are using OAuth/OIDC SSO with Azure AD for the MVC project and our SPA makes use of ADAL js for authentication. Both these applications are internal to our organization.

Please provide your valuable suggestions. Many Thanks.

Deleting records and readding in MySQL using MyISAM forgets constraints with Hibernate [migrated]

MySQL – MyISAM – Hibernate

First error in my assumption was that my DB had Foreign Keys. (Not as versed in the different engines provided by MySql)

I was testing a multiple id delete and messed up a line and it deleted everything. I was confused since FK should throw a constraint error.

I had copies of the data so I just imported them back in. Upon running my app the objects are no longer tied to each other.

I am unsure where in the process this gets tied together, I assume Hibernate, but I am not confident where.

Does anyone know where this connection gets defined and how to resolve the issue.

What’s the best approach when you need to work off two branches that still have pending PR requests?

Let’s say I have branch-1 (branched off master (major update)) and branch-2 (also branched off master (minor update)).

What is the best approach to do when I need to create another branch, e.g. branch-3 and I need to continue the work from both branches while their PRs are still pending?

And what do I do once both the PRs have been approved?

how to create a safe data model managed using unique_ptr

My data model is essentially a collection of elements. Elements are subclassed (for example, a drawing app might have the different types of geometric primitives). I use unique_ptr to manage the lifetimes of the elements. Basically:

class Model {   // various methods private:   vector<unique_ptr<Element>> _elements; } 

I’d like to design an interface to this data model that is clean and not terribly error prone.

Because I want to ensure the elements are dynamically allocated, I use something like the following to add them to the collection:

template<class T> T& Model::Create() {    auto ptr = make_unique<T>(this);    auto ref = &ptr;    _elements.push_back(move(ptr));    // ... store undo information    return ref; } 

So I can do:

Model m; auto& e = m.Create<MyElement>(); 

which seems cleaner than:

Model m; auto ptr = make_unique<MyElement>(m); // at this point we have an element which // thinks its part of the collection, but isn't m.Add(move(ptr)); 

One problem with Create is the returned reference can dangle if another operation removes the element from the collection (such as an undo). Is there a decent way to handle this? Using shared_ptr and weak_ptr would mean that the lifetime of an Element could be extended beyond the Model.

Another option is to have Element add itself to the Model as follows:

Element::Element(Model* model) {   model->_elements.push_back(unique_ptr<Element>(this)); } 

This would ensure that the data model is in a consistent state as soon as the element is constructed. But elements would need to be constructed using operator new as follows:

auto e = new MyElement(model); 

This seems error prone because someone could try creating an Element on the stack. To my knowledge, there is no way to ensure an object is dynamically allocated.

How can I make my data model as safe as possible?

Where to store side effect state in an event sourcing system

I also had a look at How do I deal with side effects in Event Sourcing? but the solution wasn’t clear to me.

If I store “EmailSent” event in the event stream, I might issue the external request to send the email again, thinking the previous send timed out, moments before the confirmation of the first email being successfully sent arrives.

However, if I never store that, and instead have all sent email IDs persistently stored by the email service, I will never know to stop bothering the email service with my old requests to send a particular email, despite it answering “this email has been previously sent successfully” every time.

Should I do both? This way the system will stop bothering the email service a lot of the time, but when it accidentally bothers it multiple times, only one email will be sent. (Yes, the email service can never be transactional, and I still have to choose “at most once” or “at least once” but having its limited scope and data locality it can give practical results much closer to “once”.)

How do you discover what libraries to use when solving a problem?

I’ve read that experienced software developers tend to use libraries more often than less experienced developers. However, how does one find out about these libraries?

How does one even become aware that they have a problem, that could be solved by a library?

After learning the syntax of a language, do developers tend to spend their time learning the most popular libraries for their specific language on Github? To add to their mental toolbox?

Or do they just Google “library for solving X” as they work on a project?