Is a GIS Tile Server reading from a distributed Postgres database a good production design or not?

We’re preparing to migrate our current two-server architecture to two new servers, and I would like to make the wisest possible decision in how we utilize these machines.

Our current setup has evolved piecemeal, without much planning for the future, and IMO has inefficient deployment of server-side resources–i.e. two systems, each running Postgres, Geoserver, and Drupal. One of them has a MySQL instance feeding its Drupal site, and the other has a Tilestache/Mapnik render stack in addition to a running, but unused Geoserver instance. Since both systems render the same underlying GIS data, all of our ETL tasks are duplicated, with the only difference between them being the geometry’s coordinate system ..I know. 🙂

Some notes on the I/O demands. Both of the Drupal sites are low-pressure homepages with very little write/insert activity. I would characterize the overall demand on both databases as read-intensive. (In fact, two tables are so read-intensive I would like to cache them in RAM if it’s possible.) We do have one insert API, though, that collects data on app usage patterns–for example, a session id is created when our single page app initially loads, then some subsequent interaction is logged, like layers being turned on/off, the measuring of an in-map feature, the type and location of features “clicked” by the user, etc. It makes about 15k inserts on a typical day but needs to handle up to 50k inserts/day based on some of the max counts when grouped by day. However these tables are only used in-house and are infrequently queried. Finally, I expect the tile-rendering itself is probably disk write-intensive.

For obvious reasons I don’t want to duplicate our current architecture. My primary goal is to have a single spatial database using geometry types of the same coordinate system in order to simplify our ETL.

So my main question is whether it is better to have all the data served from one machine configured and tuned as a stand-alone database? And whether the tile rendering software will perform better on the same system as that database, or if the renderers should be installed to the other system, which will run the web server (Apache). Worth mentioning: Geoserver and Tilestache have to be configured to handle web requests forwarded through Apache, Geoserver being on Tomcat::8080, and Tilestache using mod_python.

If we use a distributed database architecture, my concern is that THE WIRE between the two servers will be the chokepoint, especially considering that the tile renderers will be making very frequent requests for geometry collections, which are a heavy data type. Of course I would also appreciate any other insights into the best architecture.

Cannot find MissingFeature to remove it from a Content Database

We are trying to clean up our Content Databases to remove errors such as MissingFeature. However we can’t find these mission features using normal means, or any article I’ve yet found.

An example is

Category        : MissingFeature Error           : True UpgradeBlocking : False Message         : Database [WSS_Content_DB1] has reference(s) to a missing feature: Id =                   [2510d73f-7109-4ccc-8a1c-314894deeb3a]. Remedy          : Feature (Id = [2510d73f-7109-4ccc-8a1c-314894deeb3a]) is referenced in database [WSS_Content_DB1],                   but isn't installed on the current farm. The missing feature might cause upgrade to fail. If                   necessary, please install any solution that contains the feature and restart upgrade. Locations       : 

However the only place I can find a reference to this Feature Id is in the database itself under the FeatureTracking table.

I can directly delete these entries from that table and the errors go away.

Is there a supported method to remove these references to remove these errors from a Test-SPContentDatabase test?

Can my server handle 12,000 database requests per-minute?

First of all, Apologies if this is a silly question. I’ve never really had to manage servers and databases, on a “large-scale”, that is.

ANYWAY, on to the question. I am trying to figure out if our current server can handle 12,000 database requests per minute. (once again, I don’t know if this a lot. I assume it’s mid-range). I am estimating that 2/3 of the 12,000 requests will be simple SELECT queries from super small tables. No more than 20,000 rows in a table- I’ve made a point to prune them on a regular basis. LAMP Stack.

Below are the server hardware and software specs:

Processors- Intel Haswell 2095.050 MHz

Memory – 7.45gb useable

Storage – 80GB SSD

OS- Ubuntu, CentOS 7

DB

MySQL

V5.7.25

PHP – 7.2.7

The database is stored on the same server as where files are being served.

If this server is capable of this, how much further can the server be pushed?

Thank you in advance (And sorry if this seems to be a dumb question)

Which is more secure: saving reference link to file in file system in the database, or save the entire file into the database?

We have forms where we get clients to upload images of very sensitive information (passport, signature, etc.). We will store their written information into a database. However, there is the issue of the images. Usually, I hear people saving the images in a filesystem and having a link to it in its cell as a reference. I know I can also save the entire image as a blob in the database.

My question is on the matter of which would be more secure; both from the outside and the inside. This isn’t an issue of users uploading malicious files, but more about where would I hide them better?

Keeping objects in RAM vs more queries to the database

So this is my DataStructure:

Project   - Name   - ID   - Image   - History        - User        - Comment 

When my application first starts it is pulling all projects with all details.

obviously very slow -high RAM (mostly because of images)

For now it is working but as projects add up performance is getting more worse every day.

I’ve done some research but cannot figure out what would be the best.

The options I think I have:

  1. Lazy-Loading on ProjectDetails (load details when user clicks on it)
  2. Using Thumbnails for Images (showing the real image only in detail-mode)

Questions for Option 1:

  1. Wouldn’t this increase the query-count drasticly ?
  2. What is better after all keeping things in RAM or query more often ?
  3. What do I do with the Data I’ve loaded lazy ? Do I dispose everything when the details collapse or should I store them for some time in case the user opens it again ?

I’m planning on implementing both but I’m afraid of the unanswered questions mentioned above.

Tables missing from MySQL database after removing then restoring the mysql data direcory

I Carried out the following operations.

make mysqldump of testdb

# Stop service /etc/init.d/mysql stop # Create copy of directory cd /var/lib cp -R mysql mysql.backup # Delete datafile on disk rm -rf mysql/testdb # Start service /etc/init.d/mysql start  

At this stage I was going to import testdb again from the mysqldump however we wanted to roll back for an unrelated reason so I restored the mysql file

/etc/init.d/mysql stop rm -rf mysql/ cp -R mysql.backup mysql # Restore ownership and permissions chown -R mysql:mysql mysql chmod -R 700 mysql 

At this point testdb shows up if I run show databases; and it shows tables but doesn’t appear to contain data any more.

Is this recoverable or am I wasting my time? I have the option of refreshing the db from prod

Github WordPress MySQL database

I’m kinda new to this so sorry if my question is kinda dumb. I am currently hosting a website on GitHub, and I wanted to attack WordPress to it. In the manual I found, I saw that I have to make a MySQL database on the server where I’m hosting the website for the WordPress plugins. Can anyone of you answer if you can do that on GitHub and if anyone has done it, so they can give me any pieces of advice/instructions on how to do it? Thank you all very much I highly appreciate that – turb0

Saving OAuth2 credentials to database with SQLAlchemy

I’m writing web application that uses Google Proximity Beacon API to interact with beacon data. In order to access that data, I need to grant access to use beacons’ information for my web application with my Google account. When my app is authorized, it receives credentials needed to call desired API. Because I don’t want to call for these credentials each time I need to call API, I’ve created SQLAlchemy models for storing them.

scopes_m2m = db.Table(     'scopes',     db.Column(         'scope_id',         db.Integer,         db.ForeignKey('oauth2_scopes.id'),         primary_key=True),     db.Column(         'credential_id',         db.Integer,         db.ForeignKey('oauth2_credentials.id'),         primary_key=True), )   class OAuth2Scope(db.Model):     '''     Google OAuth2 scope for resource     '''     __tablename__ = 'oauth2_scopes'      id = db.Column(db.Integer, primary_key=True)     scope = db.Column(db.String(100), unique=True, nullable=False)      @classmethod     def from_scope(cls, scope):         ''' Create OAuth2Scope model object if not exists          :param scope: str - scope name         :return: OAuth2Scope object         '''         scopes = {row.scope: row for row in cls.query.all()}         if not scope in scopes.keys():             return cls(scope=scope)         return scopes[scope]   class OAuth2Credential(db.Model):     '''     Google OAuth2 credentials for accessing Proximity Beacon API     '''     __tablename__ = 'oauth2_credentials'      id = db.Column(db.Integer, primary_key=True)     token_uri = db.Column(db.String(100), nullable=False)     client_id = db.Column(db.String(100), unique=True, nullable=False)     client_secret = db.Column(db.String(100), nullable=False)     token = db.Column(db.String(250), nullable=False)     refresh_token = db.Column(db.String(250), nullable=True)     scopes = db.relationship(         'OAuth2Scope',         secondary=scopes_m2m,         backref=db.backref('credentials', lazy=True),         lazy='subquery')      @classmethod     def from_creds(cls, credentials):         ''' Create model object from google.oauth2.credentials.Credentials          :param credentials: google.oauth2.credentials.Credentials object         :return: OAuth2Credential object         '''         client_id = credentials.client_id         scopes = credentials.scopes         token = credentials.token         refresh_token = credentials.refresh_token         client_secret = credentials.client_secret         token_uri = credentials.token_uri         oauth2_credential = OAuth2Credential.query.filter_by(             client_id=client_id).first()         oauth2_scopes = [OAuth2Scope.from_scope(scope) for scope in scopes]         if not oauth2_credential:             oauth2_credential = OAuth2Credential(                 client_id=client_id,                 token_uri=credentials.token_uri,                 client_secret=credentials.client_secret,                 token=credentials.token,                 refresh_token=credentials.refresh_token)             for oauth2_scope in oauth2_scopes:                 oauth2_credential.scopes.append(oauth2_scope)         else:             oauth2_credential.token_uri = token_uri             oauth2_credential.token = token             oauth2_credential.refresh_token = refresh_token             oauth2_credential.client_secret = client_secret             for oauth2_scope in oauth2_scopes:                 if oauth2_scope in oauth2_credential.scopes:                     oauth2_credential.scopes.append(oauth2_scope)         return oauth2_credential      def get_creds(self):         ''' Get google.oauth2.credentials.Credentials object         from model's object          :return: google.oauth2.credentials.Credentials object         '''         data = {             'token_uri': self.token_uri,             'client_id': self.client_id,             'client_secret': self.client_secret,             'token': self.token,             'refresh_token': self.refresh_token,             'scopes': [s.scope for s in self.scopes]         }         return Credentials(**data)      def to_dict(self):         return {             'token_uri': self.token_uri,             'client_id': self.client_id,             'client_secret': self.client_secret,             'token': self.token,             'refresh_token': self.refresh_token,             'scopes': [s.scope for s in self.scopes]         } 

It’s important to note that each credential stores list of scopes that are links to particular API that my app has granted access to. Because many scopes can be used by many credentials, I’ve created many-to-many relationship between my scope OAuth2Scope model and OAuth2Credential model. To create new instance of OAuth2 credentials I call from_creds class method of OAuth2Credential` and here is the place where thinks look a little bit complicated:

  • First, I’m verifying if credentials already exist in db:
oauth2_credential = OAuth2Credential.query.filter_by(             client_id=client_id).first() 
  • Then I create list of scope objects based from scopes list provided by Credentials object.
  • If there is credential with provided client_id than I update them (in case there are some changes, for example new scopes are added) if not then I create new OAuth2Credential instance.

For me this code looks too complicated because it requires querying data too often which will cause performance issues in the future.

Any suggestions what can be improved here?