We’re preparing to migrate our current two-server architecture to two new servers, and I would like to make the wisest possible decision in how we utilize these machines.
Our current setup has evolved piecemeal, without much planning for the future, and IMO has inefficient deployment of server-side resources–i.e. two systems, each running Postgres, Geoserver, and Drupal. One of them has a MySQL instance feeding its Drupal site, and the other has a Tilestache/Mapnik render stack in addition to a running, but unused Geoserver instance. Since both systems render the same underlying GIS data, all of our ETL tasks are duplicated, with the only difference between them being the geometry’s coordinate system ..I know. 🙂
Some notes on the I/O demands. Both of the Drupal sites are low-pressure homepages with very little write/insert activity. I would characterize the overall demand on both databases as read-intensive. (In fact, two tables are so read-intensive I would like to cache them in RAM if it’s possible.) We do have one insert API, though, that collects data on app usage patterns–for example, a session id is created when our single page app initially loads, then some subsequent interaction is logged, like layers being turned on/off, the measuring of an in-map feature, the type and location of features “clicked” by the user, etc. It makes about 15k inserts on a typical day but needs to handle up to 50k inserts/day based on some of the max counts when grouped by day. However these tables are only used in-house and are infrequently queried. Finally, I expect the tile-rendering itself is probably disk write-intensive.
For obvious reasons I don’t want to duplicate our current architecture. My primary goal is to have a single spatial database using geometry types of the same coordinate system in order to simplify our ETL.
So my main question is whether it is better to have all the data served from one machine configured and tuned as a stand-alone database? And whether the tile rendering software will perform better on the same system as that database, or if the renderers should be installed to the other system, which will run the web server (Apache). Worth mentioning: Geoserver and Tilestache have to be configured to handle web requests forwarded through Apache, Geoserver being on Tomcat::8080, and Tilestache using mod_python.
If we use a distributed database architecture, my concern is that THE WIRE between the two servers will be the chokepoint, especially considering that the tile renderers will be making very frequent requests for geometry collections, which are a heavy data type. Of course I would also appreciate any other insights into the best architecture.