work with sets, filters and cycles

Im trying to represented a set of data with a mathematical expression but I don’t know what is the best method for this.

In programming you have data set like

var x = [1,2,3,4] 

and you can apply filter like:

x.filter( element => element > 2 ) 

or loops like:

x.foreach( element => { some filter conditions } ) 

exists a mathematical expression for this cases?

An apology if the question is very ambiguous, but I would like your guidance to begin to understand programming processes at a more abstract level and to be able to represent cases with clear and universal mathematical expressions that can later be translated into programming languages.

Optimization of a query to retrieve records randomly with multiple joins and filters

I have the following schema:

DB Schema

This question was posted also in StackOverflow, but I want to consult also to specialists more focused on DB administration because the nature of my project. Sorry if this is a mistake

Right now, the table Property hold more than 70K records. I’m developing an update to support more than 500 concurrent sessions. The application will support a map a to make the searches, that’s why GeoLocation declares Coordinate as geography data type. Now we have a big problem, because the response time for some queries (the most important ones) is very slow. I mean, the application has to return around 1000 records at once if there are that quantity of results for the specified parameters.

The parameters are distributed on all the tables of the schema (actually, it’s a portion of the schema). Being Features a table which holds all the principal “characteristics” of the properties (# of bedrooms, # of garages, etc).

With that on mind, the query that is taking so much time right now is the following:

DECLARE @cols NVARCHAR(MAX), @query NVARCHAR(MAX);  DECLARE @properties TABLE(     [ID] INT )  INSERT INTO @properties     SELECT p.[Id]     FROM[Property] p     INNER JOIN[GeoLocation] AS[g]          ON[p].[Id] = [g].[PropertyId]     INNER JOIN[PropertyFeature] AS[pf]          ON[pf].[PropertyId] = [p].[Id]     INNER JOIN[Feature] AS[f]          ON[pf].[FeatureId] = [f].[Id]     WHERE[g].[Address] IS NOT NULL AND(([g].[Address] <> N'') OR[g].[Address] IS NULL)         AND[pf].[FeatureId] IN(             Select ID from feature where featuretype = 1)     GROUP BY p.Id, p.ModificationDate     ORDER BY [p].ModificationDate DESC, newid()     OFFSET 0 ROWS     FETCH NEXT 1000 ROWS ONLY  DECLARE @features TABLE(     [Name] NVARCHAR(80) )  INSERT INTO @features     select Name from feature where FeatureType = 1  CREATE TABLE #temptable (     Id INT,     Url NVARCHAR(200),     Title NVARCHAR(300),     Address NVARCHAR(200),     Domain Tinyint,     Price Real,     Image NVARCHAR(150),      Name NVARCHAR(80),     Value NVARCHAR(150) )  INSERT INTO #temptable SELECT     [t].[Id],      [t].[Url],      [t].[GeneratedTitle] AS[Title],      [t].[Address],      [t].[Domain],      [t].[Price],     (SELECT TOP(1) ISNULL([m].[Resize1200x1200], [m].Resize730x532)      FROM [Multimedia] AS[m]      WHERE [t].[Id] = [m].[PropertyId]         and m.MultimediaType = 1      ORDER BY [m].[Order]) AS[Image],      [t].[Name],      [t].[Value] FROM     (SELECT         [p].[Id],         [p].[Url],         [p].[GeneratedTitle],         [g].[Address],         [p].[Domain],         [pr].[Amount] AS Price,         [p].[ModificationDate],         [f].[Name],         [pf].[Value]     FROM [Property] AS [p]     INNER JOIN [GeoLocation] AS[g]          ON [p].[Id] = [g].[PropertyId]     INNER JOIN [PropertyFeature] AS[pf]          ON [pf].[PropertyId] = [p].[Id]     INNER JOIN [Feature] AS[f]          ON [pf].[FeatureId] = [f].[Id]     INNER JOIN [Operation] AS [o]          ON [p].[Id] = [o].[PropertyId]      INNER JOIN [OperationType] AS [o0]          ON [o].[OperationTypeId] = [o0].[Id]      INNER JOIN [Price] AS [pr]          ON [pr].[OperationId] = [o].[Id]      WHERE p.Id in          (Select Id from @properties)     GROUP BY [p].[Id],               [p].[Url],              [p].[GeneratedTitle],               [g].[Address],              [p].[Domain],               [pr].[Amount],              [p].[ModificationDate],              [f].[Name],              [pf].[Value]) AS[t]     ORDER BY[t].[ModificationDate] DESC  SET @cols = STUFF(                 (                     SELECT DISTINCT                             ','+QUOTENAME(c.[Name])                     FROM @features c FOR XML PATH(''), TYPE                  ).value('.', 'nvarchar(max)'), 1, 1, ''); SET @query = 'SELECT [Id],                       [Url],                       [Title],                       [Address],                       [Domain],                       [Price],                       [Image],                       ' + @cols + '                FROM (SELECT [Id],                              [Url],                              [Title],                              [Address],                              [Domain],                              [Price],                              [Image],                              [Value] AS [value],                              [Name] AS[name]                       FROM #temptable)x                       PIVOT(max(value) for name in ('+@cols+')) p'; EXECUTE(@query);  DROP TABLE #temptable 

The execution plan and Live query statistics show me the following:

Query Execution Plan

The previous query tries to obtain randomly a X number of records IDs, holding all the filter criteria to obtain only the IDs of the records which meet that criteria. The time right now is up to 15 seconds. It’s a lot if we talk about more than 400 users using concurrently the application.

Please, help me with this. I’m three weeks trying to solve this problem with no success, but a lot of advances has been made (before it was consuming 2 minutes in avg).

If it helps, I can give you access to a “dummy” deployed version of the DB with the same quantity of records to test and see directly the problem.

Thanks in advance…

===================================================================================================== INDEXES:

the indexes that are currently on the tables are the following:

GO CREATE UNIQUE NONCLUSTERED INDEX IX_Property_ModificationDate  ON [dbo].[Property] (ModificationDate DESC)  WITH( SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, FILLFACTOR = 90, ONLINE = ON)  GO CREATE NONCLUSTERED INDEX [IX_Property_ParentId_StatusCode]  ON [dbo].[Property] ([ParentId] ASC, [StatusCode] ASC) WITH( SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, FILLFACTOR = 90, ONLINE = ON);  GO CREATE NONCLUSTERED INDEX [IX_Property_ParentId_StatusCode_Id_ModificationDate]  ON [dbo].[Property] ([ParentId] ASC, [StatusCode] ASC, [Id] ASC, [ModificationDate] ASC) WITH( SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, FILLFACTOR = 90, ONLINE = ON);  GO CREATE NONCLUSTERED INDEX [IX_Property_ParentId]     ON [dbo].[Property]([ParentId] ASC)     WITH( SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, FILLFACTOR = 90, ONLINE = ON);      GO CREATE NONCLUSTERED INDEX [IX_Property_Identity_Domain_StatusCode]     ON [dbo].[Property]([Identity] ASC, [Domain] ASC, [StatusCode] ASC)     WITH( SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, FILLFACTOR = 90, ONLINE = ON);  GO CREATE NONCLUSTERED INDEX [IX_Property_Id_ModificationDate]  ON [dbo].[Property] (Id ASC, ModificationDate ASC) WITH( SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, FILLFACTOR = 90, ONLINE = ON);  GO CREATE NONCLUSTERED INDEX [IX_Property_PublisherId]     ON [dbo].[Property]([PublisherId] ASC)     WITH( SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, FILLFACTOR = 90, ONLINE = ON);   GO CREATE NONCLUSTERED INDEX [IX_Property_RealEstateTypeId]     ON [dbo].[Property]([RealEstateTypeId] ASC)     WITH( SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, FILLFACTOR = 90, ONLINE = ON)   GO  CREATE INDEX FIX_Property_StatusCode_Online ON [dbo].[Property](StatusCode) WHERE StatusCode = 1 WITH( SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, FILLFACTOR = 90, ONLINE = ON) GO  CREATE INDEX FIX_Property_StatusCode_Offline ON [dbo].[Property](StatusCode) WHERE StatusCode = 0 WITH( SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, FILLFACTOR = 90, ONLINE = ON) GO  CREATE INDEX FIX_Property_Domain_Urbania ON [dbo].[Property](Domain) WHERE Domain = 1 WITH( SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, FILLFACTOR = 90, ONLINE = ON) GO  CREATE INDEX FIX_Property_Domain_Adondevivir ON [dbo].[Property](Domain) WHERE Domain = 2 WITH( SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, FILLFACTOR = 90, ONLINE = ON) GO  GO CREATE NONCLUSTERED INDEX [IX_GeoLocation_PropertyId_ModificationDate]  ON [dbo].[GeoLocation] (PropertyId ASC, [ModificationDate] DESC) WITH( SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, FILLFACTOR = 90, ONLINE = ON);  GO CREATE NONCLUSTERED INDEX [IX_GeoLocation_PropertyId_Address]  ON [dbo].[GeoLocation] (PropertyId ASC, [Address] ASC) WITH( SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, FILLFACTOR = 90, ONLINE = ON);  GO CREATE UNIQUE NONCLUSTERED INDEX IX_GeoLocation_ModificationDate  ON [dbo].[GeoLocation] (ModificationDate DESC)  WITH( SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, FILLFACTOR = 90, ONLINE = ON) GO  CREATE NONCLUSTERED INDEX [IX_GeoLocation_Ubigeo] ON [dbo].[GeoLocation]([Ubigeo] ASC) WITH( SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, FILLFACTOR = 90, ONLINE = ON)  GO CREATE UNIQUE NONCLUSTERED INDEX [IX_GeoLocation_PropertyId]     ON [dbo].[GeoLocation]([PropertyId] ASC)     WITH( SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, FILLFACTOR = 90, ONLINE = ON) GO  CREATE SPATIAL INDEX SIX_GeoLocation_Coordinate ON [dbo].[GeoLocation](Coordinate) GO  CREATE INDEX FIX_GeoLocation_Domain_Urbania ON [dbo].[GeoLocation](Domain) WHERE Domain = 1 WITH( SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, FILLFACTOR = 90, ONLINE = ON) GO  CREATE INDEX FIX_GeoLocation_Domain_Adondevivir ON [dbo].[GeoLocation](Domain) WHERE Domain = 2 WITH( SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, FILLFACTOR = 90, ONLINE = ON) GO  GO CREATE NONCLUSTERED INDEX [IX_Multimedia_PropertyId_Order]  ON [dbo].[Multimedia] (PropertyId ASC, [Order] ASC) WITH( SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, FILLFACTOR = 90, ONLINE = ON);  GO CREATE NONCLUSTERED INDEX [IX_Multimedia_PropertyId]     ON [dbo].[Multimedia]([PropertyId] ASC)     WITH( SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, FILLFACTOR = 90, ONLINE = ON);  GO CREATE NONCLUSTERED INDEX [IX_Multimedia_Order]     ON [dbo].[Multimedia]([Order] ASC)     WITH( SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, FILLFACTOR = 90, ONLINE = ON); GO  CREATE NONCLUSTERED INDEX [PK_Multimedia_Property]     ON [dbo].[Multimedia]([Id] ASC, [PropertyId] ASC); GO  CREATE INDEX FIX_Multimedia_MultimediaType_Image ON [dbo].[Multimedia](MultimediaType) WHERE MultimediaType = 1 WITH( SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, FILLFACTOR = 90, ONLINE = ON) GO  GO CREATE NONCLUSTERED INDEX [IX_PropertyFeature_PropertyId_FeatureId]  ON [dbo].[PropertyFeature] (PropertyId ASC, [FeatureId] ASC) WITH( SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, FILLFACTOR = 90, ONLINE = ON);  GO CREATE NONCLUSTERED INDEX [IX_PropertyFeature_FeatureId]     ON [dbo].[PropertyFeature]([FeatureId] ASC);   GO CREATE NONCLUSTERED INDEX [IX_PropertyFeature_PropertyId]     ON [dbo].[PropertyFeature]([PropertyId] ASC);   GO CREATE NONCLUSTERED INDEX [IX_PropertyFeature-FeatureId]     ON [dbo].[PropertyFeature]([Id] ASC, [FeatureId] ASC);   GO CREATE NONCLUSTERED INDEX [IX_PropertyFeature_Property]     ON [dbo].[PropertyFeature]([Id] ASC, [PropertyId] ASC);  GO CREATE NONCLUSTERED INDEX [IX_Operation_PropertyId]     ON [dbo].[Operation]([PropertyId] ASC);  GO CREATE NONCLUSTERED INDEX [IX_Operation_OperationTypeId]     ON [dbo].[Operation]([OperationTypeId] ASC);  GO CREATE NONCLUSTERED INDEX [IX_Price_OperationId]     ON [dbo].[Price]([OperationId] ASC);  GO CREATE NONCLUSTERED INDEX [IX_Price_Operation]     ON [dbo].[Price]([Id] ASC, [OperationId] ASC); 

Best Practices for Filters

I have several domains that I want to filter from my projects so they never attempt to post to them.

I have 2 questions.

1. Is it better to create a .txt file with a list of all the domains I want to filter out and enter in the URL of that .txt file in the “Options” —> “Filter” section and GSA will pull the list daily, -OR- is it better to list each domain individually within the Filter list itself?

2. If I want to filter/block a domain and ALL of it’s subdomains, what format do I use? I have seen *domain.com suggested?

For example, if I want to block all art.blog subdomains, and I put in *art.blog would that also block bestart.blog (I don’t want that, I only want to blog art.blog subdomains).

Country Select with region filters

I’m tinkering with a design for user to select a single country, or multiple countries that are historically/culturally grouped (eg United Kingdom).

Just looking for feedback, and wondering if anyone has seen something along these lines.

Dropdown 1: Region (whole world, Europe, Nth America, Sth America, SE Asia etc)

Drowdown 2: Country (all, then list of appropriate countries based on dropdown 1)

The user first selects a region (eg Europe) then an Country (eg Germany).

If Country=All, then the search results will return all countries in the selected region (or all countries if region=whole world).

Changing region changes the options in the Country list.

One issue I have is how to deal with regions like United Kingdom. Ignoring Brexit, the UK countries are also in the Europe region. One country being in multiple regions is not a problem. The problem is there are quite a few regions like this (UK, British Isles, Balkans, Caribbean etc). I can absolutely see users wanting to have results returned for “all countries in the UK”.

Should I just have lots of regions? All world, continents (EU, Africa etc), then these special subsets (UK etc)? Or would you go All world, then Continents+SpecialRegions [sorted a-z]?

Implementing Filters as Tabs in a Web Dashboard

We are trying to provide users with a strong filter mechanism on a Dashboard that collects visit data. It feels like we have arrived at a point where only a few pieces of the puzzle are missing and we don’t know how to place them.

The user should be able to:

  1. Define the filter parameters.
  2. Save them for future use along with a name.
  3. Apply them for one-time use to quickly check or search something without the need for saving them.
  4. Modify the parameters of an existing saved filter and save it again.
  5. Open tabs in the software itself and apply different filters (saved and unsaved) to easily switch from one type of data to another.
  6. Close tabs.

Constraints:

There is a live view tab that is static and can’t be modified or removed by the user. (Mentioned from now on as the default tab)

Solution: (So far) 1. A filter button on the dashboard that opens a side panel for filters that contains all the parameters and allows the user to save a filter, save as existing or just apply. Filter Panel View

  1. A Tab System that works like you’d expect. It allows the user to change the filter applied in a tab by clicking on a drop-down icon on the right side of the tab itself and selecting from the saved filters list. Drop-down implementation to select filters in tabs

  2. A plus button at the end of the tab list which works exactly like the drop-down above, prompting the user to select which filter view to apply on the new tab avoiding any empty states or empty tabs open.

Points we still can’t seem to get a grip on:

  1. How to make the experience of applying a filter to any open tab as seamless as possible? For example, if the user is on a tab and he clicks on the Filters button and changes some parameters and clicks Apply. Should we update the view to Unsaved View? Should we keep the name of the tab same with an asterisk to denote unsaved state with a reset button? Or should we open a new tab (potentially leading to a lot of unsaved tabs open?)

  2. If subsequently, he refreshes his page? Should the unsaved view disappear (being unsaved) or should we keep it in memory (potentially not really creating a need to use the ‘Save Filter’ feature)

  3. How to treat the default tab in all this? Should we disable all filtering options in the default tab to avoid any confusion leading to some loss in discoverability of filter features that are available as they will not show in the default tab or should we show filtering options on the default tab and open any changes made on the default tab in a new tab leading to a little bit of inconsistency (changing any filters from the default tab always opens a new tab while changing any filters from other tabs only makes changes to that specific tab and then implementing a fail safe or an error state if the user tries to make changes to the default tab with the maximum limit reached)

  4. Creating a responsive design around the solution.

It’ll be great to get insights and thought process on how to go about this problem.

String filtering – process hundreds to millions of filters

What would be the most efficient way (whether with algorithims, cpu(s), DBs & SQL, distributed computing, etc) to process many strings, say ~1000/minute, and filter each string over 100s to potentially millions of different filter parameters. A parameter can be a simple statement such as including or not including the word “cat” or including “dog” not “cat” and as “complicated” as including multiple boolean gates with added timestamp ranges (logs). Each individual filter that marks true would be collected and some operation would run for each.