Pre-filtering

Pre-filtering (or Server-side filtering) is a mechanism by which one can download working portions of a dataset at a time, rather than the whole dataset. By reducing the amount of data your machine is handling it will run faster. For example, you may only be planning on working on Finance rows in a given session, so can choose to only load Finance nodes.

To turn on pre-filtering

Pre-filters are triggered by tagging the properties you want to be able to filter on with the tag aggregate.


Once tagged this way, the contents of this property will be indexed server-side.

Note:

  • In order to index these data, they are held unencrypted. Never tag with aggregate any property which is sensitive. In general, this means any property which has unique values (or near unique). It should be reserved for text fields (dimensions) where the number of unique values is small. Country and Department are good examples. Numeric fields will be treated as text. All other field types are unsupported
  • You may repeat this for as many properties as you wish to use in the filter. However around 5 properties are recommended as a usable set of filter properties

To open a dataset with filters applied

When opening a dataset from the Home Screen which has been tagged this way, a Filter Control will be opened from which the user can select the subset of data to open with.

Once the desired subset has been selected in the pre-filter, there are two options:

  1. Clicking 'Open' will open the dataset with this filter applied.
  2. Clicking 'Add' allows the user to Save as a new Filtered view accessible from the Home Screen in the 'Filters' section. You can also create branches this way.

If the parent_id property has also been tagged with index then the dataset will open with the additional nodes passed in that are required to build the hierarchy (the ghost nodes). These nodes are opened in read-only mode with their _outgoing_count calculated server-side.

Considerations when using the server-side filters

  • Properties tagged with aggregate are not encrypted in the source database
  • The server-side filters should not be used on high cardinality properties (properties with many distinct values) because:
    • This risks making the data personally identifiable on the database, due to the unique combinations of values
    • This adds to the filter loading time
  • It works only with dimension (text) properties
  • It does not work with calculated properties

 

 

 

 

Have more questions? Submit a request

Comments