Tuesday, August 28, 2018

How to test locally SignalR on ASP.NET Core app with different connection issues

For example, I used an app with SignalR on ASP.NET Core.

Setup app process step by step: https://docs.microsoft.com/en-us/aspnet/core/tutorials/signalr?view=aspnetcore-2.1&tabs=visual-studio

Github repo: https://github.com/aspnet/Docs/tree/master/aspnetcore/tutorials/signalr/sample

How to limit speed between localhost apps using NetLimiter4

Can be downloaded at https://www.netlimiter.com/

Run application using Visual Studio and select Google Chrome (or your other browsers) in NetLimiter
Right-click to Selected row and choose connection speed you want to select. Then enable incoming/outgoing limit rule by selection checkbox on the selected row

My tests for SignalR apps showed, that WebSocket connection works perfect on 5KB/s speed. It even worked for me on very slow speed (0.5KB/sec) with some delay.

To view WebSocket frames use Developers tools from Google Chrome (F12 key), select Network tab, and filter by WS filter.

How to test network disconnection between machines:

You can use a virtual machine to test this case. I prefer using Hyper-V built-in in Windows 10 (available from Windows professional and upper).

Hyper-V setup documentation: https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/quick-start/enable-hyper-v
install Windows (or other OS) from OS image via Hyper-V, manager.
Run your app from host machine with dotnet run --urls http://0.0.0.0:5000 to allow Kestrel external connections
Setup Windows firewall to allow external connections netsh advfirewall firewall add rule name="Http Port 5000" dir=in action=allow protocol=TCP localport=5000
Run browser in you virtual machine browser and connect to you host machine by IP and specified port (5000 in my example)

Tuesday, January 30, 2018

Introduction to MongoDB Aggregation Pipeline

The main goal of this document is to describe the most commonly used commands of the aggregation pipeline and also give some recommendations for aggregation requests implementation. There will be also a sample solution for C# environment at the end of the document.

Quick Reference

`$match`

Filters the documents to pass only the documents that match the specified condition(s) to the next pipeline stage.

{ $match: { <query> } }

`$project`

Passes along the documents with the requested fields to the next stage in the pipeline. The specified fields can be existing fields from the input documents or newly computed fields.

{ $project: { <specification(s)> } }

Specifications can be the following:

`<field>: <1or true>`	Specifies the inclusion of a field.
`_id: <0 orfalse>`	Specifies the suppression of the `_id` field.
`<field>:<expression>`	Adds a new field or resets the value of an existing field.
`<field>:<0or false>`	Specifies the exclusion of a field.

`$sort`

Sorts all input documents and returns them to the pipeline in sorted order.

{ $sort: { <field1>: <sort order>, <field2>: <sort order> ... } }
Sort order can be the following values:

1 to specify ascending order.
-1 to specify descending order.

`$lookup`

Performs a left outer join to an unsharded collection in the same database to filter in documents from the “joined” collection for processing. To each input document, the $lookup stage adds a new array field whose elements are the matching documents from the “joined” collection. The $lookup stage passes these reshaped documents to the next stage.

{

$lookup:

{

from: <collection to join>,

localField: <field from the input documents>,

foreignField: <field from the documents of the "from" collection>,

as: <output array field>

}

For full reference see the following link: https://docs.mongodb.com/manual/reference/operator/aggregation-pipeline

Aggregation Pipeline Optimization

All optimizations here have their target to minimize the amount of data that are sent between pipeline stages. Also, these optimizations are done automatically by MongoDB engine, but it will probably be the right decision to eliminate at least partly the need for such optimizations to make DB engine work a bit faster.

All optimizations are done in two phases: sequence optimization and coalescence optimization. As a result, long chains of aggregation phases sometimes can be transformed into a lesser number of phases that require less memory.

Pipeline Sequence Optimization

`$project` or `$addFields` + `$match`

If $match follows $project or $addFields, then that expressions from match stage that doesn't need to be computed in projection stage are moved before projection stage.

`$sort` + `$match`

In this case $match is moved before $sort to minimize number of items to sort.

`$redact` + `$match`

If $redact stays before $match, then sometimes we can add portion of $match statement before $redact to limit number of documents aggregated.

`$skip` + `$limit`

During optimization $limit is moved before $skip, and the $limit value is increased by $skip amount.

`$project` + `$skip` or `$limit`

Obviously, in this case $skip or $limit goes before $project to limit number of documents to be projected.

Pipeline Coalescence Optimization

When possible, the optimization phase coalesces a pipeline stage into its predecessor. Generally, coalescence occurs after any sequence reordering optimization.

`$sort` + `$limit`

When a $sort immediately precedes a $limit, the optimizer can coalesce the $limit into the $sort. This allows the sort operation to only maintain the top n results as it progresses, where n is the specified limit, and MongoDB only needs to store n items in memory.

`$limit` + `$limit`

When a $limit immediately follows another $limit, the two stages can coalesce into a single $limit where the limit amount is the smaller of the two initial limit amounts.

`$skip` + `$skip`

When $skip immediately follows another $skip, the two stages can coalesce into a single $skip where the skip amount is the sum of the two initial skip amounts.

`$match` + `$match`

When a $match immediately follows another $match, the two stages can coalesce into a single $match combining the conditions with an $and.

`$lookup` + `$unwind`

When a $unwind immediately follows another $lookup, and the $unwind operates on the as a field of the $lookup, the optimizer can coalesce the $unwind into the $lookup stage. This avoids creating large intermediate documents.

Aggregation Pipeline Limits

Each document in the result set is limited by the maximum size of BSON Document, it's currently 16 megabytes. If any single document exceeds this limit, the 'aggregate' command will produce an error. The limit only applies to the returned documents; during the pipeline processing, the documents may exceed this size.

Pipeline stages have a limit of 100 megabytes of RAM. If a stage exceeds this limit, MongoDB will produce an error. To allow for the handling of large datasets, use the allowDiskUse option to enable aggregation pipeline stages to write data to temporary files.

The $graphLookup stage must stay within the 100 megabyte memory limit. If allowDiskUse: true is specified for the aggregate operation, the $graphLookup stage ignores the option. If there are other stages in the aggregate operation, allowDiskUse: true option will affect these other stages.

Aggregation Pipeline and Sharded Collections

The aggregation pipeline supports operations on sharded collections.

Behavior

If the pipeline starts with an exact $match on a shared key, the entire pipeline runs on the matching shard only. Previously, the pipeline would have been split, and the work of merging it would have to be done on the primary shard.

For aggregation operations that must run on multiple shards, if the operations do not require running on the database’s primary shard, these operations will route the results to a random shard to merge the results to avoid overloading the primary shard for that database. The $out stage and the $lookup stage require running on the database’s primary shard.

Optimization

When splitting the aggregation pipeline into two parts, the pipeline is split to ensure that the shards perform as many stages as possible with consideration for optimization.

To see how the pipeline was split, include the explain option in the db.collection.aggregate() method.

Optimizations are subject to change between releases.

Time measure experiment

For this experiment I used following data:

100 000 records for Users

100 000 records for Items

1 000 000 records for Orders, that contain user id and can contain up to 10 items ids.

Experiment was done with four tests: simple match, match using collection 'contains' function, lookup with match, and lookup & match & unwind & group.

Also time has been measured for following cases: when foreign key was object id with ascending index and hash index, and when foreign key was GUID with ascending index and hash index.

I've got following results:

	ObjectId & Ascending Index	ObjectId & Hash Index	GUID & Ascending Index	GUID & Hash Index	Ascending GUID & Ascending Index	Ascending GUID & Hash Index	CombGUID & Ascending Index	CombGUID & Hash Index	Non-ID without Index	Non-ID & Hash
simple match	0.168s	0.168s	0.171s	0.18s	0.183s	0.179s	0.185s	0.183s	0.194s	0.177s
match with 'contains'	0.709s	0.722s	0.814s	0.83s	0.793s	0.828s	0.798s	0.787s	0.79s	0.781s
lookup & match	66.373s	79.796s	79.823s	97.733s	83.317s	97.171s	85.767s	98.76s	42501.798s ≈ 11h 48m 21.798s	83.502s
lookup & unwind & match & unwind & group	75.856s	74.563s	81.012s	86.546s	84.847s	86.375s	86.045s	85.928s	44605.215s ≈ 12h 23m 25.215s	82.692s
lookup & unwind & match & unwind & lookup & unwind & group	73.797s	74.129s	83.504s	87.338s	85.418s	86.321s	86.303s	86.572s	44763.693s ≈ 12h 26m 03.693s	82.749s

Results are pretty bad, as we can see. But, there is a way to do it right!

This measurements were done by using aggregation pipeline on Orders, that was lookup-ed with Users and filtered on username AFTER that. But what if we use aggregation pipeline on Users, filter it on username and do a lookup with Orders collection. The results I've got are following:

simple match	0.187s
match with 'contains'	0.807s
lookup & match	0.062s
lookup & unwind & match & unwind & group	0.038s
lookup & unwind & match & unwind & lookup & unwind & group	0.014s

So, as we can see, now we have a very good request speed with aggregation pipeline. So, we should think very carefully about optimizations while using the aggregation pipeline.

Also I did tests on the same tasks but without aggregation pipeline usage. The results are following:

simple match	0.008s
match with 'contains'	0.012s
lookup & match	0.566s
lookup & unwind & match & unwind & group	0.652s
lookup & unwind & match & unwind & lookup & unwind & group	0.838s

Of course, these tests were done without such things as indexes, etc. That's why they are executed for a bit longer.

Thursday, December 21, 2017

How-to: Make MongoDB HIPAA compliant

Configuring encryption at rest

To enable encryption in MongoDB you should start mongod with --enableEncryption option.

Also, you need to decide where you are going to store the master key. You can store it either in external key manager which is the recommended way since this is necessary to meet HIPAA guidelines, or locally.

You will need to get an external key manager application that supports KMIP communication protocol. For example, this: https://www.townsendsecurity.com/products/centralized-encryption-key-management

To start mongodb with the new key use this command:

mongod --enableEncryption --kmipServerName <KMIP Server HostName> --kmipPort <KMIP server port> --kmipServerCAFile <ca file path> --kmipClientCertificateFile <certificate file path>

Now about the two last options:

--kmipServerCAFile <string>

Path to CA File. Used for validating secure client connection to KMIP server.

--kmipClientCertificateFile <string>

A string containing the path to the client certificate used for authenticating MongoDB to the KMIP server.

If the command succeeds then in the log file you will see the following messages:

[initandlisten] Created KMIP key with id: <UID>

[initandlisten] Encryption key manager initialized using master key with id: <UID>

If a key already exists, then use the following command to start mongodb:

mongod --enableEncryption --kmipServerName <KMIP Server HostName> --kmipPort <KMIP server port> --kmipServerCAFile <ca file path> --kmipClientCertificateFile <certificate file path> --kmipKeyIdentifier <UID>

To read the full article about mongodb encryption at rest, follow this link: https://docs.mongodb.com/manual/core/security-encryption-at-rest/

Transport encryption

On server side

Before you can use SSL, you must have a .pem file containing a public key certificate and its associated private key.

MongoDB can use any valid SSL certificate issued by a certificate authority, or a self-signed certificate. If you use a self-signed certificate, although the communications channel will be encrypted, there will be no validation of server identity.

Set Up mongod with SSL Certificate and Key

To use SSL in your MongoDB deployment, start mongod including following run-time options:

net.ssl.mode set to requireSSL. This setting restricts each server to use only SSL encrypted connections. You can also specify either the value allowSSL or preferSSL to set up the use of mixed SSL modes on a port.
PEMKeyfile with the .pem file that contains the SSL certificate and key.

Syntax should be following:

mongod --sslMode requireSSL --sslPEMKeyFile <pem> <additional options>

You may also specify these options in the configuration file, as in the following example:

net:

ssl:

mode: requireSSL

PEMKeyFile: /etc/ssl/mongodb.pem

Set Up mongod with Certificate Validation

Along with options from the previous methods you should also set up CAFile with the name of the .pem file that contains the root certificate chain from the Certificate Authority.

Syntax:

mongod --sslMode requireSSL --sslPEMKeyFile <pem> --sslCAFile <ca> <additional options>

If you prefer using a configuration file, then:

net:

ssl:

mode: requireSSL

PEMKeyFile: /etc/ssl/mongodb.pem

CAFile: /etc/ssl/ca.pem

Disallow Protocols

To prevent MongoDB servers from accepting incoming connections that use specific protocols, including the --sslDisabledProtocols option, or if using the configuration file the net.ssl.disabledProtocols setting.

mongod --sslMode requireSSL --sslDisabledProtocols TLS1_0,TLS1_1 --sslPEMKeyFile /etc/ssl/mongodb.pem --sslCAFile /etc/ssl/ca.pem <additional options>

If you use config file:

net:

ssl:

mode: requireSSL

PEMKeyFile: /etc/ssl/mongodb.pem

CAFile: /etc/ssl/ca.pem

disabledProtocols: TLS1_0,TLS1_1

SSL Certificate Passphrase

The PEM files for PEMKeyfile and ClusterFile may be encrypted. With encrypted PEM files, you must specify the passphrase at startup with a command-line or a configuration file option or enter the passphrase when prompted. To specify the passphrase in clear text on the command line or in a configuration file, use the PEMKeyPassword and/or the ClusterPassword option.

On client side

For C#:

http://mongodb.github.io/mongo-csharp-driver/2.0/reference/driver/ssl/

To read a full article about mongodb transport encryption, follow this link: https://docs.mongodb.com/manual/core/security-transport-encryption/

Performance (of encryption at rest)

CPU: 3.06GHz Intel Xeon Westmere(X5675-Hexcore)

RAM: 6x16GB Kingston 16GB DDR3 2Rx4

OS: Ubuntu 14.04-64

Network Card: SuperMicro AOC-STGN-i2S

Motherboard: SuperMicro X8DTN+_R2

Document Size: 1KB

Workload: YCSB

Version: MongoDB 3.2

In such environment they've got following results:

In addition to throughput, latency is also a critical component of encryption overhead. From our benchmark, average latency overheads ranged between 6% to 30%. Though average latency overhead was slightly higher than throughput overhead, latencies were still very low—all under 1ms.

	Average Latency (µs)	Unencrypted	Encrypted	% Overhead
Insert Only	Average Latency	32.4	40.9	-26.5%
Read Only	Working Set Fits In Memory Avg Latency	230.5	245.0	-6.3%
Read Only	Working Set Exceeds Memory Avg Latency	447.0	565.8	-26.6%
50% Insert / 50% Read	Working Set Fits In Memory Avg Latency	276.1	317.4	-15.0%
50% Insert / 50% Read	Working Set Exceeds Memory Avg Latency	722.3	936.5	-29.7

To read the full article, follow this link: https://www.mongodb.com/blog/post/at-rest-encryption-in-mongodb-3-2-features-and-performance

Pages

Tuesday, August 28, 2018

How to test locally SignalR on ASP.NET Core app with different connection issues

For example, I used an app with SignalR on ASP.NET Core.

How to limit speed between localhost apps using NetLimiter4

Tuesday, January 30, 2018

Introduction to MongoDB Aggregation Pipeline

Quick Reference

$match

$project

$sort

$lookup

Aggregation Pipeline Optimization

Pipeline Sequence Optimization

$project or $addFields + $match

$sort + $match

$redact + $match

$skip + $limit

$project + $skip or $limit

Pipeline Coalescence Optimization

$sort + $limit

$limit + $limit

$skip + $skip

$match + $match

$lookup + $unwind

Aggregation Pipeline Limits

Aggregation Pipeline and Sharded Collections

Behavior

Optimization

Time measure experiment

Thursday, December 21, 2017

How-to: Make MongoDB HIPAA compliant

Configuring encryption at rest

Transport encryption

On server side

Set Up mongod with SSL Certificate and Key

Set Up mongod with Certificate Validation

Disallow Protocols

SSL Certificate Passphrase

On client side

Performance (of encryption at rest)

`$match`

`$project`

`$sort`

`$lookup`

`$project` or `$addFields` + `$match`

`$sort` + `$match`

`$redact` + `$match`

`$skip` + `$limit`

`$project` + `$skip` or `$limit`

`$sort` + `$limit`

`$limit` + `$limit`

`$skip` + `$skip`

`$match` + `$match`

`$lookup` + `$unwind`