Simple MongoDB Aggregation in C#

Grouping Data with Count

Note: This starts with the 1.10.1 version of the mongo driver; https://www.nuget.org/packages/mongocsharpdriver/1.10.1 the v2 driver wraps the Async methods to syncronous to stay 'compatible' and as such is slower. So running 'synchronous' using v2 would be disadvantaged by the new driver while async with the async driver may have a minor benefit. The async is covered further down using Driver 2.1.1. https://www.nuget.org/packages/MongoDB.Bson/2.1.1

Being in C# we tend to store our Collection dto using the type name; if you don't just swap typeof(MyCollection).Name for the collection's name as a string.

To query an aggregate we need to select the collection then pass the aggregate pipeline in. This is represented by the match, group and sort variables.

V1.10.1 Mongo Driver

      // MongoUtil.GetMongoDatabase simply gets an instance of mongodb database
      var _db = MongoUtil.GetMongoDatabase("mongodb://somedb:port/mydata?safe=true");
      var col = _db.GetCollection<BsonDocument>(typeof(MyCollection).Name);

      // You can add a query within the match to reduce the result set; this is basically your where clause
      var match = new BsonDocument { { "$match", new BsonDocument { } } };

      // We group by setting, in this instance, the group by clause to our field UserId 
      // (it's a GUID but could be anything).    
      // We prefix with this with $. Then set our variable name to  count on as 
      // NumberOfReferencedUsers (overly descriptive?)
      var group = new BsonDocument
          {
            {
             "$group", new BsonDocument
              {
                {"_id", "$UserId"},
                {"NumberOfReferencedUsers", new BsonDocument {{"$sum", 1}}}
              }
            }
          };

      // We want to sort decending so we'll sort by our count with `-1`; ascending would be `1`
      var sort = new BsonDocument { { "$sort", new BsonDocument { { "NumberOfReferencedUsers", -1 } } } };
      // Create our pipeline array then feed it into an instance of AggregateArgs and pass to the aggregate method.
      var pipeline = new[] { match, group, sort };
      var mongoArgs = new AggregateArgs { Pipeline = pipeline };
      var res = col.Aggregate(mongoArgs).ToList();

      // BSONDoc is dynamic enough so we'll enumerate with tolist and get on with things.
      foreach (var x in res)
      {
          // You can grab the entries by their key name and cast, call other stuff etc like usual. 
          // There's also the `TryGetElement` and `GetElement`, same again for value.
          var id = (Guid)x["_id"];
          Console.WriteLine($"{id},{dealerName},{x["NumberOfReferencedUsers"]}");
      }

V2 Async Mongo Driver

The async (forced to resolve as I whacked this in a console app) is a little more fluent and a little less nested.

     var db = new MongoContext("mongodb://server:port/mydata?safe=true");
     db.Connect();


      var col = db.Database.GetCollection<BsonDocument>(typeof(MyCollection).Name);

      // Again this is the 'where clause' so add the other side of the match query
      // If we wanted to only get results with more than 3 matches we could query
      // var match = new BsonDocument{{"NumberOfReferencedUsers", new BsonDocument {{"$gt" , 3}} }};
      // and we'd have to reorder the Aggregate call so match came after group or NumberOfReferencedUsers 
      // wouldn't have a value to filter on
      var match = new BsonDocument();

      var group = new BsonDocument
      {
        {"_id", "$UserId"},
        {"NumberOfReferencedUsers", new BsonDocument {{"$sum", 1}}}
      };

      var sort = new BsonDocument {{"NumberOfReferencedUsers", -1}};

      // Fluent and oh yes less code. Tasty
      var aggregate = col.Aggregate().Match(match).Group(group).Sort(sort);

      // ToListAsync() would require you to precede the method with await and for your class to also be awaitable
      // and all the usual async stuff applies. Here it's in a console app so its just called. 
      var res = aggregate.ToListAsync().Result;

      foreach (var x in res)
      {
        try
        {
          var id = (Guid)x["_id"];
          Console.WriteLine($"{id},{dealerName},{x["NumberOfReferencedUsers"]}");
        }
        catch (Exception ex)
        {
        }
      }

In case you're interested this is the 'context' the mongo util creates; I had to quickly rewrite it for the V2 driver but it doesn't change that much.

    internal class MongoContext
    {
      private readonly string _connectionString;
      private string _databaseName;
      private MongoClient _mongoClient;

      public MongoContext(string connectionString)
      {
        _connectionString = connectionString;
        BsonDefaults.GuidRepresentation = GuidRepresentation.Standard;
      }

      public IMongoDatabase Database { get; private set; }

      public void Connect()
      {
        _databaseName = MongoUrl.Create(_connectionString).DatabaseName;
        if (string.IsNullOrWhiteSpace(_databaseName))
        {
          throw new Exception("Could not determine database name from connection string");
        }
        _mongoClient = new MongoClient(_connectionString);
        Database = _mongoClient.GetDatabase(_databaseName);
      }
    }

With the pipeline it's pretty much up-to-you; and happily the v2 library reduces the need to create more BsonDocument objects than necessary.

Refs:

Chris McKee

Chris McKee

https://chrismckee.co.uk

Software Engineer, Web Front/Backend/Architecture; all-round tech obsessed geek. I hate unnecessary optimism

View Comments