Monthly Archives: December 2010

Cutting down your data the right way

Through my career, I’ve seen two types of methods for getting what you need from data. One method, grabs a mass collection from a data source and iterates through it to get what you need. The second method builds a complex filter and sends it across to the data source hoping that it can interpret it quickly. Personally, I have gone the later route even though it seems more risky. Querying a database is much more useful in my mind, than building your own queries in memory.

Dim objPerson As Object
Dim lstPerson As List(Of Object) = GetAllPeople()
For Each objPerson In lstPerson
  If objPerson.HometownState = "CA" Then
    Continue For
  End If
'Do Some Logic

Assume we have a Person object, with a Hometown City and Hometown State assigned to them. Let's take example A:
While this code is logical and put together, it has two flaws. It is assuming that the impact of loading every person into memory will be small. When we get into the tens of thousands range, that's not the case. So ideally, we should have a method that returns back our Person object with just the HometownState of "CA". Another impact is speed. To iterate through a collection like this is very redundant in compared to hitting an index. Assuming we are on a SQL database, the HometownState property can be indexed in the database.
While at first, this method can seem small and insignificant, you must consider the possibility that someone else will want to use it on a larger scale. If we do, we can start to utilize the tools provided to us to speed up our application and allow it to be scalable.