Post

Split'ing Strings and the Performance Implications

Jason IM’d me with a question last night about how to split a string using a delimiter with multiple characters. Of course, my first answer was to use the VB Split method (he’s doing this from C#). He also found that there is another way using System.Text.RegularExpressions.RegEx.Split (hmmm… that’s easy to remember). So since there were two ways to do it, he decided to do some performance testing. Before getting the results from him, I decided to do a quick and dirty test as well. In my testing, I also included System.String.Split. System.String.Split doesn’t allow for multiple character delimiters, so thus the reason for the original question. However, with single character delimiters, System.String.Split is definitely faster than the VB Split method. What I did notice though is that I didn’t see any real difference between a single character and multiple character performance difference when using VB Split. System.Text.RegularExpressions.RegEx.Split on the other hand was very noticeably slower. Of course, after getting the results from Jason, our data didn’t seem to match. He was only showing a minor difference between the two. It turns out we are measuring differently. He’s taking the difference between each test, summing that together and taking the average. Mine is a summed difference in the total timing over several iterations. To me, it’s more important to look at the how long 1000 iterations took. We are both right to a degree, but I think my measurement more accurately shows the timing difference. And as such, there is a huge difference between the two. I’ve modified Jason’s example to show both of these values and it doesn’t require opening the data in Excel. You can get it here.

So, what’s the rule of thumb on this one? Well, if you are using a single character delimiter and concerned with performance, use System.String.Split. If you are in need of splitting a string using a multiple character delimiter use the VB Split method (or Microsoft.VisualBasic.Strings.Split for you C# folks).

Also, I think it’s interesting to point out how presenting data using one method vs. another kind of follows along with this document and the reasons why you should (as Jason pointed on on his site) take the time to validate the information yourself.

This post is licensed under CC BY 4.0 by the author.