I had two projects over the weekend. One was to get my bathroom ready to install a new bathtub. The other was an experimental coding project.
The bathroom preparation went fairly well. I ripped out all the cabinets and countertops. I'm glad my brother showed up unexpectedly because that countertop was incredibly heavy. We also got alot of the carpet ripped up. I'm going to be putting down new flooring as well.
On the coding front, as an experiment in adopting new features in Whidbey, I implemented a binary file parser for Standard Test Datalog Format (STDF) files. These files make up 99% of the data we work with at work and that fill our many-terabyte test result database. We have a fairly complex parser and db loader framework, implemented in C# on 1.x. It works very well, but it was written early on in our adoption of .net with little knowledge of what the CLR could do for us. So, my experiment was basically to see how new features in Whidbey, along with my now deep experience in .net, could make the parser better.
STDF is record-based. The spec defines alot of records, and leaves room for user-defined records. The new parser reads chunks of the file based on the record headers and produces "unknown records". I define the record layouts using attributes on record classes. Then, the parser uses LCG (lightweight code generation using DynamicMethod) to generate converters to read the content of the unknown records into the concrete record classes (based on the attributes). The benefit of using LCG is that record types could be registered or removed on the fly and the GC could collect the generated code. I could have just as easily implemented it using on-the-fly interpretation of the attributes. I'll measure and see how the performance works out. The parser is pull-based, meaning that you ask it for records, or alternately just "foreach" through them using an iterator-based IEnumerable implementation, which is pretty sweet. On top of the pull-based parser, I built an event-based "processor" where a consumer can register to receive certain record types. This is the model used in our current parser, but after the XmlReader vs. SAX discussions, I thought exposing the pull-based approach was the right thing to do.
I had a few challenges, which I think represent work for the next version of the CLR:
Oddly enough, I spent about equal time on both projects, but I seem to have alot more to say about the later.
[UPDATE] I realized that the entry box swallowed some of my generics syntax, so I fixed that, as well as fixing some minor spelling and grammatical errors.
Remember Me
Page rendered at Friday, August 29, 2008 1:45:12 AM (Pacific Standard Time, UTC-08:00)
Disclaimer The opinions expressed herein are my own personal opinions and do not represent my employer's view in anyway.